Loading the article brief, supporting context, and related editorial blocks.
Customizing models for legal professionals | AI BriefWire
AI BriefWire / Briefing
OpenAI NewsEnterpriseCore AIHeat 64
Customizing models for legal professionals
OpenAI has introduced customized AI models tailored specifically for legal professionals. These models help improve efficiency by understanding legal terminology and workflows better. This advancement matters because it enhances AI utility in the legal industry, potentially transforming legal research and document review.
Server-side extracted preview paragraphs from the original source.
Original article excerpt
Harvey partners with OpenAI to build a custom-trained model for legal professionals.
Harvey partners with OpenAI to build a custom-trained model for legal professionals.
Over the past year, Harvey has established itself as a secure generative AI platform for professionals in law, tax, and finance. They’ve grown to a team of over 100 people, increased revenue over 10x in 2023, and raised $80M in Series B funding at a $715M valuation.
Recently, Harvey partnered with OpenAI to create a custom-trained case law model. This has allowed Harvey to deliver AI systems that help with tasks requiring complex reasoning, extensive domain knowledge, and capabilities beyond a single model call—such as drafting documents, answering questions about complex litigation scenarios, and identifying material discrepancies between hundreds of contracts.
Harvey was founded by Winston Weinberg, an attorney with a background in antitrust and securities litigation and Gabe Pereyra, an AI researcher who previously worked on large language models (LLMs) for Google Brain and Meta. They saw an opportunity to use LLMs to synthesize information and present it to lawyers for review. “Both transactional work and litigation have been getting increasingly complex—there might be hundreds of thousands of contracts to go through for an international merger, and millions of emails to review for litigation,” Weinberg explained. With AI helping synthesize documents, lawyers can spend less time sifting through and drafting legal texts, and more time making decisions and helping clients. An early proof point came when Weinberg and Pereyra pulled Reddit’s r/legaladvice for landlord/tenant questions and used GPT‑3 to generate answers, which they shared with attorneys. “For 86 out of 100 questions, the attorneys said they would have just sent the answer to the client without editing,” Weinberg said. “It was an aha moment.”
For case law research, the team at Harvey envisioned an experience where you could copy/paste a client question into a case law model, and it would answer that question thoroughly and cite all its sources. They tried the obvious techniques first: fine-tuning foundation models via public APIs and building retrieval-augmented generation (RAG) systems. But they ran into limitations with such a uniquely complex, open-ended use case. “If you just do retrieval, you can answer very simple questions about areas of law that you aren’t really an expert in, but that’s actually not that useful for most attorneys,” Weinberg explained. “With case law research, you’re finding ammo for your argument, and that’s much more difficult to do.” Foundation models were strong at reasoning, but lacked the knowledge required for legal work. So, Harvey decided to partner with OpenAI to build a custom-trained model that would allow them to inject new knowledge, and ways of reasoning about that knowledge, into base models. “None of these problems have a clear-cut solution,” Pereyra said. “A lot of it was sitting down together, having our lawyers explain how case law research works, having our researchers show what we’ve done, and learning from OpenAI about the levers we had to approach the problem.” Harvey and OpenAI worked together to add the depth of context needed, first starting with case law from Delaware, and then expanding to include all of U.S. case law. They added the equivalent of 10 billion tokens worth of data to power the custom-trained case law model.
To test the case law model, Harvey worked with 10 of the largest law firms. They provided attorneys with side-by-sides of the output from the custom case law model, versus the output from GPT‑4 for the same question. They were surprised by how strong the reaction was.