Small fashions as paralegals: LexisNexis distills fashions to construct AI assistant


Be part of our day by day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Study Extra


When authorized analysis firm LexisNexis created its AI assistant Protégé, it needed to determine one of the simplest ways to leverage its experience with out deploying a big mannequin. 

Protégé goals to assist legal professionals, associates and paralegals write and proof authorized paperwork and make sure that something they cite in complaints and briefs is correct. Nonetheless, LexisNexis didn’t desire a basic authorized AI assistant; they needed to construct one which learns a agency’s workflow and is extra customizable. 

LexisNexis noticed the chance to carry the ability of huge language fashions (LLMs) from Anthropic and Mistral and discover the most effective fashions that reply person questions the most effective, Jeff Riehl, CTO of LexisNexis Authorized and Skilled, advised VentureBeat.

“We use the most effective mannequin for the particular use case as a part of our multi-model strategy. We use the mannequin that gives the most effective outcome with the quickest response time,” Riehl mentioned. “For some use circumstances, that will probably be a small language mannequin like Mistral or we carry out distillation to enhance efficiency and cut back price.”

Whereas LLMs nonetheless present worth in constructing AI purposes, some organizations flip to utilizing small language fashions (SLMs) or distilling LLMs to turn into small variations of the identical mannequin. 

Distillation, the place an LLM “teaches” a smaller mannequin, has turn into a well-liked methodology for a lot of organizations. 

Small fashions typically work greatest for apps like chatbots or easy code completion, which is what LexisNexis needed to make use of for Protégé. 

This isn’t the primary time LexisNexis constructed AI purposes, even earlier than launching its authorized analysis hub LexisNexis + AI in July 2024.

“We have now used lots of AI up to now, which was extra round pure language processing, some deep studying and machine studying,” Riehl mentioned. “That actually modified in November 2022 when ChatGPT was launched, as a result of previous to that, lots of the AI capabilities have been form of behind the scenes. However as soon as ChatGPT got here out, the generative capabilities, the conversational capabilities of it was very, very intriguing to us.”

Small, fine-tuned fashions and mannequin routing 

Riehl mentioned LexisNexis makes use of completely different fashions from a lot of the main mannequin suppliers when constructing its AI platforms. LexisNexis + AI used Claude fashions from Anthropic, OpenAI’s GPT fashions and a mannequin from Mistral. 

This multimodal strategy helped break down every process customers needed to carry out on the platform. To do that, LexisNexis needed to architect its platform to swap between fashions

“We might break down no matter process was being carried out into particular person elements, after which we’d determine the most effective giant language mannequin to help that part. One instance of that’s we’ll use Mistral to evaluate the question that the person entered in,” Riehl mentioned. 

For Protégé, the corporate needed quicker response instances and fashions extra fine-tuned for authorized use circumstances. So it turned to what Riehl calls “fine-tuned” variations of fashions, primarily smaller weight variations of LLMs or distilled fashions. 

“You don’t want GPT-4o to do the evaluation of a question, so we use it for extra refined work, and we swap fashions out,” he mentioned. 

When a person asks Protégé a query a couple of particular case, the primary mannequin it pings is a fine-tuned Mistral “for assessing the question, then figuring out what the aim and intent of that question is” earlier than switching to the mannequin greatest suited to finish the duty. Riehl mentioned the subsequent mannequin could possibly be an LLM that generates new queries for the search engine or one other mannequin that summarizes outcomes. 

Proper now, LexisNexis principally depends on a fine-tuned Mistral mannequin although Riehl mentioned it used a fine-tuned model of Claude “when it first got here out; we aren’t utilizing it within the product in the present day however in different methods.” LexisNexis can also be all for utilizing different OpenAI fashions particularly because the firm got here out with new reinforcement fine-tuning capabilities final yr. LexisNexis is within the means of evaluating OpenAI’s reasoning fashions together with o3 for its platforms. 

Riehl added that it could additionally have a look at utilizing Gemini fashions from Google. 

LexisNexis backs all of its AI platforms with its personal data graph to carry out retrieval augmented technology (RAG) capabilities, particularly as Protégé might assist launch agentic processes later. 

Even earlier than the arrival of generative AI, LexisNexis examined the potential for placing chatbots to work within the authorized {industry}. In 2017, the firm examined an AI assistant that might compete with IBM’s Watson-powered Ross and Protégé sits within the firm’s LexisNexis + AI platform, which brings collectively the AI companies of LexisNexis. 

Protégé helps legislation corporations with duties that paralegals or associates are inclined to do. It helps write authorized briefs and complaints which are grounded in corporations’ paperwork and knowledge, counsel authorized workflow subsequent steps, counsel new prompts to refine searches, draft questions for depositions and discovery, hyperlink quotes in filings for accuracy, generate timelines and, after all, summarize advanced authorized paperwork. 

“We see Protégé because the preliminary step in personalization and agentic capabilities,” Riehl mentioned. “Take into consideration the various kinds of legal professionals: M&A, litigators, actual property. It’s going to proceed to get increasingly more customized primarily based on the particular process you do. Our imaginative and prescient is that each authorized skilled could have a private assistant to assist them do their job primarily based on what they do, not what different legal professionals do.”

Protégé now competes in opposition to different authorized analysis and know-how platforms. Thomson Reuters personalized OpenAI’s o1-mini-model for its CoCounsel authorized assistant. Harvey, which raised $300 million from traders together with LexisNexis, additionally has a authorized AI assistant.