LLM (Large Language Model)
A large language model (LLM) is an AI system trained on large volumes of text data to predict and generate human-like text; it serves as the core engine underlying most legal AI tools for research, drafting, and document analysis.
Last reviewed: 2026/05/19
Definition
Why It Matters for Lawyers
How AI Tools Handle It
Frequently Asked Questions
- Q1: What is the difference between an LLM and a traditional legal research database?
- A traditional legal database is a structured repository that retrieves documents matching search queries. An LLM generates new text based on statistical patterns learned during training — it does not retrieve documents but rather produces text that resembles authoritative answers. Legal research AI tools typically combine both: using LLMs for language understanding and generation, and databases for sourcing verified legal content.
- Q2: Are LLMs trained on confidential client information?
- That depends on the tool and contract. General-purpose LLMs are trained on public data. When a lawyer submits client documents to a legal AI tool, whether that data is used for model training depends on the vendor's data processing terms. Most enterprise legal AI vendors explicitly prohibit use of submitted content for model training. This should be confirmed in the vendor agreement before submitting any client-confidential material.
- Q3: How often do legal AI tools update their underlying LLM?
- Update frequency varies by vendor and is not always disclosed publicly. Foundation models are periodically updated or replaced by their developers, and legal AI vendors may update their applications to use newer model versions. However, the legal content corpus used in RAG applications is typically updated more frequently than the underlying model weights. --- *Last reviewed: 2026-05-19 by LawyerAI Editorial Team.*
Related Concepts
Hallucination (in Legal AI)
Hallucination in legal AI refers to instances where an AI model generates factually incorrect, fabricated, or unsupported output — such as nonexistent case citations, invented statutes, or inaccurate summaries of legal holdings — presented with apparent confidence.
Tech / ModelRAG (Retrieval-Augmented Generation)
Retrieval-Augmented Generation (RAG) is an AI architecture that combines a retrieval system — which fetches relevant documents from a specified corpus — with a generative language model that produces answers grounded in those retrieved documents, rather than relying solely on the model's training data.
Tech / ModelFine-tuning
Fine-tuning is the process of further training a pre-trained large language model on a domain-specific dataset to improve its performance on tasks in that domain, such as legal document analysis, contract drafting, or jurisdiction-specific research.
Tech / ModelTraining Data
Training data is the corpus of text and examples used to train a large language model, establishing its capabilities, knowledge, and limitations; the quality, recency, and composition of training data directly affects the model's reliability for legal tasks.
Related Tools
- Harvey AI
The most expensive legal AI in the market — Am Law 100 firms only.
- CoCounsel
Thomson Reuters' GPT-backed research and drafting with Westlaw integration.
- Westlaw Precision AI
AI-powered legal research with citation-validated answers from Westlaw.
- Lexis+ AI
Conversational legal research with real-time Shepard's citation validation.
- Spellbook
AI contract drafting and review inside Microsoft Word for transactional lawyers.
Related Comparisons
Related Reading
Last reviewed: 2026/05/19. Definitions are written by the LawyerAI Editorial team. We do not accept affiliate commissions; Featured placement is clearly labeled and does not influence editorial content.