LawyerAILawyerAIIndependent Reviews
  • Search
  • Categories
  • Tag
  • Collection
  • Blog
  • Compare
  • Glossary
  • Solutions
  • Pricing
  • Submit
LawyerAILawyerAI
  1. Home
  2. ›
  3. Glossary
  4. ›
  5. TAR vs. CAL in eDiscovery

TAR vs. CAL in eDiscovery

Two AI-assisted document review approaches in eDiscovery: TAR 1.0 uses a frozen trained model; CAL continuously updates as reviewers code documents.

Last reviewed: 2026/05/22

Definition

Why It Matters for Lawyers

How AI Tools Handle It

Frequently Asked Questions

Is TAR/CAL accepted by courts as defensible in all federal districts?
TAR is broadly accepted in federal courts following *Da Silva Moore* and its progeny, but acceptance is not uniform in all districts, and some state courts remain skeptical. The critical factor is process transparency: courts that have accepted TAR have done so because the producing party documented its methodology clearly and was willing to explain it. Courts that have been critical of TAR productions focused on failures of protocol documentation rather than the technology itself. Check your specific district's local rules and any standing ESI orders.
Can the requesting party demand to review the producing party's seed set?
This is one of the most litigated procedural questions in TAR discovery. Courts are divided. Some hold that seed set documents are work product and not discoverable; others require disclosure of at least the non-privileged coding decisions. The strongest approach is to negotiate this issue in the Rule 26(f) meet-and-confer and reach an agreement before the protocol is in place. Stipulating in advance that seed set coding will not be challenged typically resolves the issue without court involvement.
How does TAR/CAL interact with privilege review?
TAR and CAL tools can prioritize documents by relevance, but privilege review is typically handled as a separate process — often a second-pass review of the relevant population using keyword filters and attorney-client relationship data before production. Some platforms offer "privilege prediction" models, but attorney-client privilege and work product protection determinations require attorney judgment on specific documents. Privilege prediction should be used to prioritize attorney review, not to substitute for it.

Related Concepts

Tech / Model

Continuous Active Learning (CAL)

An eDiscovery review method where the AI updates its relevance predictions after every reviewer decision, continuously prioritizing the most likely-relevant documents.

Legal Practice

eDiscovery (Electronic Discovery)

The process of identifying, preserving, collecting, processing, reviewing, and producing electronically stored information in litigation or regulatory investigations under FRCP and equivalent rules.

Legal Practice

Bates Numbering

Bates numbering assigns a unique sequential identifier to every page of every document produced in litigation, enabling parties, witnesses, and courts to cite exhibits unambiguously.

Related Tools

  • Relativity

    The industry-standard e-discovery platform for processing, reviewing, and analyzing large-scale document collections.

  • Everlaw

    Cloud eDiscovery with AI predictive coding and document summarization.

  • Casepoint

    Cloud-based eDiscovery and legal hold platform with AI-powered document review for government and enterprise.

  • Logikcull

    Self-service eDiscovery platform designed for instant setup, used by solo firms through Fortune 500 legal teams.

Related Reading

  • The Complete Guide to AI in Litigation and eDiscovery (2026)

Last reviewed: 2026/05/22. Definitions are written by the LawyerAI Editorial team. We do not accept affiliate commissions; Featured placement is clearly labeled and does not influence editorial content.

← All glossary terms
LawyerAILawyerAI

Independent Reviews

The independent directory of AI tools for lawyers — reviewed by methodology, not by ad budget.

X (Twitter)
Tools
  • Search
  • Categories
  • Tag
  • Collection
Resources
  • Blog
  • Compare
  • Glossary
  • Solutions
  • Pricing
  • Submit
  • Suggest a Tool
  • Newsletter
Company
  • About Us
  • Studio
Legal
  • Privacy Policy
  • Terms of Service
  • Cookie Policy
  • Refund Policy
  • Editorial Independence
  • Sitemap
Editorially independent. Methodology open and versioned.
© 2026LawyerAI Editorial

Technology-Assisted Review (TAR) is an umbrella term for using machine learning to prioritize, classify, or predict the relevance of documents in litigation document review. Two distinct implementations of TAR have evolved into industry standards, and understanding the difference between them is essential for litigation attorneys managing large-scale eDiscovery: TAR 1.0 (predictive coding) and TAR 2.0, also known as Continuous Active Learning (CAL).

TAR 1.0, also called predictive coding, works in discrete phases: the review team assembles a control set of randomly selected documents, a senior attorney reviews a seed set of documents to train the model, and the trained model then predicts relevance scores for the remaining document corpus. Once the model is validated against the control set and deemed accurate, it is "frozen" — it stops learning and applies static predictions to the full review population. The model does not update as reviewers continue to code documents.

CAL operates differently. In a CAL workflow, the model updates continuously as reviewers code documents. Each relevance decision a reviewer makes — relevant, not relevant, privileged — is fed back into the model in near-real time. The model constantly re-ranks the remaining uncoded documents to surface the most likely-relevant items next. There is no static training phase or model freeze. The model learns throughout the entire review period.

This architectural difference has practical consequences for review efficiency, defensibility, timeline, and cost.

The choice between TAR 1.0 and CAL is not primarily a technology decision — it is a litigation strategy decision with evidentiary, scheduling, and cost implications that attorneys need to understand.

Document review is one of the largest and most controllable costs in complex litigation. A 2023 RAND Institute for Civil Justice study found that document review accounts for 60–80% of total eDiscovery costs in large commercial cases, which can run to millions of dollars. Proportionality — the governing standard under FRCP Rule 26(b)(1) since the 2015 amendments — requires that discovery burden be proportional to the needs of the case. Courts have accepted TAR as a proportionality tool in numerous rulings, and the choice of TAR protocol is now a standard topic in meet-and-confer discussions under FRCP Rule 26(f).

Judicial acceptance of predictive coding began with the seminal ruling in Da Silva Moore v. Publicis Groupe (S.D.N.Y. 2012), in which Magistrate Judge Andrew Peck became the first US federal judge to approve the use of predictive coding for document review. Judge Peck's ruling established "transparent process" as the touchstone standard: the producing party must be willing to explain its TAR methodology to opposing counsel and the court. He reiterated and expanded this principle in Rio Tinto plc v. Vale S.A. (S.D.N.Y. 2015), which held that predictive coding is "generally accepted" for use in litigation and that parties must disclose their TAR protocol.

Livingston v. City of Chicago (N.D. Ill. 2019) extended judicial acceptance to CAL specifically, allowing the producing party to use a continuous active learning approach over the objection of the requesting party. The court's analysis emphasized FRCP proportionality: CAL achieved comparable recall at substantially lower cost than manual review, satisfying the producing party's discovery obligations efficiently.

For litigators, the practical stakes are: which methodology achieves the required recall at lower cost and is most defensible if challenged by opposing counsel or scrutinized by the court?

How It Works (Technical)

The underlying machinery of both approaches can be understood through a concrete analogy. Imagine you are trying to find all the relevant fish in a large, murky lake. Manual review is fishing with a rod — you pull out fish one by one. TAR 1.0 is building a fish-finder device: you study a sample of the lake, calibrate the device to the fish characteristics in that sample, then run the device across the full lake and rank locations by likelihood of containing fish. Once you've built and calibrated the device, it doesn't change — it applies the same calibration to every square meter. CAL is a self-adjusting fish-finder: every fish you catch (and don't catch) updates the device's calibration in real time, so the most promising areas keep shifting based on new information.

TAR 1.0 workflow in practice:

The review team first assembles a randomly selected control set — typically 2,000–10,000 documents — which is set aside to validate model performance. A senior attorney or subject matter expert reviews a seed set (often 500–2,000 documents) to train the model with initial positive and negative relevance examples. The trained model assigns relevance scores to the full document corpus. The team reviews the highest-scoring documents first, periodically validating model performance against the control set using standard recall and precision metrics. When validation confirms the model has achieved sufficient performance (typically 75%+ recall against the control set), the model is frozen and its predictions are applied to the remaining documents for final disposition.

CAL workflow in practice:

In CAL, there is no static seed set or control set. Reviewers begin coding documents immediately — typically starting with a random sample to expose the model to document variety, or with known relevant documents if they exist. As documents are coded, the model updates its rankings continuously. The platform surfaces the highest-predicted-relevance uncoded documents first, ensuring reviewers always work on the most likely-relevant items. Review continues until the model's predicted-relevance rate for remaining uncoded documents falls below a defined threshold — typically when the proportion of newly relevant documents found per batch is low enough to indicate diminishing returns.

Performance benchmarking:

Research from Gordon Cormack and Maura Grossman, published through the TREC (Text Retrieval Conference) Legal Track, is the most cited independent source for TAR vs. CAL performance comparison. Their findings indicate that CAL achieves comparable or higher recall than TAR 1.0 at materially lower review effort in the majority of tested scenarios. Industry benchmarks from major eDiscovery platforms generally indicate:

  • CAL: 85–95% recall at 20–40% of documents reviewed
  • TAR 1.0: 70–85% recall at 40–60% of documents reviewed

These figures vary substantially by corpus complexity, document type, and reviewer consistency. They should be treated as illustrative ranges, not guarantees.

Recall vs. precision defined for litigators: Recall measures what percentage of all actually relevant documents were found. Precision measures what percentage of the documents you designated as relevant were actually relevant. High recall means few relevant documents were missed — the producing party's primary obligation. High precision means reviewers' time is spent efficiently — fewer irrelevant documents in the produced set. In litigation, recall is the primary defensibility metric; precision drives review cost. CAL generally achieves better recall at lower review cost by continuously prioritizing the highest-value documents.

How Legal AI Vendors Address It

Relativity Active Learning is the most widely used CAL implementation in the AmLaw 100 and large corporate eDiscovery programs. It is built into the Relativity platform and can be configured by Certified Relativity Administrators with Active Learning experience. Relativity AL offers the deepest feature set for protocol documentation, control set management, and performance reporting — all important for defensibility. Limitation: Relativity's depth requires skilled administrators; improperly configured Active Learning workflows produce poor results and can create defensibility issues rather than resolving them. Organizations without certified Relativity admins should use a managed services provider for Active Learning configuration.

Everlaw Predict provides CAL in a cloud-native platform that is faster to deploy than Relativity for straightforward matters. Everlaw's interface is more accessible for legal teams without dedicated eDiscovery specialists, and the platform's shared-workspace model facilitates cooperation with opposing counsel in agreed-protocol productions. Limitation: Everlaw offers less granular control over CAL configuration than Relativity Active Learning; for highly complex or disputed eDiscovery protocols, the platform's lighter controls may create defensibility gaps.

Casepoint has strong adoption in government and FOIA-intensive matters, with reliable CAL implementation and strong support for the specific document types and review protocols common in federal regulatory investigations. Limitation: Casepoint offers less customization of CAL parameters than Relativity; for cutting-edge protocol designs (e.g., multi-model ensemble approaches), the platform's configuration options are more constrained.

Logikcull targets small to mid-size law firms and corporate legal departments handling discovery in-house for the first time. Its AI features are simpler than enterprise platforms and include basic predictive prioritization. Limitation: Logikcull's AI-assisted review is not a full TAR 1.0 or CAL implementation in the technical sense — it lacks the protocol documentation features and performance validation tools that courts expect when TAR is raised in discovery disputes.

How Lawyers Should Negotiate and Document TAR/CAL Protocol

  1. Raise TAR/CAL in the Rule 26(f) meet-and-confer. Do not wait until after collection to discuss methodology. The Rio Tinto standard requires transparency, and courts look favorably on parties that raise TAR methodology early and cooperatively. Identify whether your party or opposing counsel intends to use TAR, which implementation, and the expected recall target.

  2. Document the agreed protocol in the Rule 26(f) report or a separate ESI protocol stipulation. The protocol should specify: methodology (TAR 1.0 or CAL), platform, recall target (typically 75–85% depending on case complexity), validation approach, and what production documentation the producing party will provide to demonstrate methodology compliance. A written protocol signed by both parties and entered by the court is your primary defense if the production is challenged.

  3. Maintain all training and validation logs. For TAR 1.0, retain the seed set coding decisions, control set validation results, and model performance metrics at each iteration. For CAL, retain the platform's built-in performance reports and log the total documents reviewed, total found relevant, and estimated recall at conclusion. These records demonstrate that the methodology was correctly applied.

  4. Specify who has authority to code training documents. Da Silva Moore and subsequent cases emphasize that senior attorneys or subject matter experts — not contract reviewers — should make the coding decisions that train the model. This is not merely a best practice; it is a defensibility requirement. Document who made which training decisions and their qualifications.

  5. Consider a joint expert or neutral protocol referee for high-stakes disputes. In cases where TAR methodology is likely to be contested (e.g., the requesting party has objected or the court has expressed skepticism), consider retaining a joint eDiscovery neutral — a role recognized in several federal districts — to review and certify the protocol. This pre-empts disputes about methodology after production.