Zero Data Retention in Legal AI: Why It Matters for Lawyers

Vendors say they don't train on your data. Four dodge patterns make that claim meaningless without the right DPA. This guide explains how to verify genuine ZDR commitments.

All Posts

Publisher

Sarah Chen, Senior Legal Tech Analyst

2026/11/11

Categories

We respect attorney-client confidentiality. No tracking pixels in our emails.

The sales rep says "we don't train on your data." You sign. Six months later, a Wired story reveals the vendor's ToS carves out "aggregated, anonymized" training rights. Your client's confidential matter language is in someone else's model.

About This Guide

This is our analysis of zero data retention claims in legal AI tools in 2026, written for lawyers, legal ops professionals, and firm technology leaders who need to evaluate AI vendor data handling before deployment. LawyerAI built this guide. We earn no affiliate revenue from these tools.

Here are the 4 rules we set for ourselves before writing this:

Each platform gets a real limitation. Even tools we recommend.
We state pricing when published, and mark "not published" when vendors don't disclose.
Accuracy numbers come only from independent benchmarks (Stanford RegLab, etc.). Vendor-authored accuracy claims don't count.
The decision tree near the end sends you to the right tool for your primary job.

We re-review this list every quarter.

Short Answer

Short answer: Harvey AI has the most credible contractual ZDR commitment for enterprise legal work · Lexis+ AI and CoCounsel provide enterprise DPA no-training clauses at the research tool tier · Spellbook provides ZDR commitments at SMB scale with lighter audit rights · General LLMs without enterprise agreements should never be used for privileged client content under any circumstances.

ZDR Commitment Comparison

Tool	No-Training DPA	Audit Rights	Subprocessor List	Data Deletion Timeline	Certification
Harvey AI	Explicit contractual prohibition	Yes	Yes	Session end + contract termination	SOC 2 Type II + ISO 27001
Lexis+ AI	Yes (enterprise tier)	Enterprise audit	Yes	Defined in DPA	SOC 2 Type II
Spellbook	Yes (DPA available)	Limited	Yes	Defined in DPA	SOC 2 Type II
CoCounsel	Yes (enterprise DPA)	Enterprise audit	Yes	Defined in DPA	SOC 2 Type II
LegalFly	EU-native design; no training by architecture	Audit available	EU-resident	Session end	EU data residency

What Zero Data Retention Actually Means

The Technical Definition

Zero data retention (ZDR) in the AI context means that query inputs and model outputs are not persisted — stored — after the session ends. When you submit a document to an AI tool, the document and the tool's response exist in memory during processing, but are deleted from the vendor's systems when the session closes. No copy is kept for any purpose, including model training, quality improvement, debugging, or audit.

That is the technical ideal. In practice, most vendors maintain some operational logging — audit trails, error logs, system diagnostics — that temporarily persist session data for operational reasons. The critical distinction is whether that operational logging is (a) clearly time-limited (e.g., deleted within 30 days), (b) explicitly excluded from training use, and (c) stored with the same security controls as the production system.

Genuine ZDR requires all three: no training use, no persistent storage beyond defined operational minimums, and audit rights to verify compliance. A vendor that logs your sessions for 90 days and says the logs are "not used for training" is offering something weaker than ZDR — they are offering a no-training commitment combined with temporary data retention. The distinction matters if you experience a breach or a regulatory inquiry during that 90-day window.

For the full technical and contractual definition of ZDR, see the zero-data-retention-policy glossary entry.

The Four Vendor Dodge Patterns

Based on our review of vendor terms of service and data processing agreements across the legal AI market in 2026, here are the four patterns used to describe data handling practices that are weaker than genuine ZDR:

Pattern 1: "Aggregated, anonymized training." The vendor trains on data that has been stripped of obvious identifiers — names, addresses, specific identifying terms. The claim is that anonymization protects client confidentiality because no individual client can be identified in the training data.

The problem: anonymization is not reversible-proof. If your matter involves a distinctive transaction structure, a unique regulatory situation, or specific contractual language that is uncommon in the market, that content may survive anonymization in a form that is recognizable to someone with contextual knowledge. A securities lawyer who knows the deal can recognize the deal from the contractual language even without the parties' names. "Aggregated, anonymized" is not a confidentiality commitment — it is a technical description of a process that may or may not protect your client's matter content.

Pattern 2: "You can opt out of training." The default is that the vendor may train on your data. You can opt out by taking an affirmative action — usually navigating to a privacy settings page, sending an email to a data team address, or requesting an enterprise agreement. The problem: firms that sign without reading the fine print — which is most firms — are opted in by default. Opt-in-by-default is the appropriate standard for any tool processing privileged legal content.

Pattern 3: "We use your data to improve our product." "Product improvement" is deliberately broad. It is commonly used as a synonym for training and fine-tuning in vendor terms. A vendor that says "we use your data to improve our product" is telling you they train on your data in language designed not to say that directly. Read "product improvement" as training until the vendor explicitly clarifies otherwise in the DPA.

Pattern 4: "Customer-isolated fine-tuning." Some vendors offer to fine-tune a custom model specifically on your firm's data. The pitch is that your data improves only your model, not the shared base model — so other customers do not benefit from your data. This is materially better than shared training. But it still means your data is used for training. And it raises three questions: What happens to the fine-tuned model if you terminate the contract? Can the vendor extract insights from the model about your client data (model inversion attacks)? What security controls apply to a model that is fine-tuned on your privileged matter data?

Training Data Opt-Out vs. Genuine ZDR

There is a meaningful difference between "opt-out of training" and genuine ZDR, and vendors conflate them deliberately.

Opt-out of training means the vendor will not use your data for model training if you actively request exclusion. Your data may still be retained, logged, and processed. If you fail to opt out, your data is available for training.

Genuine ZDR means no training use, no persistent data retention beyond operational minimums, and contractual commitments covering both the vendor and their subprocessors. Opt-out is a feature; ZDR is an architecture.

For law firms processing privileged client matter data, opt-out is insufficient. The contractual commitment must be an explicit prohibition on training use — not an option the firm can elect.

How to Verify: What to Look for in the DPA, Not the Sales Deck

The sales deck will say whatever it needs to say to close the deal. The DPA is the legally binding document. Verification of ZDR requires reading the DPA, not the marketing materials.

In the DPA, you are looking for:

An explicit prohibition on using customer data for training. The language should say: "Vendor shall not use Customer Data to train, fine-tune, or improve any machine learning model." If the language only says "Vendor may use aggregated, anonymized data," you do not have a ZDR commitment.
A data deletion timeline. The DPA should specify: (a) when customer data is deleted after session end, and (b) when customer data is deleted after contract termination. "Promptly" is not a timeline. A specific number of days is a timeline.
A subprocessor list. Every third party that processes your data — the inference provider (OpenAI, Anthropic, AWS, etc.), the storage provider, the logging provider — must be disclosed and bound by equivalent restrictions. A vendor with a perfect DPA whose inference provider has no ZDR commitment has a gap that defeats the vendor's commitment.
Audit rights. The DPA should give you the right to audit the vendor's compliance with data handling commitments. In practice, enterprise vendors offer audit reports (SOC 2 Type II reports) rather than direct access, which is an acceptable substitute if the audit scope covers the specific commitments in the DPA.

See the data-processing-agreement glossary entry for the full DPA requirements checklist.

The Subprocessor Problem

This is the gap that catches firms that have done the right things and still have exposure. A firm signs a DPA with their legal AI vendor. The DPA has a good no-training clause. The vendor's DPA applies to the vendor's own systems. But the vendor runs their AI inference on an OpenAI API, and the vendor's contract with OpenAI is the consumer API without a no-training commitment.

The vendor's DPA says "we don't train on your data." OpenAI's API terms say "we may use API inputs to improve our models" (unless you are on the enterprise API with specific no-training terms). The subprocessor — OpenAI — is training on your data, and the vendor's DPA did not adequately bind the subprocessor.

The correct DPA language requires: "Vendor shall ensure that all subprocessors who process Customer Data are bound by data handling restrictions no less protective than those in this Agreement, including the prohibition on using Customer Data for model training."

Then verify: Does your vendor's inference provider (OpenAI, Anthropic, Cohere, etc.) have enterprise API terms that include a no-training commitment? For the major inference providers, enterprise API tiers generally include no-training commitments; consumer-tier APIs generally do not. Ask your vendor explicitly: are you running on an enterprise API contract with your inference provider that includes a no-training commitment?

The audit-log-legal-ai glossary entry covers what audit trails you should require to verify subprocessor compliance.

Contractual Protections: What Clauses to Require

Beyond the core no-training prohibition, require these contractual protections before submitting privileged client data to any AI tool:

Breach notification. The DPA should require vendor notification of any unauthorized access to customer data within 72 hours (matching the GDPR standard even if your firm is not EU-based — it is a reasonable benchmark).

Confidentiality of outputs. AI outputs generated from your privileged matter content are themselves potentially privileged. The DPA should prohibit the vendor from disclosing outputs as well as inputs.

Return or destruction on termination. When you terminate the contract, all customer data must be returned to you or destroyed, with written certification.

No disclosure obligation override. If the vendor receives a government request for your data, the DPA should require them to (a) notify you before complying (to the extent legally permissible) and (b) seek a protective order if legally appropriate.

Tool-by-Tool ZDR Assessment

Harvey AI

Harvey AI provides an explicit contractual no-training commitment as a standard term in its enterprise DPA. The commitment covers customer data submitted as prompts and documents — not an opt-out, but a blanket prohibition. Harvey's security certifications (SOC 2 Type II and ISO 27001) support the claim with independent audit coverage.

What works: Harvey's DPA no-training language is the most straightforward among the tools in this guide. The prohibition is explicit, not buried in an anonymization carve-out. Harvey's enterprise customer base (BigLaw firms) has created negotiating pressure for strong data protection terms, and the resulting DPA reflects that pressure.

Real limitations: Verify Harvey's subprocessor list before signing. Harvey runs on OpenAI infrastructure — confirm that Harvey's OpenAI API agreement is an enterprise agreement with a no-training commitment. This information is not publicly disclosed; request confirmation in writing from Harvey's legal team. For EU-based firms, the ZDR commitment coexists with the Schrems II transfer mechanism requirement — both need to be in place. See gdpr-compliance-ai for the EU layer.

Lexis+ AI

LexisNexis's enterprise data handling standards are well-established from decades of managing sensitive legal data. Lexis+ AI's enterprise DPA includes a no-training commitment for customer-submitted content. The commitment applies at the enterprise tier; verify which tier your organization has signed.

What works: LexisNexis's institutional DPA framework is reliable. The enterprise agreement gives legal teams leverage to require audit access and subprocessor disclosure.

Real limitations: Lexis+ AI is primarily a legal research tool — the primary input is search queries and legal research prompts, not raw client matter documents. For document analysis use cases where full client documents are submitted, verify that the no-training commitment explicitly covers document content, not just search queries. Enterprise tier required for full DPA coverage.

Spellbook

Spellbook provides SOC 2 Type II certification and a data processing agreement with a no-training commitment. The DPA is available for review before signing. The commitment covers documents submitted for contract review and drafting.

What works: Spellbook's DPA provides a genuine no-training commitment at a price point accessible to small firms. The SOC 2 Type II certification provides independent audit coverage.

Real limitations: Spellbook's audit rights provisions are less extensive than Harvey's enterprise-level DPA. For SMB customers, negotiating DPA customization is less feasible — you are largely taking the standard form. If your firm's privileged content volume is high or your matters are particularly sensitive (high-stakes M&A, government investigations), Harvey's enterprise DPA framework provides more protection.

CoCounsel

Thomson Reuters' DPA for CoCounsel includes a no-training commitment for enterprise customers. Thomson Reuters has large enterprise relationships and strong institutional data protection standards.

What works: Thomson Reuters' institutional credibility and enterprise DPA infrastructure are reliable. CoCounsel is built on OpenAI infrastructure; verify that Thomson Reuters' API agreement with OpenAI includes no-training terms covering customer data.

Real limitations: CoCounsel is primarily a US-focused research and drafting tool. EU compliance documentation (GDPR DPA, SCCs) is less developed than Harvey's. For firms with material EU exposure, verify the EU layer separately.

Decision Tree

If you are working on privilege-sensitive matters (M&A, litigation strategy, regulatory investigations) → Harvey AI — require the enterprise DPA with explicit no-training prohibition; verify subprocessors; most credible ZDR commitment in the market

If you primarily need AI for legal research tasks → Lexis+ AI or CoCounsel — enterprise DPAs available; research queries carry lower privilege risk than document submission; verify enterprise tier before signing

If you are a small firm doing commercial contract work → Spellbook — SOC 2 Type II; DPA with no-training commitment; adequate for commercial contract use cases at SMB scale

If you have EU clients and need ZDR plus EU data residency → LegalFly — EU-native design; no training by architecture; no Schrems II transfer issue

Never under any circumstances → general-purpose consumer LLMs (ChatGPT consumer, Claude.ai consumer, Gemini free tier) for privileged client content without an enterprise agreement and signed DPA with an explicit no-training prohibition

Frequently Asked Questions

What does zero data retention actually mean?

Zero data retention means the AI vendor does not keep your query inputs or model outputs after the session ends. No persistent copy for training, quality improvement, debugging, or any other purpose. In the strictest technical definition, no session data persists after session close. In practice, most vendors maintain time-limited operational logs (30-90 days) for system diagnostics, which is acceptable if those logs are explicitly excluded from training use and stored with equivalent security controls. The key test is not whether any data is retained temporarily — it is whether data is retained in a way that could be used for model training or disclosed to any third party. See zero-data-retention-policy for the full definition.

How do I verify a vendor's ZDR claim?

Four steps: (1) Read the DPA, not the sales deck. Find the explicit no-training prohibition language. If it is not there, the claim is not contractual. (2) Check the subprocessor list. Every inference provider, storage provider, and logging provider must be identified and bound by equivalent restrictions. (3) Ask whether the vendor's inference provider (OpenAI, Anthropic, etc.) is on an enterprise API contract with no-training terms. Get the answer in writing. (4) Review the audit rights provision. Confirm that the SOC 2 Type II audit scope covers the specific data handling commitments in the DPA — not just general security controls.

Which vendors have the strongest ZDR commitments?

Among the tools reviewed in this guide, Harvey AI has the most explicit and well-documented contractual ZDR commitment for enterprise legal work. The no-training prohibition is a standard DPA term (not an opt-out), and the SOC 2 Type II and ISO 27001 certifications provide independent audit coverage. LexisNexis (Lexis+ AI) and Thomson Reuters (CoCounsel) have institutional data handling standards that are reliable at enterprise tier but require tier verification. Spellbook provides credible ZDR at SMB scale with lighter audit rights. LegalFly provides ZDR through EU-native architecture for EU-based firms. attorney-client-privilege-ai covers the privilege implications of each tier.

What is a subprocessor and why does it matter for ZDR?

A subprocessor is a third party that the AI vendor uses to process your data as part of delivering the service. For legal AI tools, the primary subprocessors are: the AI model inference provider (OpenAI, Anthropic, Cohere, Google), the cloud infrastructure provider (AWS, Azure, GCP), and operational services (logging, monitoring, security). Your ZDR commitment from the vendor is only as strong as the vendor's commitments to its subprocessors. If the vendor's inference provider does not have a no-training commitment covering your data, the vendor's DPA has a gap. The GDPR standard under Article 28(4) requires subprocessors to be bound by the same data protection obligations as the primary processor — the same principle applies to ZDR requirements whether or not GDPR applies to your firm.

Can I require ZDR contractually from any AI vendor?

Yes — ZDR is a contractual requirement, and any vendor that wants your business can be required to agree to it as a DPA term. The practical limitation is negotiating leverage. Large enterprise vendors (Harvey, LexisNexis, Thomson Reuters) have standard enterprise DPAs that already include ZDR-equivalent commitments. Smaller vendors may not have standard ZDR DPA language and may require negotiation, legal review on their side, and longer sales timelines. Consumer-tier tools (ChatGPT without an enterprise agreement) cannot be made compliant by requesting a DPA — the enterprise tier is a distinct product with a distinct contractual framework. If a vendor cannot or will not provide an explicit no-training commitment in writing, do not use the tool for privileged client content. That is the baseline.

Editorial Independence

LawyerAI evaluations are independent. We do not accept payment that influences our editorial scores. Featured placements are clearly labeled and do not affect our 5-dimension methodology (Accuracy / Speed / Usability / Value / Security). We re-review tools every 6 months.

If you believe any information is inaccurate, contact editor@lawyerai.directory.

Zero Data Retention in Legal AI: Why It Matters for Lawyers

Publisher

Categories

Table of Contents

Zero Data Retention in Legal AI: Why It Matters for Lawyers

Publisher

Categories

Table of Contents

About This Guide

Short Answer

ZDR Commitment Comparison

What Zero Data Retention Actually Means

The Technical Definition

The Four Vendor Dodge Patterns

Training Data Opt-Out vs. Genuine ZDR

How to Verify: What to Look for in the DPA, Not the Sales Deck

The Subprocessor Problem

Contractual Protections: What Clauses to Require

Tool-by-Tool ZDR Assessment

Harvey AI

Lexis+ AI

Spellbook

CoCounsel

Decision Tree

Frequently Asked Questions

Editorial Independence

Zero Data Retention in Legal AI: Why It Matters for Lawyers

Publisher

Categories

Table of Contents

Newsletter

Monthly Legal AI Reviews — In Your Inbox

Zero Data Retention in Legal AI: Why It Matters for Lawyers

Publisher

Categories

Table of Contents

About This Guide

Short Answer

ZDR Commitment Comparison

What Zero Data Retention Actually Means

The Technical Definition

The Four Vendor Dodge Patterns

Training Data Opt-Out vs. Genuine ZDR

How to Verify: What to Look for in the DPA, Not the Sales Deck

The Subprocessor Problem

Contractual Protections: What Clauses to Require

Tool-by-Tool ZDR Assessment

Harvey AI

Lexis+ AI

Spellbook

CoCounsel

Decision Tree

Frequently Asked Questions

Editorial Independence