eDiscovery, or electronic discovery, is the process of identifying, preserving, collecting, processing, reviewing, and producing electronically stored information (ESI) in response to litigation, regulatory investigations, or government inquiries. ESI encompasses email, documents, spreadsheets, databases, instant messages, social media content, voicemails, metadata, and any other information stored in electronic form.
In US federal litigation, eDiscovery is governed primarily by the Federal Rules of Civil Procedure — particularly Rule 26 (scope of discovery and disclosure obligations), Rule 34 (requests for production of ESI), Rule 37(e) (sanctions for failure to preserve ESI), and Rule 45 (third-party subpoenas). Equivalent rules govern state court practice, UK disclosure under the Civil Procedure Rules, EU civil procedure frameworks, and international arbitration proceedings.
The Electronic Discovery Reference Model (EDRM), developed by industry practitioners and widely adopted by courts and practitioners, defines nine sequential stages of the eDiscovery process: Information Governance, Identification, Preservation, Collection, Processing, Review, Analysis, Production, and Presentation. Each stage generates specific professional obligations, cost exposure, and risk of sanctions for non-compliance.
eDiscovery is the budget-defining event in most commercial litigation. RAND Institute for Civil Justice studies estimate that document review alone — one stage of the nine-stage process — accounts for 50 to 70 percent of total litigation budgets. At hourly attorney review rates of $50 to $200 per document (depending on seniority, matter complexity, and jurisdiction), a commercial dispute generating two million documents can produce review costs exceeding the value of the underlying claim. This economic reality has made eDiscovery efficiency a strategic competency for litigation practices, not merely a back-office function.
The legal stakes are equally significant. FRCP Rule 37(e) authorizes courts to impose sanctions — including adverse inference instructions, evidence preclusion, and case-dispositive orders — when a party fails to preserve ESI that should have been subject to a litigation hold. The standard for sanctions requires a showing that the failure was due to failure to take reasonable steps to preserve, and that the lost ESI cannot be recovered through additional discovery. Courts have awarded adverse inferences in cases where litigation hold notices were issued weeks after counsel knew litigation was reasonably anticipated, even when the data loss was inadvertent.
FRCP Rule 26(b)(1) imposes a proportionality standard: discovery must be proportional to the needs of the case, considering the importance of the issues, the amount in controversy, the parties' relative access to information, and the burden and expense of proposed discovery. Courts now actively enforce this standard — federal judges in the District of Delaware, Southern District of New York, and Northern District of California have issued standing orders requiring parties to meet and confer about proportionate discovery protocols and to explain their use of technology-assisted review. A litigation attorney who cannot speak fluently about eDiscovery methodology is at a disadvantage in discovery conferences before technologically sophisticated judges.
The 2024 EDRM Future of eDiscovery Survey found that outside counsel eDiscovery costs have increased at 8 to 12 percent annually since 2018, driven by data volume growth (primarily from collaboration platforms such as Teams, Slack, and Zoom), increased mobile device data, and expanded regulatory investigation activity. AI-assisted review is now viewed as a cost-control necessity rather than an optional upgrade in matters generating more than 100,000 documents.
How It Works (Technical)
Information Governance is the foundation: policies and technology that manage data creation, retention, and destruction before litigation arises. Firms that advise on information governance help clients implement retention schedules, hold management procedures, and data maps that reduce the burden and risk of future eDiscovery obligations.
Identification involves determining what data sources likely contain relevant ESI — which custodians (individuals) have potentially relevant documents, which systems contain them, and what date ranges are applicable. A data map documenting data sources, custodians, and retention policies is the output of this stage.
Preservation requires issuing legal hold notices to identified custodians and IT personnel, instructing them to suspend routine document retention/deletion policies for data within the hold scope. Under FRCP Rule 37(e), the obligation to preserve arises when litigation is "reasonably anticipated" — not when a complaint is filed. Preservation failures that occur between the point when litigation was anticipated and the point when a hold was issued are a common source of sanctions motions.
Collection is the forensically sound transfer of data from its native location to a collection environment. Forensic collection preserves metadata — creation dates, modification history, email routing information — that may be material to authentication and chain-of-custody arguments. Self-collection by custodians (asking a custodian to search their own email and send relevant items) is generally disfavored in courts because it lacks independent verification and is susceptible to inadvertent or deliberate omission.
Processing converts raw collected data into a format suitable for review. This includes deduplication (removing multiple identical copies of the same document), near-duplicate identification (grouping documents that are nearly identical but differ in minor ways), email threading (assembling email conversations into chronological chains), format conversion (converting proprietary formats to reviewable PDFs or TIFF images), and optical character recognition (OCR) for scanned documents.
Review is where attorneys make relevance, responsiveness, and privilege determinations on individual documents. This is the most expensive stage and the primary target of AI optimization. Technology-Assisted Review (TAR) and Continuous Active Learning (CAL) methods reduce review volume by predicting relevance before attorney eyes touch a document.
Production is the delivery of responsive, non-privileged documents to the requesting party in the agreed format (TIFF with extracted text, native format, PDF) with Bates numbering applied.
How Legal AI Vendors Address It
Relativity is the dominant platform for large-matter eDiscovery in the US. It provides full EDRM workflow coverage: legal hold management (Relativity Legal Hold), collection and processing (Collect), review and analysis (Relativity Review), AI-assisted review (Active Learning), and production. Relativity's ecosystem depth — hundreds of third-party integrations, a large certified administrator community, and deep customization capability — makes it the default choice for AmLaw 200 litigation departments and large eDiscovery service providers. The limitation is operational complexity and cost: Relativity requires a skilled administrator to configure and maintain, and setup for complex matters involving foreign-language documents, multi-issue review, or intricate privilege protocols is time-intensive.
Everlaw is cloud-native and positioned as a faster-to-deploy alternative to Relativity for matters in the 50,000 to 2 million document range. Its collaboration features — real-time co-review, attorney annotation sharing, deposition preparation tools — are strong. The onboarding timeline from data upload to active review is shorter than Relativity for standard commercial litigation. The trade-off is configurability: Everlaw's TAR/CAL implementation is less granular for complex multi-issue reviews, and its production functionality offers fewer customization options for matters requiring specialized production formats.
Logikcull is a self-service platform designed for litigation teams that need eDiscovery capability without a managed services overlay. It is appropriate for standard commercial litigation generating up to a few hundred thousand documents, with straightforward relevance determinations and limited foreign-language content. Attorneys upload data and run the process themselves. The limitation is that Logikcull's AI review tools provide limited visibility into model parameters, which makes it less defensible than Relativity or Everlaw for matters where the opposing party is likely to challenge the review methodology.
DISCO eDiscovery is an AI-first platform that has designed its review workflow around AI-assisted document categorization from the ground up, rather than adding AI onto a legacy review interface. Its analytics capabilities — timeline visualization, network graph analysis of email communications, issue-based clustering — are strong for investigative and complex commercial matters. DISCO is a newer platform with lower market share than Relativity or Everlaw, which means fewer experienced project managers in the eDiscovery services market have production experience with the platform.
Nuix is the platform of choice for forensic and government investigation matters. Its processing engine handles high-volume, multi-format data collections more efficiently than most competitors, including encrypted containers, database exports, and mobile device images. It is widely used by government agencies, law enforcement, and corporations conducting internal investigations. Its document review interface is less polished than Relativity or Everlaw for standard attorney review workflows.
How Lawyers Should Verify and Apply eDiscovery AI
-
Issue a litigation hold before collecting. The obligation to preserve arises at the point litigation is reasonably anticipated. Do not wait for a formal complaint or service of process. The hold notice must identify the scope of materials subject to preservation, instruct recipients to suspend deletion, and be sent to all reasonably identifiable custodians. Retain records of when holds were issued, to whom, and what acknowledgments were received.
-
Agree on an ESI protocol at the Rule 26(f) conference. Federal courts expect parties to address ESI during the mandatory meet-and-confer conference. Come prepared with a proposed ESI protocol that specifies: data sources in scope, custodians to be collected, format of production, de-duplication methodology, and — if AI-assisted review will be used — the proposed review methodology and recall target. A written, court-filed ESI protocol protects against later completeness challenges.
-
Require a proportionality analysis before committing to a review approach. For every matter, document the analysis: how many documents are in the collection, what is the case value and complexity, what review approach will be used, and what is the estimated cost? This analysis supports your FRCP Rule 26(b)(1) certification and provides a basis for opposing disproportionate discovery requests.
-
Use a validation sample before certifying production. Before signing a Rule 26(g) certification that the production is complete, have a senior attorney review a random sample of documents coded as not-responsive. If the sample reveals a materially higher elusion rate than expected, extend the review before certifying.
-
Maintain a privilege log concurrently with review. Do not defer privilege logging until after production. Log privileged documents as they are identified during review, capturing: document type, date, author, recipients, subject matter (without disclosing the privileged content), and basis for privilege. A privilege log produced weeks after production that appears to have been reconstructed from memory is a credibility problem in any discovery dispute.