Active Learning (eDiscovery)
An iterative ML approach in eDiscovery where the model continuously updates relevance predictions as reviewers code documents, prioritizing the most uncertain documents for review.
Last reviewed: 2026/05/19
Definition
Why It Matters for Lawyers
How AI Tools Handle It
Frequently Asked Questions
- Q: How do active learning and traditional predictive coding differ in practice?
- With traditional predictive coding, attorneys build and code a seed set first, then the model predicts across the full population. With active learning, the model starts predicting immediately and improves as reviewers code, eliminating the separate seed set phase. Active learning is generally more efficient, particularly on large, heterogeneous document populations.
- Q: How do I know when active learning review is done?
- Completion is assessed through recall validation sampling — reviewing a random sample of model-low-scored documents and measuring actual relevance. When the estimated relevant population in the unreviewed documents falls below an agreed threshold, review is complete. The appropriate threshold is a legal judgment based on case needs, not a fixed technical standard.
- Q: Does active learning work on small document sets?
- Active learning provides the most efficiency gain on large document sets (100,000+ documents). On smaller sets, the efficiency advantage over linear review or simple keyword filtering is less pronounced. For very small document sets, linear review may be more straightforward. --- *Last reviewed: 2026-05-19 by LawyerAI Editorial Team.*
Related Tools
Related Reading
Last reviewed: 2026/05/19. Definitions are written by the LawyerAI Editorial team. We do not accept affiliate commissions; Featured placement is clearly labeled and does not influence editorial content.