top of page
  • Writer's pictureErica So

AI-Driven E-Discovery Tools: Essential for Better Practice?

The legal industry, known to be profoundly traditional, is seeing slow but significant technological transformation, driven by advances in Artificial Intelligence (AI), Generative AI (GenAI), and other machine learning technologies. With a view to increase speed and accuracy in workflows especially within the discovery stage in litigation, tool-assisted review (TAR) has steadily gained in popularity in the past decade. Nowadays, with the emergence of TAR 2.0, it is set to showcase the full gamut of AI’s potential and allow legal professionals to target only information that is essential and improve their overall efficiency in reviewing documents for court actions.



Author: Erica So, Associate Solicitor



The use of AI in e-discovery since early 2010 has paved the development of TAR 2.0. The TAR workflow streamlines the documentation review process by predicting which documents are relevant based on patterns identified in a set of sample documents. The process involves harnessing machine learning technology to identify potentially relevant documents during discovery, a stage in litigation where both parties exchange information and evidence, looking for points to challenge each other and build their own arguments.


Now, let’s delve into how AI is trained and utilized in this process. Practitioners would code a set of sample documents, usually referred to as the “seed set”, which serves as a training dataset for the AI. By categorizing and annotating a selection of documents, the AI system is taught how to recognize relevant information. Subsequently, by making use of TAR, practitioners would be able to examine the characteristics of relevant documents, such as keywords and phrases pivotal to their cases.


The system identifies similar documents and organises them according to their level of relevance. This predictive coding not only categorizes the documents but also applies a probability score to determine the potential outcome of whether a document is relevant for the purpose of the case. Predictive coding can significantly reduce the time and effort required to sift through massive amounts of data, making the review process more efficient and accurate.


A Leap from TAR 1.0 to 2.0: building on what it has learned prior

While TAR 1.0 processes have helped cut down the number of documents required for manual review in litigation and drastically cut review costs in the billable time of lawyers, they also increased the accuracy of information, as using TAR allows the reviewers to focus on the most relevant documents primarily. However, TAR 1.0 is subject to the ‘seed sets’ of documents that were initially coded into the algorithm. It relies on a single static training set to train the software to spot relevant information and categorize it according to the needs of the practitioners. Simply put, because randomly selected sample document sets are included, multiple iterations of these sample reviews may have to be further performed by the lawyers themselves to achieve their desired accuracy.


This reliance has led to the push for a more advanced version of TAR namely TAR 2.0, which largely replaces its predecessor.

TAR 2.0 leverages Continuous Active Learning (CAL) models. Like TAR, a CAL model is initially trained using a ‘seed set’ of documents prepared by human reviewers. The software subsequently continues to learn and adapt after the initial coding instruction. As practitioners continue to utilise the software, it fine-tunes the information given, increasing its accuracy in identifying relevant documents and making the review process smoother. This process is tailored to each specific matter, as the 'seed set' is matter-specific, ensuring that the tool finds useful information even when switching from one topic for one client to another topic for a different client. This continuous learning capability allows TAR 2.0 to maintain high accuracy and efficiency across various cases and topics.


Fully accurate Legal Automation is yet to evolve into existence

While studies have proven that TAR provides more accurate results than manual reviews, for a fraction of the cost and time normally spent on document review process, it should be noted that the CAL model has its own limitations. For instance, it is built on the textual information of each document, meaning that documents such as images, audio files, and movies, are not suitable for this type of application. Spreadsheets such as Excel files are also often excluded because the model has difficulty grasping the context of highly structured documents. Overall, the goal remains to increase the volume of documentation that can be accurately analysed using machine learning, which in turn assists legal professionals in designing an effective workflow strategy tailored to their practice and the needs of their clients. How legal automation will develop to support this goal, remains to be seen.


 

Disclaimer: Whilst every effort has been made to ensure the accuracy of this article, it is general in nature and does not constitute legal advice of any kind. You should seek your own personal legal advice before taking legal action. We accept no liability whatsoever for loss arising out of the use or misuse of this article.


For specific advice about your situation, please contact:


Portrait of Erica So

Associate Solicitor

+852 2388 3899

Commenti


bottom of page