Can Artificial Intelligence Ease the EDD Burden?


By Nick Brestoff

Law Technology News

January 20, 2011


We don't handwrite letters or sit on park benches to confer with each other anymore, do we? We text-message each other, write e-mails, or use still other forms of instant communication. We may even send signals up to a satellite and then down to the person in the same room.

Welcome to the age of legal informatics. Legal informatics is a sort of fusion between artificial intelligence and the law. It presents the question "Will AI put information management, that far left and oft-neglected process described by the Electronic Discovery Reference Model, in the e-discovery spotlight?" In other words, can AI help prevent or alleviate our e-discovery burdens?

It's safe to say that the vast majority of our business and personal information is stored and transmitted in digital format. And we create a lot of it. According to IBM, every day we create enough information to fill all the libraries in the United States; oh, sorry, eight times all the libraries in the U.S., every day.

In this coming year alone, we humans will create as much information as we created in all the years that preceded it, combined -- all of them. Which brings me to what I learned at TEDxCaltech, a conference I attended as an alum on January 14, 2011.

TED stands for Technology Entertainment & Design. TED's purpose is to provide a platform for very interesting people to share their very interesting ideas, across disciplines, in a fast-paced and entertaining way. Each presentation is limited to 18 minutes, and they are likely the best presentations by the best and brightest minds you will ever see. I recommend them.

The "x" means that a particular event is independently organized -- in this case, by and for Caltech. The TED organization provides the format, some loose guidelines, and the spirit of sharing ideas.

Conference session 3 was about nanoscience and future biology, arenas that have seemingly no relevance to e-discovery but which may have important implications for legal professionals. These fields are moving quickly in large part because the ability of modern scientists to generate data is increasingly enormous. In one experiment, a single scientist can now generate well over a gigabyte of data. To interpret this overwhelming amount of data, biologists have turned to the field of bioinformatics to apply advanced search strategies to the data. They are uncovering relationships that would have been otherwise impossible to appreciate.

Perhaps the use of bioinformatics in nanoscience is analogous to the use of legal informatics in e-discovery.Can we apply a variation of bioinformatics in the legal world?


Yes. In e-discovery, we are forced in many cases to deal with gigabytes and terabytes of data that consist of e-mails, documents, and spreadsheets. When we first met search technology about 35 years ago, we learned how to use keywords to search a database filled with appellate decisions. But now we have to deal with unstructured and vastly larger databases.


We are adapting. Although you may not realize it, many of us are using predictive coding to go well beyond the searches we did using keywords. And by so doing, we are finding a much higher percentage of potentially relevant documents in the datasets we receive from opposing counsel than the keywords approach could ever produce.




What's under the hood of predictive coding, also known as concept searching? A discipline completely foreign to most attorneys: mathematics -- matrices, linear algebra, vector clusters, and relevance rankings.

In e-discovery, we search through massive amounts of documents by treating them all as data. All those e-mails, documents, and spreadsheets are converted into long, unique series of just two numbers: 1 and 0.

But computers are very fast now and they are becoming faster, smarter, and less expensive. Next month an IBM computer named Watson will play against two Jeopardy champions. Watson will ingest the Jeopardy categories and clues, deal with puns and so on, hit the button, and in seconds put its answer in the proper Jeopardy format.


I'm betting on Watson. It already won a practice round against its two human competitors. If Watson wins, it would be a sort of "proof of concept" for legal informatics.

But for what? In e-discovery, we find ourselves increasingly connected to the worlds of data mining and artificial intelligence, and this is unfamiliar territory. Could we re-educate ourselves to figure out how to avoid e-discovery altogether?


One thing is clear: if you went to law school to avoid science, technology, engineering, and math, I'm sorry to bring you the news that you can avoid it no longer. It's here; it's in the middle of nearly every lawsuit; and it's quickly taking over giant swaths of our jurisprudential territory.

So it's high time for attorneys to upgrade their search methodologies. In a world of gigabytes and terabytes, the Boolean keyword search technique is simply not up to the job.


But why do the work if you can avoid it? If Watson wins, we will be on the threshold of preventing the need for e-discovery and its costs by moving analysis and review into that far left category of the EDRM called information management or, in other words, pre-litigation information management. Like Smokey the Bear, I'd want to find those embers of potential liability in real time, and try to put those fires out the moment they begin to flicker.


All this does not spell "early case assessment." It suggests real-time review and analysis as part of information management, before any damage is done: no damage means no lawsuit; no lawsuit means no e-discovery.


For those who have to pay the bill, liability prevention is far better than having to start preserving data because of liability potential.


Will you hear this message at LegalTech New York? I doubt it. But if you need to be persuaded that we are in the middle of an enormously significant paradigm shift for how we practice law, here's another conference to consider attending: the International Conference on Artificial Intelligence and Law at the University of Pittsburgh School of Law from June 6 to 10, 2011.


You won't find Smokey there saying, "Only AI can prevent litigation." But you should look for Watson.




Nick Brestoff was a litigator in California for 38 years. He is the principal author of Preventing Litigation: An Early Warning System to Get Big Value Out of Big Data (Business Expert Press, 2015), and is the founder and CEO of Intraspexion Inc.


Read more:


Copyright 2011. ALM Media Properties, LLC. Further duplication without permission is prohibited. All rights reserved