Why Enterprise Paralegals Will Be More Powerful Than the Senior Partners in Outside Law Firms

On May 17, 2018, I listened to a talk by the Dr. Lewis Z. Liu, the CEO of Eigen Technologies. (His Ph.D. is in atomic and laser physics from the University of Cambridge.)

The point of Dr. Liu’s fine talk was to suggest why, with assistance from machines using artificial intelligence in the form of deep learning and natural language processing (NLP), the young associates of the law firms of tomorrow will know more and be more efficient, more productive than the highest paid senior partners of the law firms of today.

In an equation of sorts, Dr. Liu argued that, being AI-enabled means this (using > to mean “greater value than”):

young associates of the law firms of tomorrow > senior partners of the law firms of today

I believe that, but I’ll take Dr. Liu’s contention to an even higher level.

My contention is that, in the future, the paralegals in the enterprise Law Departments (whether corporate or governmental) will, with the assistance of machines using the same tools, e.g., artificial intelligence in the form of deep learning and NLP, will be more efficient and more productive than the most senior partners of the outside Law Firms.

Here’s my contention (where >> means “much greater value than”):

paralegals in Law Departments >> senior partners in Law Firms.

Now how am I going to convince you of that outlandish contention?

Let’s start with the paralegal in the enterprise Law Department of today. When a lawsuit is filed, custodians of potentially relevant documents receive litigation hold notices, and collections are assembled into a case-specific corpus of documents.

These collections need to be separated into documents that irrelevant and need not be produced; documents that are potentially relevant; and documents that may be potentially relevant but are privileged from being produced because of an applicable privilege, e.g., the attorney work-product doctrine or the attorney-client privilege.

In addition, an early assessment of the relevant but non-privileged documents can be done. That assessment may reveal whether risky or “smoking gun” documents are contained in the set that must be produced to an adversary unless otherwise privileged from being disclosed.

This information is useful in that such “smoking gun” documents, if they exist, may be weak and indicate that the lawsuit is defensible, or may be terrifying and suggest that the case is a looming disaster.

But the context is an already-filed lawsuit. In other words, if it poses a terrible risk, the enterprise is in big trouble. If that’s the nature of the lawsuit, it’s already too late.

Could a paralegal in the Law Department have seen such a risk coming, say by accessing and assessing the “smoking gun” or risky emails or other text-based documents before the situation devolved into litigation?

Before Intraspexion began combining artificial intelligence in the form of deep learning to internal enterprise communications such as emails, the answer is no.

And the reason is simple: The amount of data in yesterday’s emails is too large to read and there are no tools available to do the job.

But suppose that such a tool exists? We at Intraspexion have used artificial intelligence in the form of deep learning and created a patented software system for finding specific types of litigation risks in yesterday’s emails.

The system is currently trained for the risk of employment discrimination and, after the deep learning analysis “engine” is trained, it is very sharp.

What do I mean by “sharp”? I mean this:

I’ve said that our first classification is “employment discrimination.”

In our white paper, I described our steps for training a machine to “understand” this litigation risk category.

For the “positive” training set, we created a “positive” set of examples from the factual allegations in hundreds of previously filed discrimination complaints in the federal court litigation database called PACER, and in the classification for “Civil Rights-Jobs," which (in PACER) is Nature of Suit code 442. 

Now, for our purposes, we didn't care about the legalese of "jurisdiction and venue," the names of the parties, or specific claims that were being made. And it didn't matter whether the discrimination was for age, race, sex, and any other sub-category of discrimination. PACER has no sub-classifications for them.

Next, we created a “negative” set of examples that was “unrelated” to “employment discrimination.” This negative set consisted of newspaper and Wikipedia articles and other text, including emails.

But to the best of our knowledge, there were no "discrimination" articles or emails in these sources for our "negative" examples.

After that, we switched from PACER and looked at Enron emails and, to make a long story short, found four (4) examples of true risks for employment discrimination. We found them in the subsets for Lay, Kenneth (Ken Lay was the Chairman and CEO of Enron); Derrick, J.; and a few other former Enron employees.

Now, having found four "true risks," we knew that we had something special. Enron is known for fraud, not employment discrimination. And, as far as we know, no one before us had previously surfaced emails that were about "discrimination." 

Thus, we had successfully trained our Deep Learning model to "learn" the pattern for "discrimination."

Our third step was a benchmarking project we can't discuss.

Then, after that "first light" pilot project, we added 10,000 Enron non-discrimination emails to the unrelated set, so the model could “understand” English in the context of emails.

Then we looked at a held-out set of 20,401 Enron emails that our system had never analyzed previously.

Result: Our "model" called out 25 emails as being "related" to discrimination, and our 4 "needles" were among the 25.

That's 25 out of 20,401 emails, a fraction of 0.001225, which is a little less than one-eighth of one percent.

So if a machine can winnow 20,401 down to only 25 emails that it calls out as “related” to the risk of employment discrimination, that’s great, but then what?

What’s a machine to do? What’s next?

It doesn’t know.

Now I can conclude my argument in three short steps.

  1. Enterprise paralegals by themselves are closest to the risky data but don’t swim well in that ocean. They don’t bother to even look because the water is so murky.

  2. A computer trained using artificial intelligence in the form of deep learning can find the text related to the risks for which it’s been properly trained, but by itself cannot deal with the results.

  3. But paralegals who receive the results can assess the results, and then open and conduct an internal investigation, and, perhaps after sharing the investigation results with others, advise a control group executive about the potentially adverse situation.

Thus, the enterprise may be proactive and avoid the lawsuit altogether.

Senior partners in the very best outside law firms can never hope to be so helpful. They’re not close enough to the risky data when it’s only risky.

Conclusion: the enterprise paralegals of the Law Departments of the Future, enabled by artificial intelligence using deep learning, will be more powerful and beneficial to the enterprise than the outside law firm’s most senior partners.



Book Review: Prediction Machines

In my last blog, I said that I’d attended the Summit on Law and Innovation at Vanderbilt Law in Nashville. I try to write at least one blog a month, but something wonderful happened as I went to the Summit, and I have to report it to you here.

On the plane going to the Summit, I finally had time to read a book. On the plane going back to Seattle, I re-read it.

What an eye-opener.

So the book is Prediction Machines, subtitled, The Simple Economics of Artificial Intelligence, written by Professors Ajay Agrawall, Joshua Gans, and Avi Goldfarb, all three of whom are economists and Professors at the University of Toronto’s Rotman School of Management.

See https://www.predictionmachines.ai/

But these three academics also had some hands-on experience for their teachings. They built the Creative Destruction Lab, a seed-stage program to support science-based startups. As they put it, on page 2 of their book, CDL’s most exciting ventures were AI-enabled companies and that, as of September of 2017, the CDL had (for the third year in a row) interfaced with the largest cohort of AI startups of any program on the planet.

From that advantaged perch, the authors launch their book with their “first key insight,” which is that “the new wave of artificial intelligence does not actually bring us intelligence but instead a critical component of intelligence—prediction.” (Italics in the original.)

And they were still on page 2.

At the end of Chapter 1, they provide Key Points in bullet-point fashion. They do this with every chapter. In other words, they take notes for you.

In Chapter 2, there was a second major insight. Prediction, they say, is “the process of filling in missing information.” Cheaper predictions will mean more predictions because, they say, when the cost of something valuable falls, we will do more of it.

That puts us on the road to disruption. Predictions are being used to solve traditional problems now, but they will be used to solve non-traditional problems in the future. And then something else happens: the value of other things, which they and other economists call “complements,” increases. As examples, they write that if the cost of coffee goes down, and we drink more of it, the demand for and value of sugar and cream goes up. When autonomous vehicles make highly accurate predictions, the value of sensors to capture the data representing the oncoming surroundings goes up.  

In fact, they write, “Some AIs will affect the economics of a business so dramatically that they will no longer be used to simply enhance productivity in executing against the strategy; they will change the strategy itself.”

Now what do they mean by that? They mean that, for Amazon, the current strategy is to enable “shop, then ship.” But they also mean that if the processes of delivery and handling returns are so well predicted that their respective costs go down significantly, then a new model might emerge: “ship, then shop.”

And that’s just the end of Chapter 2.

By the time these brilliant authors reach the end of their book, they are explaining why the likes of the AI-enabled tech companies—Google and Microsoft—have seen the future and, having seen it, transformed their companies from “mobile-first” to “AI-first.”

In Chapter 17, they’re explaining that such a shift “means compromising on other goals such as maximizing revenue, user numbers, or user experience.”

Why? What’s the explanation? Here it is, on p. 194:

“AI can lead to disruption because incumbent firms often have weaker economic incentives than startups to adopt the technology. AI-enabled products are often inferior at first because it takes time to train a prediction machine to perform as well as a hard-coded device that follows human instructions rather than learning on its own. However, once deployed, an AI can continue to learn and improve, leaving its unintelligent competitors’ products behind….”

Without reservation and with sincere (and highest) compliments, I recommend Prediction Machines.

Published by the Harvard Business Review Press. 2018.

Call to action: Get this book.



How to Quantify the Bullets Dodged

On April 30, I attended the Summit on Law and Innovation at Vanderbilt Law School in Nashville. It was one of best Summits ever. Vanderbilt Law an innovation pioneer fostering innovation year in and year out.

Now to my post. One of the presenters was Lawton Penn, a partner at Davis Wright Tremaine who leads the DWT De Novo team. She too is a true innovator.

One of the stories she told involved a GC who expressed concern about having to go to a meeting with the CFO.

It was all about budgets.

The GC said that the leaders of other departments had multi-million-dollar budgets and were typically coming within one-half of one percent of meeting them, and so was not looking forward to saying, “Just trust me.”

The GC was asked why she feared the meeting. She went on to say that the Law Department had done very well and had avoided some scary bullets. But then she asked a memorable question: “How do you quantify the value of dodging bullets?”

My mind switched to how I would quantify “dodging bullets” for litigation.

First, I’d build a spreadsheet of the past. The spreadsheet would show how many lawsuits were filed against the company each year, and then break them down into case types, and then show the trends over time, and the total, average, and distribution of the litigation “cost,” meaning the cost of verdicts and settlements, defense attorneys’ fees, expert witness costs, and administrative costs.

No matter what, all of this cost would be a loss to the company. I say “loss” because any lawsuit filed against a company is a forthcoming loss. No matter whether the case is “favorably” settled or won at trial or on appeal, there are losses in productivity, potential damage to the brand, and the costs of administration by the Corporate Law Department, the fees charged by outside counsel, eDiscovery vendors, experts, and the payouts for settlements or verdicts.

Not even insurance, if any, will cover it all.

But those costs can be compared against the costs in prior years and even the net revenues of the company.

I can only show you a template for such a spreadsheet. I don’t have actual data to present, so I will use old, and not market-relevant, data for a company I won’t name.

In part one, I will use only the publicly available litigation data from PACER (the federal court litigation database). Since the data does not include state court cases, it’s incomplete in that and in other ways.

But the six (6) Nature of Suit (NOS) codes listed below account for 152 cases out of 190 total cases over a 5-year period of time , or about 80% of the total.

In part two, I will use publicly available financial data and also use the $350,000 per case data that I compiled in my book, Preventing Litigation: An Early Warning System, etc. (2015).

(Important note: Each Law Department can compute its own cost average per case each year.)

There are only a handful of Nature of Suit codes, and they are:

NOS descriptions

190 -- Breach of Contract

440 -- Civil Rights:  Other

442 -- Civil Rights:  Employment (i.e., employment discrimination)

710 -- Labor:  Fair Labor Standards Act (FLSA)

791 -- Labor:  ERISA

830 – Patent

COMPANY X                                                                                                                

                                190      440          442          710       791       830    

2013        9           5              7             9           13           2                          

2012        7           0            11             1             6           9                          

2011        2           1              4             1             7             1                          

2010        9           7              5             2             7           9                          

2009        3           4              4             0             6           1                          

                      30         17            31            13          39         22                                                                                                                           

Now read this from the bottom up; that is, starting in 2009 and going forward in time.

There’s a certain magic about that sort of rear-view-mirror review: Trends and/or problems appear.

For example, take breach of contract (NOS 190), the trend was low at 3, bounced up to 9, then down to 2, then up to 7 and stayed up at 9.  

But notice that Civil Rights: Employment went from 4 to 5 to 4, which is essentially flat, but then flared up to 11 and then moved down again, but only to 7.

And FLSA cases when from 0 to 2 to 1 and then from 1 to 9.

And the ERISA cases went from 6 to 7 to 7 to 6 and then more than doubled to 13.

Wow, what’s happening with employment issues? Hmmm.

Let’s step back and think more generally.

With trends, a GC might make hiring plans for on-coming threats (as in L & E, above; or the wave of asbestos lawsuits.

Or, perhaps, she might ask questions about whether employment or product liability insurance levels are set right or need to be adjusted.

And then there’s the financial context. With cost and caseload numbers, the GC could compute the cost per case in each year. And, of course, the cost per case, when multiplied by the number of cases, would tie back to the total cost.

Being a data-driven person, that’s how I’d try to manage the Law Department, and I think I’d be looking for any tool that would help me drive the frequency and the cost of the top five types of lawsuits that bedevil the company.  

And if I ever found a tool that would help me drive the frequency of a particular type of litigation down on a year-over-year basis, then I would be able to explain what went right.

But in general, I’d have data and metrics and trends and a principled way of holding my head up in a meeting with anyone in the C-suite.

But here’s the problem.

I think everyone in the legal profession is aware that law firm leaders are in their positions to build revenues for the firm and to increase the annual distributions.

Law Departments of the Future should not be mired in such short-term thinking. If they persist in that myopia they’ll have no budget for innovation. And then they’ll be stuck with contending that the profession is “special,” and that “Trust me” is an answer that’s good enough.

It’s not. In these tech-turbulent times, the tech companies are moving from mobile-first to AI-first. Law Departments should take note, arguing for innovation budgets, and move from increased efficiency to creative disruption.

Put another way: How about a paradigm shift to innovation, experiments with new tools, and a management style that more data-driven?

With new AI and blockchain tools that are just now coming over the horizon for the Corporate Law Departments of the Future, my argument is simple: Give them a try. See if they make a difference.




Will Quantum Computing Use Quantum Walk Neural Networks?

I’ll answer my own question. Of course it will.

Now let me set the stage.

In my last blog, I recounted having attended an event in Seattle last month put on by the United States Chamber of Commerce. It was jointly presented by the Institute for Legal Reform and the Center for Emerging Technologies.

One of the speakers said something like “Big Data is the new oil.” I made a connection then with AI in the form of Deep Learning, and I’m repeating it here:

“If Big Data is the new oil, deep learning is the new refinery.”

I also summarized my overview of the ingredients of the New Refinery as follows, asserting that they consisted of  the Computer Processing Units (CPUs) + Graphics Processing Units (GPUs) + Deep Learning algorithms (DL) + Applications revealing useful insights (A); i.e., CPUs + GPUs + DLs + As.

I ended my last blog by suggesting that another Revolution is on the horizon and would give you an “early warning” about it.

Here’s what I see.

On the hardware side, I think we must take note of the efforts to develop better hardware, the so-called Quantum Computers.

Quantum Computing

A Quantum Computer (QC) is a computer that operates in an entirely different way from the computers with which we’re familiar. In the world we experience as humans, the physics is a classical approximation of Nature, which is quantum mechanical.

Our computers are digital and operate in two specific states of either one (1) or zero (0), a binary “yes” or “no” choice. The data encoded in this way are called “bits.” 

A QC uses a quantum processing unit (QPC) and a “superposition” of states called “quantum bits” or “qubits” for short.

I won’t delve into the development of the QC except to alert you to a few details:

The idea was first proposed in 1980. See the Timeline section of the Wikipedia article for Quantum Computing at https://en.wikipedia.org/wiki/Quantum_computing.

In 1981, the late (and great) Richard Feynman is reported to have said:  "Nature isn't classical, dammit, and if you want to make a simulation of nature, you'd better make it quantum mechanical, and by golly it's a wonderful problem, because it doesn't look so easy."

For Feynman’s quote, see this IBM blog post from May of 2016: https://www.ibm.com/blogs/think/2016/05/the-quantum-age-of-computing-is-here/.

Here's the Wikipedia timeline on the timeline for Quantum Computing:


Scroll through it, and you’ll see that QC has been developing for a very long time, but also that the pace of innovation is accelerating.

More recently, here are some key developments by some of our tech giants:

In March 2017, IBM announced a commercially available quantum computing system with an open Application Programming Interface (API) called (not surprisingly) IBM Q.

In December 2017, Microsoft announced a preview version of a develop kit with a programming language called Q#. This language is for writing programs that run on an emulated quantum computer.

In March 2018, Google’s Quantum AI Lab announced a 72 qubit processor called Bristlecone.

(All of these developments are on the Wikipedia article for Quantum Computing, the link to which is here:


So now what about software?

Yes, that’s beginning to appear too.

Quantum Walk Neural Networks

In January of 2018, a paper by Dernbach, Mohseni-Kabir, Towsley and Pal was published called Quantum Walk Inspired Neural Networks for Graph-Structured Data. We now have yet another abbreviation: QWNNs.

The first author is Stefan Dernbach, who is currently a PhD student at the Computer Networks Research Group at the University of Massachusetts College of Information and Computer Sciences. There are three co-authors: Arman Mohseni-Kabir, Don Towsley and Siddarth Pal. Towsley is Dernbach’s PhD advisor. Mohseni-Kabir is a graduate student in the physics department at UMass Amherst. Pal is a scientist with BBN Raytheon Technologies.

The link to this two-page article is here:


The abstract reads in part:

“We propose quantum walk neural networks (QWNN), a new graph neural network architecture based on quantum random walks, the quantum parallel to classical random walks. A QWNN learns a quantum walk on a graph to construct a diffusion operator which can be applied to a signal on a graph. We demonstrate the use of the network for prediction tasks for graph structured signals.”

Note the phrase, “prediction tasks.”

That’s what’s so promising. Like it or not, we want (and need) AI to help us do in the future what we alone cannot do now. As I will discuss in my next blog, devices and software systems using AI, especially in the form of Deep Learning, are (to quote the title of a book I will review in my next post) "prediction machines."

But humans are also prediction machines, and I will make this prediction now: We’ll be hearing a lot more about the combination of QCs with some variation of QWNNs.




Deep Learning is the New Refinery

Last month, I quoted Aleksandra Zimonjic, an attorney with the Los Angeles law firm of Landau Gottfried & Berger LLP, for her quip about what Intraspexion does. She said, “Oh, I see. Better a toothbrush than a drill.”

That was amazing and, after thinking about it, I can only add a dimension of time:

“It’s better to use a toothbrush daily than a face a dentist’s drill even once a year.”

This month I attended the joint meeting, in Seattle no less, of the Institute for Legal Reform of the United States Chamber of Commerce and the Chamber’s recent initiative to focus on Emerging Technologies. 

One of the speakers said something like “Big Data is the new oil.” I made this connection then and I’m writing it here:

“If Big Data is the new oil, deep learning is the new refinery.”

It makes sense, right? In a previous Revolution, oil was rare and valuable because so many technologically advances could be made with it. The race was on to find it.

Now we’re in the era of Big Data and the Data Lake has become the Data Ocean.

It’s turning out that while oil was hard to find, Big Data is overwhelmingly present. And while Big Data also can be used for valuable technological (and other) advances, it’s so ever-present that we need tools to extract the business insights that matter.

Of course, the business insights that matter for a company’s bottom line are the ones that either drive revenue up or drive costs down. 

Intraspexion is one type of New Refinery because we use Deep Learning to monitor yesterday’s emails to find the risks of tomorrow’s lawsuits. We can avoid (or mitigate) those lawsuits only if we see the risks coming over the horizon.

And, like the ballistic missile early warning system (BMEWS) I heard so much about in decades now gone by, I can at least accept the notion that an early warning system of a bad thing is a good thing.  

I think the elements for all New Refineries are three: hardware, software, and applications that provide valued insights (which we call “business intelligence”).

Currently, the hardware consists of computers that combine Computing Processing Units (CPUs) with graphics processing units (GPUs), and which are optimized to run Deep Learning (DL) algorithms for applications (A) that matter.

To summarize: the New Refineries consist of CPUs + GPUs + DL + As.

We’ve barely scratched the surface of this Fourth Industrial Revolution. I’ll write about the phase I see on the horizon and give you an “early warning” about it in my next post.