Intraspexion Receives a Rave Review

Every once in a while, I put “intraspexion” into a search engine and see what comes up.

Today I found an article about Intraspexion I hadn’t seen before.

The title of the article is “Predict the Risk of a Law Suit? Intraspexion’s Deep Learning Model Makes it Possible.”

This article was published on February 21, 2018, and was written by Pranav Dar, an editor at Analytics Vidhya.

According to the Analytics Vidhya website, Pranav Dar is a “Data visualization and Six Sigma practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.”

Dar’s article is here:

I’m only going to quote a portion of the article’s last paragraph:

“Intraspexion is as unique as it is transcendent. Their software has the capability of saving millions of dollars for firms and sets the precedent for others to follow.”

From what I see in other articles on Analytics Vidhya's website, they know Deep Learning. So quite apart from the (previously unknown) rave review Intraspexion received, I’m a subscriber now.

Is Intraspexion like "Minority Report" or "WarGames"?

Minority Report, the 2002 sci-fi film directed by Steven Spielberg, comes to mind because the “precogs” were predicting crimes. We use software to “predict” litigation.

In December of 2016, Richard Tromans, who writes a popular blog from London as the Artificial Lawyer, made this connection. He wrote, "Think 'Minority Report,' but using algorithms instead of pre-cogs."

A few months later, Attorney Jeff Cox made that connection too. In an article for the ACC’s Legal Ops Observer in March of 2017, “AI for Legal Ops and Corporate Counsel - The First Wave,” Jeff wrote about several AI startups and covered Intraspexion first. He wrote, “Intraspexion is the Minority Report of litigation.”

When Jeff conducted his pre-article interview, he told me that a reference to Minority Report helped him explain Intraspexion to others, and that it worked for them in a flash of understanding.

But when this question comes up, I also recall the 1983 film WarGames. (Click the title for a third party's YouTube clip of the movie's last few minutes. Ad supported.)

The clip is one of the best last four minutes of any movie, but I’ll summarize it for you:

At the end of the movie, a computer named Joshua is playing a game called Global Thermonuclear War, but doesn't realize it's a game.

Towards the end of the film, while Joshua is trying to obtain the President's launch codes, the hero (played by Matthew Broderick) directs it to play tic-tac-toe, and to play against itself in order to learn "futility." Joshua plays tic-tac-toe game repeatedly, until it finally "understands" that every scenario produces no winner. 

Once it "understands" futility and transfers that learning to the War Game it's been playing (which tech readers may recognize as transfer learning), Joshua utters a memorable line:

"Greetings, Professor Falken. A strange game. The only winning move is not to play." 

The computer then shuts down Global Thermonuclear War, and says with "How about a nice game of chess?" 

Litigation is like Global Thermonuclear War.

Why? Because even if the company wins in the end, it loses.

Everywhere you look, financial resources are burning.

First, if the comany's a net winner, it's unlikely to collect.

Or, if it prevails against its adversary, and owes nothing, the company must still pay the defense attorney fees, expert witness fees, the costs of eDiscovery, and so on.

And if it settles or loses at trial, it must also pay the amount of the settlement or the verdict.

Business leaders have always been smarter than Joshua. They know that the only winning move is to avoid or prevent litigation. Until now, with Intraspexion, there just hasn't been a way to do that.  

What's the Business Case for Preventing Litigation?

In Nick Brestoff's book, Preventing Litigation: An Early Warning System to Get Big Value from Big Data (Business Expert Press 2015) (hereinafter Preventing Litigation), he presented his calculation that the average cost per case at $408,000. See Chapter 4 ("What's it worth to prevent a lawsuit?").

The global numbers, for the ten-year period from 2001 through 2010, are as follows:

The “cost” of commercial tort litigation (rounded) is: 

$1.6 Trillion

The number of Federal and State lawsuits (also for 2001-2010) (rounded) is: 

Four Million Lawsuits

These two numbers (reduced by 15 percent and rounded) indicate the Average Litigation “Cost” per lawsuit is (rounded): 


Intraspexion's Deep Learning patent family ranks high, behind only Google, IBM, Microsoft, and Siemens Healthcare

I wrote a blog on May 18th, which I have now deleted. I deleted it because I had a second thought, prompted by an email from Jagannath Rajagopal, Intraspexion's Chief Data Scientist. He questioned why I had searched the Patent Full-Text and Image Database in the United States Patent and Trademark Office ( only for "deep learning" using two search categories: Issue Date and Title.

I thought, right, why not Claims? After all, claims are an invention's "metes and bounds." See (referring to "metes and bounds" 12 times).

So I searched for "deep learning" as Term 1 and "Claim(s)" as Field 1. I also searched for each year, beginning with 2013, as Term 2 and "Issue Date" as Field 2.

Subsequently I realized that "deep neural" would be an appropriate proxy for "deep neural network(s)," and would be accurate and potentially broader. It was. These searches allowed me to surface the 167 patents the USPTO has granted during the last 5 years (from 2013 to the present). After opening each patent, I could readily see the names of the companies to which the patents were assigned. I used the Quick Search modality and the ranking below was the result.

I later used the Advanced Search modality and found a variance of one patent each for the top three, but no change in ranking. The results below are as of May 29, 2018.

The Top 5 results were:

1. Google: 23 patents;

2. IBM: 18 patents;

3. Microsoft: 13 patents;

4. Siemens Healthcare GmbH: 10 patents; and

5. Intraspexion: 7 patents.

A detailed spreadsheet based on the Quick Search modality is available upon request. The spreadsheet lists the patent in numerical order by year and the names of the Assignees, and splits out the results into two columns: one where Term 1 is "deep learning" and a second column where Term 1 is "deep neural."


# # #


Why Enterprise Paralegals Will Be More Powerful Than the Senior Partners in Outside Law Firms

On May 17, 2018, I listened to a talk by the Dr. Lewis Z. Liu, the CEO of Eigen Technologies. (His Ph.D. is in atomic and laser physics from the University of Cambridge.)

The point of Dr. Liu’s fine talk was to suggest why, with assistance from machines using artificial intelligence in the form of deep learning and natural language processing (NLP), the young associates of the law firms of tomorrow will know more and be more efficient, more productive than the highest paid senior partners of the law firms of today.

In an equation of sorts, Dr. Liu argued that, being AI-enabled means this (using > to mean “greater value than”):

young associates of the law firms of tomorrow > senior partners of the law firms of today

I believe that, but I’ll take Dr. Liu’s contention to an even higher level.

My contention is that, in the future, the paralegals in the enterprise Law Departments (whether corporate or governmental) will, with the assistance of machines using the same tools, e.g., artificial intelligence in the form of deep learning and NLP, will be more efficient and more productive than the most senior partners of the outside Law Firms.

Here’s my contention (where >> means “much greater value than”):

paralegals in Law Departments >> senior partners in Law Firms.

Now how am I going to convince you of that outlandish contention?

Let’s start with the paralegal in the enterprise Law Department of today. When a lawsuit is filed, custodians of potentially relevant documents receive litigation hold notices, and collections are assembled into a case-specific corpus of documents.

These collections need to be separated into documents that irrelevant and need not be produced; documents that are potentially relevant; and documents that may be potentially relevant but are privileged from being produced because of an applicable privilege, e.g., the attorney work-product doctrine or the attorney-client privilege.

In addition, an early assessment of the relevant but non-privileged documents can be done. That assessment may reveal whether risky or “smoking gun” documents are contained in the set that must be produced to an adversary unless otherwise privileged from being disclosed.

This information is useful in that such “smoking gun” documents, if they exist, may be weak and indicate that the lawsuit is defensible, or may be terrifying and suggest that the case is a looming disaster.

But the context is an already-filed lawsuit. In other words, if it poses a terrible risk, the enterprise is in big trouble. If that’s the nature of the lawsuit, it’s already too late.

Could a paralegal in the Law Department have seen such a risk coming, say by accessing and assessing the “smoking gun” or risky emails or other text-based documents before the situation devolved into litigation?

Before Intraspexion began combining artificial intelligence in the form of deep learning to internal enterprise communications such as emails, the answer is no.

And the reason is simple: The amount of data in yesterday’s emails is too large to read and there are no tools available to do the job.

But suppose that such a tool exists? We at Intraspexion have used artificial intelligence in the form of deep learning and created a patented software system for finding specific types of litigation risks in yesterday’s emails.

The system is currently trained for the risk of employment discrimination and, after the deep learning analysis “engine” is trained, it is very sharp.

What do I mean by “sharp”? I mean this:

I’ve said that our first classification is “employment discrimination.”

In our white paper, I described our steps for training a machine to “understand” this litigation risk category.

For the “positive” training set, we created a “positive” set of examples from the factual allegations in hundreds of previously filed discrimination complaints in the federal court litigation database called PACER, and in the classification for “Civil Rights-Jobs," which (in PACER) is Nature of Suit code 442. 

Now, for our purposes, we didn't care about the legalese of "jurisdiction and venue," the names of the parties, or specific claims that were being made. And it didn't matter whether the discrimination was for age, race, sex, and any other sub-category of discrimination. PACER has no sub-classifications for them.

Next, we created a “negative” set of examples that was “unrelated” to “employment discrimination.” This negative set consisted of newspaper and Wikipedia articles and other text, including emails.

But to the best of our knowledge, there were no "discrimination" articles or emails in these sources for our "negative" examples.

After that, we switched from PACER and looked at Enron emails and, to make a long story short, found four (4) examples of true risks for employment discrimination. We found them in the subsets for Lay, Kenneth (Ken Lay was the Chairman and CEO of Enron); Derrick, J.; and a few other former Enron employees.

Now, having found four "true risks," we knew that we had something special. Enron is known for fraud, not employment discrimination. And, as far as we know, no one before us had previously surfaced emails that were about "discrimination." 

Thus, we had successfully trained our Deep Learning model to "learn" the pattern for "discrimination."

Our third step was a benchmarking project we can't discuss.

Then, after that "first light" pilot project, we added 10,000 Enron non-discrimination emails to the unrelated set, so the model could “understand” English in the context of emails.

Then we looked at a held-out set of 20,401 Enron emails that our system had never analyzed previously.

Result: Our "model" called out 25 emails as being "related" to discrimination, and our 4 "needles" were among the 25.

That's 25 out of 20,401 emails, a fraction of 0.001225, which is a little less than one-eighth of one percent.

So if a machine can winnow 20,401 down to only 25 emails that it calls out as “related” to the risk of employment discrimination, that’s great, but then what?

What’s a machine to do? What’s next?

It doesn’t know.

Now I can conclude my argument in three short steps.

  1. Enterprise paralegals by themselves are closest to the risky data but don’t swim well in that ocean. They don’t bother to even look because the water is so murky.

  2. A computer trained using artificial intelligence in the form of deep learning can find the text related to the risks for which it’s been properly trained, but by itself cannot deal with the results.

  3. But paralegals who receive the results can assess the results, and then open and conduct an internal investigation, and, perhaps after sharing the investigation results with others, advise a control group executive about the potentially adverse situation.

Thus, the enterprise may be proactive and avoid the lawsuit altogether.

Senior partners in the very best outside law firms can never hope to be so helpful. They’re not close enough to the risky data when it’s only risky.

Conclusion: the enterprise paralegals of the Law Departments of the Future, enabled by artificial intelligence using deep learning, will be more powerful and beneficial to the enterprise than the outside law firm’s most senior partners.