Prediction

“Prediction takes information you have …

and uses it to generate information you don’t have.”

Ajay Agrawal, Joshua Gans, and Avi Goldfarb. 2018.

Prediction Machines: The Simple Economics of Artificial Intelligence at 24.

Harvard Business Review Press. Boston, Massachusetts, United States.

what’s your risk of litigation?

You’re in the Corporate Legal Department. As things stand now, you can only manage the lawsuits that come in the door. Even when you try to manage your litigation as efficiently as possible, that’s all you can do. You’re closest to the risky emails, but there’s too much of it. There’s no way for you to see the litigation risks in time to avoid the lawsuits.

So you’re stuck in reactive mode.

intraspexion uses deep learning

But what if Intraspexion’s software system could surface for you only a few candidates for the risky or "smoking gun" emails, and bring them to your attention before they show up in a lawsuit?

That’d be a game-changer for you. You’d be using your education and training to make the calls. You’re the human-in-the-loop. Now you can be proactive about the risks and drive the frequency (and cost) of litigation down.

That’s the service we provide. We help you to see the risks, and we use AI in the form of “deep learning” to do it. 

Let’s go from precedents in law and patterns in deep learning 

to predictions

To create a “deep learning” model for a particular litigation risk requires examples of text which typifies a specific classification of litigation. In law school, you learned the big categories: contracts, torts, and so on. Now, when you’re facing a new situation, it’s second nature for you to categorize it. The federal judiciary’s litigation database (PACER) is just like that. In the Civil silo of PACER, there are many business-relevant categories and each of them has an associated Nature of Suit code.  

But to cook up a deep learning model for a specific case-type, we need only two basic ingredients: (1) a classification of text, which we’ve already described, and (2) examples of the classification.

With classified data, we’ll create a dataset that’s "positive" for a specific litigation case-type; and then, for contrast, we’ll create a "negative" set with text that’s not related to it. Then we’ll have what’s called a binary classifier.

Now that we have a model for a specific classification of risk, e.g., employment discrimination (Nature of Suit code 442), now what? Answer: Let’s look at yesterday’s batch of emails, index them, and pass them through the classifier. The results are output to designated personnel in the law department. In the User Interface, they see only the high-scoring emails that pattern-match to employment discrimination.

In other words, with Intraspexion you’ll get an early warning of the few emails from yesterday that match up to that risk!

So Intraspexion is not about helping you to do more efficiently any of the things you already do. It’s about helping you identify the risks in yesterday’s set of emails so that you can investigate further and, hopefully, nip the risk in the bud.

With Intraspexion, you’re in the prediction business.

 what’s the “accuracy” of this pattern-matching?

We ran an experiment with a held-out set of 20,401 emails (from the Enron dataset), and the system output only 25 as being “related” to the risk. That fraction—25 out of 20,401—is equal to about one-eighth of one percent, i.e., 0.0012.

Let’s see how this improvement plays out. An attorney for an NYSE company once told us that it was typical for the company to handle two million emails per month. OK, let’s start with 2,000,000 emails per month: Divide 2,000,000 by 4.3 weeks per month: the result is 465,116 emails per week. Now divide 465,116 emails per week by 5 days per week: the result is 93,023 emails per day.

At that point, without Intraspexion, you’d stop and say, “You want me to look at 93,023 emails per day? Yikes. I'd need an army.”

But let's continue the calculation. The number of emails the Intraspexion system would reveal to you as related to the risk is 93,023 multiplied by 0.0012 = about 112 emails per day. Now that's doable. That’s the power of a deep learning model for a risk.

So give us a try. Request a demo.