Reading Time: 4 minutes

When you’re driving down the street in your car (or it’s driven by A.I., if you’re reading this in the near future) and you want to know if you’re speeding, you look at the speedometer. Pretty simple.

That’s a measurement of a rate (mph or kmph), and a really important one at that. If you’re speeding, you want to know. And generally, we trust the speedometer to be accurate.

A.I. systems have similar measures. These are the numbers that we produce to help us understand how our models are performing. These metrics help us make all sorts of decisions:

  • Is the new model “better than” the old model? If so, it may be time to test the new model in production.
  • Is the current model still performing? Drift or changes to input data can have big effects on revenue; we monitor metrics to identify and correct for drift early on.
  • Do we have the necessary data to predict churn or fraud? Metrics tell us how accurately we can predict such events.

Ok, but what does this have to do with driving a car and reading the speedometer?

Say you’re speeding down the freeway at 90 mph and the speed limit is 70 mph. Well, you’re in a prime position to get a ticket. Is it your fault? Yes, of course it is! You were speeding. But what if your speedometer said you were going 70 mph? That doesn’t matter, it’s still your fault. You made a mistake when you trusted your speedometer.

Specifically, you chose a false negative answer to the question, “Am I speeding?”. You said, “no”, and you were wrong (false negative).

What’s the penalty for a false negative like this? Well, probably a few hundred dollars and maybe a weekend day in traffic school! No thank you!

Could you go back to the car manufacturer, and demand an explanation? Of course you could. And if they sent a lot of cars onto the streets that made errors like these, they would certainly hear about it. In other words, the penalty of false negatives for car manufacturers is very real.

What about the other way around, false positives. In the car example, a false positive to the question, “Am I speeding?” Would be a “yes” if you were driving equal to or less than 70 mph. No big deal. You’d simply slow down. The consequence here is entirely different.

Now we have enough information to put together a compelling way to think about how accurate speedometers need to be.

As a driver, I would take a higher false positive rate if it meant keeping my false negative rate VERY low. In practice this would mean that occasionally I might buy a car that shows I’m driving 72, when I’m really only going 70. In this case I would slow down a bit, but would never be in danger of a ticket. I would be willing to accept that tradeoff much more often than the other case; only very rarely would I want to have a speedometer that says I’m going 68 mph when I’m really going 75 mph.

The tradeoff I’m discussing here is illustrated beautifully in a plot called a “receiver operating characteristics curve” (no one calls it that, it’s abbreviated ROC curve and pronounced like the word “rock”). It allows data scientists to set a false positive rate based on an acceptable false negative rate. It does this by showing the relationship between the true positive rate and the false positive rate, and allowing someone to choose the acceptable risk (the operating point).

Generally, the operating point is determined by product leaders, as it has a lot to do with business risk (think about the risk to an automobile manufacturer when its speedometers read too low for the actual speed.)

We’ve used ROC analysis often. One example is fraud detection for financial institutions where a false positive means calling a customer or sending an SMS when there was no fraud event.

That’s an inconvenience, but compared to a false negative (when fraud happens and it is not detected), it’s a lot less expensive for the bank. In this case, banks are usually willing to send out a few unwarranted messages to customers in lieu of losing money.

You can read more here about ROC curves and the importance of picking the right operating point, and how it’s the job of product leads to do this

Of Interest

Biased Algorithms are Easier to fix Than Biased People
In one study published 15 years ago, two people applied for a job. Their résumés were about as similar as two résumés can be. One person was named Jamal, the other Brendan. In a study published late last year, two patients sought medical care. Both were grappling with diabetes and high blood pressure. One patient was black, the other was white. Both studies documented racial injustice: In the first, the applicant with a black-sounding name got fewer job interviews. In the second, the black patient received worse care. It’s easier for analytics professionals to de-bias data and generate race-independent predictions than it is for individuals to change their unconscious bias.

Using Algorithms to Understand the Biases in Your Organization
There’s no doubt there’s bias in A.I. algorithms. This article talks about using A.I. to identify that bias. Organizations should use statistical algorithms for the magnifying glasses they are: Algorithms can aggregate individual data points with the purpose of unearthing patterns that people have difficulty detecting.

Soft Skills for Data Science
Part of being successful as a data scientist involves a lot of math and analytical skills. That’s fairly well accepted. To be truly successful, however, requires a few soft skills as well. In this article, you will see how something like skepticism can be essential to succeeding as an analytics professional.