Tuesday, January 29, 2019

Can artificial intelligence correct the problems of artificial intelligence?

First, let me start off by saying that the views in this post are my own and not necessarily the views of my employer.

Why the caveat?

Because my employer, like many, is doing things in the artificial intelligence arena.

By Grafiker61 - Own work, CC BY-SA 4.0, Link

Because of this, I make a point of monitoring discussions of artificial intelligence - especially recent discussions regarding possible bias in artificial intelligence algorithms. And while I'm not going to touch [REDACTED] topic, I'll take a moment to highlight this discussion from the MIT Technology Review:

Risk assessment tools are designed to do one thing: take in the details of a defendant’s profile and spit out a recidivism score—a single number estimating the likelihood that he or she will reoffend. A judge then factors that score into a myriad of decisions that can determine what type of rehabilitation services particular defendants should receive, whether they should be held in jail before trial, and how severe their sentences should be. A low score paves the way for a kinder fate. A high score does precisely the opposite.

By RoyHalzenski - Taken by RoyHalzenski (Myself), Public Domain, Link

Obviously an important tool - as author Karen Hao notes, the scores can dramatically impact your life. But Hao has one concern.

Modern-day risk assessment tools are often driven by algorithms trained on historical crime data.

If the potential problem with this isn't immediately apparent to you, let me share an example from outside the criminal world - in fact, it's from the hiring world. And this involves a company that's known for its involvement in AI - the company's named for a river or sumfin.

The team had been building computer programs since 2014 to review job applicants’ resumes with the aim of mechanizing the search for top talent...

“Everyone wanted this holy grail,” one of the people said. “They literally wanted it to be an engine where I’m going to give you 100 resumes, it will spit out the top five, and we’ll hire those.”

Sounds great, doesn't it! So what data was used to power that engine?

[The] computer models were trained to vet applicants by observing patterns in resumes submitted to the company over a 10-year period.

So what happens when you use such data to train algorithms? Because the company was based in the Pacific Northwest, presumably the algorithm favored applicants that liked coffee, wore flannel, and didn't have strong suntans. Those issues in and of themselves were not problems. However, the fact that the source data had a lot of successful candidates who were males - something that led the algorithm to prefer candidates who were male - WAS a problem that caused the company to scrap the effort.

The "garbage in, garbage out" issue isn't limited to AI, but it's something that AI needs to address if algorithms are going to work properly. So if you have garbage data, how do you make it ungarbage?

Another part of MIT proposes to solve the AI data bias problem with...AI.

A team from MIT CSAIL is working on a solution, with an algorithm that can automatically “de-bias” data by resampling it to be more balanced.

The algorithm can learn both a specific task...as well as the underlying structure of the training data, which allows it to identify and minimize any hidden biases. In tests the algorithm decreased "categorical bias" by over 60 percent...while simultaneously maintaining the overall precision of these systems.

Now I don't have the scientific smarts to determine if this other MIT group is blowing smoke. And I should mention in passing that I am more concerned about facial recognition rather than face detection, if you get my drift.

But if we truly can develop algorithms that look at a set of data and normalize it, then the science will take a great step forward.

Assuming, of course, that the bias reduction algorithms are not themselves biased...

By Love Krittaya - Own work, Public Domain, Link
blog comments powered by Disqus