Why would a model be prejudiced against a certain race? Very rarely do people give race as a feature to statistical models to begin with. And even if they did, they do not have human prejudices. They train on actual data and care only about making the most accurate predictions possible.
> and care only about making the most accurate predictions possible
of labels often generated by humans.
Bias can get into a classifier. It can get there through a biased model, but that's very unlikely. Much more likely is that it's trained on biased data. Which is _easy_ to do, even by accident.
Sure, but even then they would be no worse than the humans they replace.
But in most cases, the whole point of using machine learned models is to do better than humans. Like an insurance company using ML to predict how likely a customer is to get in an accident. They aren't going to train the models to mimic actuaries, they have plenty of actual data to train it on.
And it is quite possible that males get into accidents much more often than females. But that doesn't mean the model is prejudiced or that it's wrong.
We can't inspect human brains, they are black boxes. People are incredibly biased, but also mostly unaware of their biases. For instance, judges have been found to give much harsher sentences just before lunch, when they are hungry. Or attractive people get much shorter sentences. In job interviews attractive people do better. It also matters way more in tipping waitresses and in elections.
But on top of that, humans almost always do worse than even the most simple statistical baselines. Simple linear regression on a few relevant variables beats human 'experts' 99% of the time. Humans shouldn't be allowed to make decisions at all, yet everyone seems to fear teh scary algorithms instead.
https://motherboard.vice.com/read/why-an-ai-judged-beauty-co... “It happens to be that color does matter in machine vision,” Alex Zhavoronkov, chief science officer of Beauty.ai, wrote me in an email. “and for some population groups the data sets are lacking an adequate number of samples to be able to train the deep neural networks.”
Well of course a beauty contest judged by AIs would go horribly wrong. Appearance is highly subjective and arbitrary.
But even so it's not clear their algorithm was the cause of the bias, or that the bias was significant. For instance, it's possible that black people have slightly worse "facial symmetry" on average, or whatever made up metric they were using. And even if black people only scored 1% worse on average, that means the extremes will be dominated by whites, because of the way gaussian distributions work. So it may appear to be way more biased than it actually is.
Even with no bias, uncertainty can cause problems. Say an ML system is tasked with finding the top 10 candidates in terms of "confidence that they will be able to do the job". Then if it has little training data on candidates of a particular class, and those in that class are actually quite different on many different variables (so it can't generalize very well), it may not be able to reach the required levels of confidence for them.
I think this is actually the reason for a lot of accidental discrimination, because human judges would have exactly the same problem.
I remember in school, playing chess a couple of times against a guy fresh from Sudan. He had the most unsettling smile, and played very unorthodox openings. I won some, he won some - but I always suspected he was stronger than me, and just being polite/testing me. It's just impossible to read someone from a culture so different. I'm glad we didn't play poker, to put it like that.
That particular study is pretty bad. Look at the contestants' faces. They might be failing at identifying facial features or not accounting for lighting, tilt, etc. It is obvious if you look at their estimated ages.
They clearly got the ages wrong that's not really debatable. Do not take my word for it though. There are plenty of studies in this space that got wildly different results some of which mostly agree with hotornot ratings.
It's possible to predict race by looking at several other variables which are correlated to race. See Technical Appendix A and especially Table 8 for an example of predicting ethnicity using surname and state only: http://files.consumerfinance.gov/f/201409_cfpb_report_proxy-...
You can do this deliberately, or your machine learning model might do it spontaneously.
I don't deny that. Although I don't think anyone uses surnames as a feature, and that would be pretty blatant. You also need to use census data to really make use of that for less common last names. Anyone going out of their way to do this might as well just discriminate directly instead of this convoluted method.
It may be true that a sufficiently complex machine learned model could learn race as a feature, but again, why would it? It has no prejudice against specific races. Unless you really believe that black people are inherently more likely to get into car accidents, even after controlling for income and education, etc. And even then it doesn't hate black people, it's just doing it's best to predict risk as accurately as possible. It's not charging blacks, as a group, a higher rate than they cost, as a group.
I fail to see why people get so upset at the mere possibility of this. I think they are anthropomorphizing AI as if it was a human bigot that has an irrational hatred for other races, and strongly discriminates against them for no reason. This is more like giving people who live in neighborhoods with slightly higher accident rates, slightly higher insurance rates, to make up for their increased risk. Maybe it correlates with race, maybe it doesn't, it doesn't really matter.
Based on a list of 137 questions[1] the Northpointe system predicts the risk of re-offending, and "blacks are almost twice as likely as whites to be labeled a higher risk but not actually re-offend." Meanwhile, whites are "much more likely than blacks to be labeled lower risk but go on to commit other crimes."
In other words:
- a white person labeled high risk will re-offend 66.5% of the time
- a black person labeled high risk will re-offend 55.1% of the time
- a white person labeled low risk will re-offend 47.7% of the time
- a black person labeled high risk will re-offend 28% of the time.
The model is specifically avoiding race as an input, but still overestimates the danger of black recidivism, while underestimating white recidivism.