From the court room to the workplace, important decisions are increasingly being made by so-called "automated decision systems". Critics claim that these decisions are less scrutable than those made by humans alone, but is this really the case? In the first of a three-part series, Jasmine Leonard considers the issue of algorithmic bias and how it might be avoided.
Recent advances in AI have a lot of people worried about the impact of automation. One automatable task that’s received a lot of attention of late is decision-making. So-called “automated decision systems” are already being used to decide whether or not individuals are given jobs, loans or even bail. But there’s a lack of understanding about how these systems work, and as a result, a lot of unwarranted concerns. In this three-part series I attempt to allay three of the most widely discussed fears surrounding automated decision systems: that they’re prone to bias, impossible to explain, and that they diminish accountability.
Before we begin, it’s important to be clear just what we’re talking about, as the term “automated decision” is incredibly misleading. It suggests that a computer is making a decision, when in reality this is rarely the case. What actually happens in most examples of “automated decisions” is that a human makes a decision based on information generated by a computer. In the case of AI systems, the information generated is typically a prediction about the likelihood of something happening; for instance, the likelihood that a defendant will reoffend, or the likelihood that an individual will default on a loan. A human will then use this prediction to make a decision about whether or not to grant a defendant bail or give an individual a credit card. When described like this, it seems somewhat absurd to say that these systems are making decisions. I therefore suggest that we call them what they actually are: prediction engines.
How can we assess a prediction engine? By assessing the quality of the predictions that it generates. Luckily, this does not require any understanding of the algorithm that generates the predictions; it merely requires knowledge of what those predictions were and whether the predicted outcomes occurred or not. For instance, you don’t need to understand how Google’s search algorithm works to assess whether or not the results it returns are relevant to your search query. Indeed, if Google started returning random search results – articles about the works of Shakespeare when you searched for “pizza restaurants near me”, say – you’d probably notice pretty quickly and stop using it in favour of a more useful search engine.
And so we come to the first fear about prediction engines: that they’re prone to bias. Fundamentally, biased predictions are inaccurate predictions. Specifically, they’re inaccurate in a consistent way because they’re generated by a system that consistently over or under weighs the importance of certain factors. This was the case with the infamous COMPAS system, which was (and still is) used by US courts to predict the risk that defendants would reoffend. COMPAS, it transpired, was more frequently overestimating the likelihood that black defendants would reoffend and underestimating the likelihood for white defendants. In other words, it was biased because it inadvertently placed too much weight on race.
COMPAS failed to do the one thing it was supposed to do: accurately predict whether or not a defendant would reoffend. It was therefore a bad product and ought not to have been used by the courts to guide sentencing decisions. But there’s no reason to think that just because COMPAS was biased that therefore all prediction engines are biased. In fact, compared to humans, prediction engines are inherently less prone to bias. After all, computers don’t suffer from the inherent cognitive biases that we humans must constantly fight against. Indeed, many computer biases only exist because their algorithms are designed by (biased) humans or trained on data produced by (biased) humans.
From a practical standpoint, it’s also far easier to spot bias in a computer system than it is in a human. For a start, computers can typically generate many more predictions than a human can in a given space of time, so it’s far easier to gather enough sample predictions from which to identify any bias. Computers are also more consistent than humans, so whilst a human’s bias might only be noticeable in some of their decisions, a computer’s bias will be visible across all its predictions where the biased factor is in play.
Given all this, it seems to me that worries about bias ought not to deter us from using prediction engines to aid our decision-making. We just need to apply some common sense when adopting them and, as with any new technology, test them before we use them in real-life situations. This does not require access to a system’s source code or the data on which it was trained, it just requires someone to compare its predicted outcomes with the real outcomes. And if a system fails to predict outcomes with sufficient accuracy or displays biases in the predictions that it makes, we shouldn’t use it.
Jasmine currently provides programme support to the RSA’s Forum for Ethical AI, a new programme exploring the ethics of AI with citizens. She is a former software developer, tech founder and philosopher.
Related articles
-
Artificial intelligence in the NHS: fad – or the future?
Asheem Singh Jake Jooshandeh
In partnership with NHSX we deliberated with healthcare professionals and tech specialists on artificial intelligence in our NHS.
-
We need to talk about artificial intelligence
Asheem Singh
Asheem Singh reflects on the RSA’s deliberative democratic experiment at the intersection of technology and society: the citizen-driven ‘Forum for Ethical AI’
-
The ethics of ceding more power to machines
Brhmie Balaram
We make the case for engaging citizens in the ethics of AI and share a snapshot of public attitudes towards AI and automated decision-making.
Join the discussion
Comments
Please login to post a comment or reply
Don't have an account? Click here to register.
I agree with the concerns expressed below. If the system uses a fairly simple algorithm then it should be possible to check any decision that is challenged and to understand the algorithm and the data that was used and to see whether there is bias or error.
BUt if the system uses machine learning then these checks cannot be made other than statistically, which raises the questions of (a) how much testing would provide adequate evidence of absence of bias, and (b) whether the test data is sufficiently relevant to the particular decision that has been challenged.
In general, neither question is answered adequately. So an automated system may be behaving unlawfully. How does society enforce social laws under these circumstances?
And then, of course, there's the requirements of GDPR - which are probably worth a separate article.
As a 'punter' it's' easy to assume the originators of the program involved wanted an unbiased process for their audience. This might not be the case of course and some deliberate manipulation may be going on for reasons commercial or political perhaps. It may be an innocent case that the originators missed something because the designers of something always have a difficult task to imagine all the ways a thing or a system is going to be used once it has left the design lab and is on the street, so to speak.
So while the head and board of the commissioning organisation need to rely on the integrity of their IT people the end user of the data produced cannot be sure that the system they have drawn information from is unbiased and not skewed in some way. The reputation of the organisation is all the punter has to go on very often and as we have seen a number of times recently big organisations can be as subject as anyone to misuse and mishandling. VW and the major car manufacturers is an example with their emissions data.
This opens up big possibilities for trust and a lack thereof, as well as reputational risk considerations. But the real point is that the user can never be sure that the system they are relying on is as good as it says it is. How are we supposed to know? Even if you were able to write it yourself, you are then in a position where unconscious bias may have crept in because of some assumption made that does not apply in 100% of cases. Pareto's rule kicks in !
Thank you Brendan - you elevated this article into something worth reading.
I don't want to dominate the debate but another perspective on the emergence of the debate on algorithmic fairness at https://medium.com/@QuantumBlack/live-from-sxsw-tackling-fairness-gender-diversity-algorithmic-52d96c48590b
From SXSW: Tackling Fairness: Gender, Diversity & Algorithmic
Data Scientist, Ines Marusic, joined a panel at The Female Quotient’s Girls Lounge to discuss how Artificial Intelligence (AI) is changing the game as well as perceptions of women in STEM. Here Ines expands on her views on this important topic.
I appreciate the RSA’s focus on this complex and challenging area and look forward to the following articles. As someone who works in the sector I am far from convinced that the article allays my fears. A few points to stimulate the debate:
I think the three main issues are data, data ownership & transparency:
• Garbage In, Garbage Out (GIGO) – how do we know that the data these systems are using to predict is accurate, complete or timely? Or that any inherent structural bias has been identified and resolved? Very few datasets are as accurate or complete as one would hope or believe and there is a lot of human and machine manipulation required to ‘clean’ data for use by algorithms. Failure to do this can be catastrophic for individuals, as with COMPAS in the article. For example, how biased are the data collected by Police based on those they arrest if uncorrected human bias towards arresting minorities permeated the historical dataset? And what does more accurate mean – more closely aligned with prevailing biased policing or some, hypothetical unbiased version of it? Some US Courts still use COMPAS and can probably hide behind its biased decisions, a great example of the double-edged nature of such systems.
• Ownership - as more and more data are concentrated in fewer commercial companies, often outside a government’s jurisdiction or international conventions. As individuals, we are largely ignorant of the value of our data and how it is used, both for-profit and Government. We too-willingly cede ownership of our fragmented data as the alternative is isolation. Government and regulators are struggling to stay in touch, let alone catch-up. We are well on the way to a world where ‘everything that can be digital is digital’ (the economics are compelling) but this is not a simple evolution of ‘what is’, it is a transformational shift that concentrates data into fewer hands and has no incentive to make their algorithms transparent or public, far from it. Netflix publicly claims its algorithms are worth £1billion, Dominos Pizzas says it is in the data business, not the pizza business. The world’s top data scientists earn 7-figure salaries, top data scientist graduates can demand $100k packages. There is a reason for this and its not altruistic.
• Transparency - Yes, algorithms can be transparent and explained but they are not, and for sound financial reasons as stated above. Algorithms are mostly commercial intellectual property and developers and owners have no incentive to make them transparent or regulatory compliance pressure to do so, Google’s search is a good example. Google search may return seemingly relevant articles but how do I know what it is not returning, or why? Or why articles figure on the page? Google is a subject of a legal battle with the EU for this very reason, prioritising it’s or advertisers content. In the UK a shopkeeper is not obligated to serve a customer or provide a reason for not serving a customer. There may be good reasons for this, I simply don’t know. But in a world where most things are digital, and Google dominates search, would this lack of a need for an explanation still make sense? Or does it create an opportunity for commercial abuse, malpractice or discrete discrimination, as the EU believes is the case with Google? Google has multiple biases so the chances of anyone spotting what they are from results is close to impossible and I suspect it is a struggle for the EU. As an individual, I have no access to Google’s data or algorithms, I am a slave to is results and impotent (PS Try DuckDuckGo as an alternative!).
I’m afraid the conclusion to ‘test them before we use them in real-life situations’ is maybe naïve and simplistic given their complexity, ownership and lack of transparency. Who will do this, when and how? We will need to do better.