Could the Ethnicity Penalty Trigger Something Very Big?

One level on which this digital transformation of the law will happen is say the Rehabilitation of Offenders Act. This allows the rehabilitated offender to no longer disclose certain information about their past. What happens then when the insurer’s data and analytics can find it out anyway? There’s a case for changing the law to reflect such new capabilities. A sort of updating of the law.

On another level, data analytics could be in the process of disrupting the foundations of particular laws, such that they need to be fundamentally redesigned for a digital future. Equalities legislation could be one such area of law. And the current campaign about an ethnicity penalty in motor insurance pricing might well emerge as a galvaniser for that redesign.

Food for Thought

I’m going to outline how this could be the case, drawing in huge measure upon a research paper by Mireille Hildebrandt, a Professor of Law and Technology at the Vrije Universiteit Brussels in Belgium. The paper is ‘Discrimination, Data-driven AI Systems and Practical Reason’ and can be found here.

This paper came out a year ago and when I first read it, one thing struck me immediately. The example upon which much of the paper is based is insurance. And the various points it raises anticipate pretty closely the findings of Citizens Advice’s research into ethnicity and insurance pricing. It’s one of those examples of how insurance has become the case study par excellence for research into the issue of bias in digital decision making systems.

What I’m going to set out here is very much in outline and based upon a simplified synopsis of Prof. Hildebrandt’s paper. So if your interest is piqued, I would very much recommend reading her paper in full.

The Distinction between Direct and Indirect Discrimination

We know that the law handles direct and indirect discrimination differently. To this can be added how the law handles insurance differently than other goods and services. This means that in some circumstances and for some protected characteristics, it could be lawful for insurers to directly discriminate. And in other circumstances with other protected characteristics, it could be sometimes lawful, sometime unlawful, for insurers to indirectly discriminate.

A key point put forward by Prof. Hildebrandt is that AI systems are reducing that distinction between direct and indirect discrimination. The starting point for this is the way in which proxies are at the heart of machine learning systems.

“Data-driven AI systems are based on machine learning (ML), which develops mathematical functions that correlate input data with output data. For instance, data on residence, life-style, health records or educational background may correlate in different ways with claim-behaviour, thus offering an assessment of the risk for an insurance company. This assessment will influence decisions on acceptance and on price. The point is that ML systems necessarily work with datified proxies (variables), because computational systems cannot work with concepts like ‘health’ or ‘education’. They require formalisation in the form of discrete variables, for instance ‘health’ can be replaced by a preformatted listing of health conditions. If such a variable correlates with a prohibited ground of discrimination, this may be an indication of bias in the sense of indirect discrimination. The problem is that this bias can only be established if the correlation with prohibited grounds is known, whereas the collection of sensitive data needed to check for such bias may not be available (collecting such data may even be prohibited to prevent direct discrimination).”

Less Clarity, Not More?

The more variables you collect, the more complex your algorithms have to become, in order to find the mathematical relationships in amongst them all. A massive number of variables results in algorithms becoming so complex and fluid that the question of which variable is a proxy for what other variable becomes unclear. As machine learning drives this sophistication ever further and further, then the relationship between protected grounds of discrimination and these highly complex combinations of variables becomes ever less clear.

Bring together two things. Firstly, all data is proxy for the outcome your decision is looking for. Secondly, massive combinations of variables create highly detailed identities for you. Add in the likelihood that variables for prohibited discrimination may not be collected. What this could amount to is the eroding of the distinction between direct and indirect discrimination.

Discernment and Discretion

As these many proxies are brought together in myriad combinations, what then matters is finding the ‘difference that makes a difference’. In other words, distinguishing what matters from what doesn’t matter. This is not a matter of mathematical accuracy. It is a matter of judgement and is something we do all the time, at work and at home, as individuals and as businesses.

We know that equality law here in the UK allows insurers to indirectly discriminate where it amounts to a proportionate means of achieving a legitimate aim. You can see the judgement inherent in that estimation of proportionality and legitimacy. So the question then becomes this. How do we discern the difference that makes a difference between the way in which AI systems discriminate, and the way in which society distinguishes between lawful and unlawful discrimination.

And right behind this comes two further questions. What do we accept as evidence of that difference? And what discretion do we allow, in our case, insurers in doing this? The answers to these lie not in logic or accuracy, but in judgement and reason.

Fairness

You can see here how fairness is present in the weave and weft of all this. What the ethnicity penalty is all about is the question of how we design automated decision systems so that they avoid both the direct and indirect exclusion of people from what they deserve, desire and need.

As we enter into a world of decisions based upon huge lochs of micro-proxies, leveraged by ever more complex systems, how do we ensure that the default for people with protected characteristics is inclusion, rather than the exclusion that seems too often to be the case now. Can existing equalities legislation deliver this? What Prof. Hildebrandt suggests is that there are some significant questions hanging over this.

So while we can see the Citizens Advice’s ethnicity report as something for the here and now (and importantly so), it may also represent something more significant in the longer term. And perhaps Citizens Advice realises this, for they very clearly say that this insurance angle to their research is just the start of a longer campaign.

A Can of Worms?

Does this mean that the ethnicity report is part of a much larger and more complex picture than it presents? Yes and no. Yes, in that it cannot be seen in isolation to that bigger picture. And no, I think Citizens Advice can see that bigger picture but like many others, sense that it needs to be explored and mapped out carefully. It’s just that they have chosen insurance to explore it, just like academics have been increasingly doing too.

Let’s look ahead to say the end of 2022. Something clear might come out of this ‘insurance stage’ of the CA’s campaign and be used by them as a platform of addressing other sectors. That would be a pretty positive outcome for both consumer advocates and insurers. Alternatively, the ‘insurance stage’ could become bogged down in claim and counter claim, and end in mutual mistrust. That would be a pretty negative outcome.

And if the latter were to be the case, then 2023 could see some quite political, perhaps legal, manoeuvrings. Not good for sector reputations. Part of that scenario could involve some quite challenging questions being asked not just of insurers’ exemptions under equalities legislation, but of the legislation itself. That’s why Prof. Hildebrandt’s paper is so timely.

Click here to follow me on LinkedIn for more views and insight