May 2, 2024 4 min read

Why Secondary Data Could Turn Toxic for Insurers

Digital innovations are helping the sector break new ground in how policies are underwritten. The problem is that some practices could be landing insurers outside of the law. This is not bias in data or analytics. This is ill advised decisions around the strategic use of secondary data.

secondary data
Could secondary data turn into a brick wall?

Insurers enjoy exemptions in equalities legislation that allow them certain uses of data in relation to protected characteristics. Those exemptions do not apply to race and gender, but they do apply to other characteristics like age and disability. And the terms of those exemptions are carefully worded. To apply, certain conditions must be met.

I want to highlight here certain ways in which insurers may not be meeting the conditions to those exemptions. This is not down to bias in the data or in the analytics. It is happening because the sector may not have paid enough attention to the implications of a key aspect of their digital strategies – the use of ever larger amounts of secondary data.

What this does is expose the sector to reputational and/or financial challenge. Such a challenge does not seem to have emerged so far. I can find no case law relating to it. Yet it is an exposure that insurers should pay attention to. Indeed, the overall sector needs to weigh up its implications, for it’s an exposure that seems at odds with the direction in which the market is heading.

Some Background

Let’s clarify two things: the nature of secondary data and the conditions applying to the exemptions in equalities legislation that insurers rely on.

Secondary data is data collected by someone for a particular purpose, but which is then used by someone else for a different purpose. So the data starts out as primary in one context, but then becomes secondary through its used in another context.

One example would be data relating to someone’s shopping choices, which is then used by an insurer in its underwriting. There are motor insurers in the US who use whether you drink bottled water or tap water as a rating factor.

The gathering and categorisation of secondary data has been at the heart of many an insurer’s digital strategy. Two factors are at play here. One is that it makes the customer journey when taking out a policy much less complicated. There’s no need for lots of questions, as the insurer can work out what they want to know from all their secondary data.

And the other factor is that insurers hope to uncover new underwriting insights from all that secondary data – what I’ve seen described as the ‘social determinants of risk’. Uncovering such insights and using them in select and price risk are seen as key competitive advantages.

Conditions and Exemptions

As I’ve outlined above, there are exemptions in equalities legislation that allow data about someone’s protected characteristics to be used in the assessment of risk within financial services. Let’s use disability as an example of what this means.

The condition within the exemption refers to the risk assessment having to be done “…by reference to information that is both relevant to the assessment of the risk to be insured and from a source on which it is reasonable to rely…” and that “it is reasonable to do that thing.”

In other words, the insurer’s risk assessment has to satisfy two tests: is the information upon which the risk assessment is being done relevant, and is it reasonable to use it?

The $64k Question(s)

The $64k question therefore is: does secondary data pass those two tests? Or to put it another way: what types of secondary data fail either of those two tests? And what is the rationale for such a fail?

Let’s go back to the bottled water rating factor and assume that it is being used by a UK motor insurer. That insurer needs to have asked itself the following questions.

  • is it information that is relevant to motor underwriting?
  • does it come from a source on which it is reasonable to rely?
  • is it a reasonable thing to do?

The insurer could well have assembled statistical information that pointed to the drinking of bottled water as having relevance to be used in underwriting. The question then arises – is that level of relevance sufficient? I suspect it would be down to a court to decide that. Account would no doubt be taken of those statistics, but will that be enough? I would expect the court to look for intuitive explanations of why two things are related.

Intuitive Explanations

Professor Barbara Kiviat of Stanford University has done some interesting research on intuitive explanations for the use of different insurance rating factors (more here). Suffice to say here that it is research like this that the courts and/or regulators are likely to refer to when judging the relevance of information like shopping habits or online browsing patterns.

What Professor Kiviat also found was that the context within which the information was gathered matters as well. So in addressing the question of whether the information came from a source on which it is reasonable to rely (the second of our two questions above), context and the consent associated with it will matter.

Can information disclosed under the terms of very generic consent in one context then be relied upon in another quite different context? Could I as a person on the Clapham Omnibus have reasonable expected to have understood that my shopping information would be used to underwrite my motor policy? Again, Professor Kiviat had some interesting findings about what consumers thought about this. Needless to say, they weren’t totally in sync with the views of present day insurers


Then there is the reasonableness question. Insurers would say that, within the context of the digital transformation of insurance, it is reasonable to use social determinants of risk (aka secondary data) in their underwriting. The problem here is that those insurers could well be inviting closer scrutiny of their digital strategies and the reasonableness of their sourcing of data in terms of equalities and data protection legislation. Remember that the ICO sees data protection as encompassing not just privacy, but fairness and autonomy as well. With that in mind, I suspect the end result would not fall in the insurer’s favour.

To Sum Up

Insurers’ use of secondary data could be happening outside of the conditions applying to the exemptions they enjoy under equalities legislation to the use of information about protected characteristics. It does not yet appear to have been challenged, but that may only be a matter of time.

It is likely to be a matter of context and reasonableness. That said, the perspective through which such things are often judged is not that of a business sector wanting to innovate, but that of someone on the Clapham Omnibus. That’s where research into the intuitive understanding of data and risk comes in.

Innovation is a good thing, so long as it doesn’t cut across the grain of society’s sense of justice. Insurers need to challenge themselves on this, before someone else does.

If you'd like to explore this further, get in touch
Duncan Minty
Duncan Minty
Duncan has been researching and writing about ethics in insurance for over 20 years. As a Chartered Insurance Practitioner, he combines market knowledge with a strong and independent radar on ethics.
Great! You’ve successfully signed up.
Welcome back! You've successfully signed in.
You've successfully subscribed to Ethics and Insurance.
Your link has expired.
Success! Check your email for magic link to sign-in.
Success! Your billing info has been updated.
Your billing was not updated.