One of the privacy issues that the insurance sector will increasingly be confronted with over the next few years is the selling and buying of personal data and the influence that anonymisation and re-identification have had on those transactions. I’ve written about this in two posts, here and here.
So in following up on some recently published guidance on anonymisation, I came across an interesting paper by Paul Ohm, an Associate Professor at the University of Colorado Law School, called ‘Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization’. Ohm is a well known commentator on matters of data and privacy and his work is worth following.
His ‘Broken Promises…’ paper included the following story about how one insurance organisation got into hot water when it released claimant data that it thought had been suitably anonymised. I’ve taken the liberty of copying the story below, as it’s a salutary tale that every insurer should think about. Remember that the events took place in the mid-1990s: since then, data sources have proliferated and re-identification techniques become more sophisticated, so the risk described here is now much, much greater.
“In Massachusetts, a government agency called the Group Insurance Commission (GIC) purchased health insurance for state employees. At some point in the mid-1990s, GIC decided to release records summarizing every state employee’s hospital visits at no cost to any researcher who requested them. By removing fields containing name, address, social security number, and other “explicit identifiers”, GIC assumed it had protected patient privacy, despite the fact that “nearly one hundred attributes per” patient and hospital visit were still included, including the critical trio of ZIP code, birth date, and sex. At the time that GIC released the data, William Weld, then-Governor of Massachusetts, assured the public that GIC had protected patient privacy by deleting identifiers. In response, then-graduate student Latanya Sweeney started hunting for the Governor’s hospital records in the GIC data. She knew that Governor Weld resided in Cambridge, Massachusetts, a city of fifty-four thousand residents and seven ZIP codes. For twenty dollars, she purchased the complete voter rolls from the city of Cambridge – a database containing, among other things, the name, address, ZIP code, birth date, and sex of every voter. By combining this data with the GIC records, Sweeney found Governor Weld with ease. Only six people in Cambridge shared his birth date; only three were men, and of the three, only he lived in his ZIP code. In a theatrical flourish, Dr. Sweeney sent the governor’s health records (including diagnoses and prescriptions) to his office.”
Latanya Sweeney, now a professor at Harvard University, showed that anonymisation is difficult and re-identification is easy. Both carry ethical risks. Every insurer needs to proceed with its anonymisation and re-identification with considerable caution – after all, all 650 members of the UK Parliament buy some form of insurance.