Latanya Sweeney: When anonymized data is anything but anonymous
30,452 views
0

 Published On Apr 9, 2018

Relatively simple data science experiments can yield major insights and have a significant impact.

Many experiments in data science are expensive and time consuming to pursue. But Latanya Sweeney, professor of government and technology at Harvard University, has shown that even relatively simple studies conducted by students can have a significant impact on public policy and society.

As a student in the 1990s, Sweeney discovered that by applying a couple of filters to a database containing supposedly anonymized health records of Massachusetts state employees, she was able to identify the medical history of Gov. William Weld. That simple experiment led to a broader conclusion: Most people in the United States are the only ones in their ZIP code with a particular date of birth, which means it is relatively easy to discover their identities in much the same way Sweeney found Weld’s history.

“That impact -- the ability to have a simple experiment and have dramatic impact – was huge, and something that stayed with me forever. That simple experiment was quoted in the preamble of HIPAA and the rewrite of privacy laws around the world,” Sweeney said during a talk at this year’s Women in Data Science (WiDS) conference at Stanford University.

Over the years, Sweeney and her associates have flagged numerous instances of flaws in public databases that have caused significant harm. And they’ve found instances where data sources have been misused or applied in a discriminatory manner.

A query by a reporter prompted Sweeney to look for a correlation between names typically given to African-Americans and online ads mentioning arrest records. She found it. Online searches containing a name that sounds like it belongs to a black person were 80 percent more likely to generate an ad mentioning arrest records than searches for stereotypically white names.

“Somebody goes online to see what they can find out about you and Googles your name. And if the ads are popping up implying that you have an arrest record, then in fact, you’re at a disadvantage. It’s not about the intent or whether it was intended,” said Sweeney, who was chief technology officer for the Federal Trade Commission from January 2014 until December 2014.

A study by Sweeney’s students found that a major SAT tutoring company charged higher prices to Asians. And Airbnb modified its pricing policies after the students found price discrimination against certain groups.

Referring to the examples she gave during her talk, Sweeney said: “I like to think I’m really smart, but the truth is these are really simple experiments. But they have profound impact because they empower someone else to be able to do their job better.”

show more

Share/Embed