Jun 172011
 June 17, 2011

Klint Finley writes:

Data collected from customers is routinely anonymized and then sold or otherwise disseminated for research purposes. But does anonymization work? One particularly high profile case was Netflix’s release of its customer data as part of its machine learning algorithm contest. According to Forbes’ firewall blog, researchers were able to de-anonymize some of this data by doing things like cross-referencing it with IMDB comments. Netflix wound up canceling its later contest.

But is this sort of re-identifying practical, and does it make anonymizing data a pointless endeavor?

