
Two researchers at Texas University released a paper recently citing the ability to use what appears to be benign supposed anonymous information, or 'micro-data', coupled with other outside unrelated sources to de-anonymize the source. If that confuses you as much as it did me to write, stick with me.
The researchers proved their point by taking the NetFlix Prize micro-data database that simply contained a listing of movie ratings of 500,000 NetFlix subscribers sans names. This information is public domain, as most micro-data is nowadays, used for data-mining research et al.. They then took the individual's movie tastes and linked these quasi-identifiers to other public records and were able not only to come up with names, but in some cases, addresses, social security numbers, and other potentially unsafe information.
Ok, I know you want them, so here are a couple examples of de-anonymizations of the recent past.
A Massachusetts hospital's discharge list was coupled with the state's public voter database to reveal sensitive information on the the patients. Or the best of all happened last year when America Online's chief technology officer resigned after a massive dataset of 20 million searches performed by 658,000 people was published for use in research. The data was believed to be anonymized, but revealed sensitive details of the searchers private lives, including Social Security numbers, credit-card numbers, addresses, and, in one case, apparently a searcher's intent to kill their wife.
Take it how you'd like. I found it interesting, it does make sense, but I can think of much easier ways to get someone's sensitive information. Sometimes all you have to do is ask.