Interesting People mailing list archives

not so anonymous


From: "David Farber" <dave () farber net>
Date: Mon, 24 Dec 2007 13:20:32 -0500



-----Original Message-----
From: Dean F. Sutherland [mailto:dfsuther () cs cmu edu] 
Sent: Monday, December 24, 2007 11:35 AM
To: David Farber
Subject: for IP? Fwd: [read20-l] not so anonymous

For IP if you like.

Dean F. Sutherland
dfsuther () cs cmu edu

"He that publisheth not a sufficiency, he shall perish: yea, all  
grants shall be refused him, his contract shall not be renewed, he  
shall vanish from the sight of his fellows even unto the depths of the  
teacher's colleges, and his name shall vanish from the footnotes." --  
source unknown

Begin forwarded message:

-------- Original Message --------
Subject:      [read20-l] not so anonymous
Date:         Sun, 23 Dec 2007 14:21:26 -0800
From:         Peter Brantley <peebsley () gmail com>
Reply-To:     Peter Brantley <peebsley () gmail com>
Organization:         Digital Library Federation
To:   Read20 List <read20-l () lists panix com>



recently, netflix released some anonymized usage data in order
to seed a technical challenge (on recommending algorithms).

bruce schneier reports that a team of Univ. of Texas researchers
de-anonymized a subset of the data through correlation with public
IMdB (internet movie database) entries.

bruce extends this by analogy to point how easy this really is
and he notes the obvious analogy to book purchasing habits:

http://www.schneier.com/blog/archives/2007/12/anonymity_and_t_2.html

"Someone with access to an anonymous dataset of telephone records,
for example, might partially de-anonymize it by correlating it
with a catalog merchants' telephone order database. Or Amazon's
online book reviews could be the key to partially de-anonymizing
a public database of credit card purchases, or a larger database
of anonymous book reviews.

"Google, with its database of users' internet searches, could
easily de-anonymize a public database of internet purchases, or
zero in on searches of medical terms to de-anonymize a public
health database. Merchants who maintain detailed customer and
purchase information could use their data to partially de-anonymize
any large search engine's data, if it were released in an
anonymized form. A data broker holding databases of several
companies might be able to de-anonymize most of the records in
those databases.

"What the University of Texas researchers demonstrate is that this
process isn't hard, and doesn't require a lot of data. It turns out
that if you eliminate the top 100 movies everyone watches, our
movie-watching habits are all pretty individual. This would
certainly hold true for our book reading habits, our internet
shopping habits, our telephone habits and our web searching habits."


________________________________________
read20-l : sponsored by Panix in New York City



-------------------------------------------
Archives: http://v2.listbox.com/member/archive/247/=now
RSS Feed: http://v2.listbox.com/member/archive/rss/247/
Powered by Listbox: http://www.listbox.com


Current thread: