[Air-l] AOL Releases Search Logs from 500,000 Users
Michael Zimmer
michael.zimmer at nyu.edu
Mon Aug 7 04:02:41 PDT 2006
I noticed they warn users that the database has not be censored of
sexually explicit search terms. But what about personally-
identifiable search terms? Someone concerned about whether their
personal information is online might search for "Michael Zimmer xxx-
xx-xxxx" with their social security number. Or how about personally
embarrassing searches, such as "Michael Zimmer nude karaoke"? The
presence of such searches in a public database is problematic.
-m
-----
Michael T. Zimmer
Doctoral Candidate, Culture and Communication, New York University
Student Fellow, Information Law Institute, NYU Law School
e: michael.zimmer at nyu.edu
w: http://michaelzimmer.org
On Aug 7, 2006, at 6:27 AM, Maciej Kos wrote:
> This may be useful to some of us.
>
> "AOL just released the logs of all searches done by 500,000 of
> their users
> over the course of three months earlier this year. That means that
> if you
> happened to be randomly chosen as one of these users, everything
> you searched
> for from March to May (2006) is now public information on the
> internet."
>
> "Update: Seems like AOL took it down. There are some mirrors of the
> data in
> the comments of the digg story, linked below. I estimate about 1000
> people
> have the file, so it's definitely going to be circulated around.
> The main AOL
> research page is still up, with some other data collections. The
> google cache
> of the download page is still up, but you can't get the data."
>
> "500k User Session Collection
> ----------------------------------------------
> This collection is distributed for NON-COMMERCIAL RESEARCH USE ONLY.
> Any application of this collection for commercial purposes is
> STRICTLY PROHIBITED.
>
> Brief description:
>
> This collection consists of ~20M web queries collected from ~650k
> users over
> three months.
> The data is sorted by anonymous user ID and sequentially arranged. "
>
>
>
> http download
> http://www.yousendit.com/transfer.php?
> action=download&ufid=DDD1D4D0017BB5BE
>
> Torrent
> http://thepiratebay.org/details.php?id=3510027
> http://www.mininova.org/tor/388815
>
> AOL website´s cache:
> http://72.14.207.104/search?q=cache:2Qvd2z9VbuIJ:research.aol.com/
> pmwiki/pmwiki.php%3Fn%3DResearch.500kUserQueriesSampledOver3Months
> +&hl=en&gl=us&ct=clnk&cd=1
>
>
> _______________________________________________
> The air-l at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at: http://
> listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/
More information about the Air-L
mailing list