[Air-l] AOL Releases Search Logs from 500,000 Users

Michael Zimmer michael.zimmer at nyu.edu
Mon Aug 7 04:02:41 PDT 2006


I noticed they warn users that the database has not be censored of  
sexually explicit search terms. But what about personally- 
identifiable search terms? Someone concerned about whether their  
personal information is online might search for "Michael Zimmer xxx- 
xx-xxxx" with their social security number.  Or how about personally  
embarrassing searches, such as "Michael Zimmer nude karaoke"? The  
presence of such searches in a public database is problematic.

-m


-----
Michael T. Zimmer
  Doctoral Candidate, Culture and Communication, New York University
  Student Fellow, Information Law Institute, NYU Law School
e: michael.zimmer at nyu.edu
w: http://michaelzimmer.org



On Aug 7, 2006, at 6:27 AM, Maciej Kos wrote:

> This may be useful to some of us.
>
> "AOL just released the logs of all searches done by 500,000 of  
> their users
> over the course of three months earlier this year. That means that  
> if you
> happened to be randomly chosen as one of these users, everything  
> you searched
> for from March to May (2006) is now public information on the  
> internet."
>
> "Update: Seems like AOL took it down. There are some mirrors of the  
> data in
> the comments of the digg story, linked below. I estimate about 1000  
> people
> have the file, so it's definitely going to be circulated around.  
> The main AOL
> research page is still up, with some other data collections. The  
> google cache
> of the download page is still up, but you can't get the data."
>
> "500k User Session Collection
> ----------------------------------------------
> This collection is distributed for NON-COMMERCIAL RESEARCH USE ONLY.
> Any application of this collection for commercial purposes is  
> STRICTLY PROHIBITED.
>
> Brief description:
>
> This collection consists of ~20M web queries collected from ~650k  
> users over
> three months.
> The data is sorted by anonymous user ID and sequentially arranged. "
>
>
>
> http download
> http://www.yousendit.com/transfer.php? 
> action=download&ufid=DDD1D4D0017BB5BE
>
> Torrent
> http://thepiratebay.org/details.php?id=3510027
> http://www.mininova.org/tor/388815
>
> AOL website´s cache:
> http://72.14.207.104/search?q=cache:2Qvd2z9VbuIJ:research.aol.com/ 
> pmwiki/pmwiki.php%3Fn%3DResearch.500kUserQueriesSampledOver3Months 
> +&hl=en&gl=us&ct=clnk&cd=1
>
>
> _______________________________________________
> The air-l at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at: http:// 
> listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/




More information about the Air-L mailing list