[Air-l] AOL and research ethics

elw at stderr.org elw at stderr.org
Tue Aug 29 17:54:04 PDT 2006




> Subject: Re: [Air-l] AOL and research ethics
> 
> I'm wondering what kind of sanitization method(s) would have worked in 
> this case?


Data sanitization is very difficult to do, and just about equally 
difficult to audit or evaluate.

It may go as far as a state in which *single inferences* that can be made 
from the data are damaging - for example, a user who searches for several 
proper names (uncommon ones) *plus* a string of terms regarding some 
pretty unsavory pornography.  How hard, in this case, to trace out their 
social network and figure out who the person really is?

I would guess that you might be able to sanitize this sort of data by 
removing all of the nouns from it.  *smirk*

--elijah



> On 8/29/06, Barry Wellman <wellman at chass.utoronto.ca> wrote:
>> Besides legal stuff, it's clear that AOL didn't follow what's SOP
>> procedures for preserving respondent, privacy in the social sciences. They
>> thought they were by not directly releasing user-holder's name and
>> accounts, but were so eager to be helpful that they didn't sanitize the
>> data well.
>> Always a concern, but one routinely dealt.
>> OK, I gotta stop myself or else I go on another rant about computer
>> scientists not knowing any social science -- in this case methods.




More information about the Air-L mailing list