[Air-L] Anonymizing Twitter handles

Bernhard Rieder berno.rieder at gmail.com
Fri Apr 14 04:07:19 PDT 2017


> On 14 Apr 2017, at 7:47 , Maurice Vergeer <m.vergeer at maw.ru.nl> wrote:

> Still, anonymizing is fairly easy when you have the data in a statistical
> program such as SPSS, R or even Excel: replace the userhandles with a
> unique number (from 1 to N).
> Then remove the userhandles from the dataset. Still I would advice always
> to keep a secure file with both keyvariables userhandles and the new
> identifyer for future resrearch.

I you hash the userhandle, e.g. with SHA-1 or similar (which is even possible in Excel with a small formula), there is no need to keep a correspondence file, because hashing a string will always yield the same hash - while making reversal virtually impossible (i.e. you cannot get the handle from the hash).

best,
Bernhard

--
Bernhard Rieder | Associate Professor | New Media and Digital Culture
University of Amsterdam | Turfdraagsterpad 9 | 1012 XT  Amsterdam | The Netherlands
http://thepoliticsofsystems.net | http://rieder.polsys.net | https://www.digitalmethods.net | @RiederB




More information about the Air-L mailing list