[Air-L] Are Twitter Handles identifiable information? (eProtocol section 13)

Ansgar Koene Ansgar.Koene at nottingham.ac.uk
Thu Apr 20 04:26:54 PDT 2017

Dear Ye Na Lee,

    with regards to Twitter handles being identifiable participant information, I would strongly agree with your IRB on this point since knowing a twitter handle will in most cases provide enough information to find the relevant twitter account and all the Tweets associated with it. The information in those tweets is very likely to provide enough material to further cross-references that will results in identification of a person.

I think the main concern you were raising with regards to your ability to do the study was to do with the need to be able to go back to the same twitter accounts and follow discussion threads. Based on the quote you provided from your IRB "Identifiers should be removed from data/specimens as soon as possible following collection, except in cases where the identifiers are embedded (e.g., voices in audio or faces in video recordings)." you will notice that it says "as soon as possible following collection". I think you can make a case that removing the identify prior to the end of your 13 days collection period is not possible in the context of this study since it would interfere with the ability to build the coherent data set. During the data collection period you will store the data in a secure, encrypted, system so that the identifiers will not be release to the outside. Once the complete data sets are collected you will proceed to remove the Twitter handles, replaying them with some random alphanumeric IDs that allow you to retain the structure of the data (which tweets are from the same person).

Hope that helps,


Dr. Ansgar Koene
Senior Research Fellow: Horizon Policy Impact, CaSMa & UnBias
Horizon Digital Economy Research Institute
University of Nottingham
From: Air-L <air-l-bounces at listserv.aoir.org> on behalf of Ye Na Lee <jpt2007 at berkeley.edu>
Sent: Thursday, April 20, 2017 3:54:39 AM
To: air-l at listserv.aoir.org
Subject: [Air-L] Are Twitter Handles identifiable information? (eProtocol section 13)

Dear member of AoIR
I am currently going through IRB process (non-exempt) for my research which
involves analyzing tweets containing certain hashtags. In my opinion,
Twitter handles do not constitute as identifiable participant information.
However, I got a response from IRB, saying:

"Twitter handles may or may not be identifiable, depending on the user. In
some cases they are not identifiable, in many other cases they are
identifiable (i.e., twitter users' identities are known). Names and emails
are not the only variable that are considered "identifiable." Any
information that can lead to identification of an individual is considered
"identifiable." (See FAQ for further information:
http://cphs.berkeley.edu/faqs.html#e1). Please revise section 13a to
account for this. Ensure that corresponding updates are also integrated
into pertinent sections/documents."

Which means I need to remove identifiers, Twitter handles in this case from
tweets as soon as I collect them like item 13d says.

"Identifiers should be removed from data/specimens as soon as possible
following collection, except in cases where the identifiers are embedded
(e.g., voices in audio or faces in video recordings). If data are coded in
order to retain a link between the data and identifiable information,
explain where the key to the code will be stored, how it will be protected,
who will have access to it, and when it will be destroyed."

This will be very cumbersome for me because my research is not quantitative
and does not involve collecting massive amounts of tweets at once using
Twitter API. It is a qualitative research using virtual ethnography to
explore and understand the conversation around a certain hashtag movement,
which means I will need to constantly go back to certain Twitter accounts
and use snowball sampling to collect tweets like searching who they are
following and looking into these accounts as well.
(I plan to use NVivo 10 to collect and code tweets)

I assume IRB is taking more precaution about the security issues because my
research involves the risk of cyberbullying. I understand the risk and will
completely anonymize Twitter handles on my paper. However, I honestly do
not see the need to separate/remove identifiers in the process of
collection. I`d really appreciate if anyone who`s done a similar research
as mine could tell me about their experiences with the IRB process.
Thank you in advance.
The Air-L at listserv.aoir.org mailing list
is provided by the Association of Internet Researchers http://aoir.org
Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org

Join the Association of Internet Researchers:

This message and any attachment are intended solely for the addressee
and may contain confidential information. If you have received this
message in error, please send it back to me, and immediately delete it. 

Please do not use, copy or disclose the information contained in this
message or in any attachment.  Any views or opinions expressed by the
author of this email do not necessarily reflect the views of the
University of Nottingham.

This message has been checked for viruses but the contents of an
attachment may still contain software viruses which could damage your
computer system, you are advised to perform your own checks. Email
communications with the University of Nottingham may be monitored as
permitted by UK legislation.

More information about the Air-L mailing list