[Air-L] Suggestions for anonymizing qual interview data
Robert Gorwa
robert.gorwa at sant.ox.ac.uk
Mon Apr 12 04:19:02 PDT 2021
Hi all,
Thanks for this really insightful discussion.
Prof. Aragon, your very on-point comment reminded me of a short paper I wrote a few years ago on vulnerable populations in qualitative research, that might also be of interest to the group. I was focusing specifically on this question of how quotations and other identifying data is reproduced in text, and looked at a small sample of published articles as mini-case studies to illustrate some of the strategies that have been employed by others.
Gorwa, R., and P.N. Howard. “Studying Politically Vulnerable Communities Online: Ethical Dilemmas, Questions, and Solutions.” 13th International Conference on Web and Social Media (ICWSM), Workshop on Exploring Ethical Trade-offs in Social Media Research. Palo Alto, USA (June 25). https://arxiv.org/abs/1806.00830
With digital interviewing or observation in online forums/on platforms the issue is, in my opinion, compounded by the fact that exact quotations, even if anonymized, could be dug up via search engines or interfaces. I think I landed on Annette Markham’s concept of ‘bricolage’ — which others have hinted at, in terms of making changes to the text or even fusing multiple participants together in terms of responses — as probably the best practice for fully ensuring anonymity.
Cheers,
Rob
Robert Gorwa
Fellow, Weizenbaum Institute for the Networked Society, Berlin
---------------------------------------------------------
Doctoral Candidate, Department of Politics and International Relations, University of Oxford
---------------------------------------------------------
<http://gorwa.co.uk> | @rgorwa
On Apr 9, 2021, at 7:41 PM, Cecilia Aragon <aragon at uw.edu<mailto:aragon at uw.edu>> wrote:
Cory, thanks for asking this important question, and Michael, thank you for
pointing out some issues of intersectional re-identification in qualitative
research, particularly for marginalized participants. This is a huge social
justice issue where more research needs to be done.
As a member of multiple underrepresented groups, I've frequently been asked
to be a participant in qualitative research. I've found that many very
well-meaning and intelligent people, including both researchers and IRBs,
are unaware of the risks to vulnerable populations, particularly when
intersectionality is involved. Researchers often apply naïve techniques for
anonymity, and IRBs assume they are sufficient.
Some IRB-approved projects clearly re-identified me and put me at risk,
e.g. "anonymous participant X is a female computer science PhD student who
is also an aerobatic pilot" (in a department where a professor had told me
"women don't have the intellectual ability for computer science"). So I
started requiring researchers to let me approve and/or edit the final text
to go into published reports (and yes, it was a lot of labor for me). Some
of the techniques I ended up asking researchers to use:
- Apply a qualitative analog of differential privacy: add "noise" by
obfuscating parts of the quotes and identifiers. For example, change the
above description to "anonymous participant X is a female computer science
PhD student who is also a competitive skier." This maintains the contextual
integrity of the quote but obfuscates an easily identifiable characteristic.
- Don't retain identifiers throughout the document for the same
individual. It can be easy to re-identify an individual from a series of
quotes, particularly if they have intersectional identities. Instead, you
can attribute some of the quotes to "participant X, a female computer
science student" and the rest to "participant Y, a Latinx computer science
student."
I'd love to see more research on this topic and would be happy to
collaborate on projects!
Cecilia
--
Cecilia R. Aragon, Professor
Director, Human-Centered Data Science Lab
Department of Human Centered Design & Engineering, University of
Washington, Seattle
http://faculty.washington.edu/aragon | @craragon
<https://twitter.com/craragon>
New memoir *Flying Free
<https://www.blackstonepublishing.com/flying-free-cecilia-aragon>: My
Victory over Fear to Become the First Latina Pilot on the US Aerobatic Team*
On Fri, Apr 9, 2021 at 6:43 AM Michael Muller <michael_muller at us.ibm.com<mailto:michael_muller at us.ibm.com>>
wrote:
I think part of the thinking-process might be: How easily can someone
figure out the identity of the informant? If I were to say that I
interviewed people in our 8-person team, and if I report that one
person was working from the Pacific timezone, then it's easy to
determine which of us I am referring to. Or if I were to write that a
disabled member of the team said... then that's me. I know that seems
obvious for a tiny group, but these kinds of intersectional identities
can operate in larger groups, too.
A second way-of-thinking may involve a focus on the risk of disclosure
to the informant. However, this criterion often becomes a matter of the
researcher's imagination regarding the Other. It's been shown again and
again that people in a position of privilege and safety may not
understand the very real risks that are experienced by people who have
fewer safeguards - e.g., men writing about women's safety (how easily
can a stalker act on the information?), or straight people writing
about risk of identification of someone in one of the LGBTQIA+ spectra,
or citizens making assumptions about legal protections (or lack of
protections) for non-citizens. Of course, it's a good idea to discuss
these matters with people who are not ourselves, and who are not like
ourselves. It's also a good idea not to put the burden of explaining
bias on the person who is the target of that bias. Yes, I know that I
said two things that somewhat contradict each other. There are no easy
answers here.
A third possibility is to ask each informant to state what information
about themself would be safe to share. This is sensible only if the
informants understand publications and readerships, etc. But it may be
a more radically democratic approach to demographic description.
I'm suggesting these ideas as among a larger number of *starting
points* for thinking about difficult research questions. Please think
of them as heuristic questions - not as authoritative questions, and
certainly not as answers!
best wishes,
--michael
-----
Michael Muller, PhD, IBM Research, Cambridge MA USA
pronouns: he/him/his
ACM Distinguished Scientist
ACM SIGCHI Academy
On Wed, Apr 7, 2021 at 3:47 AM Cory Robinson
<cory.robinson at liu.se<mailto:cory.robinson at liu.se><mailto:
cory.robinson at liu.se<mailto:cory.robinson at liu.se>>> wrote:
HI all,
Two Masterâs students I recently met are conducting recorded
interviews
resulting in texts they will code and quote within their theses. I
have
given input about how to protect the recorded interviews (encrypted,
password protected, not stored in the cloud). I do not work with qual
data,
so I need help recommending methodology or help for anonymizing
quotes in
their thesis.
(I am inquiring about this for a student, that unfortunately, has not
received helpful advice from their supervisor). â¹
The students assumed they would assign each participating an
identification number, and then attribute the quote and ID # in their
thesis. However, I feel there is surely a better way to ensure
anonymity?
(Too easy to reidentify if research data was obtained).
What methods do you utilize for anonymizing individual interview
data? Or
manuscripts/books helpful for this? Sadly, the students are nearing
the end
of the study, but late is better than never. (Itâs indeed a failure
of
universities, as well as unequipped supervisors!)
Best,
Cory
_______________________________________________
The Air-L at listserv.aoir.org<mailto:Air-L at listserv.aoir.org> mailing list
is provided by the Association of Internet Researchers http://aoir.org<http://aoir.org/>
Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers:
http://www.aoir.org/
More information about the Air-L
mailing list