[Air-L] Suggestions for anonymizing qual interview data

Robert Gorwa robert.gorwa at sant.ox.ac.uk
Mon Apr 12 04:19:02 PDT 2021


Hi all,
Thanks for this really insightful discussion.

Prof. Aragon, your very on-point comment reminded me of a short paper I wrote a few years ago on vulnerable populations in qualitative research, that might also be of interest to the group. I was focusing specifically on this question of how quotations and other identifying data is reproduced in text, and looked at a small sample of published articles as mini-case studies to illustrate some of the strategies that have been employed by others.

Gorwa, R., and P.N. Howard. “Studying Politically Vulnerable Communities Online: Ethical Dilemmas, Questions, and Solutions.” 13th International Conference on Web and Social Media (ICWSM), Workshop on Exploring Ethical Trade-offs in Social Media Research. Palo Alto, USA (June 25). https://arxiv.org/abs/1806.00830

With digital interviewing or observation in online forums/on platforms the issue is, in my opinion, compounded by the fact that exact quotations, even if anonymized, could be dug up via search engines or interfaces. I think I landed on Annette Markham’s concept of ‘bricolage’ — which others have hinted at, in terms of making changes to the text or even fusing multiple participants together in terms of responses —  as probably the best practice for fully ensuring anonymity.

Cheers,
Rob

Robert Gorwa
Fellow, Weizenbaum Institute for the Networked Society, Berlin
---------------------------------------------------------
Doctoral Candidate, Department of Politics and International Relations, University of Oxford
---------------------------------------------------------
<http://gorwa.co.uk> | @rgorwa

On Apr 9, 2021, at 7:41 PM, Cecilia Aragon <aragon at uw.edu<mailto:aragon at uw.edu>> wrote:

Cory, thanks for asking this important question, and Michael, thank you for
pointing out some issues of intersectional re-identification in qualitative
research, particularly for marginalized participants. This is a huge social
justice issue where more research needs to be done.

As a member of multiple underrepresented groups, I've frequently been asked
to be a participant in qualitative research. I've found that many very
well-meaning and intelligent people, including both researchers and IRBs,
are unaware of the risks to vulnerable populations, particularly when
intersectionality is involved. Researchers often apply naïve techniques for
anonymity, and IRBs assume they are sufficient.

Some IRB-approved projects clearly re-identified me and put me at risk,
e.g. "anonymous participant X is a female computer science PhD student who
is also an aerobatic pilot" (in a department where a professor had told me
"women don't have the intellectual ability for computer science"). So I
started requiring researchers to let me approve and/or edit the final text
to go into published reports (and yes, it was a lot of labor for me). Some
of the techniques I ended up asking researchers to use:

  - Apply a qualitative analog of differential privacy: add "noise" by
  obfuscating parts of the quotes and identifiers. For example, change the
  above description to  "anonymous participant X is a female computer science
  PhD student who is also a competitive skier." This maintains the contextual
  integrity of the quote but obfuscates an easily identifiable characteristic.
  - Don't retain identifiers throughout the document for the same
  individual. It can be easy to re-identify an individual from a series of
  quotes, particularly if they have intersectional identities. Instead, you
  can attribute some of the quotes to "participant X, a female computer
  science student" and the rest to "participant Y, a Latinx computer science
  student."

I'd love to see more research on this topic and would be happy to
collaborate on projects!

Cecilia

--
Cecilia R. Aragon, Professor
Director, Human-Centered Data Science Lab
Department of Human Centered Design & Engineering, University of
Washington, Seattle
http://faculty.washington.edu/aragon | @craragon
<https://twitter.com/craragon>
New memoir *Flying Free
<https://www.blackstonepublishing.com/flying-free-cecilia-aragon>: My
Victory over Fear to Become the First Latina Pilot on the US Aerobatic Team*


On Fri, Apr 9, 2021 at 6:43 AM Michael Muller <michael_muller at us.ibm.com<mailto:michael_muller at us.ibm.com>>
wrote:

  I think part of the thinking-process might be: How easily can someone
  figure out the identity of the informant? If I were to say that I
  interviewed people in our 8-person team, and if I report that one
  person was working from the Pacific timezone, then it's easy to
  determine which of us I am referring to. Or if I were to write that a
  disabled member of the team said... then that's me. I know that seems
  obvious for a tiny group, but these kinds of intersectional identities
  can operate in larger groups, too.
  A second way-of-thinking may involve a focus on the risk of disclosure
  to the informant. However, this criterion often becomes a matter of the
  researcher's imagination regarding the Other. It's been shown again and
  again that people in a position of privilege and safety may not
  understand the very real risks that are experienced by people who have
  fewer safeguards - e.g., men writing about women's safety (how easily
  can a stalker act on the information?), or straight people writing
  about risk of identification of someone in one of the LGBTQIA+ spectra,
  or citizens making assumptions about legal protections (or lack of
  protections) for non-citizens. Of course, it's a good idea to discuss
  these matters with people who are not ourselves, and who are not like
  ourselves. It's also a good idea not to put the burden of explaining
  bias on the person who is the target of that bias. Yes, I know that I
  said two things that somewhat contradict each other. There are no easy
  answers here.
  A third possibility is to ask each informant to state what information
  about themself would be safe to share. This is sensible only if the
  informants understand publications and readerships, etc. But it may be
  a more radically democratic approach to demographic description.
  I'm suggesting these ideas as among a larger number of *starting
  points* for thinking about difficult research questions. Please think
  of them as heuristic questions - not as authoritative questions, and
  certainly not as answers!
  best wishes,
  --michael
  -----
  Michael Muller, PhD, IBM Research, Cambridge MA USA
  pronouns: he/him/his
  ACM Distinguished Scientist
  ACM SIGCHI Academy



On Wed, Apr 7, 2021 at 3:47 AM Cory Robinson
  <cory.robinson at liu.se<mailto:cory.robinson at liu.se><mailto:
cory.robinson at liu.se<mailto:cory.robinson at liu.se>>> wrote:
HI all,

Two Masterâs students I recently met are conducting recorded
  interviews
resulting in texts they will code and quote within their theses. I
  have
given input about how to protect the recorded interviews (encrypted,
password protected, not stored in the cloud). I do not work with qual
  data,
so I need help recommending methodology or help for anonymizing
  quotes in
their thesis.

(I am inquiring about this for a student, that unfortunately, has not
received helpful advice from their supervisor). â¹

The students assumed they would assign each participating an
identification number, and then attribute the quote and ID # in their
thesis. However, I feel there is surely a better way to ensure
  anonymity?
(Too easy to reidentify if research data was obtained).

What methods do you utilize for anonymizing individual interview
  data? Or
manuscripts/books helpful for this? Sadly, the students are nearing
  the end
of the study, but late is better than never. (Itâs indeed a failure
  of
universities, as well as unequipped supervisors!)

Best,
Cory

_______________________________________________
The Air-L at listserv.aoir.org<mailto:Air-L at listserv.aoir.org> mailing list
is provided by the Association of Internet Researchers http://aoir.org<http://aoir.org/>
Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org

Join the Association of Internet Researchers:
http://www.aoir.org/



More information about the Air-L mailing list