[Air-L] Suggestions for anonymizing qual interview data

Cecilia Aragon aragon at uw.edu
Fri Apr 9 10:41:49 PDT 2021


Cory, thanks for asking this important question, and Michael, thank you for
pointing out some issues of intersectional re-identification in qualitative
research, particularly for marginalized participants. This is a huge social
justice issue where more research needs to be done.

As a member of multiple underrepresented groups, I've frequently been asked
to be a participant in qualitative research. I've found that many very
well-meaning and intelligent people, including both researchers and IRBs,
are unaware of the risks to vulnerable populations, particularly when
intersectionality is involved. Researchers often apply naïve techniques for
anonymity, and IRBs assume they are sufficient.

Some IRB-approved projects clearly re-identified me and put me at risk,
e.g. "anonymous participant X is a female computer science PhD student who
is also an aerobatic pilot" (in a department where a professor had told me
"women don't have the intellectual ability for computer science"). So I
started requiring researchers to let me approve and/or edit the final text
to go into published reports (and yes, it was a lot of labor for me). Some
of the techniques I ended up asking researchers to use:

   - Apply a qualitative analog of differential privacy: add "noise" by
   obfuscating parts of the quotes and identifiers. For example, change the
   above description to  "anonymous participant X is a female computer science
   PhD student who is also a competitive skier." This maintains the contextual
   integrity of the quote but obfuscates an easily identifiable characteristic.
   - Don't retain identifiers throughout the document for the same
   individual. It can be easy to re-identify an individual from a series of
   quotes, particularly if they have intersectional identities. Instead, you
   can attribute some of the quotes to "participant X, a female computer
   science student" and the rest to "participant Y, a Latinx computer science
   student."

I'd love to see more research on this topic and would be happy to
collaborate on projects!

Cecilia

--
Cecilia R. Aragon, Professor
Director, Human-Centered Data Science Lab
Department of Human Centered Design & Engineering, University of
Washington, Seattle
http://faculty.washington.edu/aragon | @craragon
<https://twitter.com/craragon>
New memoir *Flying Free
<https://www.blackstonepublishing.com/flying-free-cecilia-aragon>: My
Victory over Fear to Become the First Latina Pilot on the US Aerobatic Team*


On Fri, Apr 9, 2021 at 6:43 AM Michael Muller <michael_muller at us.ibm.com>
wrote:

>    I think part of the thinking-process might be: How easily can someone
>    figure out the identity of the informant? If I were to say that I
>    interviewed people in our 8-person team, and if I report that one
>    person was working from the Pacific timezone, then it's easy to
>    determine which of us I am referring to. Or if I were to write that a
>    disabled member of the team said... then that's me. I know that seems
>    obvious for a tiny group, but these kinds of intersectional identities
>    can operate in larger groups, too.
>    A second way-of-thinking may involve a focus on the risk of disclosure
>    to the informant. However, this criterion often becomes a matter of the
>    researcher's imagination regarding the Other. It's been shown again and
>    again that people in a position of privilege and safety may not
>    understand the very real risks that are experienced by people who have
>    fewer safeguards - e.g., men writing about women's safety (how easily
>    can a stalker act on the information?), or straight people writing
>    about risk of identification of someone in one of the LGBTQIA+ spectra,
>    or citizens making assumptions about legal protections (or lack of
>    protections) for non-citizens. Of course, it's a good idea to discuss
>    these matters with people who are not ourselves, and who are not like
>    ourselves. It's also a good idea not to put the burden of explaining
>    bias on the person who is the target of that bias. Yes, I know that I
>    said two things that somewhat contradict each other. There are no easy
>    answers here.
>    A third possibility is to ask each informant to state what information
>    about themself would be safe to share. This is sensible only if the
>    informants understand publications and readerships, etc. But it may be
>    a more radically democratic approach to demographic description.
>    I'm suggesting these ideas as among a larger number of *starting
>    points* for thinking about difficult research questions. Please think
>    of them as heuristic questions - not as authoritative questions, and
>    certainly not as answers!
>    best wishes,
>    --michael
>    -----
>    Michael Muller, PhD, IBM Research, Cambridge MA USA
>    pronouns: he/him/his
>    ACM Distinguished Scientist
>    ACM SIGCHI Academy
>
>
>    >
>    > On Wed, Apr 7, 2021 at 3:47 AM Cory Robinson
>    <cory.robinson at liu.se<mailto:
>    > cory.robinson at liu.se>> wrote:
>    > HI all,
>    >
>    > Two Masterâs students I recently met are conducting recorded
>    interviews
>    > resulting in texts they will code and quote within their theses. I
>    have
>    > given input about how to protect the recorded interviews (encrypted,
>    > password protected, not stored in the cloud). I do not work with qual
>    data,
>    > so I need help recommending methodology or help for anonymizing
>    quotes in
>    > their thesis.
>    >
>    > (I am inquiring about this for a student, that unfortunately, has not
>    > received helpful advice from their supervisor). â¹
>    >
>    > The students assumed they would assign each participating an
>    > identification number, and then attribute the quote and ID # in their
>    > thesis. However, I feel there is surely a better way to ensure
>    anonymity?
>    > (Too easy to reidentify if research data was obtained).
>    >
>    > What methods do you utilize for anonymizing individual interview
>    data? Or
>    > manuscripts/books helpful for this? Sadly, the students are nearing
>    the end
>    > of the study, but late is better than never. (Itâs indeed a failure
>    of
>    > universities, as well as unequipped supervisors!)
>    >
>    > Best,
>    > Cory
>



More information about the Air-L mailing list