[Air-L] irb approach to data like panama papers / wikileaks / etc?

Fri May 20 12:17:30 PDT 2016

Hi, for a study I'm doing I was wondering if people on this list might
could contact me offline at (kalev.leetaru5 at gmail.com) with any pointers or
personal experiences of how their IRBs are addressing the issue of academic
research using data from data breaches. Ie, the Panama Papers, Wikileaks,
the Sony emails, Ashley Madison data, and any of the myriad other datasets
now floating around.

Researchers from several major US institutions I've spoken with thus far
have shared IRB approval forms with me that show their particular IRBs
accepting the argument that any data, no matter how sensitive, which can be
downloaded openly from the web, is accepted as "public domain" and falls
under an IRB exemption of existing public data. In particular, the IRB
approvals I've seen accepted that any personally identifying information in
the datasets, no matter how sensitive, is exempted due to its being public
access now.  I'm thus extremely curious whether this is a general trend and
how other IRBs are treating the use of hacked datasets which are widely
accessible online.

In an era in which academic researchers can easily access with a few
mouseclicks breached medical records to password archives to sexual
preferences to financial statements to just about any kind of dataset you
can imagine, there are all kinds of questions around whether those datasets
should be available for academic research and I'm curious how IRBs are
leaning right now.

I realize there are a myriad professional ethics guidelines out there put
forward by the various professional societies, but from browsing recent
journal issues from a cross-section of fields including the social,
information, and computer sciences, I've turned up countless papers using
breached datasets, and those papers in fields that traditionally use IRBs
have all claimed full IRB approval, while I've found in the computer
sciences that few of the researchers I've spoken with thus far have either
heard of the IRB or believe that their research is subject to IRB.

Thus, my main interest is really at the institutional level - how are IRBs
and universities handling the issue of their scholars using data from
breaches like Wikileaks, Panama Papers, Sony emails, breaches from
government agencies, etc?  I'm also interested for those of you who are
journal editors or who have gone through that process, how do you handle
the issue of whether to publish a paper using something like Wikileaks data
and, *in particular* how do you handle the issue of hosting portions of the
breached data in the replication archive of your journal's website?

I know this can be a deeply impassioned area of discourse and for my
particular study I'm *solely* focused on how institutions, especially IRBs
and also journals, are currently handling the issue of breached data like
Wikileaks/Panama Papers/etc in academic publications.

Thanks so much in advance!

Kalev