[Air-L] Ethics of using hacked data.

Nathaniel Poor natpoor at gmail.com
Wed Oct 7 17:26:40 PDT 2015


Peter and list -

We are academic researchers — when I said something to the effect of “trying to help crowdfunders do it better” I meant “I am an idealistic ex-professor who still does academic research and I still hope some of my work will make the world a better place and not sit unread in a dusty journal on a shelf,” and since this line of work is about crowdfunding, well that’s what it will improve and inform the practices of (we find it won’t work for everyone, so keep alternate and longer-established funding mechanisms like the NEH in the US).

I realize I haven’t been to AoIR in quite some time, but the first one I went to was Toronto, 2003. I’m an academic. Peter I see we have some shared connections on LinkedIn.

We have two published papers on this topic in journals most of this list will know: NMS and iCS.

Davidson, R, & Poor, N. (2015). The barriers facing artists’ use of crowdfunding platforms: Personality, emotional labor, and going to the well one too many times. New Media & Society,17(2), 289-307. 
http://nms.sagepub.com/content/early/2014/11/24/1461444814558916.abstract

Davidson, R, & Poor, N. (Forthcoming). Why sugar daddies are only good for Bar-Mitzvahs: Exploring the limits on repeat crowdfunding. Information, Communication, and Society.

I left Roei out of it since he would think I’m a bit daft for asking (I am 99.5% sure he isn’t on this list…). He is a professor (just got tenure and is on his sabbatical year!), I, though, am not (it’s not my thing). But I’m one of those pesky “independent scholar” people, I’m like a ronin professor who doesn’t teach, which leaves me feeling misunderstood and annoyed with conference registration web pages that require an affiliation. So I don’t have an IRB or tenure committee to worry about, but I want to do good work regardless. Roei, however, does have an IRB and, one day, a promotion committee. This also means I get to mouth off about academic annoyances on Facebook, quite to the irritation of some of my friends and to my great delight. (Granted, I know some of you who do this and yet are professors….)

-Nat

-------------------------------
Nathaniel Poor, Ph.D.
http://natpoor.blogspot.com/
https://sites.google.com/site/natpoor/


> On Oct 7, 2015, at 4:54 PM, Peter Timusk <peterotimusk at gmail.com> wrote:
> 
> I think one could look a little at the consequences of what you are doing. Seems you are trying to make money by researching funding data is that right? I find that unethical but I find all kinds of data mining unethical. There are reasons to use your same skill sets that could benefit society. May be I don't understand what your end result is about.
> 
> Peter Timusk
> peterotimusk at gmail.com
> I do not speak for my employer or charities or political parties or unions I volunteer with or belong to, unless otherwise noted.
> 
> 
>> On Oct 7, 2015, at 4:11 PM, Nathaniel Poor <natpoor at gmail.com> wrote:
>> 
>> Hello list-
>> 
>> I recently got into a discussion with a colleague about the ethics of using
>> hacked data, specifically the Patreon hacked data (see here:
>> http://arstechnica.com/security/2015/10/gigabytes-of-user-data-from-hack-of-patreon-donations-site-dumped-online/
>> ).
>> 
>> He and I do crowdfunding work, and had wanted to look at Patreon, but as
>> far as I can tell they have no easy hook into all their projects (for
>> scraping), so, to me this data hack was like a gift! But he said there was
>> no way we could use it. We aren't doing sentiment analysis or anything, we
>> would use aggregated measures like funding levels and then report things
>> like means and maybe a regression, so there would be no identifiable
>> information whatsoever derived from the hacked data in any of our resulting
>> work (we might go to the site and pull some quotes).
>> 
>> I looked at the AoIR ethics guidelines ( http://aoir.org/reports/ethics2.pdf
>> ), and didn't see anything specifically about hacked data (I don't think
>> "hacked" is the best word, but I don't like "stolen" either, but those are
>> different discussions).
>> 
>> One relevant line I noticed was this one:
>> "If access to an online context is publicly available, do
>> members/participants/authors
>> perceive the context to be public?" (p. 8)
>> So, the problem with the data is that it's the entire website, so some was
>> private and some was public, but now it's all public and everyone knows
>> it's public.
>> 
>> To me, I agree that a lot of the data in the data-dump had been intended to
>> be private -- apparently, direct messages are in there -- but we wouldn't
>> use that data (it's not something we're interested in). We'd use data like
>> number of funders and funding levels and then aggregate everything. I see
>> that some of it was meant to be private, but given the entire site was
>> hacked and exported I don't see how currently anyone could have an
>> expectation of privacy any more. I'm not trying to torture the definition,
>> it's just that it was private until it wasn't.
>> 
>> I can see that some academic researchers -- at least those in computer
>> security -- would be interested in this data and should be able to publish
>> in peer reviewed journals about it, in an anonymized manner (probably as an
>> example of "here's a data hack like what we are talking about, here's what
>> hackers released").
>> 
>> I also think that probably every script kiddie has downloaded the data, as
>> has every grey and black market email list spammer, and probably every
>> botnet purveyor (for passwords) and maybe even the hacking arm of the
>> Chinese army and the NSA. My point here is that if we were to use the data
>> in academic research we wouldn't be publicizing it to nefarious people who
>> would misuse it since all of those people already have it. We could maybe
>> help people who want to use crowdfunding some (hopefully!) if we have some
>> results. (I guess I don't see that we would be doing any harm by using it.)
>> 
>> 
>> So, what do people think? Did I miss something in the AoIR guidelines? I
>> realize I don't think it's clear either way, or I wouldn't be asking, so
>> probably the answers will point to this as a grey area (so why do I even
>> ask, I am not sure).
>> 
>> But I'm not looking for "You can't use it because it's hacked," because I
>> don't think that explains anything. I could counter that with "It is
>> publicly available found data," because it is, although I don't think
>> that's the best reply either. Both lack nuance.
>> 
>> -Nat
>> 
>> -- 
>> Nathaniel Poor, Ph.D.
>> http://natpoor.blogspot.com
>> _______________________________________________
>> The Air-L at listserv.aoir.org mailing list
>> is provided by the Association of Internet Researchers http://aoir.org
>> Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>> 
>> Join the Association of Internet Researchers:
>> http://www.aoir.org/






More information about the Air-L mailing list