[Air-L] Reddit dataset

Alex Leavitt alexleavitt at gmail.com
Sat Jul 4 10:51:03 PDT 2015


Just to give more context, the reddit API cannot scrape 'private'
subreddits. So yes, it is entirely public data. That said, I think there
are some social data issues to consider, such as persistence and issues of
access, but technically those issues exist through reddit's search
functionality (and actually play an important role in accountability on the
platform for individuals).


---

Alexander Leavitt
PhD Candidate
USC Annenberg School for Communication & Journalism
http://alexleavitt.com
Twitter: @alexleavitt <http://twitter.com/alexleavitt>


On Sat, Jul 4, 2015 at 5:31 AM, Michael T Zimmer <zimmerm at uwm.edu> wrote:

> I saw that, but has it since been deleted by the OP? I can’t seem to find
> it, nor the thread on Reddit.
>
> IIRC, a Redditor used the site’s API to grab all comments ever posted to
> the site, but it wasn’t clear if this included subreddits that are set as
> “private.”
>
> In the Facebook post, the OP appeared to cast aside concerns over consent
> since the data was “public”. While I’d agree the data is public in the
> sense that anyone could access it (if they had the URLs, search
> capabilities, time, etc to do so), this calculus is much too simplistic and
> ignores the contextual nature of those comments. As another commenter on
> the FB thread (was that you, Katy? I can’t remember) noted, just because
> someone posted a comment to Reddit doesn’t mean they’ve necessarily
> consented to having that data included in a research study. Plus, as the
> commenter noted, there likely are minors in the dataset, which complicates
> consent.
>
> This relates closely to what I discuss in my article "'But the data is
> already public': on the ethics of research in Facebook”, and others have
> covered as well, especially in the AoIR Ethics Guidelines:
> http://ethics.aoir.org/
>
> Michael
>
> --
> Michael Zimmer, PhD
> Associate Professor, School of Information Studies
> Director, Center for Information Policy Research
> University of Wisconsin-Milwaukee
> e: zimmerm at uwm.edu
> w: www.michaelzimmer.org
>
>
> > On Jul 3, 2015, at 9:41 PM, Katy Pearce <katycarvt at gmail.com> wrote:
> >
> > Someone posted a link to a dataset of Reddit posts to the AOIR Facebook
> > page.
> > I wonder what the AOIR community members feel about this in terms of this
> > being "public" data.
> > _______________________________________________
> > The Air-L at listserv.aoir.org mailing list
> > is provided by the Association of Internet Researchers http://aoir.org
> > Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> >
> > Join the Association of Internet Researchers:
> > http://www.aoir.org/
>
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/
>



More information about the Air-L mailing list