[Air-L] [External] Re: Buying tweets ?

Stuart Shulman stuart.shulman at gmail.com
Thu Sep 10 10:49:38 PDT 2020


There is a difference between not being able or willing and it not
being possible.
https://developer.twitter.com/en/docs/twitter-api/v1/tweets/compliance/overview

It is definitely possible and also required.
Many research labs in academia find ways to comply with the
documentation above.
More, I suspect either know and ignore it, or simply do not know what
is expected.
It is similar to the need to comply with the office of sponsored
research protocols for human subjects.
You have to minimize the risk of harm.
No compromises for researcher convenience or productivity demands.

~SWS








On Thu, Sep 10, 2020 at 1:33 PM Deen Freelon <dfreelon at gmail.com> wrote:

> Sure, many countries have a right to be forgotten. The US doesn't, and
> AFAIK there's little clear case law that applies to individuals'
> presence in research datasets. If someone asks me to remove their data
> from my datasets, I'm happy to do so, but I'm not willing to
> prospectively monitor Twitter's platform for deletions so that my
> datasets always match what is currently available on Twitter. That is
> technically infeasible for me, and I suspect for many others as well.
>
> The practicality aspect I mentioned applies also to users. You can ask
> AIR-L members to remove your data, but what assurances do you have that
> they've done so? It's impossible even to check that they've actually
> read your message. Now consider all the other researchers' datasets of
> which your data may be a part--there's no way to even know who to ask.
> And all of this to prevent your data from being one point among
> millions, with no exposure of any identifying information? It's little
> wonder yours is the first data removal request I've ever received, but
> as I said, I'll honor it. /DEEN
>
> On 9/10/2020 10:49 AM, Stuart Shulman wrote:
> > There is nothing theoretical about checking in real time for
> > deletions. When you study a Tweet's content in the Twitter display, if
> > a Tweet is deleted or an account suspended or deleted, the Tweet will
> > not display. That is real time compliance. We have done it for many
> > years now, all the while advising students and faculty on the
> > ethical importance of this point.
> >
> > The "right to be forgotten" is law in many countries, so I am unsure
> > how that is unresolved. Something is either legal or it is not. If
> > anyone reading this has any of my deleted Tweets from my deleted
> > account, the Canadian part of me requests you immediately delete them.
> > If you lack the ability to check for compliance in real time, should
> > you be handling my data and violating my right to be forgotten under
> > the broad banner of research? I have tweeted extensively about acts by
> > a hostile foreign power to game the imminent election. I have recently
> > deleted personal Facebook, YouTube and Twitter accounts. Nobody has
> > any business holding that data. It is unethical.
> >
> > There are various guidelines about legally sharing lists of Tweet IDs
> > for rehydration and replication (something almost never done) versus
> > sharing spreadsheets of complete data extracts or the raw JSON, which
> > is done all the time in defiance of the Twitter ToS.
> >
> > On Thu, Sep 10, 2020 at 9:55 AM Deen Freelon <dfreelon at gmail.com
> > <mailto:dfreelon at gmail.com>> wrote:
> >
> >     The streaming API is great--if 1) you're certain of all your search
> >     criteria in advance, and 2) you have a well-calibrated
> >     hardware/software
> >     setup dedicated to real-time data collection. In practice,
> >     researchers
> >     often need to add keywords and other criteria after the fact, or the
> >     significance of certain events/individuals only becomes clear
> >     later, or
> >     the limitations of their data collection setup do not become apparent
> >     until it fails, etc. The obvious advantage of streaming is that it's
> >     free, but it is not viable for a substantial subset of use cases.
> >     Sometimes the only way to obtain high-quality historical Twitter
> >     data is
> >     to purchase it.
> >
> >     The ethical issues Stu mentions are as yet unresolved. Theoretically,
> >     users should be able to demand removal of their data from academic
> >     databases at any time, but this is a practical impossibility. Most
> >     Twitter-based research would be impossible without storage of the
> >     full
> >     text and metadata, and there are no widely-used guidelines about how
> >     that data should be managed or sunset aside from Twitter's
> >     prohibition
> >     on the sharing of complete datasets. Certainly something we should
> >     continue to think about how best to address... /DEEN
> >
> >     On 9/10/2020 9:39 AM, Stuart Shulman wrote:
> >     > There is some scholarship on the various options which have
> >     continued
> >     > to evolve over time with respect to Twitter.
> >     > https://www.mdpi.com/1660-4601/17/3/864
> >     >
> >     > I agree that there is usually sufficient free, real time data to
> >     > gather from Twitter to reach saturation on most real time current
> >     > research questions.
> >     >
> >     > My personal experience with the cost and regulation of historical
> >     > Twitter access for academia is a cautionary tale. Most of what
> >     > academics want to study, and often the way they do it, violates the
> >     > clear language of the Twitter Terms of Service and also the
> >     > increasingly widespread right to be forgotten. If you are storing
> >     > spreadsheets of Twitter data that includes over time more and more
> >     > material from deleted accounts or deleted Tweets, this is
> >     problematic
> >     > from a legal perspective and raises ethical review questions
> >     > that should not be glossed over in any wikileaks fashion by journal
> >     > editors or university ethics officers.
> >     >
> >     > ~Stu
> >     > Dr. Stu Shulman U.S. Soccer Federation C-Licensed Coach Valeo FC &
> >     > Capacidad <http://capacidadprograms.org/?page_id=13> Volunteer
> >     Coach
> >     > /*Is your player ready to give back to the game?* /Contact Coach
> >     Stu
> >     > about winter & spring 2020 volunteer efforts.
> >     > Capacidad  <http://capacidadprograms.org/?page_id=13>
> >     >
> >     >
> >     >
> >     >
> >     >
> >     > On Thu, Sep 10, 2020 at 9:24 AM Peter Joseph Gloviczki PhD
> >     > <pgloviczki at coker.edu <mailto:pgloviczki at coker.edu>
> >     <mailto:pgloviczki at coker.edu <mailto:pgloviczki at coker.edu>>> wrote:
> >     >
> >     >     Deen makes a great point about sponsorship. I was referring to
> >     >     having to
> >     >     use personal funds.
> >     >
> >     >     Fondly, Peter
> >     >
> >     >     Peter Joseph Gloviczki, PhD    he/him/his
> >     >     Associate Professor of Communication
> >     >     Coker University
> >     >     300 East College Avenue
> >     >     Hartsville, South Carolina 29550
> >     >     843.383.8379
> >     > pgloviczki at coker.edu <mailto:pgloviczki at coker.edu>
> >     <mailto:pgloviczki at coker.edu <mailto:pgloviczki at coker.edu>>
> >     >
> >     >     Assistant Editor, Journal of Loss and Trauma (Taylor & Francis)
> >     >     Immediate Past Head, Cultural and Critical Studies Division,
> >     AEJMC
> >     >     1st Vice President, Carolinas Communication Association
> >     >
> >     >
> >     >     On Thu, Sep 10, 2020 at 9:20 AM Deen Freelon
> >     <dfreelon at gmail.com <mailto:dfreelon at gmail.com>
> >     >     <mailto:dfreelon at gmail.com <mailto:dfreelon at gmail.com>>>
> wrote:
> >     >
> >     >     > I have purchased tweets directly from Twitter on multiple
> >     >     occasions. I
> >     >     > disagree with Dr. Gloviczki about paying research
> >     costs--some of the
> >     >     > most rigorous research is sponsored. I wouldn't spend my own
> >     >     personal
> >     >     > money on such costs, but if you've got funding, by all
> >     means use it.
> >     >     >
> >     >     > Twitter allows university-affiliated users to buy data in
> >     a few
> >     >     ways.
> >     >     > I've primarily used their a la carte service (that's just
> >     what I
> >     >     call
> >     >     > it), where you give them a set of search criteria (e.g. a
> >     >     keyword[s] and
> >     >     > a time period) and they give you a quote. Pricing is based on
> >     >     the number
> >     >     > of days covered and the total volume of tweets. Their minimum
> >     >     price is a
> >     >     > little over $1k US and costs can quickly run into the
> >     >     five-figure range,
> >     >     > especially if you want tweets over a lengthy period of time.
> >     >     Also, they
> >     >     > have been known to refuse certain data requests,
> >     especially those
> >     >     > related to international conflict. The criteria for
> >     "acceptable"
> >     >     data
> >     >     > requests are not public--I've asked.
> >     >     >
> >     >     > Twitter does not advertise this service but it does exist.
> >     Fill
> >     >     out this
> >     >     > form and ask about it:
> >     >     >
> >     >     >
> >     >
> >
> https://developer.twitter.com/en/products/twitter-api/enterprise/application
> >     >     >
> >     >     > The associated metadata are the same as provided through the
> >     >     standard
> >     >     > APIs. These can be found here:
> >     >     >
> >     >     >
> >     >
> >
> https://developer.twitter.com/en/docs/twitter-api/v1/tweets/post-and-engage/api-reference/get-statuses-lookup
> >     >     > Language tags are included, and geographic info is present
> >     only when
> >     >     > users opt in, which is rare (typically 3-5% of tweets). I
> will
> >     >     also say
> >     >     > that obtaining the data once purchased is not easy--they
> >     come as
> >     >     GZIPped
> >     >     > JSON files packaged in 10-minute increments. So a year of
> data
> >     >     is far
> >     >     > too much to download manually--you'd need to automate your
> >     download
> >     >     > pipeline. I've written code to do this, so anyone who
> >     manages to
> >     >     > successfully buy Twitter data may feel free to contact me to
> >     >     access my
> >     >     > scripts.
> >     >     >
> >     >     > Twitter also offers a couple other data purchase options,
> >     >     including its
> >     >     > Premium API
> >     >     >
> >     >
> >      (https://developer.twitter.com/en/products/twitter-api/premium-apis
> )
> >     >     and
> >     >     > its Enterprise API
> >     >     >
> >     >
> >      (https://developer.twitter.com/en/products/twitter-api/enterprise#/
> ).
> >     >     > These charge pretty steep monthly fees and are oriented
> >     more toward
> >     >     > corporate and other well-funded clients.
> >     >     >
> >     >     > Finally, here's their portal for academic researchers,
> >     which may
> >     >     have
> >     >     > some relevant info:
> >     >     >
> >     >     >
> >     >
> >
> https://developer.twitter.com/en/solutions/academic-research/products-for-researchers
> >     >     >
> >     >     > Best, /DEEN
> >     >     >
> >     >     > On 9/10/2020 8:39 AM, Sandrine Roginsky wrote:
> >     >     > > Hello everybody,
> >     >     > >
> >     >     > > Help needed. Does anyone have experience with buying tweets
> >     >     from Twitter
> >     >     > for research? We have a fairly specific query and would
> >     like to
> >     >     know which
> >     >     > information is given about the tweets harvested through the
> >     >     query (e.g. is
> >     >     > language or geographic information given for tweets, even
> >     if it
> >     >     isn't part
> >     >     > of the query - so not a selection criterion)?
> >     >     > >
> >     >     > > Many thanks.
> >     >     > >
> >     >     > > Best wishes,
> >     >     > > Sandrine
> >     >     > >
> >     >     > >
> >     >     > >
> >     >     > > Sandrine Roginsky
> >     >     > > Associate Professor
> >     >     > >
> >     >     > > Faculty of Economic, Social and Political Sciences, and
> >     >     Communication
> >     >     > > Institute Language & Communication, PCOM / LASCO
> >     >     > >
> >     >     > >
> >     >     > >
> >     >     > >
> >     >     > > _______________________________________________
> >     >     > > The Air-L at listserv.aoir.org
> >     <mailto:Air-L at listserv.aoir.org> <mailto:Air-L at listserv.aoir.org
> >     <mailto:Air-L at listserv.aoir.org>>
> >     >     mailing list
> >     >     > > is provided by the Association of Internet Researchers
> >     > http://aoir.org
> >     >     > > Subscribe, change options or unsubscribe at:
> >     >     > http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> >     >     > >
> >     >     > > Join the Association of Internet Researchers:
> >     >     > > http://www.aoir.org/
> >     >     >
> >     >     > --
> >     >     > Deen Freelon, Ph.D.
> >     >     > Associate Professor -> Hussman School of Journalism and Media
> >     >     > Principal Researcher -> Center for Information,
> >     Technology, and
> >     >     Public Life
> >     >     > University of North Carolina at Chapel Hill
> >     >     > http://dfreelon.org | @dfreelon
> >     <https://twitter.com/dfreelon> |
> >     >     > https://github.com/dfreelon | https://citap.unc.edu/
> >     >     > Schedule an appointment with me
> >     >     > <https://doodle.com/mm/deenfreelon/book-a-time>
> >     >     > _______________________________________________
> >     >     > The Air-L at listserv.aoir.org
> >     <mailto:Air-L at listserv.aoir.org> <mailto:Air-L at listserv.aoir.org
> >     <mailto:Air-L at listserv.aoir.org>>
> >     >     mailing list
> >     >     > is provided by the Association of Internet Researchers
> >     > http://aoir.org
> >     >     > Subscribe, change options or unsubscribe at:
> >     >     > http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> >     >     >
> >     >     > Join the Association of Internet Researchers:
> >     >     > http://www.aoir.org/
> >     >     >
> >     >     _______________________________________________
> >     >     The Air-L at listserv.aoir.org <mailto:Air-L at listserv.aoir.org>
> >     <mailto:Air-L at listserv.aoir.org <mailto:Air-L at listserv.aoir.org>>
> >     >     mailing list
> >     >     is provided by the Association of Internet Researchers
> >     http://aoir.org
> >     >     Subscribe, change options or unsubscribe at:
> >     > http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> >     >
> >     >     Join the Association of Internet Researchers:
> >     > http://www.aoir.org/
> >     >
> >
> >     --
> >     Deen Freelon, Ph.D.
> >     Associate Professor -> Hussman School of Journalism and Media
> >     Principal Researcher -> Center for Information, Technology, and
> >     Public Life
> >     University of North Carolina at Chapel Hill
> >     http://dfreelon.org | @dfreelon <https://twitter.com/dfreelon> |
> >     https://github.com/dfreelon | https://citap.unc.edu/
> >     Schedule an appointment with me
> >     <https://doodle.com/mm/deenfreelon/book-a-time>
> >     _______________________________________________
> >     The Air-L at listserv.aoir.org <mailto:Air-L at listserv.aoir.org>
> >     mailing list
> >     is provided by the Association of Internet Researchers
> http://aoir.org
> >     Subscribe, change options or unsubscribe at:
> >     http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> >
> >     Join the Association of Internet Researchers:
> >     http://www.aoir.org/
> >
>
> --
> Deen Freelon, Ph.D.
> Associate Professor -> Hussman School of Journalism and Media
> Principal Researcher -> Center for Information, Technology, and Public Life
> University of North Carolina at Chapel Hill
> http://dfreelon.org | @dfreelon <https://twitter.com/dfreelon> |
> https://github.com/dfreelon | https://citap.unc.edu/
> Schedule an appointment with me
> <https://doodle.com/mm/deenfreelon/book-a-time>
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/



More information about the Air-L mailing list