[Air-L] twarc & Twitter's v2 API

Ed Summers ehs at pobox.com
Thu Apr 8 05:20:33 PDT 2021


The Documenting the Now project is pleased to announce that the twarc
data collection utility [1] has been updated to work with the Twitter
v2 API:

https://news.docnow.io/twarc2-779278e66ea0

This means that twarc can now be used to collect data from the
historical archive, if you have been granted access to Twitter's
Academic Research Product Track. Previously only the last 7 days or so
were available from Twitter without paying them.

It also means you can take advantage of new features of the v2 API such
as retrieving all the tweets in a conversation thread, or entity
annotations for people, places, products and organizations mentioned in
tweets.

While it has been fairly stable for the last ten years Twitter's
canonical JSON representation of a tweet is greatly transformed by the
v2 API since many aspects of the data are optional. One of the most
useful features of the new version of twarc is that it will collect the
maximum amount of information available for a tweet, which lets you
perform your data collection knowing that all the various expansions
and fields will be present when you do your research.

Finally we've redesigned twarc to allow for plugins [2] which extend
the twarc2 base command to let it do things like extract videos [3]
from tweet data, or convert Twitter's JSON to CSV [4]. You can install
these plugins just like you install twarc, and write and share your
own.

If you need help getting started or have feedback for us we would love
to hear from you here, in comments on the blog post, or in the
Documenting the Now slack [5]. And if it's appropriate, please feel
free to share your tweet identifier datasets in the Catalog [6].

Thanks!

Ed Summers
University of Maryland

[1] https://twarc-project.readthedocs.io/en/latest/
[2] https://twarc-project.readthedocs.io/en/latest/plugins/
[3] https://pypi.org/project/twarc-videos/
[4] https://pypi.org/project/twarc-csv/
[5] https://bit.ly/docnow-slack
[6] https://catalog.docnow.io




More information about the Air-L mailing list