[Air-L] twitter update

Shulman, Stu stu at texifter.com
Mon Feb 27 03:59:43 PST 2023


It is proving harder to turn off the Twitter APIs than imagined. Data is
still flowing and there remains no clear messaging about the launch of the
new regime. It seems that operating Twitter itself depends on the API. Many
Twitter researchers already have important but understudied collections.
For example, these are my "Top 100" largest datasets:
https://tinyurl.com/100LargeDatasets. These 100 curated "medium" datasets
total 185 million records. I have about 1,000 smaller datasets collected
over the last decade. As just one person in a massive ecosystem of
disparate collections, my guess is that across all researchers at all
institutions there are more highly relevant, valuable, understudied Twitter
datasets than academia can fully parse. This is not to say the fight to
keep the data flowing and low-cost or free is unimportant. However, take a
quick look at my Top 100 then imagine all the data already stored by the
academics (especially in computer science departments) who work at a scale
10X-100X of what I do. The key is to create more accessible repositories so
that we can support teaching and research with (please forgive me) the
"bird in the hand" and not lose hope over unlimited birds in the bush. To
that end, if you see a dataset on the list and you want to study it or
teach it (maybe both), that is legal, free, and possible on demand via
DiscoverText.

~Stu

-- 
Dr. Stuart W. Shulman
Founder and CEO, Texifter
Editor Emeritus, *Journal of Information Technology & Politics*


More information about the Air-L mailing list