[Air-L] Twitter Collection and Analysis Toolkit seeks caring new owner

ehs at pobox.com ehs at pobox.com
Sun May 31 17:31:35 PDT 2020

   Hi Jacob,
   I am sorry to hear about DMI-TCAT needing to be turned off. But I
   certainly appreciate the costs of keeping a service like this online.
   I wonder if it might be feasible to generate tweet ID datasets for the
   various collections and add them to the Documenting the Now Catalog
   [1]? The tweet IDs could then be "hydrated" by people who want to use
   the data, using tools such as the Hydrator [2] or twarc [3].
   If this sounds like it might be appropriate and you needed some help I
   would be willing to lend a hand, since I work on the Documenting the
   Now project.
   Ed Summers
   [1] https://catalog.docnow.io
   [2] https://github.com/docnow/hydrator
   [3] https://github.com/docnow/twarc
   On May 30, 2020 2:28 PM, Jacob Groshek <jgroshek at gmail.com> wrote:

     Dear Colleagues and Friends,
     I hope this message finds everyone doing well and managing through
     difficult time.
     I am posting to this list because, after about 7+ years, I am
     decommissioning a Digital Methods Initiative Twitter Collection and
     Analysis Toolkit (DMI-TCAT) installation that I oversee.  More
     details on
     this system, developed by Erik Borra and Bernhard Rieder, are
     available on
     github here https://github.com/digitalmethodsinitiative/dmi-tcat
     While this is an open source platform that was generously made
     available by the developers (sincere and deep thanks to Erik and
     there can be costs involved with hosting and storage of the system
     data.  So while there are a variety of reasons for my decision,
     including a
     finite amount of research funds that I can dedicate each year, I am
     saddened to be pulling the plug especially because I am fully aware
     of the
     value this system carries and its important as a resource for
     To cut to the chase, it is roughly $370 per month to host and store
     TCAT on Amazon Web Services (AWS), and right now, this TCAT install
     is at
     about 70% elastic capacity with more than *560 million tweets*
     over the last few years, on a wide variety of topics, many that
     relate to
     political and health communication.
     If anyone is interested to take over this system, or if there are
     out there, please reach out to me off list and I'm happy to
     discuss.  If
     there are no takers, this data will no longer be maintained by me
     June 30, and repositories of literally hundreds of millions of
     tweets on
     COVID-19 and other topics will be lost to the ether.
     Sorry for the long-ish message and thanks for your consideration.
     Best regards,
     Dr. Jacob Groshek
     Ross Beach Chair in Emerging Media Research and Associate Professor
     Kansas State University
     jacobgroshek.com | @jacobgroshek <https://twitter.com/jacobgroshek>
     | google
     Honorary Associate Professor, Roskilde University
     Associate Director, CMCS @ <https://sites.bu.edu/cmcs/> Boston U
     <https://sites.bu.edu/cmcs/> | Founding Editor, *JoCTEC
     Previously: Erasmus U
     > |
     NeSCoR <http://nescor.socsci.uva.nl/> | Boston Civic Media
     <http://bostoncivic.media/> | IAST <http://www.iast.fr/>
     The Air-L at listserv.aoir.org mailing list
     is provided by the Association of Internet Researchers
     Subscribe, change options or unsubscribe at:
     Join the Association of Internet Researchers:

More information about the Air-L mailing list