[Air-L] Twitter Collection and Analysis Toolkit seeks caring new owner
ehs at pobox.com
ehs at pobox.com
Sun May 31 17:31:35 PDT 2020
Hi Jacob,
I am sorry to hear about DMI-TCAT needing to be turned off. But I
certainly appreciate the costs of keeping a service like this online.
I wonder if it might be feasible to generate tweet ID datasets for the
various collections and add them to the Documenting the Now Catalog
[1]? The tweet IDs could then be "hydrated" by people who want to use
the data, using tools such as the Hydrator [2] or twarc [3].
If this sounds like it might be appropriate and you needed some help I
would be willing to lend a hand, since I work on the Documenting the
Now project.
Sincerely,
Ed Summers
[1] https://catalog.docnow.io
[2] https://github.com/docnow/hydrator
[3] https://github.com/docnow/twarc
On May 30, 2020 2:28 PM, Jacob Groshek <jgroshek at gmail.com> wrote:
Dear Colleagues and Friends,
I hope this message finds everyone doing well and managing through
this
difficult time.
I am posting to this list because, after about 7+ years, I am
decommissioning a Digital Methods Initiative Twitter Collection and
Analysis Toolkit (DMI-TCAT) installation that I oversee. More
details on
this system, developed by Erik Borra and Bernhard Rieder, are
available on
github here https://github.com/digitalmethodsinitiative/dmi-tcat
While this is an open source platform that was generously made
freely
available by the developers (sincere and deep thanks to Erik and
Bernhard),
there can be costs involved with hosting and storage of the system
and
data. So while there are a variety of reasons for my decision,
including a
finite amount of research funds that I can dedicate each year, I am
truly
saddened to be pulling the plug especially because I am fully aware
of the
value this system carries and its important as a resource for
academic
research.
To cut to the chase, it is roughly $370 per month to host and store
this
TCAT on Amazon Web Services (AWS), and right now, this TCAT install
is at
about 70% elastic capacity with more than *560 million tweets*
collected
over the last few years, on a wide variety of topics, many that
relate to
political and health communication.
If anyone is interested to take over this system, or if there are
questions
out there, please reach out to me off list and I'm happy to
discuss. If
there are no takers, this data will no longer be maintained by me
after
June 30, and repositories of literally hundreds of millions of
tweets on
COVID-19 and other topics will be lost to the ether.
Sorry for the long-ish message and thanks for your consideration.
Best regards,
Jacob
--
Dr. Jacob Groshek
Ross Beach Chair in Emerging Media Research and Associate Professor
Kansas State University
jacobgroshek.com | @jacobgroshek <https://twitter.com/jacobgroshek>
| google
scholar
<https://scholar.google.nl/citations?user=G1XXhccAAAAJ&hl=en>
Honorary Associate Professor, Roskilde University
<https://ruc.dk/en/department-communication-and-arts>
Associate Director, CMCS @ <https://sites.bu.edu/cmcs/> Boston U
<https://sites.bu.edu/cmcs/> | Founding Editor, *JoCTEC
<http://www.joctec.org/>*
Previously: Erasmus U
<https://www.eur.nl/en/eshcc/research/ermecc/people/research-fellows
> |
NeSCoR <http://nescor.socsci.uva.nl/> | Boston Civic Media
<http://bostoncivic.media/> | IAST <http://www.iast.fr/>
+1-857-615-4709
_______________________________________________
The Air-L at listserv.aoir.org mailing list
is provided by the Association of Internet Researchers
http://aoir.org
Subscribe, change options or unsubscribe at:
http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers:
http://www.aoir.org/
More information about the Air-L
mailing list