[Air-L] TayTweets Dataset?

Shulman, Stu stu at texifter.com
Sat Apr 16 17:42:11 PDT 2016


Unless they were all deleted, you can access them using:

sifter.texifter.com

What was the @ handle of Tay? I can share the estimate with the list,
perhaps more...

On Sat, Apr 16, 2016 at 8:39 PM, Heinz, Lisa <ls144009 at ohio.edu> wrote:

> Cory, you bring up some very good points I also thought about as I've
> searched for the dataset.
>
>
> Screenshots (and news articles) are one source, but they would contain the
> most inflammatory of the tweets and still represent only a fraction of a
> fraction of the entire dataset of more than 90,000 tweets from Tay itself.
> Also, this figure does not include all the mentions that are critical to an
> analysis of the devolution of this bot, which may at least double that
> number considering much (all?) of Tay's tweets were responses to questions
> and comments.
>
>
> Last summer, Twitter finished the process of archiving tweets clear back
> to when it launched to make them accessible to researchers/enterprise
> customers beginning last fall (go here for a short list of providers:
> http://www.pcmag.com/article2/0,2817,2489442,00.asp).  We don't have
> access at our SMART lab, yet, but if the tweets exist, they are hiding
> behind a now-private account so the usual scrapers won't find them. Unless
> someone out there has a few tricks up their sleeves...
>
>
>
>
> ~~Lisa
>
>
> ~~~~~~~~~~~~~~~~~~~
> Lisa Heinz
> PhD Student in Mass Communication
> E.W. Scripps School of Journalism
> Ohio University, Scripps College of Communication
> Twitter<http://twitter.com/livingrural>
> LinkedIn<http://linkedin.com/in/lisaheinz>
>
> ________________________________
> From: Cory Salveson <corysalveson at gmail.com>
> Sent: Saturday, April 16, 2016 4:50:50 PM
> To: Kishonna Gray
> Cc: Heinz, Lisa; air-l at listserv.aoir.org
> Subject: Re: [Air-L] TayTweets Dataset?
>
> It's not much, but Archive.org has multiple snapshots of tweets here<
> https://web.archive.org/web/20160324022856*/https://mobile.twitter.com/TayandYou>
> starting March 24. It lacks the extended metadata available via the API,
> but it's better than nothing.
>
> In case someone out there did do a more complete capture, I put out a
> request on the "datasets" subreddit here<
> https://www.reddit.com/r/datasets/comments/4f3hmx/request_taytweets_tayandyou_archive/>,
> and there's also a Quora question (from somebody else) here<
> https://www.quora.com/Have-TayTweetss-tweets-been-archived>. If anything
> surfaces, maybe one or both of these spots will get updated.
>
> If nothing else, the unavailability of these tweets is itself interesting.
> Others can advance a more nuanced analysis than me here, but for example, I
> wonder: on what basis are we as researchers "blocked" from accessing the
> tweets? I assume Twitter still technically has them in their databases
> somewhere, so isn't this essentially Twitter respecting Microsoft's --
> maybe even Tay's -- right to privacy under the Twitter TOS, like any other
> user? If so, then are all Twitter users really created equally with respect
> to privacy, or should some actors, as is the case for public figures
> generally, enjoy less? In other words, should the public be allowed to
> request these tweets directly from Twitter? Etc.
>
> Cory Salveson
> http://corysalveson.com
>
> On Fri, Apr 15, 2016 at 3:47 PM, Kishonna Gray <kishonnagray at gmail.com
> <mailto:kishonnagray at gmail.com>> wrote:
> this would have been great to capture them. i have archived some from the
> associated hashtags. so if anyone did, that would be amazing. "meltdown" is
> an understatement.
>
> On Fri, Apr 15, 2016 at 2:34 PM, Heinz, Lisa <ls144009 at ohio.edu<mailto:
> ls144009 at ohio.edu>> wrote:
>
> > Hello... I wondered if someone on this list was able to capture all of
> > Microsoft's Tay bot's tweets and mentions from the time it went live
> until
> > it was shut down, March 23-24? I did not get my collector setup in time,
> so
> > I am looking for someone who collected Tay's original tweets and
> mentions,
> > and who would be willing to share them.  I am primarily interested in the
> > avatar's meltdown as an historical marker in the adoption of this
> > technology.
> >
> >
> > ~~Lisa
> >
> >
> > ~~~~~~~~~~~~~~~~~~~
> > Lisa Heinz
> > PhD Student in Mass Communication
> > E.W. Scripps School of Journalism
> > Ohio University, Scripps College of Communication
> > Twitter<http://twitter.com/livingrural>
> > LinkedIn<http://linkedin.com/in/lisaheinz>
> >
> > _______________________________________________
> > The Air-L at listserv.aoir.org<mailto:Air-L at listserv.aoir.org> mailing list
> > is provided by the Association of Internet Researchers http://aoir.org
> > Subscribe, change options or unsubscribe at:
> > http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> >
> > Join the Association of Internet Researchers:
> > http://www.aoir.org/
> _______________________________________________
> The Air-L at listserv.aoir.org<mailto:Air-L at listserv.aoir.org> mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/
>
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/
>



-- 
Dr. Stuart W. Shulman
Founder and CEO, Texifter
LinkedIn: http://www.linkedin.com/in/stuartwshulman



More information about the Air-L mailing list