[Air-L] TayTweets Dataset?

Shulman, Stu stu at texifter.com
Sat Apr 16 19:07:01 PDT 2016


Absolutely not. Deleted and private Tweets are not accessible via Sifter.

Sifter is a Twitter-compliant tool and not strictly speaking a scraper.

It is Twitter-approved app for generating free estimates. It is used
primarily by academics who need 1,000-1,000,000 Tweets on a tight budget.

~Stu

On Saturday, April 16, 2016, Heinz, Lisa <ls144009 at ohio.edu> wrote:

> Thanks for trying, Stu! I wonder, though, does the scraper you used have
> the ability to access data behind a privacy lock?
>
> ~~~~~~~~~~~~~~
> Lisa M. Heinz
> PhD Student
> E.W. Scripps School of Journalism
> Ohio University, Scripps College of Communication
>
> @LivingRural on Twitter
>
> Sent from my iPhone
>
> On Apr 16, 2016, at 9:32 PM, Shulman, Stu <stu at texifter.com
> <javascript:_e(%7B%7D,'cvml','stu at texifter.com');>> wrote:
>
> Update
>
> from:TayandYou March 23-24:  0 Tweets available
>
> TayandYou March 23-24 (not from @TayandYou, but which mention TayandYou):
> ~122,000 Tweets.
>
> I suppose the folks at Microsoft decided to delete everything on that
> somewhat misguided experiment...
>
> On Sat, Apr 16, 2016 at 8:39 PM, Heinz, Lisa <ls144009 at ohio.edu
> <javascript:_e(%7B%7D,'cvml','ls144009 at ohio.edu');>> wrote:
>
>> Cory, you bring up some very good points I also thought about as I've
>> searched for the dataset.
>>
>>
>> Screenshots (and news articles) are one source, but they would contain
>> the most inflammatory of the tweets and still represent only a fraction of
>> a fraction of the entire dataset of more than 90,000 tweets from Tay
>> itself. Also, this figure does not include all the mentions that are
>> critical to an analysis of the devolution of this bot, which may at least
>> double that number considering much (all?) of Tay's tweets were responses
>> to questions and comments.
>>
>>
>> Last summer, Twitter finished the process of archiving tweets clear back
>> to when it launched to make them accessible to researchers/enterprise
>> customers beginning last fall (go here for a short list of providers:
>> http://www.pcmag.com/article2/0,2817,2489442,00.asp).  We don't have
>> access at our SMART lab, yet, but if the tweets exist, they are hiding
>> behind a now-private account so the usual scrapers won't find them. Unless
>> someone out there has a few tricks up their sleeves...
>>
>>
>>
>>
>> ~~Lisa
>>
>>
>> ~~~~~~~~~~~~~~~~~~~
>> Lisa Heinz
>> PhD Student in Mass Communication
>> E.W. Scripps School of Journalism
>> Ohio University, Scripps College of Communication
>> Twitter<http://twitter.com/livingrural>
>> LinkedIn<http://linkedin.com/in/lisaheinz>
>>
>> ________________________________
>> From: Cory Salveson <corysalveson at gmail.com
>> <javascript:_e(%7B%7D,'cvml','corysalveson at gmail.com');>>
>> Sent: Saturday, April 16, 2016 4:50:50 PM
>> To: Kishonna Gray
>> Cc: Heinz, Lisa; air-l at listserv.aoir.org
>> <javascript:_e(%7B%7D,'cvml','air-l at listserv.aoir.org');>
>> Subject: Re: [Air-L] TayTweets Dataset?
>>
>> It's not much, but Archive.org <http://archive.org> has multiple
>> snapshots of tweets here<
>> https://web.archive.org/web/20160324022856*/https://mobile.twitter.com/TayandYou>
>> starting March 24. It lacks the extended metadata available via the API,
>> but it's better than nothing.
>>
>> In case someone out there did do a more complete capture, I put out a
>> request on the "datasets" subreddit here<
>> https://www.reddit.com/r/datasets/comments/4f3hmx/request_taytweets_tayandyou_archive/>,
>> and there's also a Quora question (from somebody else) here<
>> https://www.quora.com/Have-TayTweetss-tweets-been-archived>. If anything
>> surfaces, maybe one or both of these spots will get updated.
>>
>> If nothing else, the unavailability of these tweets is itself
>> interesting. Others can advance a more nuanced analysis than me here, but
>> for example, I wonder: on what basis are we as researchers "blocked" from
>> accessing the tweets? I assume Twitter still technically has them in their
>> databases somewhere, so isn't this essentially Twitter respecting
>> Microsoft's -- maybe even Tay's -- right to privacy under the Twitter TOS,
>> like any other user? If so, then are all Twitter users really created
>> equally with respect to privacy, or should some actors, as is the case for
>> public figures generally, enjoy less? In other words, should the public be
>> allowed to request these tweets directly from Twitter? Etc.
>>
>> Cory Salveson
>> http://corysalveson.com
>>
>> On Fri, Apr 15, 2016 at 3:47 PM, Kishonna Gray <kishonnagray at gmail.com
>> <javascript:_e(%7B%7D,'cvml','kishonnagray at gmail.com');><mailto:
>> kishonnagray at gmail.com
>> <javascript:_e(%7B%7D,'cvml','kishonnagray at gmail.com');>>> wrote:
>> this would have been great to capture them. i have archived some from the
>> associated hashtags. so if anyone did, that would be amazing. "meltdown"
>> is
>> an understatement.
>>
>> On Fri, Apr 15, 2016 at 2:34 PM, Heinz, Lisa <ls144009 at ohio.edu
>> <javascript:_e(%7B%7D,'cvml','ls144009 at ohio.edu');><mailto:
>> ls144009 at ohio.edu <javascript:_e(%7B%7D,'cvml','ls144009 at ohio.edu');>>>
>> wrote:
>>
>> > Hello... I wondered if someone on this list was able to capture all of
>> > Microsoft's Tay bot's tweets and mentions from the time it went live
>> until
>> > it was shut down, March 23-24? I did not get my collector setup in
>> time, so
>> > I am looking for someone who collected Tay's original tweets and
>> mentions,
>> > and who would be willing to share them.  I am primarily interested in
>> the
>> > avatar's meltdown as an historical marker in the adoption of this
>> > technology.
>> >
>> >
>> > ~~Lisa
>> >
>> >
>> > ~~~~~~~~~~~~~~~~~~~
>> > Lisa Heinz
>> > PhD Student in Mass Communication
>> > E.W. Scripps School of Journalism
>> > Ohio University, Scripps College of Communication
>> > Twitter<http://twitter.com/livingrural>
>> > LinkedIn<http://linkedin.com/in/lisaheinz>
>> >
>> > _______________________________________________
>> > The Air-L at listserv.aoir.org
>> <javascript:_e(%7B%7D,'cvml','Air-L at listserv.aoir.org');><mailto:
>> Air-L at listserv.aoir.org
>> <javascript:_e(%7B%7D,'cvml','Air-L at listserv.aoir.org');>> mailing list
>> > is provided by the Association of Internet Researchers http://aoir.org
>> > Subscribe, change options or unsubscribe at:
>> > http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>> >
>> > Join the Association of Internet Researchers:
>> > http://www.aoir.org/
>> _______________________________________________
>> The Air-L at listserv.aoir.org
>> <javascript:_e(%7B%7D,'cvml','Air-L at listserv.aoir.org');><mailto:
>> Air-L at listserv.aoir.org
>> <javascript:_e(%7B%7D,'cvml','Air-L at listserv.aoir.org');>> mailing list
>> is provided by the Association of Internet Researchers http://aoir.org
>> Subscribe, change options or unsubscribe at:
>> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>>
>> Join the Association of Internet Researchers:
>> http://www.aoir.org/
>>
>> _______________________________________________
>> The Air-L at listserv.aoir.org
>> <javascript:_e(%7B%7D,'cvml','Air-L at listserv.aoir.org');> mailing list
>> is provided by the Association of Internet Researchers http://aoir.org
>> Subscribe, change options or unsubscribe at:
>> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>>
>> Join the Association of Internet Researchers:
>> http://www.aoir.org/
>>
>
>
>
> --
> Dr. Stuart W. Shulman
> Founder and CEO, Texifter
> LinkedIn: http://www.linkedin.com/in/stuartwshulman
>
>

-- 
Dr. Stuart W. Shulman
Founder and CEO, Texifter
LinkedIn: http://www.linkedin.com/in/stuartwshulman



More information about the Air-L mailing list