[Air-L] Advice on ripping a twitter feed

Cornelius Puschmann cornelius.puschmann at uni-duesseldorf.de
Mon Apr 6 13:00:31 PDT 2009


I have an untagged (plaintext) corpus of tweets from about 27k users (don't
have a word count but I can check) that I could share in anonymized form
(it's originally a conference dataset). Together with something like TextLAB
(http://www.niederlandistik.fu-berlin.de/textstat/software-en.html) that
works pretty well if you don't want to do anything terribly sophisticated.

Best,

Cornelius Puschmann, PhD
University of Duesseldorf


On Fri, Apr 3, 2009 at 9:15 PM, Fenwick Mckelvey <mckelveyf at gmail.com>wrote:

> Hi,
> The Infoscape Lab has some experience tracking twitter feeds that I
> would like to share. First, twitter feeds are RSS feeds. As a result,
> a number of tools exists to archive an analysis content, like the
> Coding Analysis Toolkit. We use a RSS aggregator called Gregarius
> (http://gregarius.net/)  to collect Twitter feeds and save them in a
> database. Importantly, Twitter moves faster than blogs. Most RSS
> aggregators  collect on an hourly or daily basis. We have had to
> manually refresh our aggregator  every 10 minutes to catch the flow of
> tweets in busy periods. Second, we have used hashtags as a way to
> sample discussion in Twitter. During our recent Canadian election, we
> tracked hashtags related to a leadership debate. Hashtags can be
> tricky because they change rapidly and seem to naturally emerge, so we
> also relied on a basket of users as well.
>
> You can see our Twitter coverage of the Canadian debates here:
>
> http://www.cbc.ca/news/canadavotes/campaign2/ormiston/2008/10/debate_hangover.html
> .
> We are pretty happy with this time-sensitive sample of Twitter because
> it captured how people flock to the site during important moments like
> the debates. If any one has any more questions about our perspective
> or method on Twitter, I'd be happy to help.
>
> All the best,
> Fen
>
> On Thu, Apr 2, 2009 at 6:50 PM, Stuart Shulman <stuart.shulman at gmail.com>
> wrote:
> > Very cool. Thanks Andrew!
> > This is a great time to send feedback about BAT. The purpose of the Blog
> > Analysis Toolkit is to establish a socially-constructed repository of
> blog
> > posts that are archived and accessible for research purposes. There are
> > about 250 BAT users at the present moment archiving about 200 blogs.The
> > posts are formatted in one of two ways to allow coding at the document or
> > paragraph level using another free software system, the Coding Analysis
> > Toolkit <http://cat.ucsur.pitt.edu/> (CAT). Once you join the system you
> > have access to all the archived posts and you can add new blogs to the
> > archiving process.
> >
> > We have just started a new programmer to improve the platform, which is a
> > free by-product of ongoing NSF-funded research. We want to increase its
> > functionality and usability, so AoIR members are strongly encouraged to
> let
> > us know what you want BAT to do in the future. We face challenges doing
> some
> > simple things, like getting the comments and the archives. If you know
> how,
> > perhaps how join the BAT development team.
> > The quick-start BAT tutorial is online at:
> >
> > http://www.screencast.com/t/OcRziCMg
> >
> > ~Stu
> >
> >
> >
> > On Thu, Apr 2, 2009 at 6:10 PM, Andrew Long
> > <ALong at infoscience.otago.ac.nz>wrote:
> >
> >> Incidentally, I have tried the Blog Analysis toolkit (see blelow) and it
> >> works fine.
> >> Grab the RSS feed from the right-hand side of the Twitter website and
> >> set this as
> >> the blog URL.
> >>
> >>
> >
> > --
> > Dr. Stuart W. Shulman
> > Assistant Professor
> > Department of Political Science
> > University of Massachusetts Amherst
> > 200 Hicks Way
> > Amherst, MA 01003
> >
> > http://people.umass.edu/stu/
> > stu at polsci.umass.edu
> > 413-545-5375
> >
> > Editor, Journal of Information Technology and Politics
> > http://www.jitp.net
> >
> > Director, QDAP-UMass
> > http://www.umass.edu/qdap/
> >
> > Associate Director, National Center for Digital Government
> > http://www.umass.edu/digitalcenter/
> > _______________________________________________
> > The Air-L at listserv.aoir.org mailing list
> > is provided by the Association of Internet Researchers http://aoir.org
> > Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> >
> > Join the Association of Internet Researchers:
> > http://www.aoir.org/
> >
>
>
>
> --
> Fenwick McKelvey
> PhD Student in Communication and Culture
> Ryerson / York Universities
>
> Research Associate
> Infoscape Research Lab
> http://www.infoscapelab.ca
>
> Research Associate
> VideoCom Research Initiative
> http://videocom.knet.ca
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/
>



-- 
Dr. des. Cornelius Puschmann, M.A.

Department of English Language and Linguistics / University of Düsseldorf,
Germany
University Library Center (hbz), Cologne, Germany

+49 211 811 5927 (office)
+49 176 811 78067 (mobile)
+49 211 139 566 84 (home)

www.ynada.com
www.elanguage.net



More information about the Air-L mailing list