[Air-L] suggestions for Twitter aggregating & analytic tools

Axel Bruns a.bruns at qut.edu.au
Fri Sep 28 17:59:39 PDT 2012


G'day !

Erica asked:

> I was wondering if anyone can suggest particular tools for aggregating and
> analyzing Twitter content.

Maybe I'm old-school on this, but I'm surprised no-one's mentioned yourTwapperkeeper yet - in my experience, very straightforward to set up (all you need is a standard LAMP server setup to run it on), and fine for most standard Twitter capture tasks (e.g. tracking hashtags, keywords, specific users, etc.). It's open source and available here:

https://github.com/jobrieniii/yourTwapperKeeper

We've made some modifications to more easily export datasets in CSV/TSV-format datasets - see details here:

http://mappingonlinepublics.net/2012/01/09/twapperkeeper-and-beyond-a-reminder/

Personally, I don't trust most out-of-the-box Twitter analytics tools, and prefer to roll my own - for processing CSV/TSV datasets containing Twapperkeeper-format data, I've been using the scriptable command-line tool Gawk with great success. A collection of Gawk scripts for standard Twapperkeeper data processing tasks is available under a Creative Commons licence here:

http://mappingonlinepublics.net/2011/06/22/gawk-scripts-for-processing-twitter-data-vol-1/

Additionally, my 'Swiss army knife' Gawk script for extracting activity metrics from a Twapperkeeper dataset is here:

http://mappingonlinepublics.net/2012/01/31/more-twitter-metrics-metrify-revisited/

The question of developing standard, case-independent metrics for the description of Twitter activity patterns is something Stefan Stieglitz and I are taking up in two forthcoming papers (happy to share drafts - email me off-list). The keynote which Jean Burgess and I presented at the recent Conference on Science and the Internet foreshadows some of this discussion, though:

Axel Bruns and Jean Burgess. "Notes towards the Scientific Study of Public Communication on Twitter." Keynote presented at the Conference on Science and the Internet, Düsseldorf, 4 Aug. 2012. http://snurb.info/files/2012/Notes%20towards%20the%20Scientific%20Study%20of%20Public%20Communication%20on%20Twitter.pdf - and the slides and video of the presentation are here: http://snurb.info/node/1678

Detailed notes on how we use these scripts to process Twitter data, and additional processing tools, are also on our Website - see http://mappingonlinepublics.net/category/twitter/ for more.

For network visualisation, I recommend the open source software Gephi. My article in Information, Communication & Society describes how I've used yourTwapperkeeper, Gawk and Gephi to create dynamic visualisations of Twitter conversation networks:

Axel Bruns. "How Long Is a Tweet? Mapping Dynamic Conversation Networks on Twitter Using Gawk and Gephi." Information, Communication & Society, 17 Nov. 2011. http://dx.doi.org/10.1080/1369118X.2011.635214


For more sophisticated, 'big data' research (i.e. upwards of a few million tweets per dataset), the yourTwapperkeeper approach is less useful (the LAMP framework just isn't built for big data), and you'll probably need to build your own customised solution. Eugene Liang and I discuss the pros and cons of both approaches in a recent article in First Monday (while we frame this in a crisis communication context, the discussion applies well beyond this):

Axel Bruns and Yuxian Eugene Liang. "Tools and Methods for Capturing Twitter Data during Natural Disasters." First Monday 17.4 (2012). http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/viewArticle/3937/3193


Hope that helps.

<insert obligatory plug for the "Digital data - lost, found, and made" panel at the upcoming AoIR conference, where I'm sure we can discuss the question of Twitter research methods some more as well>


--
Dr Axel Bruns              http://snurb.info/ - http://produsage.org/
ARC Centre for Creative Industries and Innovation  http://cci.edu.au/
Associate Professor, Media & Communication         a.bruns at qut.edu.au
Creative Industries Faculty, Z1-515, CIP     Twitter: @snurb_dot_info
Queensland University of Technology                    +61 7 31385548
Musk Ave, Kelvin Grove, Qld. 4059, Australia       CRICOS No.: 00213J




More information about the Air-L mailing list