Fellow AoIRistas,

Hope you've all recovered well from the excitements of #ir16. I was inspired by the research I've seen in Phoenix to try something slightly different: I'd like to invite you all to participate in a data crowdsourcing experiment.

A few years ago, Stefan Stieglitz and I wrote a paper that compared some basic stats on a selection of Twitter hashtag datasets to develop a typology of hashtag uses, and we found some clear distinctions especially between media event and crisis event hashtags (preprint article here: http://eprints.qut.edu.au/55823/ - see esp. fig. 4). 

I'd now like to update this paper, with a substantially larger number of datapoints - and while I have plenty of new datasets of my own to include in the updated paper, I'm keen to include as many (and as many different) datapoints as possible. Which is where you all come in: if you're willing to participate, I would like you to send me the following datapoints for your own hashtag (or keyword) archives:

 - Hashtag(s) or keyword(s) used to capture the dataset
 - Timeframe of capture (from/to date)
 - Total number of tweets
 - Total number of tweets containing URLs - using the regular expression /http/
 - Total number of tweets containing retweets - using the regular expression /(\"@|RT @|MT @|via @)[A-Za-z0-9_]+/

To make gathering these data easier, Fabio Giglietto has helpfully created a Google Form for this purpose: http://goo.gl/forms/LR1nJmH39p

If you're willing to participate, could you calculate these datapoints for your own datasets and share them through the Google Form ? In return, I promise to include you as a contributing author in the paper we'll eventually develop from this exercise. It would be fabulous if we could use this crowdsourced approach to generate a set of datapoints covering a diverse range of hashtags - please help!

