[Air-L] Academic replacements for TwapperKeeper.com?

Cornelius Puschmann cornelius.puschmann at uni-duesseldorf.de
Wed Feb 23 15:29:09 PST 2011


Thanks for the information about 140kit.com, I will definitely check it out.
I'm still wondering whether a more permanent solution can be found (funding
drying up in May doesn't sound too promising).

I have a simple BASh/cronjob-solution scripts to pull data from the API in
regular intervals, perhaps I should just go with that.

@Deen: you won't get whitelisted unless Twitter have changed their policy.
I've been turned down twice on the grounds that there is whitelisting for
applications only, not for academic research.

Best,

Cornelius

Am 23.02.2011 20:37 schrieb "Deen Freelon" <dfreelon at u.washington.edu>:

I would also be curious to know what others have been using or plan to use
for harvesting Twitter data. I've used both TwapperKeeper and 140kit, and
found that the latter is quite good for hashtag archiving, but not as good
at keyword archiving. Further, 140kit has a max scrape time of one week,
although that is manually renewable I believe. Finally, both TK and 140kit
can be quite slow and even unavailable at times, and as we've just seen they
may shut down at any time.

All of this has made me quite wary of relying on externally managed "clouds"
for data collection. That is why I intend to set up my own Twitter
harvesting operation for use within my own department, as many CS
researchers do, and would encourage others with the necessary means and
knowledge to do the same. Much valuable data can be collected even within
the default API query limits, though I'll certainly ask Twitter to put me on
the whitelist. Running one's own archiving operation is fairly cheap, and
since you're only archiving your own data, you aren't hamstrung by hundreds
of other jobs running simultaneously.

If there's any interest in learning how to set up small-scale Twitter
scrapes, let me know and I'll write something up when I have the time. Best,
~DEEN



On 2/23/11 11:18 AM, Matt Munley wrote:
>
> Cornelius,
> How well would something like 140kit (htt...
-- 
Deen Freelon
Ph.D. Candidate, Dept. of Communication
University of Washington
dfreelon at uw.edu
http://dfreelon.org/




_______________________________________________
The Air-L at listserv.aoir.org mailing list
is provi...



More information about the Air-L mailing list