[Air-L] "Big Data" Tools

kalev leetaru kalev.leetaru5 at gmail.com
Wed Apr 15 14:06:48 PDT 2015


One of the biggest issues that I see on a daily basis in the policy world
is that the vast majority of "big data" work (and even "little data" work)
are based primarily or exclusively on English-language and/or Western data
sources and attempt to use such sources to make arguments about current
events, narratives, and emotions in the non-English non-Western world.
There are simply far more tools available for performing analysis of
English material than there are for Swahili, for example, or even Arabic,
and bilingualism is not as prevalent in many areas of study, so I end up
seeing an incredible number of studies based on English-language content
about non-English speaking areas of the world.  Similarly, Twitter has
become the go-to dataset for social media studies even as Facebook, Weibo,
VK, Viber, WhatsApp, etc, offer better access to certain communities or
modalities, but don't offer the same easy firehose API and tool ecosystem,
so researchers go with the easier path rather than focusing on which
platform might offer the best access to the the community or phenomena they
are trying to measure.

This is something that needs a great deal more attention in the
quantitative and "big data" spaces.  Two of my Foreign Policy columns on
this topic may be of interest re just how much our understanding of the
world is skewed through this fixation on English Western sources.  My most
recent one, out this afternoon, explores how our understanding of global
terrorism trends is based almost exclusively on English-language news
coverage and how that has influenced our understanding of trends:

http://foreignpolicy.com/2015/04/15/why-we-cant-just-read-english-newspapers-to-understand-terrorism-big-data/

http://www.foreignpolicy.com/articles/2014/09/26/why_big_data_missed_the_early_warning_signs_of_ebola


~K



L [mailto:air-l-bounces at listserv.aoir.org] On Behalf Of Matthew Weber
> Sent: Thursday, April 09, 2015 11:08 PM
> To: air-l at listserv.aoir.org
> Subject: [Air-L] "Big Data" Tools
>
> AIR’ers:
>
> I’m working on compiling a rough list of tools and training modules that
> are useful for working with large-scale datasets (“Big Data”) and training.
> Essentially, I’m trying to build *something* that I can point newbies /
> graduate students / to when they say “I want to do Big Data”. I’ve got a
> rough list of coursera / edX / blog modules, but would welcome suggestions.
> I’m happy to share back the results.
>
> (I did try to check the AIR archive, but was unable to access).
>
> Thanks!
> Matt
>
>
>
>
> Matthew S. Weber
> Assistant Professor
> School of Communication and Information
> Rutgers University
>
> (ph): 848-932-8718
>
>
>
>
>
>
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list is provided by the Association
> of Internet Researchers http://aoir.org Subscribe, change options or
> unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/


More information about the Air-L mailing list