[Air-L] Chinese language social media data mining tools

kwfu kwfu at hku.hk
Wed May 10 07:25:06 PDT 2017


I can imagine why it's hard to find good data mining tool in social science. One reason is the sampling issue. Sample representativeness is essential to most social science research (like probability-based phone survey or multi-stage stratified sampling). While sampling scheme varies across studies, customized code is often preferred.

We did random sampling on Weibo and Twitter. The methodology is provided in the following two PLOS ONE papers. Hope this helps.

Fu, KW, Chau M (2013) Reality Check for the Chinese Microblog Space: A Random Sampling Approach. PLoS ONE 8(3): e58356. doi:10.1371/journal.pone.0058356

Liang, H, & Fu, KW. (2015). Testing Propositions Derived from Twitter Studies: Generalization and Replication in Computational Social Science. PLoS ONE, 10(8), e0134270.doi:10.1371/journal.pone.0134270

King-wa Fu
Associate Professor, Journalism and Media Studies Centre, The University of Hong Kong 
Visiting Associate Professor 2016-2017, MIT Media Lab (Fulbright Scholar)
website: https://sites.google.com/site/fukingwa/

-----Original Message-----
From: Air-L [mailto:air-l-bounces at listserv.aoir.org] On Behalf Of Gillian Bolsover
Sent: Wednesday, May 10, 2017 9:33 PM
To: Stefania Vicari <s.vicari at sheffield.ac.uk>; Helen Kennedy <h.kennedy at sheffield.ac.uk>
Cc: air-l at listserv.aoir.org
Subject: Re: [Air-L] Chinese language social media data mining tools

As part of my PhD, I did a lot of research based on data collected from both Weibo and Twitter. Finding few existing, functional tools, I wrote custom python codes to download and process various sorts of data from both Twitter and Weibo, including a code to tokenize weibo posts.

Seeing this thread brings up an issue I have been thinking about in terms of how the community of Internet researchers work with code. Other academics I know who work in sciences share all their codes online (git hub etc.), have a practice of working together to debug this code and receive academic credit when their codes are used by others. I’ve seen very little of this in social science research. 

Are there any Internet researchers who share code they have created who could advise as to what their practices are in this regard? Is there any sort of standard among Internet researchers (and should there be) in terms of sharing code created for academic purposes with other academics? 

Gillian Bolsover
Researcher
Oxford Internet Institute
University of Oxford
PGP Key: 17EC60B3

________________________________________
De : Air-L [air-l-bounces at listserv.aoir.org] de la part de Stefania Vicari [s.vicari at sheffield.ac.uk] Envoyé : mardi 9 mai 2017 19:51 À : Helen Kennedy Cc : air-l at listserv.aoir.org Objet : Re: [Air-L] Chinese language social media data mining tools

It may be worth looking at: https://api.anacode.de/landing/

Best,
S

On 9 May 2017 at 16:58, Helen Kennedy <h.kennedy at sheffield.ac.uk> wrote:

> Hello clever AOIR folks
>
> Asking for postgrad students: any recommendations of social media data 
> mining tools that work on Chinese social media platforms / with 
> Chinese languages?
>
> Thanks!
>
> Helen
>
>
> --
> Professor Helen Kennedy, Chair in Digital Society Department of 
> Sociological Studies / Faculty of Social Sciences Elmfield, 
> Northumberland Road Sheffield S10 2TU
> T: 0114 2226488
> E: h.kennedy at sheffield.ac.uk
>
> LATEST ARTICLE: *'*The Feeling of Numbers: emotions in everyday 
> engagements with data and their visualisation 
> <http://journals.sagepub.com/doi/abs/10.1177/0038038516674675?journalC
> ode=
> soca>',
> *Sociology*, 2017.
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list is provided by the 
> Association of Internet Researchers http://aoir.org Subscribe, change 
> options or unsubscribe at: http://listserv.aoir.org/ 
> listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/




--
Stefania Vicari
Senior Lecturer in Digital Sociology
Programme Manager for the MA Digital Media and Society Department of Sociological Studies The University of Sheffield Elmfield, Northumberland Road Sheffield S10 2TU

Webpage:
http://www.sheffield.ac.uk/socstudies/staff/staff-profiles/stefania-vicari
Email: s.vicari at sheffield.ac.uk
Twitter: @stefaniavicari <https://twitter.com/stefaniavicari>

Recent paper: Vicari, S. & Cappai, F. (2016) Health Activism and the Logic of Connective Action <http://www.tandfonline.com/doi/full/10.1080/1369118X.2016.1154587>.
*Information,
Communication & Society* 19(11): 1653-1671.
_______________________________________________
The Air-L at listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org

Join the Association of Internet Researchers:
http://www.aoir.org/
_______________________________________________
The Air-L at listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org

Join the Association of Internet Researchers:
http://www.aoir.org/


More information about the Air-L mailing list