[Air-L] New COVID-19 social media data release

kwfu kwfu at hku.hk
Fri May 1 22:23:48 PDT 2020


Dear all,

I am delighted to release a Sina Weibo dataset containing 1.2 million COVID-19 related posts collected between December 1, 2019 and February 27, 2020 by Weiboscope. It also includes 2,104 censored weibos. I hope the data release can support global researchers to investigate how Chinese government controlled online information in the early stage of the pandemic.

https://doi.org/10.6084/m9.figshare.12199038

Here is the data description.

COVID-19 related Weibo Data from "Did the world overlook the media’s early warning of COVID-19?"

Between December 1, 2019 and February 27, 2020, Weiboscope collected 11,362,502 posts, among which 1,230,353 contain at least an outbreak-related keyword (please refer to the paper) and 2,104 (1.7 per 1,000) have been censored.

Data fields:
Column 1: "created_at": date of publication
Column 2: "censorship_type": directly censored (return of "permission_denied") or retweet of censored post ("retweet of a “permission denied” post")
Column 3: "id_hashed": hashed post ID
Column 4: "retweeted_status_hashed": hashed retweet status
Column 5: "text_cleaned": text body with all @XXX mentions removed

Please cite the reference
King-wa Fu & Yuner Zhu (2020) Did the world overlook the media’s early warning of COVID-19?, Journal of Risk Research, DOI: 10.1080/13669877.2020.1756380

King-wa Fu
Journalism and Media Studies Centre
The University of Hong Kong


More information about the Air-L mailing list