[Air-L] four visualizations (and open data) from data mining online news at massive scale
Michael Herlihy
mherlihy at nla.gov.au
Mon Feb 27 16:50:20 PST 2017
Brilliant work - thank you Kalev
-----Original Message-----
From: Air-L [mailto:air-l-bounces at listserv.aoir.org] On Behalf Of Gohar F. Khan
Sent: Tuesday, 28 February 2017 6:30 AM
To: air-l at listserv.aoir.org; kalev leetaru <kalev.leetaru5 at gmail.com>
Subject: Re: [Air-L] four visualizations (and open data) from data mining online news at massive scale
Interesting stuff Kalev, thanks for sharing it.
Cheers,
GFK
On Tue, 28 Feb 2017 at 6:48 AM kalev leetaru <kalev.leetaru5 at gmail.com>
wrote:
> Apologies for cross-posting - I thought many of you would find of
> interest
>
> four of my latest pieces on what we can learn about the global
> structure of
>
> the news media landscape through massive mining of online news, and
> given
>
> that the underlying datasets are all open, that these might offer
> great
>
> starting points for many other research questions.
>
>
>
> In particular, two of the analyses rely on applying deep learning
> image
>
> cataloging to more than a quarter billion global news photographs from
> last
>
> year, one examining visual geocoding and the other looking at semantic
>
> visual clustering using the assigned labels.
>
>
>
> One explores what it looks like to combine multilingual textual
> geocoding
>
> and sentiment analysis (both covering 65 languages) to process a
> quarter
>
> billion news articles and 2.2 billion location mentions to map "global
>
> happiness" as seen through the eyes of the world's online news media.
>
>
>
> The final leverages visual document extraction to compile three
> quarters of
>
> a billion outlinks from 121 million articles over the last 10 months
> and
>
> uses that link graph to explore how global media outlets link to each
>
> other. What makes this particular analysis distinct is both the global
>
> scope (crossing all countries and 65 languages) and the use of the
> article
>
> link graph rather than the page link graph as is traditionally done
> (ie
>
> looking at only the links in the article text itself, rather than the
>
> myriad links found in the rest of the surrounding page, such as
>
> headers/footers/advertisements/etc).
>
>
>
> Thought these might be of interest re what it looks like to apply
> these
>
> techniques at scale and with a globalized scope and the open
> availability
>
> of the underlying computed datasets to enable all kinds of other
> research
>
> on online news.
>
>
>
>
>
>
> http://www.forbes.com/sites/kalevleetaru/2017/02/27/creating-a-massive
> -network-visualization-of-the-global-news-landscape-who-links-to-whom/
>
>
>
>
> http://www.forbes.com/sites/kalevleetaru/2017/02/25/what-does-artifici
> al-intelligence-see-in-a-quarter-billion-global-news-photographs/
>
>
>
>
> http://www.forbes.com/sites/kalevleetaru/2017/02/21/visual-geocoding-a
> -quarter-billion-global-news-photographs-using-googles-deep-learning-a
> pi/
>
>
>
>
> http://www.forbes.com/sites/kalevleetaru/2017/02/22/mapping-global-hap
> piness-in-2016-through-a-quarter-billion-news-articles/
>
>
>
>
>
> ~K
>
> http://kalevleetaru.com/
>
> http://blog.gdeltproject.org/
>
> _______________________________________________
>
> The Air-L at listserv.aoir.org mailing list
>
> is provided by the Association of Internet Researchers http://aoir.org
>
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
>
>
> Join the Association of Internet Researchers:
>
> http://www.aoir.org/
_______________________________________________
The Air-L at listserv.aoir.org mailing list is provided by the Association of Internet Researchers http://aoir.org Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
Join the Association of Internet Researchers:
http://www.aoir.org/
More information about the Air-L
mailing list