[Air-L] new dataset using google's ai to annotate a week of tv news from the internet archive

kalev leetaru kalev.leetaru5 at gmail.com
Tue May 21 10:09:15 PDT 2019


For those interested in deep learning video annotation, we worked with the
Internet Archive to analyze a week of television news from CNN, MSNBC and
Fox News, along with the morning and evening broadcasts of ABC, CBS, NBC
and PBS and analyze them using Google's Video AI, Vision AI, Speech-to-Text
and Natural Language APIs. The week we chose includes both the Notre Dame
fire and the Mueller report release, capturing how major international and
national stories are covered.

While the video and transcript files themselves are not available, all
614GB of deep learning annotation files are available for download,
cataloging the visual and spoken narratives and including the equivalent of
a reverse Google Images search for each 1fps frame.

We're particularly interested in how this dataset might jumpstart thinking
around multimodal efforts to combat misinformation/disinformation/etc
around connecting the television and online worlds.


The full annotations dataset is available for download:

https://blog.gdeltproject.org/ai-watching-television-news-deep-learning-meets-a-week-of-television/

A higher-level summary:

https://www.forbes.com/sites/kalevleetaru/2019/05/21/what-does-ai-see-when-it-watches-a-week-of-television-news/


Kalev



More information about the Air-L mailing list