[Air-L] mining images in social sciences

Fri Mar 22 06:29:11 PDT 2019

Alina, if the imagery dimension is more important than the social media
dimension (and noting that media often republishes the most iconic imagery
from social), our open data catalog of global online news imagery
2015-present (around half a billion images totaling a quarter trillion
pixels and around 300 billion computed datapoints) might be of great
interest:

https://blog.gdeltproject.org/vgkg-2-0-released/

Each day it scans the images from online news coverage worldwide and
selects a random sample of around 700K images a day to run through Google's
Cloud Vision API to create a full annotated metadata record with the URL of
the image, the URL of the first article it was seen in, and huge amount of
computed metadata, from 10K+ visually assigned labels, 2M+ entities
computed from its textual captions everywhere it appeared on the web across
languages, its estimated geographic location if recognizable, whether it is
likely to depict violence, the average facial emotion, OCR of all text in
the image in their respective languages (this includes protest signs), all
EXIF/IPTC/XMP metadata in the file itself, all of the other locations on
the web the image or any piece of it appears (essentially a reverse Google
Images search) and a huge range of other attributes. This can be linked
against our main knowledgegraph to find all of the other articles the image
appeared in, allowing you to compare textual and visual narratives.

Kalev

On Fri, Mar 22, 2019 at 8:43 AM Alina Curticapean <
alina.curticapean at gmail.com> wrote:

> Dear fellow members,
>
> I am planning a research project which aims to study political
> participation by visual means. More precisely, the project plans to collect
> images related to e.g. political protest, immigration, animal rights from
> social media and use computational methods to mine them. I am a real novice
> in what concerns mining images in social sciences and I would need your
> help to guide me to the relevant literature. I would just add that text
> data will be collected as well, but I am quite familiar with mining methods
> for text data (NLP).
>
> Looking forward to helpful advice to get me started with the project I wish
> you all a nice weekend!
>
> With the best wishes,
>
> Alina
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/
>