[Air-L] emotion detection machine?

kalev leetaru kalev.leetaru5 at gmail.com
Thu Sep 5 07:00:40 PDT 2019


Charles, given that there are myriad sentiment tools out there, ranging
from traditional presence/absence dictionary-based through value-based
dictionary through statistical and now neural approaches and both
commercial and academic systems, the most important methodological
consideration is the alignment of the specific tool's performance
characteristics with both the medium (in your case a newspaper), the
language/grammar (the formality and editorial structure of the paper and
how closely that aligns with the tool's training dataset), the era (many
tools are only updated periodically and/or were trained on content from a
specific period and can have severe mismatches even for MSM), the domain
(most tools are not domain-adapted and this can cause severe problems in
certain domains, such as when a news source refers to "the democratic
party" and just "republicans" with "party" systematically yielding a more
positive score for the former), and most importantly, the definition of the
specific emotion (ie, there is no universal "anxiety" score).

Typically this involves reviewing the validation studies for each potential
tool and comparing them along these dimensions, though methodologically the
definition of the specific measure is a very important piece that is often
missed.

With GDELT, we've run 40 common tools totaling a few thousand dimensions
over around a billion or so global news articles (
https://blog.gdeltproject.org/?s=gcam), including multilingual versions of
some of the tools to test the impact of various translation approaches on
emotional recovery, so you can compare dimensions and see, for example, how
different "anxiety" dimensions compare for outlets and domains most similar
to yours - often the differences in the response curves can be a quite
informative signal for some applications as well. All of the scores are
open data, so you can get a sense for how they respond.

Kalev



On Thu, Sep 5, 2019 at 6:52 AM Charles M. Ess <c.m.ess at media.uio.no> wrote:

> Dear colleagues,
>
> One of our students is wanting to analyze emotional content in in the
> comment fields of a major newspaper vis-a-vis specific hot-button issues.
>
> She has a good tool (I think) for scrapping the data - but she is
> stymied over the choice of an emotion analysis tool. She has looked at
> Senpy (http://senpy.gsi.upm.es/#test) and  Twinword
> <https://www.twinword.com/api/emotion-analysis.php> - the latter seems
> the most accurate, but it is also expensive.
> She has recently discovered DepecheMood emotion lexicons (Staiano, J., &
> Guerini, M. (2014). Depechemood: a lexicon for emotion analysis from
> crowd-annotated news. arXiv preprint arXiv:1405.1605.) - but this
> suffers from a lack of clarity in terms of explaining its emotional
> categories: awe, indifference, sad, amusement , annoyance, joy, fear and
> anger.
>
> For my part, I am entirely clueless.  Any suggestions that she might
> pursue would be greatly appreciated.
>
> best,
> - charles ess
> --
> Professor in Media Studies
> Department of Media and Communication
> University of Oslo
> <http://www.hf.uio.no/imk/english/people/aca/charlees/index.html>
>
> Postboks 1093
> Blindern 0317
> Oslo, Norway
> c.m.ess at media.uio.no
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/
>



More information about the Air-L mailing list