[Air-L] sentiment analysis on Twitter for smoking cessation groups

Ashley Nicoles Sanders-Jackson asnsande at stanford.edu
Fri Feb 14 07:34:58 PST 2014


I am beginning a project for which we have 4 private smoking cessation
groups (3 months each) to analyze (we are writing a grant to collect more
data as well).  I have seen some of the work related to sentiment analysis
on Twitter but I am interested in developing a system that is better
tailored to our data (e.g. being smokefree has specific meaning in this
context).  We are therefore considering developing a system by content
coding a number of tweets (either having researchers code them or by
having smokers code them) for positive and negative valence and perhaps
some discrete emotions (e.g. sadness or hope).  How many coded tweets
would we need to train a simple machine learning system on our dataset
(for example one of the many possibilities in R) and what are the best
out-of-box programs to use? I know a bit about content analysis and about
smoking cessation but not so much about machine learning.  So bear in mind
that you are dealing with a novice.  Actually, if anyone would be
interested in collaborating on the project who actually does know what
they are doing, they would be welcome as well.


More information about the Air-L mailing list