[Air-L] Texifter News Digest

Shulman, Stu stu at texifter.com
Sun Feb 11 04:04:05 PST 2018

Bot Project Update

In early January we advertised on the AoIR listserve for coders. More than
50 people applied (in just 90 minutes) and we offered work to ten. A small
group of coders ultimately began exploring the bot problem in Twitter data.
What resulted is a work in progress. It is intended to be a collaborative
resource for anyone interested in the challenge that bots pose to academic
research, democracy, civil society, data-driven journalism, and any other
realm of life where bots are causing issues. We invite your comments about
the project.


New Paper of Note

Many DiscoverText studies feature Twitter data. It was nice to see an
ambitious new coding project emerge using Reddit data. Please have a close
look at the methods details here as they are illustrative of the
collaborative challenges when labeling any text data.


CoderRank Conference Paper

For the first time in several years, we have written our own conference
paper about the core collaborative annotation methods that underly our
approach to humans and machines learning together. You can read it here and
we welcome comments.


Important New Paper on the Twitter APIs

Rebekah Tromble and Daniela Stockmann have a superb new paper out shedding
light into the dark corners of the Twitter Search and Streaming APIs. This
is an absolute must read paper for anyone who uses Twitter data in academic
research. They note: "List count, verified, hashtag count, and reply are
statistically significant and positively correlated with the likelihood of
Search API capture." http://bit.ly/2nVtXTS

Gnip Twitter Metadata Dictionary

We get this question in many forms: what do the labels in Gnip Twitter
metadata mean? As one of the original group of Gnip Twitter data vendors,
we have witnessed the amazing evolution of this data. Sometimes our
DiscoverText users are looking to create a filtering “rule” to apply
against the realtime Gnip PowerTrack. Other times, they have used Sifter to
collect historical Twitter data and they are trying to interpret the
metadata payload that includes field names like “twitter quoted status
actor followers count.”


Free Web Training Sessions

If you would like to schedule a 30-minute DiscoverText web briefing, we
have a number of openings over the remainder of February. Schedule a
meeting here.


Campus Visits

Like other small business owners with academic software solutions, I make
regular visits to universities to offer free methods workshops. Attendees
get a sponsored multi-user account for 6 months at no cost. The methods are
solid, interdisciplinary innovations that will change the way you think
about the role of human coders and machine-learning. Please send an email
to info at texifter.com if your librarians at your university would like to
host a cost-free event.

If there is anything we can do to help with your text analytics needs, just
let us know. There are a number of new features emerging in the software
every month.




Dr. Stuart W. Shulman

Founder and CEO, Texifter

More information about the Air-L mailing list