[Air-L] Trust Games - Annotation Tasks

Shulman, Stu stu at texifter.com
Wed Jul 17 05:18:58 PDT 2024


Follow Up

I am grateful to the list for generating excellent people willing to work
on projects. Already we have a nice group forming with novel exchanges of
ideas in every meeting. Collaboration like this is fun and hopefully the
game, Trust Defender, will be as well.

I do need more students, including some willing to label Twitter data as
soon as today. I am testing a new approach to remunerating "micro" tasks
that are limited to 5 minutes. I need to know the average speed, as well as
the approximate best task size, and how fast the best human annotators can
record reliable observations. I just did a small test and my rate was 1.45
seconds per item. For this initial "live-deleted-suspended" task, the key
is a combined speed and accuracy score. If you make a keystroke error, you
need to go back and fix it, which slows you down. If you let mistakes go
unfixed, we will find them using measurement against a known gold standard
during the initial rounds of training on this task.

The base remuneration is $20 for completion of tasks that have a total of
40 minutes allotted (8 tasks) during a roughly 24 hour period. It is
definitely possible to complete all of these current tasks with close to
100% accuracy in less than 40 minutes. I am planning to start with a
requirement for labeling 60 items in five minutes (5 seconds per item) in
the early rounds. Then we will test the effect of increasing the number of
items to 75 and 100 in later rounds. I'm trying to work out a prize system
for the fastest and most accurate annotators over time. We have recently
enhanced our IRB compliance architecture for this work. AS previously,
deleted and suspended Tweets are not visible and now all Tweet metadata are
also hidden from annotators. Our datasets have, in some cases, more than
70% of the items suspended. In these cases, it can be fascinating (also
shocking) and definitely educational to see what is still live on Twitter.

One goal is to better understand how DiscoverText can be further modified
to have default, ready-to-play games, as well as the traditional framework
where it is simply a flexible tool, like a spreadsheet. Leaders of the
games will create new tasks, codes, rules, parameters, code books,
assignments, peer groups, and focus the students. Much of the content of
the games can be created by the professors or teams of students. Our role
now is creating archetypal game formats  such as the deductive "live,
deleted, suspended" or a key new hybrid deductive/inductive "trust, don't
trust, need more info" game that has nothing to do with speed or accuracy.
More fully inductive educational games such as "bot, troll, or citizen" are
possible. Small adaptations to the existing crowdsource software will make
the "labeling" task more game-like. Students will train or play on default
games out-of-the-box that are easy to launch, but professors will also
adapt the framework with very minor training to define parameters for their
own games. I will be working with professors to design rubrics so that
students can play very short games in the fall then use the experience as
fodder for discussion.

As a reminder: DiscoverText is 100% free for academic research and
teaching. Anyone using DiscoverText to test or implement these annotation
games can also pursue their own independent research agenda with
other academics. We are creating a process that will streamline the
onboarding of entire classes to enable "quick launch" games. There will be
leaderboards and anonymization options.

https://discovertext.com/mentions/
https://calendly.com/discovertext

Stu




On Thu, Jul 11, 2024 at 8:38 AM Shulman, Stu <stu at texifter.com> wrote:

> Trust Games
>
> I am looking for collaborators to help prepare a free educational online
> game suitable for secondary and collegiate classrooms focused on whether
> content is trustworthy or not. Please contact me if you want to be a part
> of this effort. It might operate something like the "Which Face is Real"
> application, but might be used instead for identifying and discussing
> untrustworthy accounts on Twitter as a gamified learning module for classes
> this fall. I have most of the pieces ready, but I am not an expert in
> games. I'd like to form an ad hoc team and have this operational for
> September and October of 2024. My goal is to offer an IRB-compliant game
> platform that generates usable research results and better informed student
> discussions in advance of the U.S. election in November.
>
> Annotation Tasks
>
> I have a new set of annotation tasks related to planning for the game
> development. I need motivated undergraduates willing to label batches of
> Tweets under conditions that test core features of gamification in
> labeling, starting with speed and accuracy. In addition to getting paid
> more for being the fastest/most accurate labelers, students will see some
> remarkable datasets that have historical significance. If you know Jr. or
> Sr. undergraduates in the US or Canada with a >3.9 GPA, tell them to send
> me a resume.
>
> Thanks AoIR!
>
> --
> Dr. Stuart W. Shulman
> Founder and CEO, Texifter
> Editor Emeritus, *Journal of Information Technology & Politics*
>
>
>

-- 
Dr. Stuart W. Shulman
Founder and CEO, Texifter
Editor Emeritus, *Journal of Information Technology & Politics*
ResearchGate Profile <https://www.researchgate.net/profile/Stuart-Shulman>



More information about the Air-L mailing list