[Air-L] Summary of Responses to Two Questions about Directory of Organizational Social Media and YouTube Accounts

Ronald Rice rrice at comm.ucsb.edu
Wed Sep 9 13:22:13 PDT 2020


Hi folks...

As usual (but not taken for granted), AoIR folks have provided a variety of
insightful and useful comments and resources about my two prior requests.
One was about if there was a Twitter directory where you could enter a list
of official organization names and get their main twitter account listing,
and the second was the same about YouTube. Also as usual, what appeared to
be a simple question can involve a lot of issues.

- Twitter Username Extractor:

https://zackproser.com/software/username-extractor/

This splashpage explains how this uses quite detailed (and dark humorish)
procedures.

You log in with a Twitter account.  We tried it, but it seems like it’s no
longer working.

- Maurice RM Vergeer: “yes it can be done, using R and the package rtweet.
As for the YouTube question in the other, a similar approach could be done
with R and the package Tuber. It probably needs a "do for" loop. Not sure
rtweet (beware, technical lingo ahead) is vectorized for this problem. A
loop will take some time though, given the large number of organizations.
Furthermore, because one query will return multiple results, some
semi-manual evaluation needs to take place to asses which account is the
actual account. But, anyone with some experience with R could do it”

- Nicole Lemire Garlic: As Maurice mentioned, the R tuber package can
easily pull the lists of videos from YouTube. There are vectorized and
loop-based methods for this. Feel free to email me if you’d like some
sample R scripts.

- Stu Shulman: “I would add that Maurice points to the non-trivial task of
disambiguation when an organization name overlaps terms in common usage.
For example, United Airlines is an organization, but it is most commonly
referred to as United. Manchester United is a very popular football
organization, most often referred to as United. The list of other
widespread uses of this common organization name sums up the
disambiguation problem. It can be done with training and machine-learning,
but not for 2000 terms unless you have an army of workers and lots of
money. That suggests a second point, essentially that the practical steps
required to gather data for 2000 organizations over time and remain
compliant with rate and query limits would be daunting. You might consider
trying the task with 5 organizations to assess the challenge of performing
the task at scale. Finally, from the view of qualitative research,
depending on your end goals, you may not need such a huge number of
organizations to reach saturation during analysis. That is, say you looked
at 50 organizations and then noticed on 51-60 that you were not learning
much you had not already learned. That is saturation. “

- Ed Summers: Based on our several email discussions, he developed a simple
program (called luckysocial) that can take a list of official organization
names, and either find their website through Google and then go to the
website, or, if you have their website enter that directly, to identify
their Facebook, Twitter, Instagram, YouTube, and RSS accounts. We ran that
using 2000+ nonprofit organization names and it worked great!
https://github.com/edsu/luckysocial#readme

- Muira McCammon: Muira has done a lot of work identifying and tracking US
Federal and state government Twitter accounts. She also writes: If you do
want to get technical and/or exhaustive about this, it may be worth using
Wayback to double check that the social media accounts currently associated
with an org weren't preceded/predated by others. Many orgs these days will
have one primary Twitter account but then will launch smaller accounts
related to specific initiatives/campaigns/etc. A lot of orgs these days
aren't updating their homepages webpages to reflect the full extent of
their social media presence.

- Peter Timusk: Web sites tend to be heterogeneous and canned CMS sites
with very little CSS or HTML5 exposed to gather.

- Jakob Jünger: Interestingly, though identifying accounts is a common
task, no common protocols or methods have evolved so far. Cross-checking
web links, directories, social media search and so on is definitely
necessary to get high quality lists. Each of the source brings its own
bias. Both of your questions can be solved using Facepager, see the getting
started tutorials on the wiki: https://github.com/strohne/Facepager. I
(Ron) checked this out and it’s quite a capable program, with lots of good
documentation, more of a general interface for using APIs to collect data
from various social media and YouTube. Learning this could be a good
investment of time and effort.



Thanks to all!
-- 
Ronald E. Rice
Arthur N. Rupe Professor in the Social Effects of Mass Communication
Department of Communication
4127 SS&MS Bldg
Santa Barbara, CA 93106-4020
805-893-8696; rrice at comm.ucsb.edu
https://www.comm.ucsb.edu/people/ronald-e-rice
[image: UC Santa Barbara]



More information about the Air-L mailing list