[Air-L] Subject: Re: [External] Re: Buying tweets ?

Thu Sep 10 16:15:50 PDT 2020

Hello
I am genuinely curious about how the ethics of research on available
personal data is implemented. As a relatively new academic I would love to
do this type of research but see many ethical hurdles. I have stuck to the
organizational level rather than looking at data at the level of
individuals, which to me is fraught with ethical data extraction and
exploitation issues.

When Clearview AI used data on publicly available images, many of us said
it was unethical.  I have always wondered how a research study such as the
infamous gaydar experiment passed ethics protocols at a reputable post
secondary academic institution.
https://thenextweb.com/artificial-intelligence/2018/02/20/opinion-the-stanford-gaydar-ai-is-hogwash/

And from studies such as the one done recently by Mozilla
https://www.zdnet.com/article/mozilla-research-browsing-histories-are-unique-enough-to-reliably-identify-users/
indicating an individual is identifiable for 50-150 favorite sites,
"no exposure of any identifying information" is a meaningless phrase.

Sincerely
Ushnish Sengupta

Message: 3
Date: Thu, 10 Sep 2020 11:04:36 -0400
From: Deen Freelon <dfreelon at gmail.com>
To: "air-l at listserv.aoir.org" <air-l at listserv.aoir.org>
Subject: Re: [Air-L] [External] Re: Buying tweets ?
Message-ID: <83ec2d80-c91c-ea9e-6f2b-395297079452 at gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed

Sure, many countries have a right to be forgotten. The US doesn't, and
AFAIK there's little clear case law that applies to individuals'
presence in research datasets. If someone asks me to remove their data
from my datasets, I'm happy to do so, but I'm not willing to
prospectively monitor Twitter's platform for deletions so that my
datasets always match what is currently available on Twitter. That is
technically infeasible for me, and I suspect for many others as well.

The practicality aspect I mentioned applies also to users. You can ask
AIR-L members to remove your data, but what assurances do you have that
they've done so? It's impossible even to check that they've actually
read your message. Now consider all the other researchers' datasets of
which your data may be a part--there's no way to even know who to ask.
And all of this to prevent your data from being one point among
millions, with no exposure of any identifying information? It's little
wonder yours is the first data removal request I've ever received, but
as I said, I'll honor it. /DEEN

On 9/10/2020 10:49 AM, Stuart Shulman wrote:
> There is nothing theoretical about checking in real time for
> deletions. When you study a Tweet's content in the Twitter display, if
> a Tweet is deleted or an account suspended or deleted, the Tweet will
> not display. That is real time compliance. We have done it for many
> years now, all the while advising students and faculty on the
> ethical?importance of this point.
>
> The "right to be forgotten"?is law in many countries, so I am unsure
> how that is unresolved. Something is either legal or it is not. If
> anyone reading this has any of my deleted Tweets from my deleted
> account, the Canadian part of me requests you immediately delete them.
> If you lack the ability to check for compliance in real time, should
> you be handling my data and violating my right to be forgotten under
> the broad banner of research? I have tweeted extensively about acts by
> a hostile foreign power to game the imminent election. I have recently
> deleted personal Facebook, YouTube and Twitter accounts. Nobody has
> any business holding that data. It is unethical.
>
> There are various guidelines about legally sharing lists of Tweet IDs
> for rehydration and replication (something almost never done) versus
> sharing spreadsheets of complete data extracts or the raw JSON,?which
> is done all the time in defiance of the Twitter ToS.
>

--