[Air-L] Text/Data Mining Software Suggestions: for YouTube, Facebook & Instagram?

Brooke Criswell bcriswell at email.fielding.edu
Tue Nov 10 09:35:37 PST 2020


That's wild to me.
Thanks for the discussion, information, and links! Appreciate it all.

On Tue, Nov 10, 2020, 11:32 AM Bernhard Rieder <berno.rieder at gmail.com>
wrote:

> Brooke,
>
> I am no legal expert myself in any form or function, but here is a case in
> the US that made the rounds some time ago:
> https://arstechnica.com/tech-policy/2019/09/web-scraping-doesnt-violate-anti-hacking-law-appeals-court-rules/
>
> This may also get interesting:
> https://www.wsj.com/articles/facebook-seeks-shutdown-of-nyu-research-project-into-political-ad-targeting-11603488533?mod=hp_lista_pos1
>
> The problem is that the legal situation is simply not 100% clear, neither
> in the US, nor in Europe.
>
> Best,
> Bernhard
>
> > On 10 Nov 2020, at 17:19, Brooke Criswell <bcriswell at email.fielding.edu>
> wrote:
> >
> > Bernhard,
> >
> > Do you have any of those court cases you could send links to me? I would
> really love to learn more about this subject, especially as an early career
> researcher. And are the laws very different in Europe compared to the US?
> (I am in the US).
> >
> > This is all so interesting to me!
> >
> >
> >
> > On Tue, Nov 10, 2020, 11:11 AM Bernhard Rieder <berno.rieder at gmail.com>
> wrote:
> > Hi again,
> >
> > No need to apologize, Brooke, we are all in a situation that is marred
> by insecurity, opacity, and conflicting information. My apologies if my
> comments came off too strong, also to Stuart.
> >
> > With regards to Facepager, the tool made it through Facebook's app
> review (Jakob, I'll have to ping you sometime soon to ask how you did it),
> which means that its functionalities were audited by the company, giving
> some legal security. This does of course not eliminate ethical questions.
> >
> > With regards to Instagram, what baffles me is that scraping via
> instaloader actually works better than data retrieval via the API ever did,
> which means that there is some level of acquiescence. One can easily get up
> to 100s of 1000s of posts for a given hashtag.
> >
> > What Mirko is saying about university support is super important, but I
> also want to highlight the great work by AlgorithmWatch and colleagues Jef
> Ausloos, Paddy Leerssen and Pim ten Thije on legal frameworks for more
> robust data access, for example here:
> https://www.ivir.nl/publicaties/download/GoverningPlatforms_IViR_study_June2020-AlgorithmWatch-2020-06-24.pdf
> >
> > This may be naive, but I have the hope that the upcoming EU Digital
> Services Act will have some provisions for academic research, or at least
> some clarifications. The current situation is creating serious chilling
> effects for research, without protecting data subjects from the most
> predatory practices, since scraping works so well (in technical terms) in
> many cases - or not at all in others. Commissioner Vestager has sent some
> positive signals in that direction.
> >
> > The reason why I am very hesitant about taking ToS as legal gospel is a)
> that courts have ruled otherwise when it comes to scraping and b) because I
> find the idea that platform companies can dictate what we are able to know
> about platforms, how they operate and what happens on them highly
> problematic and worth fighting against. Jeanette Hofmann and I have a paper
> on that front coming forth very soon ,-)
> >
> > All the best,
> > Bernhard
> >
> >
> > > On 10 Nov 2020, at 15:53, Brooke Criswell <
> bcriswell at email.fielding.edu> wrote:
> > >
> > > My apologies. I was just passing along what I have been told because
> of privacy settings within Facebook and Instagram. I have been told
> specifically by Facebook there is no "legal" way to scrape comments or
> different things like that. Now likes and shares etc, I have no idea. So I
> was just passing that along. I am by no means an expert in all of the ways
> and was not aware of other ways like Facepager. I just know Facebook is
> very strict with their data especially because of the privacy policy and
> settings people can individually make. I have been told Facebook closed off
> their API except for when working in collaborations or specifically
> accepted to get data from their research team.
> > >
> > > Very sorry if I gave wrong information. This is just what I have
> learned and been told and would never want anyone to get into trouble or
> collect items they weren't technically supposed to.
> > >
> > > Best of luck and if you do find anything please share!
> > >
> > > Take care all.
> > >
> > > On Tue, Nov 10, 2020, 5:35 AM Bernhard Rieder <berno.rieder at gmail.com>
> wrote:
> > > Dear colleagues,
> > >
> > > I would like to disagree with Brooke here. Facebook data can still be
> accessed through non-scraping based API-access, most importantly the
> awesome Facepager.
> > >
> > > For Instagram, scraping is indeed the go-to technique (instaloader
> works very well) and I would like to defend the idea that ToS should not
> hinder researchers if the social relevance of the topic warrants it.
> Adhering to corporate policy is not the gold standard for what independent
> research should strive for, in my view. Proposing topics to people at
> Facebook may be a strategy for certain topics, but for anything that does
> not fit within the narrow interests of the platform, this will most likely
> go nowhere.
> > >
> > > For YouTube, you can also check out the YouTube Data Tools that I have
> been maintaining here: https://tools.digitalmethods.net/netvizz/youtube/
> > >
> > > All the best,
> > > Bernhard
> > >
> > >
> > > > On 10 Nov 2020, at 05:22, Brooke Criswell via Air-L <
> air-l at listserv.aoir.org> wrote:
> > > >
> > > > Facebook and Instagram are strict and according to terms and
> conditions
> > > > they don't allow any data scraping.
> > > >
> > > > Best try is to propose your study to a researcher at Facebook
> > > >
> > > > On Mon, Nov 9, 2020, 2:21 AM Alexandre Leroux <alleroux at ulb.ac.be>
> wrote:
> > > >
> > > >> Facepager for FB and YT it has a user interface and a decent
> documentation.
> > > >>
> > > >> There are scrappers for instagram but those don't comply with the
> > > >> platform terms of use and afaik are terminal only.
> > > >>
> > > >>
> > > >> On 6/11/20 14:59, Cristina Migliaccio wrote:
> > > >>> Dear Colleagues,
> > > >>>
> > > >>> Advance apologies if this question has been addressed (as I am
> certain it
> > > >>> has been) in some previous forum/email---does an easy to use
> text/data
> > > >>> mining software/platform exist that works across these 3 social
> media
> > > >>> platforms: YouTube, Facebook & Instagram?
> > > >>>
> > > >>> I would like to collect data on alphabetic features but also
> > > >> paralinguistic
> > > >>> features such as likes, shares, etc.
> > > >>>
> > > >>> Any suggestions whatsoever for a text/data mining beginner would be
> > > >> greatly
> > > >>> appreciated (videos, lectures to this end also appreciated!)
> > > >>>
> > > >>> Warm thanks-
> > > >>> Cristina Migliaccio
> > > >>> _______________________________________________
> > > >>> The Air-L at listserv.aoir.org mailing list
> > > >>> is provided by the Association of Internet Researchers
> http://aoir.org
> > > >>> Subscribe, change options or unsubscribe at:
> > > >> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > > >>>
> > > >>> Join the Association of Internet Researchers:
> > > >>> http://www.aoir.org/
> > > >>>
> > > >>
> > > >> --
> > > >> Alexandre Leroux
> > > >> Ph.D candidate
> > > >> Group for research on Ethnic Relations, Migrations and Equality
> (GERME)
> > > >> Université Libre de Bruxelles (ULB)
> > > >> alleroux at ulb.ac.be
> > > >> _______________________________________________
> > > >> The Air-L at listserv.aoir.org mailing list
> > > >> is provided by the Association of Internet Researchers
> http://aoir.org
> > > >> Subscribe, change options or unsubscribe at:
> > > >> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > > >>
> > > >> Join the Association of Internet Researchers:
> > > >> http://www.aoir.org/
> > > > _______________________________________________
> > > > The Air-L at listserv.aoir.org mailing list
> > > > is provided by the Association of Internet Researchers
> http://aoir.org
> > > > Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > > >
> > > > Join the Association of Internet Researchers:
> > > > http://www.aoir.org/
> > >
> >
>
>



More information about the Air-L mailing list