[Air-L] Text/Data Mining Software Suggestions: for YouTube, Facebook & Instagram?

Brooke Criswell bcriswell at email.fielding.edu
Tue Nov 10 14:16:12 PST 2020


Nicole this is great! Thanks for sharing

On Tue, Nov 10, 2020, 4:14 PM Nicole Lemire Garlic <nlgarlic at temple.edu>
wrote:

> Hello all,
>
> The Building Legal Literacies for Text Data Mining (Building LLTDM)
> Institute at Berkeley last year put together some useful tutorials on all
> the issues being discussed here (copyright, terms of service, international
> approaches, ethics, privacy, etc.) for text data mining.
>
> Here is a link to one video but there's a whole playlist.
>
> https://www.youtube.com/watch?reload=9&v=ClFG2DAFMzM
>
> Nikki
>
> On Tue, Nov 10, 2020 at 4:48 PM <air-l-request at listserv.aoir.org> wrote:
>
> > Send Air-L mailing list submissions to
> >         air-l at listserv.aoir.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> >         http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > or, via email, send a message with subject or body 'help' to
> >         air-l-request at listserv.aoir.org
> >
> > You can reach the person managing the list at
> >         air-l-owner at listserv.aoir.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of Air-L digest..."
> >
> >
> > Today's Topics:
> >
> >    1. Re: Text/Data Mining Software Suggestions: for YouTube,
> >       Facebook & Instagram? (Bernhard Rieder)
> >    2. Re: Text/Data Mining Software Suggestions: for YouTube,
> >       Facebook & Instagram? (Brooke Criswell)
> >    3. Re: Text/Data Mining Software Suggestions: for YouTube,
> >       Facebook & Instagram? (Schaefer, M.T. (Mirko))
> >    4. Re: Text/Data Mining Software Suggestions: for YouTube,
> >       Facebook & Instagram? (Ganiat.Kazeem)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Tue, 10 Nov 2020 17:32:10 +0000
> > From: Bernhard Rieder <berno.rieder at gmail.com>
> > To: Brooke Criswell <bcriswell at email.fielding.edu>
> > Cc: Air-L at listserv.aoir.org
> > Subject: Re: [Air-L] Text/Data Mining Software Suggestions: for
> >         YouTube, Facebook & Instagram?
> > Message-ID: <40C672E9-C53F-4851-8B7E-4A87C419717D at gmail.com>
> > Content-Type: text/plain;       charset=utf-8
> >
> > Brooke,
> >
> > I am no legal expert myself in any form or function, but here is a case
> in
> > the US that made the rounds some time ago:
> >
> https://arstechnica.com/tech-policy/2019/09/web-scraping-doesnt-violate-anti-hacking-law-appeals-court-rules/
> >
> > This may also get interesting:
> >
> https://www.wsj.com/articles/facebook-seeks-shutdown-of-nyu-research-project-into-political-ad-targeting-11603488533?mod=hp_lista_pos1
> >
> > The problem is that the legal situation is simply not 100% clear, neither
> > in the US, nor in Europe.
> >
> > Best,
> > Bernhard
> >
> > > On 10 Nov 2020, at 17:19, Brooke Criswell <
> bcriswell at email.fielding.edu>
> > wrote:
> > >
> > > Bernhard,
> > >
> > > Do you have any of those court cases you could send links to me? I
> would
> > really love to learn more about this subject, especially as an early
> career
> > researcher. And are the laws very different in Europe compared to the US?
> > (I am in the US).
> > >
> > > This is all so interesting to me!
> > >
> > >
> > >
> > > On Tue, Nov 10, 2020, 11:11 AM Bernhard Rieder <berno.rieder at gmail.com
> >
> > wrote:
> > > Hi again,
> > >
> > > No need to apologize, Brooke, we are all in a situation that is marred
> > by insecurity, opacity, and conflicting information. My apologies if my
> > comments came off too strong, also to Stuart.
> > >
> > > With regards to Facepager, the tool made it through Facebook's app
> > review (Jakob, I'll have to ping you sometime soon to ask how you did
> it),
> > which means that its functionalities were audited by the company, giving
> > some legal security. This does of course not eliminate ethical questions.
> > >
> > > With regards to Instagram, what baffles me is that scraping via
> > instaloader actually works better than data retrieval via the API ever
> did,
> > which means that there is some level of acquiescence. One can easily get
> up
> > to 100s of 1000s of posts for a given hashtag.
> > >
> > > What Mirko is saying about university support is super important, but I
> > also want to highlight the great work by AlgorithmWatch and colleagues
> Jef
> > Ausloos, Paddy Leerssen and Pim ten Thije on legal frameworks for more
> > robust data access, for example here:
> >
> https://www.ivir.nl/publicaties/download/GoverningPlatforms_IViR_study_June2020-AlgorithmWatch-2020-06-24.pdf
> > >
> > > This may be naive, but I have the hope that the upcoming EU Digital
> > Services Act will have some provisions for academic research, or at least
> > some clarifications. The current situation is creating serious chilling
> > effects for research, without protecting data subjects from the most
> > predatory practices, since scraping works so well (in technical terms) in
> > many cases - or not at all in others. Commissioner Vestager has sent some
> > positive signals in that direction.
> > >
> > > The reason why I am very hesitant about taking ToS as legal gospel is
> a)
> > that courts have ruled otherwise when it comes to scraping and b)
> because I
> > find the idea that platform companies can dictate what we are able to
> know
> > about platforms, how they operate and what happens on them highly
> > problematic and worth fighting against. Jeanette Hofmann and I have a
> paper
> > on that front coming forth very soon ,-)
> > >
> > > All the best,
> > > Bernhard
> > >
> > >
> > > > On 10 Nov 2020, at 15:53, Brooke Criswell <
> > bcriswell at email.fielding.edu> wrote:
> > > >
> > > > My apologies. I was just passing along what I have been told because
> > of privacy settings within Facebook and Instagram. I have been told
> > specifically by Facebook there is no "legal" way to scrape comments or
> > different things like that. Now likes and shares etc, I have no idea. So
> I
> > was just passing that along. I am by no means an expert in all of the
> ways
> > and was not aware of other ways like Facepager. I just know Facebook is
> > very strict with their data especially because of the privacy policy and
> > settings people can individually make. I have been told Facebook closed
> off
> > their API except for when working in collaborations or specifically
> > accepted to get data from their research team.
> > > >
> > > > Very sorry if I gave wrong information. This is just what I have
> > learned and been told and would never want anyone to get into trouble or
> > collect items they weren't technically supposed to.
> > > >
> > > > Best of luck and if you do find anything please share!
> > > >
> > > > Take care all.
> > > >
> > > > On Tue, Nov 10, 2020, 5:35 AM Bernhard Rieder <
> berno.rieder at gmail.com>
> > wrote:
> > > > Dear colleagues,
> > > >
> > > > I would like to disagree with Brooke here. Facebook data can still be
> > accessed through non-scraping based API-access, most importantly the
> > awesome Facepager.
> > > >
> > > > For Instagram, scraping is indeed the go-to technique (instaloader
> > works very well) and I would like to defend the idea that ToS should not
> > hinder researchers if the social relevance of the topic warrants it.
> > Adhering to corporate policy is not the gold standard for what
> independent
> > research should strive for, in my view. Proposing topics to people at
> > Facebook may be a strategy for certain topics, but for anything that does
> > not fit within the narrow interests of the platform, this will most
> likely
> > go nowhere.
> > > >
> > > > For YouTube, you can also check out the YouTube Data Tools that I
> have
> > been maintaining here: https://tools.digitalmethods.net/netvizz/youtube/
> > > >
> > > > All the best,
> > > > Bernhard
> > > >
> > > >
> > > > > On 10 Nov 2020, at 05:22, Brooke Criswell via Air-L <
> > air-l at listserv.aoir.org> wrote:
> > > > >
> > > > > Facebook and Instagram are strict and according to terms and
> > conditions
> > > > > they don't allow any data scraping.
> > > > >
> > > > > Best try is to propose your study to a researcher at Facebook
> > > > >
> > > > > On Mon, Nov 9, 2020, 2:21 AM Alexandre Leroux <alleroux at ulb.ac.be>
> > wrote:
> > > > >
> > > > >> Facepager for FB and YT it has a user interface and a decent
> > documentation.
> > > > >>
> > > > >> There are scrappers for instagram but those don't comply with the
> > > > >> platform terms of use and afaik are terminal only.
> > > > >>
> > > > >>
> > > > >> On 6/11/20 14:59, Cristina Migliaccio wrote:
> > > > >>> Dear Colleagues,
> > > > >>>
> > > > >>> Advance apologies if this question has been addressed (as I am
> > certain it
> > > > >>> has been) in some previous forum/email---does an easy to use
> > text/data
> > > > >>> mining software/platform exist that works across these 3 social
> > media
> > > > >>> platforms: YouTube, Facebook & Instagram?
> > > > >>>
> > > > >>> I would like to collect data on alphabetic features but also
> > > > >> paralinguistic
> > > > >>> features such as likes, shares, etc.
> > > > >>>
> > > > >>> Any suggestions whatsoever for a text/data mining beginner would
> be
> > > > >> greatly
> > > > >>> appreciated (videos, lectures to this end also appreciated!)
> > > > >>>
> > > > >>> Warm thanks-
> > > > >>> Cristina Migliaccio
> > > > >>> _______________________________________________
> > > > >>> The Air-L at listserv.aoir.org mailing list
> > > > >>> is provided by the Association of Internet Researchers
> > http://aoir.org
> > > > >>> Subscribe, change options or unsubscribe at:
> > > > >> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > > > >>>
> > > > >>> Join the Association of Internet Researchers:
> > > > >>> http://www.aoir.org/
> > > > >>>
> > > > >>
> > > > >> --
> > > > >> Alexandre Leroux
> > > > >> Ph.D candidate
> > > > >> Group for research on Ethnic Relations, Migrations and Equality
> > (GERME)
> > > > >> Universit? Libre de Bruxelles (ULB)
> > > > >> alleroux at ulb.ac.be
> > > > >> _______________________________________________
> > > > >> The Air-L at listserv.aoir.org mailing list
> > > > >> is provided by the Association of Internet Researchers
> > http://aoir.org
> > > > >> Subscribe, change options or unsubscribe at:
> > > > >> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > > > >>
> > > > >> Join the Association of Internet Researchers:
> > > > >> http://www.aoir.org/
> > > > > _______________________________________________
> > > > > The Air-L at listserv.aoir.org mailing list
> > > > > is provided by the Association of Internet Researchers
> > http://aoir.org
> > > > > Subscribe, change options or unsubscribe at:
> > http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > > > >
> > > > > Join the Association of Internet Researchers:
> > > > > http://www.aoir.org/
> > > >
> > >
> >
> >
> >
> > ------------------------------
> >
> > Message: 2
> > Date: Tue, 10 Nov 2020 11:35:37 -0600
> > From: Brooke Criswell <bcriswell at email.fielding.edu>
> > To: Bernhard Rieder <berno.rieder at gmail.com>
> > Cc: Air-L at listserv.aoir.org
> > Subject: Re: [Air-L] Text/Data Mining Software Suggestions: for
> >         YouTube, Facebook & Instagram?
> > Message-ID:
> >         <
> > CABWUjBUG78CBvGc6QD5zGCJ1qHUzCjcTOPMuuCDM8VX-L9TXFQ at mail.gmail.com>
> > Content-Type: text/plain; charset="UTF-8"
> >
> > That's wild to me.
> > Thanks for the discussion, information, and links! Appreciate it all.
> >
> > On Tue, Nov 10, 2020, 11:32 AM Bernhard Rieder <berno.rieder at gmail.com>
> > wrote:
> >
> > > Brooke,
> > >
> > > I am no legal expert myself in any form or function, but here is a case
> > in
> > > the US that made the rounds some time ago:
> > >
> >
> https://arstechnica.com/tech-policy/2019/09/web-scraping-doesnt-violate-anti-hacking-law-appeals-court-rules/
> > >
> > > This may also get interesting:
> > >
> >
> https://www.wsj.com/articles/facebook-seeks-shutdown-of-nyu-research-project-into-political-ad-targeting-11603488533?mod=hp_lista_pos1
> > >
> > > The problem is that the legal situation is simply not 100% clear,
> neither
> > > in the US, nor in Europe.
> > >
> > > Best,
> > > Bernhard
> > >
> > > > On 10 Nov 2020, at 17:19, Brooke Criswell <
> > bcriswell at email.fielding.edu>
> > > wrote:
> > > >
> > > > Bernhard,
> > > >
> > > > Do you have any of those court cases you could send links to me? I
> > would
> > > really love to learn more about this subject, especially as an early
> > career
> > > researcher. And are the laws very different in Europe compared to the
> US?
> > > (I am in the US).
> > > >
> > > > This is all so interesting to me!
> > > >
> > > >
> > > >
> > > > On Tue, Nov 10, 2020, 11:11 AM Bernhard Rieder <
> berno.rieder at gmail.com
> > >
> > > wrote:
> > > > Hi again,
> > > >
> > > > No need to apologize, Brooke, we are all in a situation that is
> marred
> > > by insecurity, opacity, and conflicting information. My apologies if my
> > > comments came off too strong, also to Stuart.
> > > >
> > > > With regards to Facepager, the tool made it through Facebook's app
> > > review (Jakob, I'll have to ping you sometime soon to ask how you did
> > it),
> > > which means that its functionalities were audited by the company,
> giving
> > > some legal security. This does of course not eliminate ethical
> questions.
> > > >
> > > > With regards to Instagram, what baffles me is that scraping via
> > > instaloader actually works better than data retrieval via the API ever
> > did,
> > > which means that there is some level of acquiescence. One can easily
> get
> > up
> > > to 100s of 1000s of posts for a given hashtag.
> > > >
> > > > What Mirko is saying about university support is super important,
> but I
> > > also want to highlight the great work by AlgorithmWatch and colleagues
> > Jef
> > > Ausloos, Paddy Leerssen and Pim ten Thije on legal frameworks for more
> > > robust data access, for example here:
> > >
> >
> https://www.ivir.nl/publicaties/download/GoverningPlatforms_IViR_study_June2020-AlgorithmWatch-2020-06-24.pdf
> > > >
> > > > This may be naive, but I have the hope that the upcoming EU Digital
> > > Services Act will have some provisions for academic research, or at
> least
> > > some clarifications. The current situation is creating serious chilling
> > > effects for research, without protecting data subjects from the most
> > > predatory practices, since scraping works so well (in technical terms)
> in
> > > many cases - or not at all in others. Commissioner Vestager has sent
> some
> > > positive signals in that direction.
> > > >
> > > > The reason why I am very hesitant about taking ToS as legal gospel is
> > a)
> > > that courts have ruled otherwise when it comes to scraping and b)
> > because I
> > > find the idea that platform companies can dictate what we are able to
> > know
> > > about platforms, how they operate and what happens on them highly
> > > problematic and worth fighting against. Jeanette Hofmann and I have a
> > paper
> > > on that front coming forth very soon ,-)
> > > >
> > > > All the best,
> > > > Bernhard
> > > >
> > > >
> > > > > On 10 Nov 2020, at 15:53, Brooke Criswell <
> > > bcriswell at email.fielding.edu> wrote:
> > > > >
> > > > > My apologies. I was just passing along what I have been told
> because
> > > of privacy settings within Facebook and Instagram. I have been told
> > > specifically by Facebook there is no "legal" way to scrape comments or
> > > different things like that. Now likes and shares etc, I have no idea.
> So
> > I
> > > was just passing that along. I am by no means an expert in all of the
> > ways
> > > and was not aware of other ways like Facepager. I just know Facebook is
> > > very strict with their data especially because of the privacy policy
> and
> > > settings people can individually make. I have been told Facebook closed
> > off
> > > their API except for when working in collaborations or specifically
> > > accepted to get data from their research team.
> > > > >
> > > > > Very sorry if I gave wrong information. This is just what I have
> > > learned and been told and would never want anyone to get into trouble
> or
> > > collect items they weren't technically supposed to.
> > > > >
> > > > > Best of luck and if you do find anything please share!
> > > > >
> > > > > Take care all.
> > > > >
> > > > > On Tue, Nov 10, 2020, 5:35 AM Bernhard Rieder <
> > berno.rieder at gmail.com>
> > > wrote:
> > > > > Dear colleagues,
> > > > >
> > > > > I would like to disagree with Brooke here. Facebook data can still
> be
> > > accessed through non-scraping based API-access, most importantly the
> > > awesome Facepager.
> > > > >
> > > > > For Instagram, scraping is indeed the go-to technique (instaloader
> > > works very well) and I would like to defend the idea that ToS should
> not
> > > hinder researchers if the social relevance of the topic warrants it.
> > > Adhering to corporate policy is not the gold standard for what
> > independent
> > > research should strive for, in my view. Proposing topics to people at
> > > Facebook may be a strategy for certain topics, but for anything that
> does
> > > not fit within the narrow interests of the platform, this will most
> > likely
> > > go nowhere.
> > > > >
> > > > > For YouTube, you can also check out the YouTube Data Tools that I
> > have
> > > been maintaining here:
> https://tools.digitalmethods.net/netvizz/youtube/
> > > > >
> > > > > All the best,
> > > > > Bernhard
> > > > >
> > > > >
> > > > > > On 10 Nov 2020, at 05:22, Brooke Criswell via Air-L <
> > > air-l at listserv.aoir.org> wrote:
> > > > > >
> > > > > > Facebook and Instagram are strict and according to terms and
> > > conditions
> > > > > > they don't allow any data scraping.
> > > > > >
> > > > > > Best try is to propose your study to a researcher at Facebook
> > > > > >
> > > > > > On Mon, Nov 9, 2020, 2:21 AM Alexandre Leroux <
> alleroux at ulb.ac.be>
> > > wrote:
> > > > > >
> > > > > >> Facepager for FB and YT it has a user interface and a decent
> > > documentation.
> > > > > >>
> > > > > >> There are scrappers for instagram but those don't comply with
> the
> > > > > >> platform terms of use and afaik are terminal only.
> > > > > >>
> > > > > >>
> > > > > >> On 6/11/20 14:59, Cristina Migliaccio wrote:
> > > > > >>> Dear Colleagues,
> > > > > >>>
> > > > > >>> Advance apologies if this question has been addressed (as I am
> > > certain it
> > > > > >>> has been) in some previous forum/email---does an easy to use
> > > text/data
> > > > > >>> mining software/platform exist that works across these 3 social
> > > media
> > > > > >>> platforms: YouTube, Facebook & Instagram?
> > > > > >>>
> > > > > >>> I would like to collect data on alphabetic features but also
> > > > > >> paralinguistic
> > > > > >>> features such as likes, shares, etc.
> > > > > >>>
> > > > > >>> Any suggestions whatsoever for a text/data mining beginner
> would
> > be
> > > > > >> greatly
> > > > > >>> appreciated (videos, lectures to this end also appreciated!)
> > > > > >>>
> > > > > >>> Warm thanks-
> > > > > >>> Cristina Migliaccio
> > > > > >>> _______________________________________________
> > > > > >>> The Air-L at listserv.aoir.org mailing list
> > > > > >>> is provided by the Association of Internet Researchers
> > > http://aoir.org
> > > > > >>> Subscribe, change options or unsubscribe at:
> > > > > >> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > > > > >>>
> > > > > >>> Join the Association of Internet Researchers:
> > > > > >>> http://www.aoir.org/
> > > > > >>>
> > > > > >>
> > > > > >> --
> > > > > >> Alexandre Leroux
> > > > > >> Ph.D candidate
> > > > > >> Group for research on Ethnic Relations, Migrations and Equality
> > > (GERME)
> > > > > >> Universit? Libre de Bruxelles (ULB)
> > > > > >> alleroux at ulb.ac.be
> > > > > >> _______________________________________________
> > > > > >> The Air-L at listserv.aoir.org mailing list
> > > > > >> is provided by the Association of Internet Researchers
> > > http://aoir.org
> > > > > >> Subscribe, change options or unsubscribe at:
> > > > > >> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > > > > >>
> > > > > >> Join the Association of Internet Researchers:
> > > > > >> http://www.aoir.org/
> > > > > > _______________________________________________
> > > > > > The Air-L at listserv.aoir.org mailing list
> > > > > > is provided by the Association of Internet Researchers
> > > http://aoir.org
> > > > > > Subscribe, change options or unsubscribe at:
> > > http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > > > > >
> > > > > > Join the Association of Internet Researchers:
> > > > > > http://www.aoir.org/
> > > > >
> > > >
> > >
> > >
> >
> >
> > ------------------------------
> >
> > Message: 3
> > Date: Tue, 10 Nov 2020 18:06:37 +0000
> > From: "Schaefer, M.T. (Mirko)" <m.t.schaefer at uu.nl>
> > To: Bernhard Rieder <berno.rieder at gmail.com>, Brooke Criswell
> >         <bcriswell at email.fielding.edu>
> > Cc: "air-l at listserv.aoir.org" <air-l at listserv.aoir.org>
> > Subject: Re: [Air-L] Text/Data Mining Software Suggestions: for
> >         YouTube, Facebook & Instagram?
> > Message-ID:
> >         <
> >
> AM0PR05MB664151322D735D933B36F805C2E90 at AM0PR05MB6641.eurprd05.prod.outlook.com
> > >
> >
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> > Hi all,
> >
> > there is already some documentation about platforms trying to stifle
> > research. The ones Bernhard mentioned and there were also very
> informative
> > links in the replies to my request about "Legal challenges for
> researchers"
> > on this list (28 October). Also note the case of Spotify trying to
> prevent
> > the publication of the Spotify Teardown book by our colleagues in Sweden
> (I
> > had no idea that Rolling Stone covered this:
> >
> https://www.rollingstone.com/pro/features/spotify-teardown-book-streaming-music-790174/
> > )
> >
> > I am still looking for examples of GDPR challenges for the kind of
> > research that is represented on this list, and I am still very much
> > interested to what extent universities are supporting researchers in
> > dealing with these issues.
> >
> > Cheers,
> > mirko
> >
> > ________________________________
> > From: Air-L <air-l-bounces at listserv.aoir.org> on behalf of Brooke
> > Criswell via Air-L <air-l at listserv.aoir.org>
> > Sent: 10 November 2020 18:35
> > To: Bernhard Rieder <berno.rieder at gmail.com>
> > Cc: Air-L at listserv.aoir.org <Air-L at listserv.aoir.org>
> > Subject: Re: [Air-L] Text/Data Mining Software Suggestions: for YouTube,
> > Facebook & Instagram?
> >
> > That's wild to me.
> > Thanks for the discussion, information, and links! Appreciate it all.
> >
> > On Tue, Nov 10, 2020, 11:32 AM Bernhard Rieder <berno.rieder at gmail.com>
> > wrote:
> >
> > > Brooke,
> > >
> > > I am no legal expert myself in any form or function, but here is a case
> > in
> > > the US that made the rounds some time ago:
> > >
> >
> https://arstechnica.com/tech-policy/2019/09/web-scraping-doesnt-violate-anti-hacking-law-appeals-court-rules/
> > >
> > > This may also get interesting:
> > >
> >
> https://www.wsj.com/articles/facebook-seeks-shutdown-of-nyu-research-project-into-political-ad-targeting-11603488533?mod=hp_lista_pos1
> > >
> > > The problem is that the legal situation is simply not 100% clear,
> neither
> > > in the US, nor in Europe.
> > >
> > > Best,
> > > Bernhard
> > >
> > > > On 10 Nov 2020, at 17:19, Brooke Criswell <
> > bcriswell at email.fielding.edu>
> > > wrote:
> > > >
> > > > Bernhard,
> > > >
> > > > Do you have any of those court cases you could send links to me? I
> > would
> > > really love to learn more about this subject, especially as an early
> > career
> > > researcher. And are the laws very different in Europe compared to the
> US?
> > > (I am in the US).
> > > >
> > > > This is all so interesting to me!
> > > >
> > > >
> > > >
> > > > On Tue, Nov 10, 2020, 11:11 AM Bernhard Rieder <
> berno.rieder at gmail.com
> > >
> > > wrote:
> > > > Hi again,
> > > >
> > > > No need to apologize, Brooke, we are all in a situation that is
> marred
> > > by insecurity, opacity, and conflicting information. My apologies if my
> > > comments came off too strong, also to Stuart.
> > > >
> > > > With regards to Facepager, the tool made it through Facebook's app
> > > review (Jakob, I'll have to ping you sometime soon to ask how you did
> > it),
> > > which means that its functionalities were audited by the company,
> giving
> > > some legal security. This does of course not eliminate ethical
> questions.
> > > >
> > > > With regards to Instagram, what baffles me is that scraping via
> > > instaloader actually works better than data retrieval via the API ever
> > did,
> > > which means that there is some level of acquiescence. One can easily
> get
> > up
> > > to 100s of 1000s of posts for a given hashtag.
> > > >
> > > > What Mirko is saying about university support is super important,
> but I
> > > also want to highlight the great work by AlgorithmWatch and colleagues
> > Jef
> > > Ausloos, Paddy Leerssen and Pim ten Thije on legal frameworks for more
> > > robust data access, for example here:
> > >
> >
> https://www.ivir.nl/publicaties/download/GoverningPlatforms_IViR_study_June2020-AlgorithmWatch-2020-06-24.pdf
> > > >
> > > > This may be naive, but I have the hope that the upcoming EU Digital
> > > Services Act will have some provisions for academic research, or at
> least
> > > some clarifications. The current situation is creating serious chilling
> > > effects for research, without protecting data subjects from the most
> > > predatory practices, since scraping works so well (in technical terms)
> in
> > > many cases - or not at all in others. Commissioner Vestager has sent
> some
> > > positive signals in that direction.
> > > >
> > > > The reason why I am very hesitant about taking ToS as legal gospel is
> > a)
> > > that courts have ruled otherwise when it comes to scraping and b)
> > because I
> > > find the idea that platform companies can dictate what we are able to
> > know
> > > about platforms, how they operate and what happens on them highly
> > > problematic and worth fighting against. Jeanette Hofmann and I have a
> > paper
> > > on that front coming forth very soon ,-)
> > > >
> > > > All the best,
> > > > Bernhard
> > > >
> > > >
> > > > > On 10 Nov 2020, at 15:53, Brooke Criswell <
> > > bcriswell at email.fielding.edu> wrote:
> > > > >
> > > > > My apologies. I was just passing along what I have been told
> because
> > > of privacy settings within Facebook and Instagram. I have been told
> > > specifically by Facebook there is no "legal" way to scrape comments or
> > > different things like that. Now likes and shares etc, I have no idea.
> So
> > I
> > > was just passing that along. I am by no means an expert in all of the
> > ways
> > > and was not aware of other ways like Facepager. I just know Facebook is
> > > very strict with their data especially because of the privacy policy
> and
> > > settings people can individually make. I have been told Facebook closed
> > off
> > > their API except for when working in collaborations or specifically
> > > accepted to get data from their research team.
> > > > >
> > > > > Very sorry if I gave wrong information. This is just what I have
> > > learned and been told and would never want anyone to get into trouble
> or
> > > collect items they weren't technically supposed to.
> > > > >
> > > > > Best of luck and if you do find anything please share!
> > > > >
> > > > > Take care all.
> > > > >
> > > > > On Tue, Nov 10, 2020, 5:35 AM Bernhard Rieder <
> > berno.rieder at gmail.com>
> > > wrote:
> > > > > Dear colleagues,
> > > > >
> > > > > I would like to disagree with Brooke here. Facebook data can still
> be
> > > accessed through non-scraping based API-access, most importantly the
> > > awesome Facepager.
> > > > >
> > > > > For Instagram, scraping is indeed the go-to technique (instaloader
> > > works very well) and I would like to defend the idea that ToS should
> not
> > > hinder researchers if the social relevance of the topic warrants it.
> > > Adhering to corporate policy is not the gold standard for what
> > independent
> > > research should strive for, in my view. Proposing topics to people at
> > > Facebook may be a strategy for certain topics, but for anything that
> does
> > > not fit within the narrow interests of the platform, this will most
> > likely
> > > go nowhere.
> > > > >
> > > > > For YouTube, you can also check out the YouTube Data Tools that I
> > have
> > > been maintaining here:
> https://tools.digitalmethods.net/netvizz/youtube/
> > > > >
> > > > > All the best,
> > > > > Bernhard
> > > > >
> > > > >
> > > > > > On 10 Nov 2020, at 05:22, Brooke Criswell via Air-L <
> > > air-l at listserv.aoir.org> wrote:
> > > > > >
> > > > > > Facebook and Instagram are strict and according to terms and
> > > conditions
> > > > > > they don't allow any data scraping.
> > > > > >
> > > > > > Best try is to propose your study to a researcher at Facebook
> > > > > >
> > > > > > On Mon, Nov 9, 2020, 2:21 AM Alexandre Leroux <
> alleroux at ulb.ac.be>
> > > wrote:
> > > > > >
> > > > > >> Facepager for FB and YT it has a user interface and a decent
> > > documentation.
> > > > > >>
> > > > > >> There are scrappers for instagram but those don't comply with
> the
> > > > > >> platform terms of use and afaik are terminal only.
> > > > > >>
> > > > > >>
> > > > > >> On 6/11/20 14:59, Cristina Migliaccio wrote:
> > > > > >>> Dear Colleagues,
> > > > > >>>
> > > > > >>> Advance apologies if this question has been addressed (as I am
> > > certain it
> > > > > >>> has been) in some previous forum/email---does an easy to use
> > > text/data
> > > > > >>> mining software/platform exist that works across these 3 social
> > > media
> > > > > >>> platforms: YouTube, Facebook & Instagram?
> > > > > >>>
> > > > > >>> I would like to collect data on alphabetic features but also
> > > > > >> paralinguistic
> > > > > >>> features such as likes, shares, etc.
> > > > > >>>
> > > > > >>> Any suggestions whatsoever for a text/data mining beginner
> would
> > be
> > > > > >> greatly
> > > > > >>> appreciated (videos, lectures to this end also appreciated!)
> > > > > >>>
> > > > > >>> Warm thanks-
> > > > > >>> Cristina Migliaccio
> > > > > >>> _______________________________________________
> > > > > >>> The Air-L at listserv.aoir.org mailing list
> > > > > >>> is provided by the Association of Internet Researchers
> > > http://aoir.org
> > > > > >>> Subscribe, change options or unsubscribe at:
> > > > > >> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > > > > >>>
> > > > > >>> Join the Association of Internet Researchers:
> > > > > >>> http://www.aoir.org/
> > > > > >>>
> > > > > >>
> > > > > >> --
> > > > > >> Alexandre Leroux
> > > > > >> Ph.D candidate
> > > > > >> Group for research on Ethnic Relations, Migrations and Equality
> > > (GERME)
> > > > > >> Universit? Libre de Bruxelles (ULB)
> > > > > >> alleroux at ulb.ac.be
> > > > > >> _______________________________________________
> > > > > >> The Air-L at listserv.aoir.org mailing list
> > > > > >> is provided by the Association of Internet Researchers
> > > http://aoir.org
> > > > > >> Subscribe, change options or unsubscribe at:
> > > > > >> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > > > > >>
> > > > > >> Join the Association of Internet Researchers:
> > > > > >> http://www.aoir.org/
> > > > > > _______________________________________________
> > > > > > The Air-L at listserv.aoir.org mailing list
> > > > > > is provided by the Association of Internet Researchers
> > > http://aoir.org
> > > > > > Subscribe, change options or unsubscribe at:
> > > http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > > > > >
> > > > > > Join the Association of Internet Researchers:
> > > > > > http://www.aoir.org/
> > > > >
> > > >
> > >
> > >
> > _______________________________________________
> > The Air-L at listserv.aoir.org mailing list
> > is provided by the Association of Internet Researchers http://aoir.org
> > Subscribe, change options or unsubscribe at:
> > http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> >
> > Join the Association of Internet Researchers:
> > http://www.aoir.org/
> >
> >
> > ------------------------------
> >
> > Message: 4
> > Date: Tue, 10 Nov 2020 21:18:09 +0000
> > From: Ganiat.Kazeem <ganiat.kazeem at open.ac.uk>
> > To: Bernhard Rieder <berno.rieder at gmail.com>, Brooke Criswell
> >         <bcriswell at email.fielding.edu>
> > Cc: "Air-L at listserv.aoir.org" <Air-L at listserv.aoir.org>
> > Subject: Re: [Air-L] Text/Data Mining Software Suggestions: for
> >         YouTube, Facebook & Instagram?
> > Message-ID:
> >         <
> >
> LNXP265MB1611F66B1DFCF4EFB00128F9D5E90 at LNXP265MB1611.GBRP265.PROD.OUTLOOK.COM
> > >
> >
> > Content-Type: text/plain; charset="utf-8"
> >
> > Actually mining data may present research with ethical issues such as
> > right of use of information, privacy, right of access, etc.
> >
> > So it is much more than skipping past a corporate policy
> >
> > Although not always the case some policies have a firm rooting in
> adhering
> > to service level agreements between users and service providers
> >
> > Facebook being a free platform does not mean that the information about
> > the users activities especially things like posts, comments, likes, etc (
> > things people will do because they are addressing their comment to a
> > specific person or persons or organisation) should become public.
> >
> > In principle, I assume it would be far easier to recruit facebook users
> > who are willing to share their user data in this way first.
> >
> > -----Original Message-----
> > From: Air-L <air-l-bounces at listserv.aoir.org> On Behalf Of Bernhard
> Rieder
> > Sent: 10 November 2020 11:35
> > To: Brooke Criswell <bcriswell at email.fielding.edu>
> > Cc: Air-L at listserv.aoir.org
> > Subject: Re: [Air-L] Text/Data Mining Software Suggestions: for YouTube,
> > Facebook & Instagram?
> >
> > CAUTION: This mail comes from outside the University. Please consider
> this
> > before opening attachments, clicking links, or acting on the content.
> >
> > Dear colleagues,
> >
> > I would like to disagree with Brooke here. Facebook data can still be
> > accessed through non-scraping based API-access, most importantly the
> > awesome Facepager.
> >
> > For Instagram, scraping is indeed the go-to technique (instaloader works
> > very well) and I would like to defend the idea that ToS should not hinder
> > researchers if the social relevance of the topic warrants it. Adhering to
> > corporate policy is not the gold standard for what independent research
> > should strive for, in my view. Proposing topics to people at Facebook may
> > be a strategy for certain topics, but for anything that does not fit
> within
> > the narrow interests of the platform, this will most likely go nowhere.
> >
> > For YouTube, you can also check out the YouTube Data Tools that I have
> > been maintaining here: https://tools.digitalmethods.net/netvizz/youtube/
> >
> > All the best,
> > Bernhard
> >
> >
> > > On 10 Nov 2020, at 05:22, Brooke Criswell via Air-L <
> > air-l at listserv.aoir.org> wrote:
> > >
> > > Facebook and Instagram are strict and according to terms and
> > > conditions they don't allow any data scraping.
> > >
> > > Best try is to propose your study to a researcher at Facebook
> > >
> > > On Mon, Nov 9, 2020, 2:21 AM Alexandre Leroux <alleroux at ulb.ac.be>
> > wrote:
> > >
> > >> Facepager for FB and YT it has a user interface and a decent
> > documentation.
> > >>
> > >> There are scrappers for instagram but those don't comply with the
> > >> platform terms of use and afaik are terminal only.
> > >>
> > >>
> > >> On 6/11/20 14:59, Cristina Migliaccio wrote:
> > >>> Dear Colleagues,
> > >>>
> > >>> Advance apologies if this question has been addressed (as I am
> > >>> certain it has been) in some previous forum/email---does an easy to
> > >>> use text/data mining software/platform exist that works across these
> > >>> 3 social media
> > >>> platforms: YouTube, Facebook & Instagram?
> > >>>
> > >>> I would like to collect data on alphabetic features but also
> > >> paralinguistic
> > >>> features such as likes, shares, etc.
> > >>>
> > >>> Any suggestions whatsoever for a text/data mining beginner would be
> > >> greatly
> > >>> appreciated (videos, lectures to this end also appreciated!)
> > >>>
> > >>> Warm thanks-
> > >>> Cristina Migliaccio
> > >>> _______________________________________________
> > >>> The Air-L at listserv.aoir.org mailing list is provided by the
> > >>> Association of Internet Researchers http://aoir.org Subscribe,
> > >>> change options or unsubscribe at:
> > >> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > >>>
> > >>> Join the Association of Internet Researchers:
> > >>> http://www.aoir.org/
> > >>>
> > >>
> > >> --
> > >> Alexandre Leroux
> > >> Ph.D candidate
> > >> Group for research on Ethnic Relations, Migrations and Equality
> > >> (GERME) Universit? Libre de Bruxelles (ULB) alleroux at ulb.ac.be
> > >> _______________________________________________
> > >> The Air-L at listserv.aoir.org mailing list is provided by the
> > >> Association of Internet Researchers http://aoir.org Subscribe, change
> > >> options or unsubscribe at:
> > >> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > >>
> > >> Join the Association of Internet Researchers:
> > >> http://www.aoir.org/
> > > _______________________________________________
> > > The Air-L at listserv.aoir.org mailing list is provided by the
> > > Association of Internet Researchers http://aoir.org Subscribe, change
> > > options or unsubscribe at:
> > > http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> > >
> > > Join the Association of Internet Researchers:
> > > http://www.aoir.org/
> >
> > _______________________________________________
> > The Air-L at listserv.aoir.org mailing list is provided by the Association
> > of Internet Researchers http://aoir.org Subscribe, change options or
> > unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> >
> > Join the Association of Internet Researchers:
> > http://www.aoir.org/
> >
> > ------------------------------
> >
> > Subject: Digest Footer
> >
> > _______________________________________________
> > The Air-L at listserv.aoir.org mailing list
> > is provided by the Association of Internet Researchers http://aoir.org
> > Subscribe, change options or unsubscribe at:
> > http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> >
> > Join the Association of Internet Researchers:
> > http://www.aoir.org/
> >
> > ------------------------------
> >
> > End of Air-L Digest, Vol 196, Issue 12
> > **************************************
> >
>
>
> --
> Nicole "Nikki" Lemire Garlic
> PhD, Media & Communication '21
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/
>



More information about the Air-L mailing list