[Air-L] Software to extract content of Facebook & Twitter

Thu Aug 28 01:49:08 PDT 2014

Hi,

Just to add to the list, I develop Netvizz (https://apps.facebook.com/netvizz/), which extracts data from personal networks, groups and pages on Facebook.

I also think that Facepager (https://github.com/strohne/Facepager) is an awesome tool.

best,
Bernhard

--
Bernhard Rieder | Associate Professor | New Media and Digital Culture
University of Amsterdam | Turfdraagsterpad 9 | 1012 XT  Amsterdam | The Netherlands
http://thepoliticsofsystems.net | http://rieder.polsys.net | https://www.digitalmethods.net | @RiederB

On 28 Aug 2014, at 9:14 , Harju Anu <anu.harju at aalto.fi> wrote:

> Hi everyone,
> 
> thank you all so much for all the suggestions! I will try them out as soon as I have the time. I suppose they work on Mac, too.
> 
> Noha, thanks for the offer of help, I might take you up on that if I run into any problems. I'm flying out to a conference today so won't be able to do anything in this regard for a week, but thanks again, much appreciated  :)
> 
> Best,
> Anu
> 
> Sent from my iPhone
> 
> On 28.8.2014, at 8.57, "Noha Nagi" <noha.a.nagi at gmail.com<mailto:noha.a.nagi at gmail.com>> wrote:
> 
> Hi Anu,
> 
> I suggest you try  NodeXL<http://nodexl.codeplex.com/>. It's simple and free. You will need to install first the social network importer<http://socialnetimporter.codeplex.com/> for NodeXL to grab facebook, twitter, flicker and youtube data.
> 
> Good Luck !
> 
> 
> On Wed, Aug 27, 2014 at 9:26 PM, Harju Anu <anu.harju at aalto.fi<mailto:anu.harju at aalto.fi>> wrote:
> Hi everyone,
> 
> and I'm also grateful for all these suggestions for various tools. For a paper for my PhD I'm looking at YouTube comment threads and I was wondering if any one of you might know a tool that can extract those? It's a very laborious process to do manually and it drives me insane. I once asked a coder friend of mine, but he said it was more complicated than he initially thought, and we left it at that.
> 
> Thank you in advance, and thanks for a great list! I've been a lurker for quite some time now and find it very useful.
> 
> Best,
> Anu
> 
> 
> Anu Harju
> Doctoral Candidate
> Aalto University
> Helsinki
> Finland
> 
> Sent from my iPhone
> 
> On 27.8.2014, at 18.06, "Tim Libert" <tlibert at asc.upenn.edu<mailto:tlibert at asc.upenn.edu>> wrote:
> 
>> I’d quickly point out two additional considerations when ingesting fb/twitter data:  1) APIs generally exclude ads (which are ‘targeted’) - so depending on what you want to study and/or model an API will never give you an accurate view of what users really see.  APIs are easy, but incomplete.  2) The trick with scraping content directly from the web is accounting for processing/executing javascript as that is how many pages pull content dynamically (there may also be other factors: redirects, iframes, canvas, etc).  If your tool (e.g. Python urllib,etc). can only access static HTML you will not be able to pull the content you want as you will be accessing instruction sets of how to dynamically render content rather than the actual content.  I am not sure how your tool in R works, but I imagine this is a likely issue you may be facing.  I have developed some software that solves problem #2 by leveraging http://phantomjs.org/, but it’s not ready for public release quite yet; however, you may want to consider using an automation framework like selenium (http://www.seleniumhq.org/).
>> 
>> - tim, phd student, upenn
>> _______________________________________________
>> The Air-L at listserv.aoir.org<mailto:Air-L at listserv.aoir.org> mailing list
>> is provided by the Association of Internet Researchers http://aoir.org
>> Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>> 
>> Join the Association of Internet Researchers:
>> http://www.aoir.org/
> _______________________________________________
> The Air-L at listserv.aoir.org<mailto:Air-L at listserv.aoir.org> mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> 
> Join the Association of Internet Researchers:
> http://www.aoir.org/
> 
> 
> 
> --
> Noha A.Nagi
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at: http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
> 
> Join the Association of Internet Researchers:
> http://www.aoir.org/