[Air-L] Help with Facebook Research

elw at stderr.org elw at stderr.org
Wed Oct 3 07:20:08 PDT 2007


> I am planning a survey of Facebook members at NJIT, where I am a PhD 
> student. I would like to write a web crawl or similar program to 
> identify through Facebook who is part of the NJIT network. I have seen 
> other papers discuss this technique, but I need more specific details as 
> to how to accomplish it. Any ideas?


The basic sketch of the technique is this:

1) identify a starting point [initial URL]
2a) programmatically collect all linked pages (in effect, use a regex that
    matches "a href=")...
2b) ...that match criteria you specify
3) recurse

however....

Web crawls of Facebook are against the Terms of Service of the site.  You 
might be able to work something out using the Facebook API, rather than by 
crawling.  It will take some work.

Your campus IRB will be highly unlikely to approve a project that 
explicitly violates the site's TOS; the TOS exists to give both Facebook 
and the other users of the site some notion of what sort of privacy 
exposure they are likely to be surrendering.

--elijah



More information about the Air-L mailing list