[Air-L] scraping google discussion groups?

יוחנן ועקנין dataneto at dataneto.com
Wed Oct 5 11:40:51 PDT 2011


Hello Andrew.
I use Web Content Extractor from newprosoft.com in my research and it works
quite good.
Regards,
Yohanan Ouaknine
Graduate student, Knowledge management, Bar Ilan University, Israel


On Wed, Oct 5, 2011 at 8:24 PM, Andrew Schrock <aschrock at usc.edu> wrote:

> Has anybody successfully scraped a Google discussion group? I found a
> script online, but it's thrown off by the fact you now have to login to view
> any groups.
>
> Google is getting squirrely about spammers scraping their data, so it may
> be a big roadblock. I'm looking at authorization with the Google PHP lib,
> but I'm not sure it will get me to groups, it all seems app-focused (so if
> you want to add items to a Google calendar for instance).
>
> Much appreciate any ideas that don't involve me adding 6000-some message
> to my analysis software by hand :/
>
> best
> Andrew
>
>
>
> Andrew Schrock
> USC Annenberg Doctoral Student
> aschrock at usc.edu
> 714.330.6545
>
>
>
> _______________________________________________
> The Air-L at listserv.aoir.org mailing list
> is provided by the Association of Internet Researchers http://aoir.org
> Subscribe, change options or unsubscribe at:
> http://listserv.aoir.org/listinfo.cgi/air-l-aoir.org
>
> Join the Association of Internet Researchers:
> http://www.aoir.org/
>



-- 
*יוחנן ועקנין*
Yohanan Ouaknine

<http://www.ois.co.il/>
<http://maps.google.com/maps?q=&hl=en>*050-6279777
*yohanan.ouaknine at ois.co.il
*http://il.linkedin.com/in/yohananouaknine*


See who we know in common <http://www.linkedin.com/e/wwk/32969976/>



More information about the Air-L mailing list