[Air-l] Favorite Link Analysis Software

elw at stderr.org elw at stderr.org
Mon Jul 2 17:55:23 PDT 2007

> Do list members have a preferred program for analyzing links between 
> webpages? In particular, I'm looking for something that can take a blog 
> (or really any website), spider through the site, and record where the 
> links go to.

We usually end up re-writing/tweaking code to do link extraction (e.g. 
weblogs, FOAF, vlogs, whatever) on a project-by-project basis.  The data 
being collected usually differs enough that one would want to do 
significant tweaking for each new project.

Once you have the links out, of course, you can do whatever sort of 
analysis your heart desires.

Most of the link-extraction code that i've written is in shell scripts and 
outputs plain text files full of data.  Pretty low to the ground stuff, 
mostly absent of portability problems.

I find that this is a not uncommon approach to the problem.  Once you have 
a pretty well-refined idea of what data you are looking for and what 
patterns in the data are exploitable for your data collection, getting a 
piece of code pinned down to grab the data is usually the least of your 


> While there are no lack of computer science papers on such software,
> those are usually published when the programs are in pre-alpha, so I
> was hopping that someone knows of a close-to-stable program. While
> working on a mac would be preferable, I can learn to live with a
> Windows or Linux one.
> Any advice for the list for link analysis software that works and
> isn't a huge pain to set up and use?
> Thanks in advance
> Ben Spigel
> Graduate Student
> Department of Geography
> The Ohio State University

More information about the Air-L mailing list