[Air-l] Mapping the net with crawlers/robots

rafel Lucea rafel at MIT.EDU
Wed Oct 20 07:50:27 PDT 2004


Hi all,
 
I am trying to analyze the relationships between organizations on the
web. In particular, I want to map the linking behavior of a set of
organizations subjectively defined. 
 
I have explored a number of software packages (Website Watcher, Sphinx,
MnoGoSearch, issuecrawler...) but they are either not thought for this
specific purpose (WW, Issuecrawler) or require coding abilities that are
beyond my knowledge (Sphinx -OS).
 
I would be most grateful if someone could indicate me whether there
exists some web crawler that allows to define
   - a set of URLs from where to start the crawl
   - the depth - how many levels one wants to look in a given target
domain 
   - and number of iterations -how far from the original URL domain one
wants to go.
   - and a few filters -limit specific types of pages (pdf for example)
 
and returns either a map, a table of relationships (some sort of
adjacency matrix) or both.
 
Thanks in advance,
 
Rafel Lucea
MIT - Sloan School of Management
 



More information about the Air-L mailing list