[Air-l] Data

elijah wright elw at stderr.org
Thu May 20 08:29:28 PDT 2004


> I am interested in hearing any thoughts you have on a data problem that
> I have, that I am sure many of you have approached, and which is, of
> course, a result of the structure of the Internet itself.  In my ideal
> world, I would be able to build a relational database of data traffic
> between the largest cities worldwide.

Social problem the first - the information you'd most like to have is
closely guarded by the involved companies.  They keep it secret so that
other companies can't deduce all of their peering agreements and thereby
figure out how best to 'take advantage' of network position for profit.

This is a pretty common problem for a decentralized network, in my
experience.

> The data I have found shows gross data traffic between nodes, which
> includes traffic originated in third-party cities and destined for
> fourth-party cities, for example, and which does not provide an estimate
> of the traffic originated in 3 and destined for 4.  This means that the
> data doesn't relate every node in the city system to every other in
> terms of network traffic inbound and outbound.

right - the nodes which are most easily measured/evaluated (the network
hubs) don't actually act as termination points for a whole lot of traffic.
they're just points in the system as a whole, with peers that serve
endpoints but are not backbone nodes themselves.

> Have you approached this problem?  Do you have any thoughts on how
> currently available data can be patched for network analysis, or how
> such a relational database could be built in the future?

a graph-like structure is good for this, IMHO.  something like this:

sourcenode	destnode	measurement	eval.date
sourcenode	destnode	measurement	eval.date
sourcenode	destnode	measurement	eval.date

ad nauseum.  you may need some more values, depending on what it is that
you're wanting to do.  but that general form (spreadsheet-like) is one of
the simpler structures to store in a database, and reformatting those
tables into something that tools like UCINet or Pajek can display is not
such a terrible task.

elijah




More information about the Air-L mailing list