On Wed, Oct 23, 2013 at 1:26 PM, Aaron Hopkins lists@die.net wrote:
On Wed, 23 Oct 2013, Karsten Loesing wrote:
Basically the experiment does traceroutes to three groups: all "routable IP prefixes", all Tor relays, and then all /24 subnets.
Based on my read of your input data, these will run traceroutes to 491762, 9058, and 13597431 IPs respectively, sending at least 100 million packets? This is a bigger ask than I think you made clear.
I question the utility of scanning all /24s. By definition, all /24s in a BGP prefix take the same path to its origin AS; the only variation will be within that. If you are looking for chokepoints, you've already found it with the origin AS.
Initially I didn't see the sense in this either, perhaps it's in the referenced docs. If the internal paths of a large aggregated AS were of interest it could be used for that. Though you might use the BGP table to reduce the /24 query set to just those matching the AS you reside in. Then again, there can be higher precedence side peerings that would not be found with that.
Also, I'd merge in current BGP AS and GeoIP data for each hop of the trace. That could probably be done upon receipt of a submission if they happen daily. However on the client may be better since in order to test new routes over time they'll need to update BGP tables anyway.
And for the Tor node... capture the IP, IP whois, DNS ptr, DNS ptr whois, BGP AS and prefix, and GeoIP.
I can see this being useful for siting nodes in the future.
They are uncommon from a Tor exit node, which already receives enough
It's opt-in so I see no issue here.
Which by default will try to keep 128 traceroute processes running all the time. This is potentially problematic for relays with limited RAM or CPU available. I'd recommend making this more clear.
...and more tuneable via config file.
If looking for more network input and similar work, make a post to NANOG etc. There have been lots of traceroute projects.