[tor-relays] Traceroute measurement from Tor relays

Das, Anupam das17 at illinois.edu
Fri Oct 25 06:31:01 UTC 2013


So we have received some questions about running our traceroute measurements. Let me answer some of the questions:

------------------------------------------------------
From: Aaron Hopkins [lists at die.net]

>Are you trying to work backward to physical-layer chokepoints, like how the
>inter-contentinal network topology maps to the landing sites on
><http://www.submarinecablemap.com/>?

We are not looking for "chokepoints" exactly. We are interested in knowing as much as possible about where the traffic goes and who can see it. 

>I question the utility of scanning all /24s.  By definition, all /24s in a
>BGP prefix take the same path to its origin AS; the only variation will be
>within that.  If you are looking for chokepoints, you've already found it
>with the origin AS.

Our BGP prefixes are derived from RouteViews BGP tables. There is no guarantee that these are the prefixes used by all Internet routers, much less that routing is consistent over all IPs in the prefix.
In addition, these prefixes change over time. Doing all /24 prefixes would enable us to study finer-grained routing behavior.

>This also does all scans sequentially, which will have a couple of negative
>side-effects.  You are much more likely to trigger ICMP response
>rate-limiting on intermediate routers and more likely to trigger IDS alarms
>than if you'd randomized your target selection.  Running your target list
>through one of the following would mitigate this:

  >       awk 'BEGIN{srand()}{print rand()"\t"$0}' | sort -k1 -n | cut -f2-

  >    perl -MList::Util -e 'print List::Util::shuffle <>'

  >  sort -R

We have randomized the prefix list. And will soon randomize the /24 list too.

>Which by default will try to keep 128 traceroute processes running all the
>time.  This is potentially problematic for relays with limited RAM or CPU
>available.  I'd recommend making this more clear.

We tested running 128 parallel traceroute. We found the CPU (<5%) and RAM (<0.2% for 8GB RAM) requirements really small .
We can also choose how many traceroutes you want to run using the following command-

PARALLEL=64 ./traceroutes.sh &
if you want to run 64 traceroutes in parallel

If you are using scamper (which we encourage you to do) you can tell the script how many packets-per-second you want using the following command-

PPS=800 ./traceroutes.sh &
where 1<=PPS<=1000


>I may run this from a machine on the same network as my Tor node, but
>definitely not on the Tor node itself.

Running from a machine on the same network is absolutely just as good as long as they definitely send the same traffic through the same Internet gateways.


-------------------------------------------------------------------
From: Geoff Down geoffdown at fastmail.net 

>Your README should probably explicitly say that you need to run
>sudo chmod 04555 /path/to/scamper
>after installing scamper or it won't work.

Yes we have mentioned that on the README now.

--------------------------------------------------------------------
From: Jesse Victors jvictors at jessevictors.com 

>How much bandwidth will this be taking up, and roughly how much will be
>uploaded/downloaded? I've cloned the repo, but I'm nervous about running
>this if it's going to be a significant bandwidth hog for a whole week.
>As Aaron said in Issue 39, it looks like it's going to be a lot of IPs
>and a large amount of packets. Also, ISPs may not take kindly to all
>these scans. What's the word on that? Has anyone run this tool, and
>what's their opinion? I'd be happy to help, but I'd like to know the
>full details of the various resources this tool will be consuming.

We have provided some resource requirements at the end of the README file. I'll just summarize all of them here-

1. Upload: The script generate less than 500MB of total data. By default the files are erased once
they are uploaded, but you can choose not to erase any data.


2. Bandwidth: If scamper used then with all the default setting it would consume
at most 0.5 megabit of bandwidth. However, you can choose to reduce the consumption by changing

the PPS (packet-per-second) parameter (1<=PPS<=1000) using the following command

PPS=800 ./traceroutes.sh &

With reasonable bandwidth line (>100Mbps) this form of traceroutes shouldn't be much of a load to any ISP.

3. RAM and CPU usage: We have tested out the script on multiple machines. We found the following resource usage

RAM <0.2% (for 8GB RAM) and CPU < 5%.

Hope this gives you some idea about the resource requirements.

Thanks

Anupam Das



More information about the tor-relays mailing list