Re: [tor-relays] Traceroute measurement from Tor relays

25 Oct 2013

      So we have received some questions about running our traceroute measurements. Let me answer some of the questions:

------------------------------------------------------
From: Aaron Hopkins [lists@die.net]
...
Are you trying to work backward to physical-layer chokepoints, like how the
inter-contentinal network topology maps to the landing sites on
<http://www.submarinecablemap.com/>?
We are not looking for "chokepoints" exactly. We are interested in knowing as much as possible about where the traffic goes and who can see it.
...
I question the utility of scanning all /24s.  By definition, all /24s in a
BGP prefix take the same path to its origin AS; the only variation will be
within that.  If you are looking for chokepoints, you've already found it
with the origin AS.
Our BGP prefixes are derived from RouteViews BGP tables. There is no guarantee that these are the prefixes used by all Internet routers, much less that routing is consistent over all IPs in the prefix.
In addition, these prefixes change over time. Doing all /24 prefixes would enable us to study finer-grained routing behavior.
...
This also does all scans sequentially, which will have a couple of negative
side-effects.  You are much more likely to trigger ICMP response
rate-limiting on intermediate routers and more likely to trigger IDS alarms
than if you'd randomized your target selection.  Running your target list
through one of the following would mitigate this:
...
awk 'BEGIN{srand()}{print rand()"\t"$0}' | sort -k1 -n | cut -f2-

...
perl -MList::Util -e 'print List::Util::shuffle <>'
...
sort -R
We have randomized the prefix list. And will soon randomize the /24 list too.
...
Which by default will try to keep 128 traceroute processes running all the
time.  This is potentially problematic for relays with limited RAM or CPU
available.  I'd recommend making this more clear.
We tested running 128 parallel traceroute. We found the CPU (<5%) and RAM (<0.2% for 8GB RAM) requirements really small .
We can also choose how many traceroutes you want to run using the following command-

PARALLEL=64 ./traceroutes.sh &
if you want to run 64 traceroutes in parallel

If you are using scamper (which we encourage you to do) you can tell the script how many packets-per-second you want using the following command-

PPS=800 ./traceroutes.sh &
where 1<=PPS<=1000
...
I may run this from a machine on the same network as my Tor node, but
definitely not on the Tor node itself.
Running from a machine on the same network is absolutely just as good as long as they definitely send the same traffic through the same Internet gateways.

-------------------------------------------------------------------
From: Geoff Down geoffdown at fastmail.net
...
Your README should probably explicitly say that you need to run
sudo chmod 04555 /path/to/scamper
after installing scamper or it won't work.
Yes we have mentioned that on the README now.

--------------------------------------------------------------------
From: Jesse Victors jvictors at jessevictors.com
...
How much bandwidth will this be taking up, and roughly how much will be
uploaded/downloaded? I've cloned the repo, but I'm nervous about running
this if it's going to be a significant bandwidth hog for a whole week.
As Aaron said in Issue 39, it looks like it's going to be a lot of IPs
and a large amount of packets. Also, ISPs may not take kindly to all
these scans. What's the word on that? Has anyone run this tool, and
what's their opinion? I'd be happy to help, but I'd like to know the
full details of the various resources this tool will be consuming.
We have provided some resource requirements at the end of the README file. I'll just summarize all of them here-

1. Upload: The script generate less than 500MB of total data. By default the files are erased once
they are uploaded, but you can choose not to erase any data.

2. Bandwidth: If scamper used then with all the default setting it would consume
at most 0.5 megabit of bandwidth. However, you can choose to reduce the consumption by changing

the PPS (packet-per-second) parameter (1<=PPS<=1000) using the following command

PPS=800 ./traceroutes.sh &

With reasonable bandwidth line (>100Mbps) this form of traceroutes shouldn't be much of a load to any ISP.

3. RAM and CPU usage: We have tested out the script on multiple machines. We found the following resource usage

RAM <0.2% (for 8GB RAM) and CPU < 5%.

Hope this gives you some idea about the resource requirements.

Thanks

Anupam Das

Re: [tor-relays] Traceroute measurement from Tor relays

Das, Anupam