On Wed, 23 Oct 2013, Karsten Loesing wrote:
As you may be aware, the anonymity of a connection over Tor is vulnerable to an adversary who can observe it in enough places along its route.
Are you trying to work backward to physical-layer chokepoints, like how the inter-contentinal network topology maps to the landing sites on http://www.submarinecablemap.com/?
Basically the experiment does traceroutes to three groups: all "routable IP prefixes", all Tor relays, and then all /24 subnets.
Based on my read of your input data, these will run traceroutes to 491762, 9058, and 13597431 IPs respectively, sending at least 100 million packets? This is a bigger ask than I think you made clear.
I question the utility of scanning all /24s. By definition, all /24s in a BGP prefix take the same path to its origin AS; the only variation will be within that. If you are looking for chokepoints, you've already found it with the origin AS.
This also does all scans sequentially, which will have a couple of negative side-effects. You are much more likely to trigger ICMP response rate-limiting on intermediate routers and more likely to trigger IDS alarms than if you'd randomized your target selection. Running your target list through one of the following would mitigate this:
awk 'BEGIN{srand()}{print rand()"\t"$0}' | sort -k1 -n | cut -f2-
perl -MList::Util -e 'print List::Util::shuffle <>'
sort -R
These kinds of measurements are not uncommon, and they will not be done at a high rate.
They are uncommon from a Tor exit node, which already receives enough complaints where it is really helpful to be able to truthfully claim to my ISP and others that none of the traffic was generated by me, there's not much I can do to stop it, and I have no logs about anything.
If you are not able to run scamper, the script will also work with the more-common but less-accurate and slower "traceroute" utility.
Which by default will try to keep 128 traceroute processes running all the time. This is potentially problematic for relays with limited RAM or CPU available. I'd recommend making this more clear.
I may run this from a machine on the same network as my Tor node, but definitely not on the Tor node itself.
-- Aaron