On Thu, Jul 26, 2012 at 12:01:13PM -0400, Steve Snyder wrote:
At the same time, much of our performance improvement comes from better load balancing -- that is, concentrating traffic on the relays that can handle it better. The result though is a direct tradeoff with relay diversity: on today's network, clients choose one of the fastest 5 exit relays around 25-30% of the time, and 80% of their choices come from a pool of 40-50 relays.
From what I see on the TorStatus pages (torstatus.all.de, blutmagie.de) about a third of the roughly 3000 relays listed are at or below 64KB/sec of demonstrated bandwidth. No doubt some of these are soon-to-be-high-bandwidth servers that are just ramping up, and some are nodes having transitory networking problems. It seems reasonable to assume, though, that most of these low-bandwidth nodes are intentionally low-bandwidth, perhaps on the basis of the Tor doc stating a 20KB/sec minimum.
Yep. Note that I raised the minimum to 30KB/s a year or so back: https://www.torproject.org/docs/tor-doc-relay
Here are the current cutoffs for flags from moria1's perspective:
Jul 31 18:50:01.000 [info] Cutoffs: For Stable, 656736 sec uptime, 509452 sec MTBF. For Fast: 32768 bytes/sec. For Guard: WFU 94.512%, time-known 691200 sec, and bandwidth 128000 or 133912 bytes/sec.
Meaning if you don't have 32KB/s advertised in your relay descriptor, you won't get the Fast flag and most clients will ignore you.
With "80% of their choices come from a pool of 40-50 relays" that leaves a 20% chance for the remaining 2950 nodes. A case for low-bandwidth nodes can be made as a means to dissuade anticipated routing (due to pool size), but it seems from the stats quoted above that there is little chance that 2000+ of these 3000 nodes will ever carry Tor traffic, and thus can be ignored for purposes of traffic analysis.
You're using the wrong numbers (the 40-50 relays are just for the exit position, and there are only ~920 relays with the Exit flag), but your point is right.
Karsten made this graphic earlier to show that the top 50 exits account for 78.9% of the exit weights: https://trac.torproject.org/projects/tor/attachment/ticket/6443/exit-proport...
Is there any justification for a low-bandwidth Tor node?
We could imagine alternate designs like Mashael's "multipath" design that spreads Tor flows across multiple circuits: http://www.cacr.math.uwaterloo.ca/techreports/2011/cacr2011-29.pdf
But currently, no, tiny nodes are not particularly helpful. There's an open research question as to whether they even hurt. Or more specifically, what the performance curve looks like if we dump the X% slowest relays: https://trac.torproject.org/projects/tor/ticket/1854
I had originally imagined doing network simulations with Shadow or Experimentor to help answer #1854, but it's proving particularly tough to get an accurate network model at that level: https://shadow.cs.umn.edu/about/papers/tormodel-cset2012.pdf
And if so, what is the practical minimum bandwidth needed to actually see any traffic?
Actually, even these tiny relays see traffic. That's because of the sheer number of Tor clients out there -- if enough clients make enough circuits, some of them will be through the small relays. The question is whether the bandwidth cap on them makes that circuit especially no fun to use, relative to what you'd get if we squeezed all the users onto a smaller number of higher-bandwidth relays. My guess is raising the min bw for the Fast flag to 50KB or even 100KB would reduce the variance in torperf performance: https://metrics.torproject.org/performance.html?graph=torperf&source=mor...
--Roger