revised Proposal 161: Computing bandwidth adjustments
mikeperry at fscked.org
Fri May 15 22:56:05 UTC 2009
Thus spake Nick Mathewson (nickm at torproject.org):
> I've pushed Mike's revisions to proposal 161 based on discussion
> here. I have a few questions on the revised version:
> > Two hop circuits are built using nodes from the same slice, and a large
> > file is downloaded via these circuits. For nodes in the first 15% of the
> > network, a 500K file will be used. For nodes in the next 15%, a 250K file
> > will be used. For nodes in next 15%, a 100K file will be used. The
> > remainder of the nodes will fetch a 75K file.
> The "first" 15%? Do you mean the "fastest"? And what does the 
> refer to?
I pushed a new version to a "bandwidth-proposals" branch on my public
repo last night (I think :) that added this footnote, which says that
I will perform measurements to determine these cutoffs properly (which
I'm doing now), and that gives some detail on parallelizing scans so
they can finish in a reasonable time.
> > This process is repeated 250 times, and average stream capacities are
> > assigned to each node from these results.
> > In the future, a node generator type can be created to ensure that
> > each node is chosen to participate in an equal number of circuits,
> > and the selection will continue until every live node is chosen
> > to participate in at least 7 circuits.
> By the way, what does the algorithm do with non-running nodes now?
> Does it retry them as it goes? Does a circuit that failed to build
> mean anything for the node's capacity? Does it count to the number of
> circuits built?
Non-running nodes recieve a ratio of None right now. I think we'll
just not report one and leave the authorities to abstain from voting
on a measured bandwidth for them.
Circuit failures do not affect ratios. Built circuits are not counted
as a progress metric as that is hard to enforce and still make
progress. Some nodes fail circuits endlessly.
> Also, we'll want to think more about this "until every live node is chosen
> to participate in at least 7 circuits" thing before we do it; if 7
> random circuits per live node are necessary and sufficient to measure
> the nodes' capacities, then we can get away with less (or will have to
> handle more) circuits than we're currently planning. [For example, if
> all 50 nodes are live, we'll need about 354 circuits on average. If
> only 30 are live, we'll expect to be done after 201 circuits.]
I picked 7 because we see around 10-20% circuit failure rate, and I
tended to notice node bandwidth didn't change after 3 or so
measurements. Not yet scientific.
7 each should require 7*(50-number_of_exits circuits), which is
something like 7*35 for most slices.
I also wrote the TorFlow NodeGenerator to do this last night as well,
but I still need to test it.
> > This will produce a new bandwidth value that will be output into a
> > file consisting of lines of the form:
> > node_id=<idhex> SP bw=<Bw_new> NL
> > This file can be either copied or rsynced into a directory readable
> > by the directory authority.
> When documenting and building this format, remember to note that
> programs parsing it MUST ignore unrecognized line formats and
> unrecognized fields on the lines they do recognize.
> Otherwise, this looks fine to me.
Mad Computer Scientist
fscked.org evil labs
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
More information about the tor-dev