[tor-dev] Proposal 276: Report bandwidth with lower granularity in consensus documents
teor
teor2345 at gmail.com
Tue Feb 28 11:42:23 UTC 2017
> On 27 Feb 2017, at 11:35, Nick Mathewson <nickm at alum.mit.edu> wrote:
>>> Is there any reason to think this amount of smoothing would not
>>> be save?
>> ...
>> Given the existing variance is around 2%, and the existing rounding of
>> 0.5%, rounding by 5% seems quite sensible.
>>
>> (Typical variance *between* bandwidth authorities is much more than
>> this anyway.)
I was wrong: mikeperry raised some serious issues with this scheme.
Issue 1: Increased Inaccuracy
This scheme increases the inaccuracy of the bandwidth authority votes,
compared to the bandwidth measurements. This effect is particularly
pronounced for high-bandwidth relays, because the bandwidths are
distributed according to a power law.
(Mike, feel free to correct me if I've got this wrong.)
For example, the top relay is approximately 1% of the network, or 750
Mbps. The existing variance due to rounding is 0.5% of that, or 4 Mbps.
The variance due to this scheme would be 5.5%, or 41 Mbps.
If the top 10 relays were rounded down, which would happen in
approximately 1 in 2**10 consensuses (once every 6 weeks), this could
distribute ~500 Mbps of client bandwidth to the rest of the relays
arbitrarily and without warning.
This effect gets worse as relay performance increases due to
multithreading or optimisation or processor improvements.
Issue 2: Different Levels of Inaccuracy
As a separate but related issue, rounding to 3 significant digits
means that the error ratio is much higher when those digits are
100..., compared with when those digits are 999...
The maximum error ratios vary by 10x:
100- 101 0.00499
...
999-1000 0.00050
This is an existing bug, and contributes to network instability.
We should fix it anyway, and this is a good opportunity to do so.
>>> Is there a greedier smoothing algorithm that would produce better
>>> results?
>> ...
To address issue 1, I suggest that we report higher bandwidths more
accurately, and lower bandwidths less accurately.
Here's a sketch of how that could work:
Top 1%: 3+1 significant digits (900 geometric values per decile)
Next 9%: 2+1 significant digits (90 geometric values per decile)
Next 90%: 1+1 significant digits (9 geometric values per decile)
0-100: set values: 0, 1, 5, 22 (4 geometric values for 4 deciles)
(0-100 is special, because the most common values are 0 and 20.)
To address issue 2, I suggest that we round bandwidths geometrically,
rather than linearly.
Here's a sketch of how that could work when rounding 100-1000 to
2 significant digits:
Normal linear rounding yields 90 values:
100, 110, 120, ... , 980, 990, 1000
Using the linear midpoints (a+b)/2:
105, 115, ... , 985, 995
Geometric rounding yields 90 values (using an extra significant digit):
100, 103, 105, ... , 950, 975, 1000
Using the geometric midpoints sqrt(a*b):
101, 104, ... , 962, 987
The maximum geometric error ratios vary by up to 2x, and that's due to
the rounding to 3 significant digits (otherwise, all the ratios would be
equal):
100- 103 0.01489
...
975-1000 0.01274
But this is much better than linear rounding, where the ratios vary by
up to 10x.
I've attached a script that calculates these geometric series: it's
configured to calculate the 900 values for Top 1% rounding, their
geometric error ratios, and their linear error differences.
(Changing precision to 2+1 and 1+1 gets you the 90 and 9 series,
changing precision to 1, values to 3, and high to 100.0 gets you
the tail-end series.)
This would be relatively easy to implement in C, or it could be
precalculated and stored using ~2 kilobytes (1003 * uint16_t).
And as we're also doing work on compression:
There are also opportunities for table-driven compression: each value
can be specified using 10 bits for the table index, plus a few bits
for the power of 10 of the multiplier:
Top: 750 Mbps * 1024 / 8 = 96000 KBps = 9599 * 10**1 (1 bit)
1%: 100 Mbps * 1024 / 8 = 12800 KBps = 129 * 10**2 (2 bits)
10%: 25 Mbps * 1024 / 8 = 3200 KBps = 36 * 10**2 (2 bits)
99%: 4 Mbps * 1024 / 8 = 512 KBps = 46 * 10**1 (1 bit)
End: 20 KBps = 20 KBps = 22 * 10**0 (1 bit)
6 extra bits (2 bytes total) would be sufficient to cover a 10,000-fold
increase in maximum individual relay bandwidth (up to 8 Tbps).
T
--
Tim Wilson-Brown (teor)
teor2345 at gmail dot com
PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B
ricochet:ekmygaiu4rzgsk6n
xmpp: teor at torproject dot org
------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bwmodel.py
Type: text/x-python-script
Size: 1861 bytes
Desc: not available
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20170228/589efe8b/attachment.bin>
-------------- next part --------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20170228/589efe8b/attachment.sig>
More information about the tor-dev
mailing list