On 25 Feb 2017, at 03:25, Nick Mathewson nickm@torproject.org wrote:
Filename: 276-lower-bw-granularity.txt Title: Report bandwidth with lower granularity in consensus documents Author: Nick Mathewson Created: 20-Feb-2017 Status: Open Target: 0.3.1.x-alpha
- Overview
This document proposes that, in order to limit the bandwidth needed for networkstatus diffs, we lower the granularity with which bandwidth is reported in consensus documents. ... 3. Proposal
I propose that we round the bandwidth values as they are placed in the votes to two no more than significant digits.
Misordered sentence this is
In addition, for values beginning with decimal "2" through "4", we should round the first two digits the nearest multiple of 2. For values beginning with decimal "5" though "9", we should round to the nearest multiple of 5.
Value Rounding Percentage Distinct Values 0-9 0 0% 10 10-20 0.5/10-0.5/20 5%-2.5% 10 20-50 1/20-1/50 5%-2% 15 50-100 2.5/50-2.5/100 5%-2.5% 10 (repeat the pattern for 100-200 etc.) (lower bounds are inclusive, upper bounds are exclusive)
This reduces each set of 1000 3 significant figure bandwidths to 35 distinct values.
It's worth noting that the existing bandwidth authority scripts round to 3 significant figures, so the overall rounding is up to 5.5%.
This change does not require a consensus method; it will take effect once enough authorities have upgraded.
The change will take effect progressively as authorities upgrade: since the median value is used, when one authority upgrades, 1/5 of the bandwidths will be rounded (on average).
Once all authorities upgrade, all bandwidths will be rounded like this.
- Analysis
The rounding proposed above will not round any value by more than 5%, so the overall impact on bandwidth balancing should be small.
5% *more* than the existing rounding.
...
- Open questions:
Is there a greedier smoothing algorithm that would produce better results?
Rounding to one significant digit changes 10 by up to 50%, so that's clearly unacceptable.
If we wanted 12.5% to be the maximum change, we could round approximately double (with some adjustments for scale).
For values beginning with decimal "1", we should round the first two digits the nearest multiple of 2. For values beginning with decimal "2" through "4", we should round the first two digits the nearest multiple of 5. For values beginning with decimal "5" though "9", we should round to one significant digit.
Value Rounding Percentage Distinct Values 0-9 0 0% 10 10-20 1/10-1/20 10%-5% 5 20-50 2.5/20-2.5/50 12.5%-4% 6 50-100 5/50-5/100 10%-5% 5 (repeat the pattern for 100-200 etc.) (lower bounds are inclusive, upper bounds are exclusive)
This reduces each set of 1000 3 significant figure bandwidths to 16 distinct values.
Would a scheme like this reduce the diffs by enough to make it worthwhile?
Is there any reason to think this amount of smoothing would not be save?
s/save/safe/
Given the existing variance is around 2%, and the existing rounding of 0.5%, rounding by 5% seems quite sensible.
(Typical variance *between* bandwidth authorities is much more than this anyway.)
It seems hard to game this scheme, and it would only result in a 5% boost anyway.
The same arguments apply to rounding by up to 12.5%. Anything more seems risky.
Would a time-aware smoothing mechanism work better?
Perhaps, but the complexity might outweigh the benefits.
T
-- Tim Wilson-Brown (teor)
teor2345 at gmail dot com PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n xmpp: teor at torproject dot org ------------------------------------------------------------------------