[tor-dev] tor's definition of 'median'
oneofthem at riseup.net
Tue Aug 11 13:28:57 UTC 2015
I think you are confusing the median with the mean:
Taking the median instead of the mean can be beneficial in situations
where you have larger outliers in your data, which typically affect the
mean very much.
> Is there some implementation-specific reason not to use the standard
> mathematical definition of "median"? If not, I propose changing the
> implementation to become it.
> On Tue, Aug 11, 2015 at 2:44 AM Nick Mathewson <nickm at alum.mit.edu> wrote:
>> On Mon, Aug 10, 2015 at 1:11 PM, nusenu <nusenu at openmailbox.org> wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA512
>>>> If 3 or more authorities provide a Measured= keyword for a router,
>>>> the authorities produce a consensus containing a "w" Bandwidth=
>>>> keyword equal to the median of the Measured= votes.
>>> a random sample from recent votes:
>>> grep 188.8.131.52 -A 3 *|grep Measured
>>> w Bandwidth=6869 Measured=7570
>>> w Bandwidth=6869 Measured=15500
>>> w Bandwidth=6869 Measured=18100
>>> w Bandwidth=6869 Measured=30500
>>> Tor says the median value is
>>> w Bandwidth=15500
>>> but the median of these 4 values is actually:
>>> (18100+15500)/2 = 16800
>>> Has tor a different definition of 'median' and simply takes always the
>>> second ordered measurement vote out of 4 votes or is there a bug in
>>> the spec or implementation?
>> There's one misplaced throwaway sentence in dir-spec.txt:
>> " All ties in computing medians are broken in favor of the smaller or
>> earlier item.
>> We should bring this, and probably other things, into a "definitions"
>> section earlier in dir-spec.txt. Patches welcome. ;)
>> tor-dev mailing list
>> tor-dev at lists.torproject.org
> tor-dev mailing list
> tor-dev at lists.torproject.org
More information about the tor-dev