[tor-bugs] #28305 [Metrics/Statistics]: Include client numbers even if we think we got reports from more than 100% of all relays

Tor Bug Tracker & Wiki blackhole at torproject.org
Sat Nov 3 20:38:47 UTC 2018


#28305: Include client numbers even if we think we got reports from more than 100%
of all relays
------------------------------------+--------------------------
     Reporter:  karsten             |      Owner:  metrics-team
         Type:  defect              |     Status:  new
     Priority:  High                |  Milestone:
    Component:  Metrics/Statistics  |    Version:
     Severity:  Normal              |   Keywords:
Actual Points:                      |  Parent ID:
       Points:                      |   Reviewer:
      Sponsor:                      |
------------------------------------+--------------------------
 The estimated fraction of reported user statistics from relays has reached
 100% and even went slightly beyond that number to 100.294% on 2018-10-27
 and 100.046% on 2018-10-28.

 The effect is that we're excluding days when this happened from
 statistics, because we never thought this was possible:

 {{{
 WHERE a.frac BETWEEN 0.1 AND 1.0
 }}}

 However, I think this is most likely a rounding error somewhere, not a
 general issue with the approach. Stated differently, it seems wrong to
 include a number with a fraction of reported statistics of 99.9% but not
 one where that fraction is 100.1%.

 I suggest that we drop the upper limit and change the line above to:

 {{{
 WHERE a.frac >= 0.1
 }}}

 We'll be replacing these statistics by PrivCount in the medium term
 anyway.

 However, simply excluding data points doesn't seem like an intuitive
 solution.

 Thoughts?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/28305>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list