[tor-bugs] #8462 [Metrics Utilities]: Implement new bridge user counting algorithm (was Why don't .ir bridge users fall off when Tor gets censored by DPI?)

Tor Bug Tracker & Wiki blackhole at torproject.org
Mon May 6 14:56:36 UTC 2013


#8462: Implement new bridge user counting algorithm (was Why don't .ir bridge
users fall off when Tor gets censored by DPI?)
-------------------------------+--------------------------------------------
 Reporter:  arma               |          Owner:  karsten 
     Type:  task               |         Status:  assigned
 Priority:  normal             |      Milestone:          
Component:  Metrics Utilities  |        Version:          
 Keywords:                     |         Parent:          
   Points:                     |   Actualpoints:          
-------------------------------+--------------------------------------------

Comment(by karsten):

 An important requirement for the new implementation of our user-counting
 algorithm was to reduce the delay between observing and reporting Tor
 usage.  The earlier we know about user numbers dropping off in a given
 country or using a given pluggable transport the better we can respond.

 I [https://trac.torproject.org/projects/tor/attachment/ticket/8462
 /results-delay-2013-05-06.png attached a graph visualizing this delay for
 relays] (bridges are expected to be similar).  This graph probably needs
 some explanation:

  - Uptime is the information how many relays were running on a given day
 on average.  The left-most line shows relay uptime for April 4: it starts
 at 00:00 (all times are UTC) of April 4 and reaches its maxium at 23:00.
 The reason is simply that we learn about uptime from consensuses, and
 these are published once per hour.

  - Bytes are bandwidth histories published by relays in their extra-info
 descriptors.  Similarly, the April 4 line starts early on April 4, but it
 reaches its maximum at about 18:00 of April 5.  The reason is that it can
 take up to 18 hours for some relays to publish their next descriptor
 containing their bandwidth history.

  - Responses are directory request statistics published by relays.  Here,
 we see that it can take until 12:00 of April 6 to report all responses for
 April 4.  The reason for this is that directory request statistics
 intervals are 24 hours long and can end at any time of the day, possibly
 on April 4 at 23:59.  And then it takes another 12 hours (possibly even 18
 hours) for relays to publish their next descriptor.

  - Frac is the fraction of responses, weighted by bandwidth, that are
 available to estimate daily Tor users.  The higher this fraction the
 better the estimation.  A value of 0.1 might be a fine lower limit for
 believing results at least a little bit and a value of 0.5 means results
 are about as rock-solid as this estimation can ever be.  0.1 is reached
 during the evening of April 4, 0.5 at the end of April 5.

 tl;dr: we can make our very first guess about Tor users on a given day by
 the evening of that day and will have credible numbers by the end of the
 next day.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/8462#comment:7>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list