[tor-bugs] #27338 [Core Tor/sbws]: How long should sbws keep measured and observed bandwidths?

Mon Aug 27 13:06:01 UTC 2018

#27338: How long should sbws keep measured and observed bandwidths?
---------------------------+-------------------------------------
 Reporter:  teor           |          Owner:  (none)
     Type:  task           |         Status:  new
 Priority:  Medium         |      Milestone:  sbws 1.0 (MVP must)
Component:  Core Tor/sbws  |        Version:
 Severity:  Normal         |     Resolution:
 Keywords:                 |  Actual Points:
Parent ID:  #27108         |         Points:
 Reviewer:                 |        Sponsor:
---------------------------+-------------------------------------

Comment (by juga):

 > We need to decide which strategy to use, update the bandwidth file spec,
 and implement this feature in sbws.

 i'm not sure how we're going to decide this. Try with each of the methods
 for 1 week and graph results and/or calculate % differences with Torflow?

 In case it's useful, the descriptors' observed bandwidth collected at the
 time of doing measurements in the results, for 2 days:
 - number of relays' descriptor observed bandwidth: 6462
 - mean of all relays' descriptor observed bandwidth taking the last for
 each relay: 5621550
 - mean of all relays' descriptor observed bandwidth taking the mean from
 the relay's results: 5609508
 - median of all relays' descriptor observed bandwidth taking the last for
 each relay: 2065215
 - mean of all relays' descriptor observed bandwidth taking the mean from
 the relay's results: 2060907
 - number of relays for which it was collected 1 descriptor observed
 bandwidth: 5087 (79%)
 - number of relays for which it was collected 1 descriptor observed
 bandwidth: 1368 (21%)
 - number of relays for which it was collected 1 descriptor observed
 bandwidth: 7 (0.11%)

 I've also being collecting descriptors' observed bandwidth every hour (in
 a separated script). Would be useful to compare only the descriptors'
 observed bandwidth collected in these 3 different ways?.

 I'm having a lot of new code because of all the changes, tests and graphs,
 i could:
 1. continue with the experiments and make PR only when we have decided
 this
 2. keep the experiments code so that we can reproduce them in a future and
 start creating PRs with it.
 Is it 2 ok?.

 For instance, If we collect descriptors' observed bandwidth, that's new
 code. I think it's fine i keep the code to store descriptors' observed
 bandwidth only at the time of doing measurements?. I can configure it in a
 way that the method to be used can be passed as parameter.

 > Oh, and I think we should always use the latest
 {Relay,}Bandwidth{Rate,Burst}.

 Do you mean descriptors' bandwidth burst [0]?. We have not used it yet for
 anything. How should we use them?.
 We have only used descriptors' bandwidth average [1] to cap the
 measurements.

 [0] https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n427
 [1] https://github.com/pastly/simple-bw-
 scanner/blob/master/sbws/lib/v3bwfile.py#L314

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/27338#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online