[tor-bugs] #25687 [Core Tor/Tor]: over-report of observed / self-measure bandwidth on fast hardware -- important to torflow / peerflow

Wed Sep 19 09:10:02 UTC 2018

#25687: over-report of observed / self-measure bandwidth  on fast hardware --
important to torflow / peerflow
-------------------------------------------------+-------------------------
 Reporter:  starlight                            |          Owner:  (none)
     Type:  defect                               |         Status:  new
 Priority:  Medium                               |      Milestone:  Tor:
                                                 |  unspecified
Component:  Core Tor/Tor                         |        Version:  Tor:
                                                 |  0.2.6.10
 Severity:  Normal                               |     Resolution:
 Keywords:  tor-bwauth, needs-research, needs-   |  Actual Points:
  proposal?                                      |
Parent ID:                                       |         Points:
 Reviewer:                                       |        Sponsor:
-------------------------------------------------+-------------------------

Comment (by starlight):

 The essence of Torflow's active approach is that observed bandwidth
 capacity at each relay is the key measurement and that it can only be
 reliably determined locally but that it requires adjustment, principally
 to account for used vs unused capacity and secondly the relative
 performance of each node in the asymmetric domain of internet traffic
 routing.  IMO indisputably correct.  The Peerflow paper tacitly recognizes
 this.

 However the simple linear adjustment algorithm cannot be fine-tuned for
 better results across the vast range of relay performance.  IIRC
 polynomial equations of sufficient order can describe curves of near
 arbitrary complexity and therefore parameterized polynomials can be used
 interactively, in a gradual empirical search, to describe an improving set
 of adjustment biases for applying scanner measurements to advertised
 bandwidths.  This link illustrates the general principal, though the idea
 is to design and construct bias curves with polynomials rather then to fit
 them somehow.

 https://en.wikipedia.org/wiki/Polynomial_regression

 Averaging all relay measurements to a single value appears too simple in
 the face of reality.  My suggestion was and is to apply some variation of
 a moving average in such a way that relay biases are determined relative
 to relays of similar capacity rather than to all relays.  Note this is not
 at all the same as the "slice" scheme used by Torflow to group relays for
 measurement and I agree eliminating it was a good idea.

 https://en.wikipedia.org/wiki/Moving_average

 Evaluating relays against others of the same class seems a good idea.
 (I.E. process guards, exits and unflagged middle relays as isolated
 collections.)  Torflow implemented it when PID logic is was active, but
 presently it is disabled.

 I believe the "filtered" and "stream" logic in Torflow increases noise in
 the system rather than reducing it, and works opposite to the intent of
 the design.  The "filtered" treatment option should be eliminated.

 Central to all the above is iterative empirical adjustment possibly
 assisted by automated collection and analysis, with control knobs exposed
 as consensus parameters.  The above can easily be implemented in a manner
 that allows the new system to start exactly where Torflow is and to
 cautiously refine, with the option to revere course at any point.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/25687#comment:13>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online