[tor-bugs] #33538 [Core Tor/Tor]: v3bw files with too large of weights lead to relays being selected nearly uniformly at random

Thu Mar 5 20:19:02 UTC 2020

#33538: v3bw files with too large of weights lead to relays being selected nearly
uniformly at random
------------------------------+------------------------------------
     Reporter:  pastly        |      Owner:  (none)
         Type:  defect        |     Status:  new
     Priority:  Medium        |  Milestone:
    Component:  Core Tor/Tor  |    Version:  Tor: 0.4.4.0-alpha-dev
     Severity:  Normal        |   Keywords:
Actual Points:                |  Parent ID:
       Points:                |   Reviewer:
      Sponsor:                |
------------------------------+------------------------------------
 As part of working on the FlashFlow paper (currently under submission) we
 ran Shadow simulations and compared it to TorFlow. Not surprising.

 Summary of the Shadow network used:

 - 5% of the real Tor network in size
     - 44 exits
     - 104 guards
     - 180 middles
     - 3 auths
     - 10 "markov" clients. It's not terribly important to know what
 they're doing, other than knowing they're making lots of 3-hop exit
 circuits and exchanging traffic with servers. 2 of the clients have tor
 debug logs. All 10 contribute to the relay selection data.
 - Tor version used is a63b4148229ae8ce46494fd6a0f99149c231605c (master
 branch as of March 5th, 2020) plus a small logging patch.
 [https://github.com/pastly/public-tor/tree/log-relay-weights Branch here].
 This existed in 0.3.5.7 as well. I don't know when this problem started
 because I don't know exactly what the problem is.
 - Shadow 292cd89ba52fc2972fdd9d2e27e384db9601663b (as of Jan 10th, 2020).
 - Shadow-plugin-tor 8deab15a032f5173ba7c12ad6dd0bcb1cb0c3463 (as of Oct
 2019) plus patch so it works with new Tors. [https://github.com/pastly
 /shadow-plugin-tor/tree/three-stubs Branch here].

 The only difference in the simulations are the v3bw files used.

 There are three simulations:

 1. Torflow-derived weights (TF)
 1. FlashFlow-derived weights (FF init)
 1. FlashFlow-derived weights that have all been divided by 136 (FF scaled)

 weight-dist.pdf shows the distribution of the weights in the v3bw files,
 both with the raw absolute weights and as normalized (norm_weight = weight
 / total_weight). Despite having nearly identical normalized weight
 distributions (note: FF init and FF scaled are obviously identical), FF
 init results in (1) relays being selected seemingly uniformly at random,
 and (2) significantly worse performance as a consequence.

 selection-v-weight.pdf shows how often the 10 markov clients picked each
 relay. Focus on the scatter plots. Notice how in TF and FF scaled there is
 basically a 1:1 linear relationship between additional weight and
 selection frequency, while in FF init the selection frequency is roughly
 the same regardless of the relay's weight.

 I am also attaching the three v3bw files.

 I am also attaching small snippits from the debug logs of one of the
 markov clients. The snippits show some of the relay weights the client is
 using when deciding which relays to use. You can see in the FF initial one
 that the weights are much more similar than in FF scaled and TF.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33538>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online