[tor-dev] Metrics for evaluating sbws vs torflow? (was: Raising AuthDirMaxServersPerAddr to 4)

4 Jun 2019

      teor:
...
Hi Mike,
...
On 4 Jun 2019, at 06:20, Mike Perry <mikeperry@torproject.org> wrote:
Mike Perry:
...
teor:
...
I have an alternative proposal:
Let's deploy sbws to half the bandwidth authorities, wait 2 weeks, and
see if exit bandwidths improve.
We should measure the impact of this change using the tor-scaling
measurement criteria. (And we should make sure it doesn't conflict
with any other tor-scaling changes.)
I like this plan. To tightly control for emergent effects of all-sbws vs
all-torflow, ideally we'd switch back and forth between all-sbws and
all-torflow on a synchronized schedule, but this requires getting enough
measurement instances of sbws and torflow for authorities to choose
either the sbw file, or the torflow file, on some schedule. May be
tricky to coordinate, but it would be the most rigorous way to do this.
We could do a version of this based on votes/bwfiles alone, without
making dirauths toggle back and forth. However, this would not capture
emergent effects (such as quicker bwadjustments in sbws due to decisions
to pair relays with faster ones during measurement). Still, even
comparing just votes would be better than nothing.
I don't know how possible this is: we would need two independent network
connections per bandwidth scanner, one for sbws, and one for torflow.
(Running two scanners on the same connection means that they compete
for bandwidth. Perhaps we could use Tor's BandwidthRate to share the
bandwidth.)
I also don't know how many authority operators are able to run sbws:
Roger might be stuck on Python 2.
And I don't know how often they will be able to switch configs.
Let's make some detailed plans with the dirauth list.
Ok. It looks like I am still on the dirauth list. Perhaps we can come up
with some way to use the dirauth-conf repo to switch things, but if we
lack the machines for separate sbws and torflow, I agree that we should
not try to have the same connections/machines running both.

In that case, we should just focus on tracking the metrics that are
important to us as we continue to add sbws and remove torflow instances.
...
...
...
Do you like these metrics? Do you think we should be using different
ones? Should we try a few different metrics and see what makes sense
based on the results?
As additional metrics, we could do the CDFs of the ratio of measured bw
to advertised bw, and/or the metrics Karsten produced using just
measured bw. (I can't still find the ticket where those were graphed
during previous torflow updates, though).
These metrics would be pretty unique to torflow/sbws experiments, but if
we have enough of those in the pipeline (such as changes to the scaling
factor), they may be worth tracking over time.
If we get funding for sbws experiments, we can definitely tweak the sbws
scaling parameters, and do some experiments.
At the moment, I'd like to focus on fixing critical sbws issues, deploying
sbws, and making sure it works at least as well as torflow.
Yes, that makes sense. A minimal version of this could be: don't do the
swapping back and forth, just add sbws and replace torflow scanners one
by one. As we do this, we could just keep a record of the metrics over
the votes and consensus during this time, and compare how the metrics
look for the sbws vs torflow votes vs the consensus, over time.

I'll work on precise formulae for the "Per Relay Spare Capacity" metric
and the "Measured to Observed Ratio" metric, and think more about how we
want to graph them so they are more easy to compare over time. I feel
like my previous mails were a little hand-wavy. Depending on how this
works out, I will either post that to tor-scaling with a complete list
of specific metrics equations, or write a separate post to tor-dev with
them just for sbws.

We won't finalize all of the performance experiment metrics until after
the Mozilla All Hands meeting (ie: ~3 weeks), but the two above can be
retroactively computed using router descriptor and extrainfo archives.

What were you thinking for the timeframe for the complete transition to
sbws?

-- 
Mike Perry