teor:
Hi Mike,
On 4 Jun 2019, at 06:20, Mike Perry mikeperry@torproject.org wrote:
Mike Perry:
teor:
I have an alternative proposal:
Let's deploy sbws to half the bandwidth authorities, wait 2 weeks, and see if exit bandwidths improve.
We should measure the impact of this change using the tor-scaling measurement criteria. (And we should make sure it doesn't conflict with any other tor-scaling changes.)
I like this plan. To tightly control for emergent effects of all-sbws vs all-torflow, ideally we'd switch back and forth between all-sbws and all-torflow on a synchronized schedule, but this requires getting enough measurement instances of sbws and torflow for authorities to choose either the sbw file, or the torflow file, on some schedule. May be tricky to coordinate, but it would be the most rigorous way to do this.
We could do a version of this based on votes/bwfiles alone, without making dirauths toggle back and forth. However, this would not capture emergent effects (such as quicker bwadjustments in sbws due to decisions to pair relays with faster ones during measurement). Still, even comparing just votes would be better than nothing.
I don't know how possible this is: we would need two independent network connections per bandwidth scanner, one for sbws, and one for torflow.
(Running two scanners on the same connection means that they compete for bandwidth. Perhaps we could use Tor's BandwidthRate to share the bandwidth.)
I also don't know how many authority operators are able to run sbws: Roger might be stuck on Python 2.
And I don't know how often they will be able to switch configs.
Let's make some detailed plans with the dirauth list.
Ok. It looks like I am still on the dirauth list. Perhaps we can come up with some way to use the dirauth-conf repo to switch things, but if we lack the machines for separate sbws and torflow, I agree that we should not try to have the same connections/machines running both.
In that case, we should just focus on tracking the metrics that are important to us as we continue to add sbws and remove torflow instances.
Do you like these metrics? Do you think we should be using different ones? Should we try a few different metrics and see what makes sense based on the results?
As additional metrics, we could do the CDFs of the ratio of measured bw to advertised bw, and/or the metrics Karsten produced using just measured bw. (I can't still find the ticket where those were graphed during previous torflow updates, though).
These metrics would be pretty unique to torflow/sbws experiments, but if we have enough of those in the pipeline (such as changes to the scaling factor), they may be worth tracking over time.
If we get funding for sbws experiments, we can definitely tweak the sbws scaling parameters, and do some experiments.
At the moment, I'd like to focus on fixing critical sbws issues, deploying sbws, and making sure it works at least as well as torflow.
Yes, that makes sense. A minimal version of this could be: don't do the swapping back and forth, just add sbws and replace torflow scanners one by one. As we do this, we could just keep a record of the metrics over the votes and consensus during this time, and compare how the metrics look for the sbws vs torflow votes vs the consensus, over time.
I'll work on precise formulae for the "Per Relay Spare Capacity" metric and the "Measured to Observed Ratio" metric, and think more about how we want to graph them so they are more easy to compare over time. I feel like my previous mails were a little hand-wavy. Depending on how this works out, I will either post that to tor-scaling with a complete list of specific metrics equations, or write a separate post to tor-dev with them just for sbws.
We won't finalize all of the performance experiment metrics until after the Mozilla All Hands meeting (ie: ~3 weeks), but the two above can be retroactively computed using router descriptor and extrainfo archives.
What were you thinking for the timeframe for the complete transition to sbws?