Hi asn,
Original Subject: Re: [tor-project] Meeting notes, network team meeting, 18 Dec
On 19 Dec 2017, at 07:28, Nick Mathewson nickm@torproject.org wrote:
- Met with David Stainton, Moritz and others. Talked about relay load balancing and bandwidth dirauths. People are sad about the state of the network: some relays are overloaded while others idle, many overloaded relays cant even establish circuits to each other. Need to do something about it: deploy bwscanner and start thinking about peerflow. What about isis' bridge bandwidth scanner?
Can you tell us a bit more about this meeting? What can people do if they want to be involved? When is the next meeting, or how can people find out about it?
T
-- Tim / teor
PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n ------------------------------------------------------------------------
teor teor2345@gmail.com writes:
Hi asn,
Original Subject: Re: [tor-project] Meeting notes, network team meeting, 18 Dec
On 19 Dec 2017, at 07:28, Nick Mathewson nickm@torproject.org wrote:
- Met with David Stainton, Moritz and others. Talked about relay load balancing and bandwidth dirauths. People are sad about the state of the network: some relays are overloaded while others idle, many overloaded relays cant even establish circuits to each other. Need to do something about it: deploy bwscanner and start thinking about peerflow. What about isis' bridge bandwidth scanner?
Can you tell us a bit more about this meeting? What can people do if they want to be involved? When is the next meeting, or how can people find out about it?
Hey teor,
thanks for following through.
The "meeting" was impromptu and IRL because we all happened to be at the same place. There is no next meeting and it's up to us (Tor/network team) to figure out what are the next steps here.
This week I did some digging to explore the various possible ways forward also based on your email here: https://www.mail-archive.com/tor-dev@lists.torproject.org/msg09912.html Here are my findings:
Possibility a) Develop peerflow and deploy it in place of torflow
Peerflow is an exciting and secure bandwidth measurement system published in PETS 2017: https://ohmygodel.com/publications/peerflow-popets2017.pdf
Unfortunately, it seems quite complicated to develop from scratch and will probably require _significant_ engineering time to actually make it a deployed reality (understand, develop, test, deploy). This is probably the solution we would like to pursue if we had a grant and a dedicated developer.
Possibility b) Finalize bwscanner and deploy in place of torflow
bwscanner is a project by Aaron/David/Donncha: and can be found here: https://github.com/TheTorProject/bwscanner
It seems to implement the torflow design (2-hop circs && buckets) but in a cleaner and better codebase. From what I understand, the main part of the project is done, but there has been minimal testing on the real network (there are unittests tho) and also the final output file with the bandwidth weights has not been completely finalized.
This project is not quite there yet, and will require some non-trivial engineering time, but it's probably a much easier task compared to peerflow due to the design being more understood and already coded. I think 2-3 weeks of developer time could be quite fruitful here. I also heard that some bw auth operators are eager to run bwscanner instead of torflow on their setup in January.
Possibility c) Adapt the bridge bw scanner that is currently being developed
Apparently isis and another developer are currently writing a bridge bandwidth scanner for bridgedb, that could in theory be extended to scan the whole network. They are currently writing some sort of Rust library that will be used by the scanner, and the project is ETA around March 2018. The whole development process is pretty opaque so I have no idea what's going on. Also, there probably needs to be considerable work to extend it from a simple bridge scanner to a real relay scanner, and the final result will probably look like bwscanner above.
Currently my intuition is to work on (b) above, while also preparing the ground for (a) which seems to be The Right Thing.
I'm not sure what's the right way forward here in terms of project management, since the network team seems overloaded and I haven't heard of anyone willing to take this on...
Ideally we would probably apply for some sort of grant on this work so that some actual developer time is allocated. I think this is definitely fundable work since it deeply impacts the *performance* and security of the Tor network, and basically the network has no chance of surviving in greater loads if the status quo persists.
I'll try to think more about this problem in the future, these are just my thoughts from a few hours of digging.
Cheers!
George Kadianakis desnacked@riseup.net writes:
The "meeting" was impromptu and IRL because we all happened to be at the same place. There is no next meeting and it's up to us (Tor/network team) to figure out what are the next steps here.
I want to help. Anyone please bug me on IRC for any Python etc help required to make bwauth/scanners better. I don't have enough volunteer cycles right now to "take over" bwscanner entirely though.
This project is not quite there yet, and will require some non-trivial engineering time, but it's probably a much easier task compared to peerflow due to the design being more understood and already coded.
I'm not convinced this part is completely accurate ;) because at TorDev MTL it seems to me the consensus was that nobody actually knows what torflow is doing and so answering the question "is bwscanner doing the same thing" is approximately NP-hard.
I think 2-3 weeks of developer time could be quite fruitful here. I also heard that some bw auth operators are eager to run bwscanner instead of torflow on their setup in January.
Wooo! (I think the best path to answering "does bwscanner do the same thing as torflow" is to Run It And See...) If any of these parties are having problems deploying bwscanner this is probably something I can help with.
Currently my intuition is to work on (b) above, while also preparing the ground for (a) which seems to be The Right Thing.
+1 I think the next step for a) isn't "implement it", but "write a spec for it" instead.
Ideally we would probably apply for some sort of grant on this work so that some actual developer time is allocated. I think this is definitely fundable work since it deeply impacts the *performance* and security of the Tor network [..]
+5
On 20 Dec 2017, at 06:06, meejah meejah@meejah.ca wrote:
This project is not quite there yet, and will require some non-trivial engineering time, but it's probably a much easier task compared to peerflow due to the design being more understood and already coded.
I'm not convinced this part is completely accurate ;) because at TorDev MTL it seems to me the consensus was that nobody actually knows what torflow is doing and so answering the question "is bwscanner doing the same thing" is approximately NP-hard.
I have some idea what torflow is doing, in a broad sense: * launch 2 tor clients * repeat as often as possible, running 9 different scanners: * split relays into buckets by bandwidth percentile * build two hop paths with a relay and exit from relays in the bucket * download a file from a bandwidth server, choose the size based on the bucket * measure how long it takes * store the results in a database * aggregate the results hourly: * produce a consensus weight to advertised bandwidth ratio * using a decaying weighted average * and some form of feedback (PID) control * and dump it to a file Then authorities read this file and include it in their votes.
I suspect that Mike Perry may remember more detail, or may want to correct my summary, as he wrote most of torflow (I think?)
My conclusion at the Montreal meeting was that we don't have a detailed spec (see below). So that makes it hard to tell if: * torflow does what we want it to do * the new bwauth project does what we want it to do * they are similar enough for a staged or once-off transition
I think 2-3 weeks of developer time could be quite fruitful here. I also heard that some bw auth operators are eager to run bwscanner instead of torflow on their setup in January.
Wooo! (I think the best path to answering "does bwscanner do the same thing as torflow" is to Run It And See...) If any of these parties are having problems deploying bwscanner this is probably something I can help with.
It doesn't produce an output file in the same format as torflow, so we need to specify (see below) and implement that part first.
Otherwise, we would not have any results to compare.
Currently my intuition is to work on (b) above, while also preparing the ground for (a) which seems to be The Right Thing.
+1 I think the next step for a) isn't "implement it", but "write a spec for it" instead.
+1
Let's start by specifying what tor directory authorities expect from the file format.
T
teor:
On 20 Dec 2017, at 06:06, meejah meejah@meejah.ca wrote:
This project is not quite there yet, and will require some non-trivial engineering time, but it's probably a much easier task compared to peerflow due to the design being more understood and already coded.
I'm not convinced this part is completely accurate ;) because at TorDev MTL it seems to me the consensus was that nobody actually knows what torflow is doing and so answering the question "is bwscanner doing the same thing" is approximately NP-hard.
I have some idea what torflow is doing, in a broad sense:
- launch 2 tor clients
- repeat as often as possible, running 9 different scanners:
- split relays into buckets by bandwidth percentile
- build two hop paths with a relay and exit from relays in the bucket
- download a file from a bandwidth server, choose the size based on the bucket
- measure how long it takes
- store the results in a database
- aggregate the results hourly:
- produce a consensus weight to advertised bandwidth ratio
- using a decaying weighted average
- and some form of feedback (PID) control
- and dump it to a file
Then authorities read this file and include it in their votes.
Yes, all of this is correct.
Technically though full PID feedback is disabled right now. The PID-based implementation itself is enabled via bwauthpid=1 in the consensus, but the PID constants are currently set such that there is no actual feedback happening. See Section 3 of the Bw authority spec for more info: https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/R...
If feedback is enabled (via consensus parameters), it drives relays to other forms of resource exhaustion which we do not currently measure (primarily CPU exhaustion, which we could approximate by circuit failure, but potentially also memory pressure, which we have no signal for).
I suspect that Mike Perry may remember more detail, or may want to correct my summary, as he wrote most of torflow (I think?)
Yes.
(I think the best path to answering "does bwscanner do the same thing as torflow" is to Run It And See...) If any of these parties are having problems deploying bwscanner this is probably something I can help with.
Karsten wrote some scripts that can produce CDF graphs of bw authority votes for all of the flag combinations. This was very useful for determining if different bw authorities were measuring the network similarly. It will also be useful to see how closely the bwscanner is coming to the bwauth votes: https://trac.torproject.org/projects/tor/ticket/2394
I am not sure what repo they are in, though.
Let's start by specifying what tor directory authorities expect from the file format.
This format is already specified in Sections 2.4 and 3.4 of the bwauth spec itself: https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/R... https://gitweb.torproject.org/torflow.git/tree/NetworkScanners/BwAuthority/R...
(This output should not be confused with Section 1.6, which specifies the intermediate sub-process format before aggregating results).
Hi,
On 11/12/17 16:04, Mike Perry wrote:
https://trac.torproject.org/projects/tor/ticket/2394
I am not sure what repo they are in, though.
https://gitweb.torproject.org/metrics-tasks.git/tree/task-2394
Thanks, Iain.