On 29 Jan 2018, at 08:00, Florentin Rochet florentin.rochet@uclouvain.be wrote:
Hello,
On 28/01/18 11:52, teor wrote: Hi,
I have some more questions:
Nice, thanks! I still have to answer your previous email and push an update to the proposal. I should do it this week, sorry for late answers :)
See inline a few answers to your questions:
On 18 Jan 2018, at 11:03, teor teor2345@gmail.com wrote:
> Unanswered questions:
The Tor network has been experiencing excessive load on guards and middles since December 2017.
If I have correctly followed what was happening: around 1M tor2web clients appeared at OVH
Not just OVH, at least 3 different providers.
And not just Tor2web, either. There are onion services which are overloading the network as well, probably in response to these clients. The onion services are mostly overloading guard-weighted nodes.
and started to overload the network with circuit creation requests using the old and costly TAP handshake.
Not just TAP. The sheer number of entry connections, extend requests, and destroy cells is also creating overloads on some relays.
Tor2web clients make direct connections to the intro point and to the rendezvous point, right?
Yes.
And, looking into the code right now, it does not looks like Tor2webs make any distinction to flags. So, basically, the Tor2web load is only weighted by consensus weight (bandwidth-weights have no impact) on the overall network (exits too).
This only applies if Tor2webRendezvousPoints is set. Otherwise, the nodes are middle-weighted.
Guess: shouldn't that the reason why all exits logs are flooded with the message "[warn] Tried to establish rendezvous on non-OR circuit with purpose Acting as rendevous (pending)"? Those messages would be caused by tor2web clients picking exit relays as rendezvous node :/ I started to see them increasing more and more since August 2017.
No, this is a different issue. Exit relays are allowed as rendezvous nodes.
So basically, I *think* we can drop the questions below because bandwidth-weights do not play any role in the excessive load that the network is handling with those tor2webs.
Guard weights are used by overloading onion services, and middle weights are used by overloading Tor2web clients.
Does the waterfilling proposal make excessive load on guards worse, by allocating more guard weight to lower capacity relays? Is the extra security worth the increased risk of failure?
We want to design a network that can handle different kinds of extra load. So these questions are important, even if they don't apply right now.
Does the waterfilling proposal make excessive load on middles better, by allocating more middle weight to higher capacity relays? Is there a cascading failure mode, where excess middle weight overwhelms our top relays one by one? (It seems unlikely.)
I'm going to re-ask this questions, in light of the extra middle load from Tor2web clients:
Does the waterfilling proposal make excessive load on middles worse, by allocating more middle weight to higher capacity relays?
In particular, connections are limited by file descriptors, and file descriptor limits typically don't scale with the bandwidth of the relay. As far as I can tell, waterfilling would have directed additional Tor2web traffic to large guards. It would have brought down my guards faster, and made it much harder for me to keep them up.
If we had implemented waterfilling before this attack, would it have lead to cascading failures on our top guards? They would have been carrying significantly more middle load, and mine barely managed to cope.
Can you redesign the proposal so there is some limit on the extra middle load assigned to a guard? Or does this ruin the security properties?
Is there a compelling argument for security over network robustness?
I also have another practical question:
We struggle to have time to maintain the current bandwidth authority system.
Is it a good idea to make it more complicated?
Hm, I don't see how Waterfilling plays any role with torflow or bwscanner? I mean, there is still this feedback loop thing but it has no impact on the design of the current torflow or bwscanner?
I can't really say. I look forward to your explanation of the feedback loop.
Could you be more specific about your concerns with the bandwidth authorities and this proposal?
It takes time and effort from Tor people to integrate and maintain the code and monitoring for a new proposal like this one.
We will need to take extra time on this proposal, because we already need more monitoring for the current bandwidth authority system. And only then would we have time to build monitoring specific to this proposal.
Also, when we change bandwidth measurement or allocation, we need to change one thing at a time, and then monitor the change. So depending on our priorities, this proposal may need to wait until after we implement and monitor other urgent fixes.
Who will maintain the new code we add to Tor to implement waterfilling?
I would volunteer to that.
Typically, experienced Core Tor team members review and maintain code.
And there's still a lot of development and testing work to be done before the code is ready to merge. Are you able to do this development?
How much help will you need to write a new consensus method? How much help will you need to write unit tests? (This help will come from existing team members.)
Does your current code pass: * make check * make test-network-all * in particular, any new consensus method must pass the "mixed" network, with an unpatched Tor version in your path as "tor-stable"
Who will build the analysis tools to show that waterfilling benefits the network?
Volunteers or master students. I can definitely suggest this topic in my university.
Typically, experienced Tor Metrics team members write, review, and maintain monitoring systems. And they don't have a lot of extra capacity right now.
Even if students do this task, they would need help from existing team members.
Do the benefits of waterfilling justify this extra effort?
Question for the other Tor devs :) I am definitely biased towards the "yes"
It seems plausible, but I don't feel I have seen a compelling enough argument to prioritise it above fixing bandwidth authorities.
At the moment, reasonably fast guards in Eastern North America and Western Europe are overloaded with client traffic. And guards in the rest of the world are under-loaded. Reducing this bias is something we need to do.
And this proposal gets us better security if we fix this geographical bias first. Otherwise, adversaries can simply pick a location that massively increases their consensus weight, and get lots of client traffic.
And even if they do, should we focus on getting the bandwidth authorities in a maintainable state, before adding new features? (I just gave similar advice to another developer who has some great ideas about improving bandwidth measurement.)
Bandwidth-weights and measurements (consensus weights) are two different things that solve 2 different problems. So, we can work independently on improving measurements (like what is currently done with bwscanner) and improving Tor's balancing (bandwidth-weights) with this proposal.
I don't think this is realistic. There is always contention for shared resources.
Integrating and testing new code, and monitoring its effects, will take effort from the teams I mentioned above. This takes away from the urgent work of fixing the bandwidth authority system. Which also takes effort from the Core Tor and Metrics teams.
> What about the feedback loop between this new allocation system > and the bandwidth authorities? > I am sorry, I don't really understand why a feedback loop is needed. Measuring bandwidth and producing bandwidth-weights seems orthogonal to me.
You do not need to add a feedback loop, one already exists:
- Consensus weights on guards and middles change
- Client use of guards and middles change
- Bandwidth authority measurements of guards and middles change
- Repeat from 1
My question is:
How does this existing feedback loop affect your proposal? Does it increase or reduce the size of the guard and middle weight changes?
I have added those questions to the proposal. This looks difficult to know.
Can shadow simulate this?
I am still interested in this feedback loop. If it fails to converge, the system will break down.
Yup. Going to answer this on your previous email.
T