On 04/07/2019 12:46, George Kadianakis wrote:
David Goulet dgoulet@torproject.org writes:
Overall, this rate limit feature does two things:
Reduce the overall network load.
Soaking the introduction requests at the intro point helps avoid the service creating pointless rendezvous circuits which makes it "less" of an amplification attack.
I think it would be really useful to get a baseline of how much we "Reduce the overall network load" here, given that this is the reason we are doing this.
That is, it would be great to get a graph with how many rendezvous circuits and/or bandwidth attackers can induce to the network right now by attacking a service, and what's the same number if we do this feature with different parameters.
If you're going to do this comparison, I wonder if it would be worth including a third option in the comparison: dropping excess INTRODUCE2 cells at the service rather than NACKing them at the intro point.
In terms of network load, it seems like this would fall somewhere between the status quo and the intro point rate-limiting mechanism: excess INTRODUCE2 cells would be relayed from the intro point to the service (thus higher network load than intro point rate-limiting), but they wouldn't cause rendezvous circuits to be built (thus lower network load than the status quo).
Unlike intro point rate-limiting, a backlog of INTRODUCE2 cells would build up in the intro circuits if the attacker was sending cells faster than the service could read and discard them, so I'd expect availability to be affected for some time after the attack stopped, until the service had drained the backlog.
Excess INTRODUCE2 cells would be dropped rather than NACKed, so legitimate clients would see a rendezvous timeout rather than an intro point failure; I'm not sure if that's good or bad.
On the other hand there would be a couple of advantages vs intro point rate-limiting: services could deploy the mechanism immediately without waiting for intro points to upgrade, and services could adjust their rate-limiting parameters quickly in response to local conditions (e.g. CPU load), without needing to define consensus parameters or a way for services to send custom parameters to their intro points.
Previously I'd assumed these advantages would be outweighed by the better network load reduction of intro point rate-limiting, but if there's an opportunity to measure how much network load is actually saved by each mechanism then maybe it's worth including this mechanism in the evaluation to make sure that's true?
I may have missed parts of the discussion, so apologies if this has already been discussed and ruled out.
Cheers, Michael