[tor-dev] Denial of service defences for onion services
desnacked at riseup.net
Tue Apr 30 17:12:09 UTC 2019
This is a thread summarizing and brainstorming various defences about denial of
service defences for onion services after an in-depth discussion with David Goulet.
We've been thinking about denial of service defences for onion services
lately. This has been a recurrent topic that has been creeping up every once in
a while: Last time we had to tackle this issue it was back in early 2018 when
we had to design a DoS mitigation subsystem because the network was crumbling
Unfortunately, while the DoS mitigation subsystem improved the health of the
network and stopped the DoS attacks back then, it did not address the total
space of possible attacks, and onion services and the network is still open to
various attacks. The main DoS attack right now is the naive attack of flooding
the service with too many introduction requests, and this is the attack that
this post is gonna be dealing with.
We don't like DoS attacks because they cause two issues to Tor:
a) They damage the health of the Tor network impacting every user
b) They kill availability of legitimate onion services.
In this thread we will handle these two issues independently, as there is no
single solution that improves both areas at once. We have some pretty good
ideas on (a), but we would appreciate ideas on (b), so feel free to give us
== a) Minimizing the damage to the network caused by DoS attacks:
Most of the damage caused during DoS attacks is from the circuits created by
the attacker to introduce/rendezvous to the victim onion service, and also
by the circuits created by the victim onion service as it tries to
rendezvous with all those clients. An attacker can literally create tens of
thousands of introduction circuits in less than a minute, which get
amplified by the service launching that many rendezvous circuits. Not good.
Here are a few ways to reduce the damage to the network:
== 1) Rate limiting introduction circuits
There should be a way to rate-limit introductions so that services do not
get overwhelmed. There are various places where we can rate-limit: we
could rate-limit on the guard-layer, or on the intro-point layer or on
We have already attempted at rate-limiting on the guard-layer with
#24902, but it's hard to go deeper there because the guard does not know
if the circuit is a DoS attacker, or a busy onion service, or 150 Tor
users in an airport. We also think that rate-limiting on the
service-layer won't do much good since that's too far down the circuit,
and we are trying to reduce the operations it has to do so that it
doesn't get overwhelmed (see #15463 for various queue-management
approaches for rate-limiting on the service side).
So we've been thinking of rate-limiting on the introduction point layer,
since it's a nice soaking point that does not do much right now. See
#15516 (comment 28) for a concrete proposal by arma which results in far
less damage to the network (since evil traffic does not get carried
through to the service-side introduction circuit, and no extra rendezvous
circuits get launched), and also a swifter way for legit clients to know
that an onion-service circuit won't work.
== 2) Stop needless circuit rotation on service-side
Right now, services will rotate their introduction circuits after a
certain number of introductions (#26294). This means that during an
attack, the service not only needs to handle thousands of fake
introduction circuits, but also continuously tear down and recreate
introduction circuits and publish new descriptors. See comment 8 on that
ticket for a short-term proposal on how to improve the situation here,
by not continuously rotating introduction points.
== 3) Optimize CPU performance on the service-side
Right now, onion services during an attack are actually CPU bound. See
#30221 for various improvements we can do to improve the performance of
services. However, improving CPU performance might have the opposite effect,
since processing cells quicker means that the service will make even more
== 4) Make sure attackers don't take shortcuts around the protocol
We should make sure that attackers don't take shortcuts around the Tor
protocol to launch their attacks. Examples here involve requiring a
proof-of-rendezvous from clients (#25066), and not allowing single-hop
proxies to do introductions (#22689).
The above suggestions (maybe in priority order) are ways we can improve the
damage dealt to the network by DoS attackers. But that still does not make
DoS attacks less effective. So here follows the section about improving
== b) Improve service availability during DoS attacks
Unfortunately, it's really hard to accurately stop DoS attacks in the Tor
protocol. There is just no good way to distinguish between innocent clients
trying to access content, and a bad actor trying to disable an onion service.
Here is the main way we've thought of addressing this issue:
== 1) Binding the application-layer with the Tor introduction-layer
We think that the Tor protocol layer might not be the right place for
handling DoS attacks. There are literally million-dollar companies trying
hard to tackle this issue on the application-layer, where it's easier
since you can do machine learning, give out captchas, zone out users,
etc. And that's why we think that the solution to this issue lies on the
application-layer and not on the Tor protocol layer.
In particular, a plausible solution here might involve for the client to
embed application-layer information (e.g. a username/password) in its
INTRODUCE1 cell, which then gets passed to the service. The service, can
then check whether the given username/password should be allowed to
connect (see "rendezvous approver" concept at #16059), and allow or reject
the connection as it wishes. This way onion service operators can have
complicated application-layer software that analyzes the activity of users
and decide whether users should be allowed in or not (based on the number
of introductions, or their application-layer (web) activity).
| Tor network |
| +-----+ |
+-------->| Tor |-------------------+
INTRO2 | HS | rendezvous circuit
with +-----+ only if approved
|approver | +-------+
We think that this is a solution that could allow onion services to
continue existing under high-load scenarios, since no rendezvous circuits
would be established during DoS scenarios (and we know that rendezvous
circuits is what causes the most CPU/network/availability damage).
However, this is a very complicated solution from an engineering
perspective given that it requires changes on the client-side (to enhance
INTRO1 cells with application-layer data), and also involves various
enhancements on the service-side (various control port commands to
interact with the (nonexistent) "rendezvous approver" software, which in
turn needs to interact with other application-layer software (e.g. sql
databases to manage membership).
There is also serious UX concerns with how this would look like on the
client-side? Also, how does this interact with client auth? And how does
this interact with intro-point-level rate limiting proposed above
(onions should be given the option to disable intro-layer rate limiting)?
How is this related to #17254?
All in all, we feel like we have pretty good options for reducing the
damage that DoS attacks cause on our network, but we are still lacking
easy and practical solutions for ensuring availability of onion services
that are under DoS. For the next months, we plan to focus on reducing
the damage on the network, since the damage on the network has a
cummulative effect as circuits fail and get endlessly retried, where
nothing ends up working right. At the same time, we will be thinking of
good solutions for keeping a high availability on services that receive
We would love your feedback and suggestions.
More information about the tor-dev