Hi folks,
As part of the hackweek projects ( https://gitlab.torproject.org/tpo/community/hackweek/ ), some of us are thinking about simple tweaks we can do to tune the network to better handle this month's traffic overload.
The long term answer is to try out proposal 327: https://gitweb.torproject.org/torspec.git/tree/proposals/327-pow-over-intro.... because we think a lot of the overload has to do with people sending way too many intro cells to some onion services, and giving the onion services ways to defend themselves is the only real answer.
But while we're thinking about implementing that proposal, one of our earlier steps is to set the guard-n-primary-guards-to-use consensus parameter from 1 to 2.
Now that it's taken effect (you can watch the votes at https://consensus-health.torproject.org/#consensusparams ), this change means that clients will now choose between two guard relays by default (rather than just one) when building circuits.
This is potentially a big deal, since it puts us into a different point in the performance vs safety tradeoff space.
Here is some reading for why we originally moved down to 1 guard by default: https://blog.torproject.org/improving-tors-anonymity-changing-guard-paramete... https://www-users.cse.umn.edu/~hoppernj/single_guard.pdf
But on the theory that some guards are way overloaded right now and some aren't, giving clients two bites at the apple might make a dramatic improvement in terms of reliable and consistent performance.
There is also some argument in favor of using two guards anyway. One reason (explained more in proposal 291) is that there are already some edge cases where clients use their second guard. And also, in the glorious future, we will want to be using multiple guards because we have switched to the multi-path Conflux design (proposal 329) -- though we're not there yet.
So: I am giving you all here some early warning, in case you see anything odd on the network when we make this change. Let us know if you do. :)
--Roger
On Mon, Jun 27, 2022 at 10:58:42PM -0400, Roger Dingledine wrote:
this change means that clients will now choose between two guard relays by default (rather than just one) when building circuits.
One of the people on the forum asked if this change applies to bridge users (including pluggable transport users) too, and I believe the answer is yes-it-does.
So if you have two or more bridges configured, with this change your Tor will now pick two of them to use for your circuits.
More details here: https://forum.torproject.net/t/tor-is-much-slower-latterly-than-it-used-to-b...
--Roger
On Mon, Jun 27, 2022 at 10:58:42PM -0400, Roger Dingledine wrote:
So: I am giving you all here some early warning, in case you see anything odd on the network when we make this change. Let us know if you do. :)
So far so good. Performance looks like it improved.
The "intro2 cell" overload also stopped over the weekend (yay): https://metrics.torproject.org/hidserv-rend-relayed-cells.html
But it was replaced with a new overload (boo), from way too many Tor clients running at a few cloud providers. The main result for relay operators is greatly increased file descriptor use, with a few IP addresses or /24's generating the majority of the new connections.
If your relay is bumping up against its file descriptor limits, or otherwise suffering (e.g. more memory usage than desired), one reasonable option for you might be to set some iptables-level connection limiting. More details in this ticket: https://gitlab.torproject.org/tpo/core/tor/-/issues/40636#note_2818529
Some of the dir auths are suffering from this connection overload too -- longclaw and bastet in particular but I think all of us are feeling it.
As I mention in that ticket, ultimately it seems to me that we'll need to come up with a guide of recommended iptables rules for big relay operators to run alongside their Tor. It wouldn't be mandatory (Tor has some adequate-ish defenses here at the application layer) but it seems clear that it would do the job better at scale.
--Roger
On 2022-07-06 21:19, Roger Dingledine wrote:
But it was replaced with a new overload (boo), from way too many Tor clients running at a few cloud providers. The main result for relay operators is greatly increased file descriptor use, with a few IP addresses or /24's generating the majority of the new connections.
If your relay is bumping up against its file descriptor limits, or otherwise suffering (e.g. more memory usage than desired), one reasonable option for you might be to set some iptables-level connection limiting. More details in this ticket: https://gitlab.torproject.org/tpo/core/tor/-/issues/40636#note_2818529
I'm running the small non-exit 8F6A78B1EA917F2BF221E87D14361C050A70CCC3.
Since mid-may the relay has been under heavy load. I had to limit my bandwidth using "RelayBandwidthRate" in torrc to about 90% of my real BW to be able to use internet for myself. This solved my laggy internet.
Since the 2nd of July the number of (non torrelay) tor connections to my relay skyrocketed from about 3500 to 20000. A week ago I implemented connection limits per Toralf's post: iptables -A INPUT -p tcp --destination-port 443 -m connlimit --connlimit-mask 32 --connlimit-above 30 -j DROP This reduced the number of connections to about 10000.
I just now noticed that the relay is flagged as overloaded. What to do? Decrease the connection limit from 32 to .. what? Decrease my RelayBandwidthRate even more? Seems like giving in to the DoSer.
Logfile: Jul 10 02:58:39.000 [warn] Your computer is too slow to handle this many circuit creation requests! Please consider using the MaxAdvertisedBandwidth config option or choosing a more restricted exit policy. [8169 similar message(s) suppressed in last 14820 seconds] Jul 10 03:32:28.000 [notice] General overload -> Ntor dropped (220414) fraction 5.8677% is above threshold of 0.5000%
Metrics port: tor_relay_load_onionskins_total{type="tap",action="processed"} 697956 tor_relay_load_onionskins_total{type="tap",action="dropped"} 0 tor_relay_load_onionskins_total{type="fast",action="processed"} 0 tor_relay_load_onionskins_total{type="fast",action="dropped"} 0 tor_relay_load_onionskins_total{type="ntor",action="processed"} 503071860 tor_relay_load_onionskins_total{type="ntor",action="dropped"} 323369 tor_relay_load_onionskins_total{type="ntor_v3",action="processed"} 503071860 tor_relay_load_onionskins_total{type="ntor_v3",action="dropped"} 323369
Logforme:
On 2022-07-06 21:19, Roger Dingledine wrote:
But it was replaced with a new overload (boo), from way too many Tor clients running at a few cloud providers. The main result for relay operators is greatly increased file descriptor use, with a few IP addresses or /24's generating the majority of the new connections.
If your relay is bumping up against its file descriptor limits, or otherwise suffering (e.g. more memory usage than desired), one reasonable option for you might be to set some iptables-level connection limiting. More details in this ticket: https://gitlab.torproject.org/tpo/core/tor/-/issues/40636#note_2818529
I'm running the small non-exit 8F6A78B1EA917F2BF221E87D14361C050A70CCC3.
Since mid-may the relay has been under heavy load. I had to limit my bandwidth using "RelayBandwidthRate" in torrc to about 90% of my real BW to be able to use internet for myself. This solved my laggy internet.
Since the 2nd of July the number of (non torrelay) tor connections to my relay skyrocketed from about 3500 to 20000. A week ago I implemented connection limits per Toralf's post: iptables -A INPUT -p tcp --destination-port 443 -m connlimit --connlimit-mask 32 --connlimit-above 30 -j DROP This reduced the number of connections to about 10000.
I just now noticed that the relay is flagged as overloaded. What to do? Decrease the connection limit from 32 to .. what? Decrease my RelayBandwidthRate even more? Seems like giving in to the DoSer.
Seems the overload on your relay is gone again? We've seen a large spike in overloaded relays on the weekend but so far our indicators show this has been a temporary issue and not sustained overload.
Georg
[snip]
On 7/10/22 22:28, Logforme wrote:
A week ago I implemented connection limits per Toralf's post: iptables -A INPUT -p tcp --destination-port 443 -m connlimit --connlimit-mask 32 --connlimit-above 30 -j DROP This reduced the number of connections to about 10000.
I just now noticed that the relay is flagged as overloaded. What to do? Decrease the connection limit from 32 to .. what? Decrease my RelayBandwidthRate even more? Seems like giving in to the DoSer.
There're still about 200-300 VPS systems DDoS'ing my 2 Tor relays. The iptables rule halfs the pressure. I could nearly fully stop the DDoS by using [1].
[1] https://github.com/toralf/torutils/blob/master/ddos-inbound.sh
tor-relays@lists.torproject.org