Chris tor@wcbsecurity.com wrote:
<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> </head> <body> <p><br> </p> <div class="moz-cite-prefix">On 11/10/2022 2:38 AM, Scott Bennett wrote:<br> </div> <blockquote type="cite" cite="mid:202211100738.2AA7cw7d026293@sdf.org"> <pre class="moz-quote-pre" wrap="">Toralf F?rster <a class="moz-txt-link-rfc2396E" href="mailto:toralf.foerster@gmx.de"><toralf.foerster@gmx.de></a> wrote:
</pre> <blockquote type="cite"> <pre class="moz-quote-pre" wrap="">On 11/8/22 10:57, Chris wrote: </pre> <blockquote type="cite"> <pre class="moz-quote-pre" wrap="">The main reason is that a simple SYN flood can quickly fill up your conntrack table and then legitimate packets are quietly dropped and you won't see any problems thinking everything is perfect with your server unless you dig into your system logs. </pre> </blockquote> <pre class="moz-quote-pre" wrap=""> Hhm, my system log doesn't show any problems, maybe due to (or regardless of?): CONFIG_SYN_COOKIES=y ?
I surmise that the above is a LINUXism that is approximately equivalent to a pf rule using synproxy.
</pre> </blockquote> <pre class="moz-quote-pre" wrap=""> On FreeBSD 12.3 I use pf and have gone back to using synproxy on the "pass in" statements for the ORPort and DirPort, but I doubt it has actually made any difference </pre>
I should clarify my statement above by stating that the SYN packets still have to be received from my ISP before the rule can be applied, so yes, a SYN flood attack can still tie up my Internet connection, but that does not appear to be the kind of attacks that my relay was experiencing. Specifying synproxy on the "pass in" rules for tor means that the kernel simply drops any pending connection that fails to complete the SYN-SYNACK handshake within a short time instead of passing it on to tor to deal with; IOW, no incoming connections are passed to tor unless they complete that handshake first. The second reason I made that statement was that all the attacks I have seen in recent months have tied up my inbound (and sometimes outbound) data capacity for some time, and the next appearance of a set of heartbeat messages from tor show an increase in the INTRODUCE2 rejections of 2,000 to 3000 or occasionally more. I suspect the "occasionally more" cases occur when two of the bot attacks hit my relay at the same or overlapping times. All of the above was true before I began using synproxy again and appears to be the case still. If you have seen SYN flood attacks, then that is grounds enough for me to continue to leave it in the rules for tor indefinitely. The cost to the system for using synproxy is too small to be detected, but the potential for sparing cost to tor appears to be significant.
</blockquote> <p><font size="-1"><font face="Arial">The quote about SYN Flood is actually from my post which went only to toralf and wasn't displayed on the group. My bad. To explain further, I didn't say the current attack includes SYN floods, what I meant was
Ah. I see.
whenever we have some conntrack rules in our iptables, it's prudent to have some rate limiting rules before it, because if the attacker knows we rely on conntrack and intends to do some
Not being a LINUX user, I am unaware of what "conntrack" does. pf has a "keep" flag that tells it to keep state for each connection, but many years ago pf was changed to keep state anyway, whether one tells it to or not, so nowadays it is effectively a comment. I don't know of any method by which one can tell pf *not* to keep state.
damage, the attacker can easily flood our conntrack table with SYN flood and then we start dropping legitimate packets without notice. However you're correct, currently there are no SYN floods.</font></font><font size="-1"><font face="Arial">?</font></font><br>
Understood. Thank you for the clarification.
</p> <p><br> </p> <blockquote type="cite" cite="mid:202211100738.2AA7cw7d026293@sdf.org"> <pre class="moz-quote-pre" wrap="">because the only attacks I've seen so far were coming
via other relays and triggered tor's rejections of INTRODUCE2 cells by the thousands. Instead, what has been very effective has been to increase the NumCPUs count drastically. </pre> </blockquote> <p><font size="-1"><font face="Arial">You're correct yet again. The number of CPUs make a huge difference. Tor automatically detects up to 16 CPUs if you have them. Anything above that, Tor can't see. I've never tried adding it to my torrc though, it might see more if you tell it to look for them.</font></font></p>
It only looks for the number of CPU threads actually available if you don't specify a value for NumCPUs. You can put any natural number there that you want, unless there's some upper limit I don't know about, e.g., 255.
<p><font size="-1"><font face="Arial">On my relays which are run on VMs, I simply added more CPUs to the VM and somewhere around 10 CPUs seemed to be the magic number when all the warning messages disappeared. They are currently happily running on 12.</font></font></p> <p><br> </p> <blockquote type="cite" cite="mid:202211100738.2AA7cw7d026293@sdf.org"> <pre class="moz-quote-pre" wrap="">On a non-hyperthreaded quad-core CPU I now have
it set as "NumCPUs 20". </pre> </blockquote> <p><font size="-1"><font face="Arial">OK I'm confused now, Are you saying that it's possible to tell Tor to use non existent CPUs and it actually works? That would be really cool. Is it because Tor assigns multiple worker threads to the same CPU?<br>
Of course, it's possible. NumCPUs only tells tor how many worker threads to start. tor does not assign any CPU affinity, so everything gets handled by the OS's scheduler. When the main thread encounters an onionskin that must be decrypted, it places that onionskin onto a queue for some worker thread to pick up as soon as a worker becomes available. Apparently how fast that occurs determines whether tor begins dropping connections and issuing warning/error messages, so having a lot of workers means that one is usually available or becomes available very soon, so the timeout for decryption of that onionskin to begin doesn't happen. IOW, the timeout seems to depend upon how long the queued onionskin waits for decryption to *begin*, not to *complete*. Anytime I've seen lots of workers active in top(1), they've been showing less than 1% CPU usage apiece, so they usually have a higher priority than the main thread unless, of course, the main thread is waiting for a select(1) or some other I/O operation to be posted complete, in which case the main thread will have a priority in the single digits anyway, but isn't actually doing anything at the time. Given that they use less than 1% CPU, it is frankly rather difficult to find one actually running at any given instant with top(1). Instead they are usually in "kqueue" state or some similarly waiting state. When tor is being assaulted with an INTRODUCE2 attack, the main thread is usually running at 8% to 15% CPU usage. (These are attacks coming via other relays, so naturally the synproxy condition is satisfied and has no effect.) All that having been written, I would like to point out that greatly increasing NumCPUs does not *solve* the problem of the INTRODUCE2 attacks, nor do I have any suggestions for how this type of attack can be prevented/stopped. It is just a workaround that provides a way for a relay to survive them and keep running, though at the cost of many thousands of unnecessary, undesirable onionskin decryptions. On that scale, onionskin decryptions do become significantly expensive and the moreso the larger the capacity of the relay's Internet connection(s).
</font></font></p> <p><br> </p> <blockquote type="cite" cite="mid:202211100738.2AA7cw7d026293@sdf.org"> <pre class="moz-quote-pre" wrap="">Each worker thread uses almost no CPU time, but
haveing enough of them waiting to grab an onionskin off the queue instantly seems to stop all messages about cells, onionskins, or connections being dropped. During an attack I often saw all workers in top(1) screen updates with "NumCPUs 16", so I increased to 20 for the next restart, but I hadn't gotten any of the aforementioned error/warn messages at 16. Unfortunately, I have yet to see what happens at 20 because before the next restart Comcast made a change that blocks me from running a relay. :-( I intend to find out very soon whether I can afford to switch to their business network right away, so that I might resume running my relay or will have to wait until things happen next summer that should free up some of my limited income first.
BTW, it is generally poor practice to post HTML to mailing lists. I usually skip and delete HTML messages, but my eyeballs and brain are feeling fresher than usual this evening, and your Subject: line was one I had responded to previously, so I decided to wade through your message after all.
Scott Bennett, Comm. ASMELG, CFIAG ********************************************************************** * Internet: bennett at sdf.org *xor* bennett at freeshell.org * *--------------------------------------------------------------------* * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * * -- Gov. John Hancock, New York Journal, 28 January 1790 * **********************************************************************