Tor client performance (was Re: URGENT: patch needed ASAP for authority bug)
bennett at cs.niu.edu
Thu Apr 22 11:44:47 UTC 2010
On Thu, 22 Apr 2010 05:16:30 -0400 Roger Dingledine <arma at mit.edu> wrote:
>On Thu, Apr 22, 2010 at 03:30:06AM -0500, Scott Bennett wrote:
>> >Thanks for the brilliant patch.
>> Well, Roger's patch may provide some relief until the next tor release
>> comes out, but lest anyone get too excited, it would be well to keep in mind
>> that it is a patch to treat symptoms. It is not at all clear yet, AFAIK,
>> what the cause of the recent troubles has been. Wiping out connections that
>> might otherwise remain available for use, thereby making it necessary for
>> clients to make new SSL connections sooner than they might otherwise have
>> needed does have a cost. At some point, I hope that cause will be found and
>> dealt with.
>Actually, you're in luck -- we do have a pretty good handle on at least
>part of the problem.
>A few years ago, we switched clients over to tunneling their directory
>requests via ordinary TLS (aka "OR") conns rather than asking them as http
>(in plaintext), so it was harder to censor them.
Yes, I remember that.
>But while clients use a small set of first hops (called "entry guards")
>when building three-hop circuits -- the ones they use for anonymity --
>they just pick any relay when they want to do a directory fetch. They
>weight these choices by relay bandwidth, so we give more attention to
>the relays that can handle more attention.
>We've been meaning for a while to switch to a "directory guard" design,
>where we only ask our entry guards for directory information. This
>approach would be better for privacy, because it would be harder for
>a directory mirror to enumerate users (he can't learn what they do,
>but he can learn that they use Tor). It would be better for scalability
>too, since we wouldn't be spewing out so many new TLS connections (sound
>familiar?). But it would be put additional load on just the entry guards,
>and worse, it would screw up all our user count statistics including the
>per-country graphs that are helping us understand where Tor is seeing use.
>So the problem in this thread was that Tor clients weren't hanging up
>quickly enough once they'd done a directory fetch. Unlikely when they're
>using their entry guards, Tor clients are quite unlikely to get back to
>the same relay for the next directory fetch. So it's quite reasonable for
>them to hang up a lot faster than they do for TLS conns to their guards.
I see. I didn't realize the client's process for choice of directory
server would differ so much between the tunneled version and the direct,
plaintext method via a DirPort.
>We should change clients to do this faster hanging up. In the mean time
>(and in addition), we should teach directory mirrors to defend themselves
>from over-zealous clients.
>The approach in the patch is to close only the circuits which behave
>like directory fetches that are finished: they came from a client,
You check the IP addresses against the relay address list in the
directory? I guess that would still leave connections made from
bridges still treated as client connections...hmmm...maybe that's not
a bad thing, though.
>don't extend anywhere else, don't have any streams on them right now,
>and haven't answered any queries in a while (in the patch, "a while" is
>60 seconds). Once we close those circuits, our earlier logic to close
>TLS conns that aren't in use (don't have any circuits) and have been
>idle "a while" kicks in. In this case "a while" used to be 15 minutes,
>and it's 1 minute in the patch here.
>So that leaves us two questions.
>First: what timeout should we actually pick for these circuits and
>connections? I'm inclined to be pretty aggressive -- first because
That question is the primary reason I suggested having a configurable
>everybody is freaking out about this huge influx of connections, second
>because there *is* a huge influx of connections, and third because these
>circs and conns really are unlikely to be reused again if they aren't
>reused quite quickly.
>Second: why is there a huge influx of connections? I think there are
>three answers here. A) Mike's new bandwidth weightings are putting more
>attention on the fast relays than before. B) There actually are a lot of
>new Tor clients that can reach the Tor network, now that China opened
This second potential explanation is the kind of thing to give an
entertaining thrill to the paranoiacs among us because of the unanswered
questions: *why* did China open things up again? And *why* did China
allow access to only *some*, rather than all, relays in the directory?
When you first mentioned the reopening of access in connection with the
crisis in the tor network, I had to wonder, if only fleetingly, whether
they opened up enough to enable some attack that they had prepared for us.
>up a bit. And C) We had some other bugs lately that reduced the number
>of available relays, and people have been turning off their DirPort,
>leading to evenly more concentrated pain on the directory mirrors that
>remain. My intuition is that "B" is the heaviest factor.
>> Having a way to close idle OR connections based upon a timeout
>> specified by the authorities in the consensus, but overridable by a torrc
>> line by individual relay operators, looks to me like a good thing to have
>> henceforward. That way attentive relay operators can decrease or increase
>> the timeout period according to their needs, but the authorities would still
>> have a possibility of adjusting the timeout period on NORDO relays on all
>> the other relays.
>Yes, maybe. I'm not sure. The broader question here is how many
>connections are too many, and how aggressive should we be at killing them.
That, I think, is clearly a question for local relay operators to decide.
>It's a shame to be killing them at the server end at all -- in an ideal
Yes, it is, and it is kind of a sledgehammer where a ball peen hammer
might be better, but relay operators *do* need to be able to defend their
>world, clients should be realizing they won't need them and hanging up
Also true. However, we are all aware that client users tend not to
update frequently, and that this fact results in a population of somewhat
out-of-date clients. (That's not to say that there aren't some long out-
of-date relays, too, but relay operators as a group do tend to pay somewhat
closer attention to new releases.) Whenever something like this latest
situation crops up, even a fix release made available within the first hour
is unlikely to make much difference for some weeks at best. Meanwhile,
the number of relay operators is far smaller and, as just noted, they are
more likely to be aware of the problems and of new releases.
>early. But the parameters we picked for clients a while ago didn't take
>into account that there would be half a million Tor clients all clamoring
>for attention at once.
Yes, and I strongly suspect that you've only nailed down one of the
causes so far. But we can hope that's not the case. We shall see.
>> Thanks for making the patch available, Roger. Circuit build times have
>> climbed here from ~26 s a few days ago to 97 s at the moment. It will be
>> good to see those times fall again.
>Yeah, things sure have gone to hell lately:
>I should get this patch into git so more people can upgrade.
Thanks again for getting onto it so quickly. Perhaps the tor network
should be grateful for that volcano's activity. :)
>And then at some point think about whether to get something into a new
A client-side stable fix release ASAP would certainly help.
Scott Bennett, Comm. ASMELG, CFIAG
* Internet: bennett at cs.niu.edu *
* "A well regulated and disciplined militia, is at all times a good *
* objection to the introduction of that bane of all free governments *
* -- a standing army." *
* -- Gov. John Hancock, New York Journal, 28 January 1790 *
More information about the tor-relays