[tor-relays] Bandwidth Authority PID Feedback Experiment #2 Starting

Tim Wilde twilde at cymru.com
Wed Dec 14 17:00:30 UTC 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/13/2011 10:34 PM, Mike Perry wrote:
> Thus spake Tim Wilde (twilde at cymru.com):
> 
>> We're not seeing source port exhaustion, but we are seeing two
>> warns, one of which I haven't been able to nail down:
>> 
>> 2011 Dec 13 20:22:07.000|[notice] We stalled too much while
>> trying to write 8542 bytes to address "[scrubbed]".  If this
>> happens a lot, either something is wrong with your network
>> connection, or something is wrong with theirs. (fd 409, type
>> Directory, state 2, marked at main.c:990).
> 
> Hrm.. Haven't seen this one before...

I've seen LOTS of them. :)  When I turned off SafeLogging briefly to
see what the scrubbed addresses were it turns out they seem to be the
dir auths, if that helps:

2011 Dec 14 16:53:17.000|[notice] We stalled too much while trying to
write 273 bytes to address "193.23.244.244".  If this happens a lot,
either something is wrong with your network connection, or something
is wrong with theirs. (fd 4108, type Directory, state 2, marked at
main.c:990).

I haven't seen any examples of that error message that were not to a
dir auth.

> 
>> 2011 Dec 13 22:26:45.000|[warn] Your computer is too slow to
>> handle this many circuit creation requests! Please consider using
>> the MaxAdvertisedBandwidth config option or choosing a more
>> restricted exit policy. [18 similar message(s) suppressed in last
>> 60 seconds]
> 
> Ah, we should be handling this issue with the fix for #1984: 
> https://trac.torproject.org/projects/tor/ticket/1984
> 
>> The second warn I figure I should be tuning myself with 
>> MaxAdvertisedBandwidth, and it's happening on BigBoy, the relay
>> on this box that's doing the majority of its bandwidth.  So I'm
>> not sure if it's anything that your feedback loop should be
>> involved in or not.
> 
> It's a shame this log message makes such a crazy recommendation
> wrt MaxAdvertisedBandwidth. But I guess some tweak is better than
> no tweak. Hopefully we can make this go away without you needing to
> lower it, though. Can you ping me on IRC if you keep getting these
> warns after leaving MaxAdvertisedBandwidth alone?

Will do.

> This sounds incredibly familiar. What ethernet card + driver
> version do you have? Some combos of are pretty abysmal about IRQ
> load balancing and interrupt optimizations, or at least they were
> on old kernels (which may still apply if you are CentOS).

Yeah, some further investigation today indicates that may be the case.
:|  Running Intel PRO/1000s with the latest E1000E driver
(1.6.3-NAPI), I do in fact see what looks like potential interrupt
load issues.  I've split the relays across to another NIC to see if
that helps at all in the relatively short term, long term it looks
like a migration away from CentOS for this is called for (it's good
for some things, but not this :)).  I also rediscovered that the
receive packet/flow steering in 2.6.35+ kernels is one of the
torservers.net optimizations I haven't done (and can't do on CentOS
without a manual kernel installation due to the 2.6.32 kernel it ships
with even in 6.0).  So it looks like Debian is in this box's future
(though I'll try to remember to keep my keys this time :)).

Thanks for the help and suggestions, if I can provide any more info
with regards to the stalled writes to the dir auths, let me know.

Thanks,
Tim

- -- 
Tim Wilde, Senior Software Engineer, Team Cymru, Inc.
twilde at cymru.com | +1-847-378-3333 | http://www.team-cymru.org/
-----BEGIN PGP SIGNATURE-----

iEYEARECAAYFAk7o1i4ACgkQluRbRini9tggqwCeNVW/af9eNz/TGh7Pe40tQpVj
FQQAn1/UKmRHK/w3ctf0iERdgNVfPIM7
=Pq3p
-----END PGP SIGNATURE-----


More information about the tor-relays mailing list