[tor-dev] exitmap modules that make *lots* of connections

Zack Weinberg zackw at panix.com
Fri May 13 12:18:24 UTC 2016


On 05/08/2016 03:49 AM, Roger Dingledine wrote:
> On Fri, Apr 22, 2016 at 04:22:48PM -0400, Zack Weinberg wrote:
>> I'm working on an exitmap module that wants to feed order of 5000
>> short-lived streams through each exit relay.  I think this is running
>> foul of some sort of upper limit (in STEM, or in Tor itself, not sure)
>> on the number of streams a circuit can be used for, or how long, or
>> something.
> 
> Tor has a built-in "if a stream has been waiting for a connected cell
> for 10 seconds, something has gone wrong" feature, where it disconnects
> from that circuit, and marks the circuit as not worth using for future
> streams. (Assuming the stream hasn't timed out, Tor would then say "oh
> hey, I need to find a circuit for this stream", and start the process
> over again.)
> 
> It looks from your logs like you're attaching a bunch of streams to
> a circuit, and one or more of the begin cells (heck, maybe even all of
> them) aren't getting any answer in the first 10 seconds.

Yeah, that's expected; the policy you describe is very, very wrong for
this experiment.  Is there any way to turn it off?

What's going on is that I have ~500 hosts in known locations worldwide,
and I'm measuring the round-trip latency to each.  Unfortunately, I
can't count on all of them to be up and responding to SYNs at any given
time.  The module asks each exit to make 10 connections to each host,
and it deliberately targets a *closed* port whenever possible, so I'm
expecting to get back mostly RELAY_END/REASON_CONNECTREFUSED replies,
and timeouts.

The module internally times out each *stream* after ten seconds, but it
wants to use the same circuit forever - this is both because of design
limitations in exitmap (see below), and because changing circuits in the
middle of this test will spoil at least one measurement with
circuit-creation overhead.  Changing the route to the exit is also bad,
as it will change the intra-Tor latency, which ideally would be the same
for all measurements.  (This last is mitigated by exitmap using two-hop
circuits and being able to pin the entry node, but one would like to be
certain that it won't happen.)

Frankly, assuming that something is wrong with a *circuit* when one of
its *streams* suffers a connection timeout seems inappropriate in
general.  That can happen for any number of reasons beyond Tor's
control, not least that, simply, the exit node's SYN to the requested
destination got dropped by a firewall.  What was the original rationale
for this policy?

>> 2016-04-22 16:07:54,115 [DEBUG]: Circuit status change: CIRC 6 BUILT
>> [fp],[fp] PURPOSE=GENERAL TIME_CREATED=2016-04-22T20:07:53.305851
> [...]
>> 2016-04-22 16:07:58,118 [DEBUG]: Attempting to attach stream 98 to circuit 6.
>> 2016-04-22 16:08:04,697 [DEBUG]: Circuit status change: CIRC 6 CLOSED
>> [fp],[fp] PURPOSE=PATH_BIAS_TESTING
>> TIME_CREATED=2016-04-22T20:07:53.305851 REASON=FINISHED
> 
> This is interesting. If the later streams were still attached to the
> circuit, the circuit shouldn't be closing here. Are you sure the streams
> actually get attached (and stay attached)?
> 
> It looks like you're only giving me exitmap logs, not Tor logs, so it's
> hard to tell for sure what's going on underneath.

Yeah, I'll try to get some Tor logs (that'll mean more changes to
exitmap core, yay).  I may not get to it till June, though.  The later
streams might be a red herring; the module closes each stream
immediately after the connection resolves and that may not be visible in
these logs.

I am not sure why the circuit is labeled PATH_BIAS_TESTING in the first
place; it's possible that exitmap has gotten mixed up about which
circuit it is supposed to be using.

>> You can see that circuit 6 is no longer available, but the module is
>> still trying to use it.
> 
> That does indeed sound like an exitmap bug.

exitmap core assumes that one circuit per exit is all you will ever
need, and has no way of notifying an experiment module that its circuit
has gotten destroyed.  It seems pretty clear that this is a design
error, but it's going to take a lot of work to change ...

zw


More information about the tor-dev mailing list