eliminating bogus port 43 exits

Scott Bennett bennett at cs.niu.edu
Sat Jun 13 17:13:43 UTC 2009


Hi Jon,
     On Sat, 13 Jun 2009 10:20:45 -0600 Jon <scream at nonvocalscream.com>
wrote:
>I've read the entire thread and I still have one persisting question in
>my mind...
>
>
>Why are "bogus port exists" bad, and why should I eliminate them form my
>exit policy?

     Okay, consider how tor works.  Each request to connect to a destination
IP address's TCP port 43 and all data traveling in both directions over the
resulting connection travel in a unique tor stream.  There is some overhead
at each of several steps in the process of establishing the stream, encrypting
and decrypting the data, passing data in both directions, and then destroying
the stream.
     Each stream exists within a circuit.  If the client already has a circuit
available that can support the stream, then that is good.  Many streams can be
supported by a single circuit, so there is not a 1:1 correspondence in the
total number of requests (and therefore streams), which makes it difficult to
know just what the impact of millions/billions of extraneous streams happens
to be, but for the sake of discussion, let's say that the average circuit ends
up having supported 10 streams by the time it is shut down.
     If the client currently has no circuit available to support a stream,
then it must build one.  This is a slow and tedious process with lots of
overhead demand placed upon the relays selected for the route of the circuit
to be built.  It is an unfortunate fact that many circuits fail during the
construction phase and get destroyed without ever having been used.  All such
failed circuits amount to lost/wasted tor network capacity.
     Suppose a 100 KB/s tor exit node services 150,000 exits/week for https
(port 443).  But if its exit policy allows unrestricted exits for whois
(port 43), it may also end up having to service 750,000 - 1,500,000 port 43
exits.  If we average 10 streams/circuit, then that adds the burden of 75,000 -
150,000 circuits to the other loads already on this exit node, as well as
distributing twice that burden (assuming routelen = 3) across the rest of the
tor network.  If that traffic is legitimate, then that's just life, and is
part of the service the tor network was established to provide.  However, if
95% or more of those port 43 streams and circuits are bogus, then they
represent lost/wasted capacity that then is not available to provide the
service that tor is intended to provide.
     When you consider that the distribution of circuits across the set of
exit nodes that allow exits to a particular port is based upon the relative
data rate capacities of all exits in that set, you see that what one exit node
experiences should be similar to what all others in that set experience,
proportionally adjusted according to relative capacities.  In other words, if
a single, 100 KB/s exit node is burdened with bogus exits to the tune of 10,
20, or maybe 30 times as many as the combined total of all the legitimate exit
requests it gets, then all the other exit nodes in that set are getting
hammered basically the same way.  That amounts to a horrendous waste of tor
resources that could have been devoted to providing better service for the
legitimate requests.
     Again, it is important to keep in mind that the volume of payload data
traversing the tor network is only one part of the consumption of tor network
resources.  Circuit construction and tear-down are another big piece of the
picture.  Remember that circuit construction means lots of very slow asymmetric
key exchanges, as well as lots of overhead at the TCP and IP levels.  Each tor
hop in a circuit might have 10 - 30 physical hops through the Internet.
>
>*if* I want to keep the type of traffic somewhat also anonymous
>(assuming the operator is not looking at the content) then I might use a
>separate port to communicate my information.  I don't know if I totally
>feel comfortable in this, most especially when we start talking about
>peering into the content.  And even looking to see what the protocol
>actually is, is peering.  That should be private, as an ethical
>consideration for all operators.
>
     Yeah, looking at the content is a really noxious idea.  I don't understand
why anyone would ever suggest it if they've bothered to read the documentation
and other materials available on the torproject.org web site.  And if someone
has *not* read those materials, they really shouldn't be running tor nodes.
     I would still very much like to hear from exit node operators who allow
exits to ports 80 (http) and 443 (https) to find out what the ratio of http
exits to https exits might be.  If those same nodes also allow exits to port 43
(whois) and/or port 4321 (rwhois), then those figures might be helpful, too,
although just the ratio of http:https exits should be enough to clarify the
magnitude of the potentially bogus traffic burden pretty well because I can
already calculate the other ratios from the data I already have.


                                  Scott Bennett, Comm. ASMELG, CFIAG
**********************************************************************
* Internet:       bennett at cs.niu.edu                              *
*--------------------------------------------------------------------*
* "A well regulated and disciplined militia, is at all times a good  *
* objection to the introduction of that bane of all free governments *
* -- a standing army."                                               *
*    -- Gov. John Hancock, New York Journal, 28 January 1790         *
**********************************************************************



More information about the tor-talk mailing list