proposal 141: download server descriptors on demand

Nick Mathewson nickm at freehaven.net
Thu Jul 17 06:38:56 UTC 2008


On Tue, Jul 15, 2008 at 05:29:45PM -0400, Nick Mathewson wrote:
> On Fri, Jul 11, 2008 at 08:22:55PM +0200, Peter Palfrader wrote:
>  [...] 
> > Theory: Most routers use one of a very small set of different exit policies (if
> > we think of a router's own IP address in its exit policy as a single token
> > @@IP@@ or whatever).
> > 
> > Maybe the consensus document should include a hash of the (normalized,
> > i.e. the router's IP replaced with a token) exit policy.
 [...]
> Hmmm.  I need to think about this part more.  I'm in particular
> curious whether we can do better than hashed policies with a
> ports/addresses list, but I think one of us will need to actually
> build a prototype to see how well this works or doesn't.


I tried to dig deeper to evaluate this, based on post-processing the
contents of my client's cached-consensus file.  Here's what I got:

Our of 1057 servers, there were only 564 distinct policies.  If I
ignored policy lines that referred only to a single IP (these were
mostly "reject myip:*", then I got only 174 distinct policies.

The good news is that the top 10 policies (taken ignoring single-IP
exceptions) cover most of the descriptors (about 82%).  The bad news
is that the long tail is pretty long: of the 213 descriptors whose
exit policies aren't in the top 10, nearly all of them have a policy
unique to themselves.

Another good thing: The overwhelming majority of descriptors (96%)
have exit policies that of the form

    maybe accept some port ranges and reject others.
    maybe reject private:*
    accept some port ranges, reject others.
    accept *:* or reject *:*

So these exit policies could be compressed to a list of port ranges that
the server accepts or denies.

Of the servers have exit policies more complex than this, I'm looking
into the distribution of policy lines that don't fall under this
format.  Such lines (which I'll call 'bogons' since that's what my
analysis code calls them) seem mostly to be network or IP-specific
exceptions to broader rules.  Broadly, these bogons fall into two categories:
single-ip exceptions and subnet exceptions.  To a first approximation,
they can be ignored when you're trying to figure out what a router
supports.

(There are a few exceptions: there are exactly 7 routers that have
more than 9 bogons, including the winner, che, which also wins the
"most complex exit policy" award.)

What this implies for proposal 141, I'm afraid I've forgotten.
Perhaps, "Maybe most our exit policies can be summarized as a list of
port-ranges for which connections to most non-private IPs are
allowed."

yrs,
-- 
Nick



More information about the tor-dev mailing list