Future areas for Tor research

Fri Aug 22 22:34:27 UTC 2008

Here are some of my thoughts.

On Thu, Aug 21, 2008 at 03:31:57PM -0400, Roger Dingledine wrote:
> By end of 2008:
> - Paul's NRL project to evaluate path selection under various trust
>   distributions. The idea is to figure out safer/better ways to build
>   paths if we assume some users trust some relays more than others.

I talked to the folks involved in this at PETS, so know roughly the
direction they planned. It sounded like this is on track, but I'd be
happy to hear how things are going.

> - Peter's proposal 141. How to trade off descriptor fetching overhead
>   with circuit-building overhead. Are there even better ways?

One option I think was discussed is to remove the public key of the
server from the directory descriptors. Instead the client would extend
to the server and fetch the key from there, signed by the directory
authorities. This way the client only gets the keys it needs, but
doesn't leak information. As always the devil will be in the detail.

> By end of 2009:
> - Understand the risks from letting people relay traffic through your
>   Tor while you're also being a Tor client. Compare risks from being a
>   bridge relay to risks from being a 'full' relay. Come up with practical
>   ways to mitigate.

The defenses in this paper looked promising:
 http://www-users.cs.umn.edu/~hopper/circuit_clogging_fc08.pdf

> - Take Roger's incentive.pdf design, flesh it out further, and see if we
>   can find solutions to the long-term intersection attack that arises
>   from attackers being able to correlate "that relay is online everytime
>   this anonymous high-priority user does an action." (I need to clean up
>   incentive.pdf and send it to this list.)
> 
> By end of 2010:
> - Better load balancing algorithms, path selection choices, etc.
>   Building on Mike Perry's work and Steven's PETS 2008 paper. Do we
>   do simulations? analysis? How to compare them? Are there cases when
>   we can switch to 2-hop paths, or the variable-hop paths?

The obvious extension to my PETS paper is to optimize the expression I
used for path selection impact on latency. Nothing neat fell out of
doing the optimization, but it should still be possible to come up
with a numerical answer. 

Then the question would be whether this is still the right path
selection parameters when the simplifying assumptions are dropped. Tor
is sufficiently complicated that I think simulation is the only
reliable answer. This would be a big chunk of work, but very useful in
evaluating candidate load balancing algorithms.

> - Evaluate the latency and clogging attacks that are coming out, figure
>   out if they actually work, and produce countermeasures.

Indeed, but even simple end to end traffic analysis is useful to look
at. I feel it will still work great for the common case, but there are
special-case proposals out there which are intended to mix like
traffic with other streams, while not introducing huge overhead.

The problem here is that it's really hard to compare schemes, or to
even give realistic estimates of anonymity and overhead. There's quite
a bit of benchmarking required here.

> - Tor network scalability, the easy version: use several parallel
>   networkstatus documents, have algorithms for clients to pick which to
>   use, for relays to get assigned to one, and make sure new designs like
>   Peter's proposal 141 will be compatible with this.

This paper raising interesting results:
 http://research.microsoft.com/users/gdane/papers/bridge.pdf

The issues are not simple, but it does suggest that partitioning is
the right idea.

> - There's a vulnerability right now where you can enumerate bridges by
>   running a non-guard Tor server and seeing who connects that isn't
>   a known relay. One solution is to use two layers of guards, meaning
>   bridge users use 4-hop paths. Is this the best option we've got? They
>   don't want to be slowed down like that.
> - How many bridges do you need to know to maintain reachability? We
>   should measure the churn in our bridges. If there is lots of churn,
>   are there ways to keep bridge users more likely to stay connected?

I think that the results at the moment will be deceptive, since I
expect the bridge pool to change quite a bit once we are actively
recruiting.

> - Related, more bridge address distribution strategies: Steven and I
>   were talking about a ``bridge loop'' design where bridge identities form
>   a ``loop'' at the bridgeDB , and if you know any bridge in the loop you
>   can learn all the others. This approach will allow Tor clients who know
>   a few bridges to be updated with new bridges as their old ones rotate,
>   without opening up the list to full enumeration.

One other idea (that came from John Gilmore) was how to manage who is
a bridge. Currently it is up to the node operator, so if we realize we
need lots of bridges, we have to ask them and it will take a while.
Another option is for the directory authorities to pick -- that way
there can be a central switch to pull.

Steven.

-- 
w: http://www.cl.cam.ac.uk/users/sjm217/