0.0.8pre1 works: now what?

Marc Rennhard rennhard at tik.ee.ethz.ch
Wed Jul 28 18:36:05 UTC 2004


> In my eyes, there are three big issues remaining. We need to:
> 1) let clients describe how comfortable they are using unverified servers
> in various positions in their paths --- and recommend good defaults.
> 2) detect which servers are suitable for given streams.
> 3) reduce dirserver bottlenecks.
> 
> ---------------------------------------------------------------------
> Part 1: configuring clients; and good defaults.
> 
> Nodes should have an 'AllowUnverified' config option, which takes
> combinations of entry/exit/middle, or any.

I agree it's difficult to decide for a reasonable default value. Using 
clients only as exit nodes only slightly increases a user's risk because 
the exit node can't learn much without colluding with a server node that 
may be picked as the first hop. Exit plus middle nodes as clients is 
also quite safe. Picking clients as entries is risky as the client may 
own/observe the web (or whatever) server. Using clients as both entry 
and exit in a path is the highest risk because owning/observing the web 
server is no longer needed for the adversary.

The quite paranoid default could therefore be "use clients as exit 
and/or middle nodes in a path"; the less paranoid one is "use clients as 
entry/middle/exit but never as entry and exit in the same path". Other 
choices do not make much sense in my opinion.

What's funny is that the really paranoid user wants others to use his 
node as entry node, but picking clients as entries is relatively high 
risk. Similarly, most users would use clients as exits, but only a few 
clients will be willing to act es exits. Could this mean that there 
potential to "unload" traffic onto the clients is quite small? Or put it 
differently, can we shift incentive for clients somehow from entry 
(which currently gives you better anonymity) to exit (which currently 
could give you troubles) nodes? I don't have an answer but I believe 
this is a key problem to be solved on the way towards a hybrid Tor where 
a significant (in fact most) of the traffic is handled by clients.


> ---------------------------------------------------------------------
> Part 2: choosing suitable servers.
> 
> If we want to maintain the high quality of the Tor network, we need a
> way to determine and indicate bandwidth (aka latency) and reliability
> properties for each server.
> 
> Approach one: ask people to only sign up if they're high-quality nodes,
> and also require them to send us an explanation in email so we can approve
> their server. This works quite well, but if we take the required email
> out of the picture, bad servers might start popping out of the woodwork.
> (It's amazing how many people don't follow instructions.)

Maybe not a bad choice to start until you get very many e-mails a day.

> Approach two: nodes track their own uptime, and estimate their max
> bandwidth. The way they track their max bandwidth right now is by
> recording whenever bytes go in or out, and remembering a rolling average
> over the past ten seconds, and then also the maximum rolling-average
> observed in the past 12 hours. Then the estimated bandwidth is the smaller
> of the in-max and the out-max. They report this in the descriptor they
> upload, rounding it down to the nearest 10KB, and capping anything over
> 100KB to 100KB. Clients could be more likely to choose nodes with higher
> bandwidth entries (maybe from a linear distribution, maybe something
> else -- thoughts?).

Sounds reasonable. Simply picking clients at random 
(bandwidth-dependant) in a circuit may not be the best option, though. 
Better would be classifying clients according their usefulness  for 
specific application. E.g. that 10KB node (or even less) that is online 
for 23+ hours a day is a great choice for remote logins, and this 100KB 
client that is usually online just one hour a day is just fine for web 
browsing. The disadvantage is that this requires the circuit setup to be 
application aware.

> Since uptime is published too, some streams (such as irc or aim) prefer
> reliability to latency. Maybe we should prefer latency by default,
> and have a Config option StreamPrefersReliability to specify by port
> (or by addr:port, or anything exit-policy-style), that looks at uptime
> rather than advertised bandwidth.

OK, that's about what I meant above :-)

> And of course, notice that we're trusting the servers to not lie. We
> could do spot-checking by the dirservers, but I'm not sure what we would
> do if one dirserver thought something was up, and the others were fine.
> At least for jerks the dirservers can agree about, maybe we could
> configure the dirservers to blacklist descriptors from certain IP spaces.

In general - although I fully agree that making use of the clients is a 
necessary step -- including clients is extremely risky. First of all, 
new users may soon be frustrated and leave if performance is poor 
because of a few unreliable clients (I agree that we have the same 
problem if the Tor-core, i.e. the servers get overloaded). This means 
that the clients offering service should offer really good service, 
which will be extremely hard to guarantee. The other issue is that some 
users may find it funny to disrupt the QoS. I'm not so much concerned 
about adversaries running many nodes (not until Tor really gets big), 
but more about clients that drop the circuits of others randomly. This 
would easily reduce Tor to its core again because nobody will use other 
clients.

In any case, all your questions are extremely difficult to answerI 
believe the right way to go is to come up with a reasonable (it won't be 
perfect, only time and experience will tell) design to start with and 
give it a try. Same way as Tor has done since its first public release. 
falling back to Tor-core only is always an option. What follows depends 
on the popularity of Tor. If 100 clients will act as relays, not many 
problems should arise (despite poor service for the other 99900 
clients... I'm very curious about how many of the clients will relay 
data for others). Node discovery must likely change if 1000s of clients 
are offering to relay data for others. Maybe there will be too many 
clients offering poor service and Tor will performs poor. It may then be 
time to think about a reputation scheme again...

--Marc



More information about the tor-dev mailing list