On Tue, Oct 8, 2013 at 1:52 AM, Christopher Baines cbaines8@gmail.com wrote:
I have been looking at doing some work on Tor as part of my degree, and more specifically, looking at Hidden Services. One of the issues where I believe I might be able to make some progress, is the Hidden Service Scaling issue as described here [1].
So, before I start trying to implement a prototype, I thought I would set out my ideas here to check they are reasonable (I have also been discussing this a bit on #tor-dev). The goal of this is two fold, to reduce the probability of failure of a hidden service and to increase hidden service scalability.
I think what I am planning distils down to two main changes. Firstly, when a OP initialises a hidden service, currently if you start a hidden service using an existing keypair and address, the new OP's introduction points replace the existing introduction points [2]. This does provide some redundancy (if slow), but no load balancing.
My current plan is to change this such that if the OP has an existing public/private keypair and address, it would attempt to lookup the existing introduction points (probably over a Tor circuit). If found, it then establishes introduction circuits to those Tor servers.
Then comes the second problem, following the above, the introduction point would then disconnect from any other connected OP using the same public key (unsure why as a reason is not given in the rend-spec). This would need to change such that an introduction point can talk to more than one instance of the hidden service.
So, let's figure out all our possibilities before we pick one, and talk about requirements a little.
Alternative 1: Multiple hidden service descriptors.
Each instance of a hidden service picks its own introduction points, and uploads a separate hidden service descriptor to a subset of the HSDir nodes handling that service.
Alternative 2: Combined hidden service descriptors in the network.
Each instance of a hidden service picks its own introduction points, and uploads something to every appropriate HSDir node. The HSDir nodes combine those somethings, somehow, into a hidden service descriptor.
Alternative 3: Single hidden service descriptor, one service instance per intro point.
Each instance of a hidden service picks its introduction points, and somehow they coordinate so that they, together, get a single unified list of all their introduction points. They use this list to make a single signed hidden service descriptor, and upload that to the appropriate HSDirs.
Alternative 4: Single hidden service descriptor, multiple service instances per intro point.
This is your design above, where there's one descriptor chosen by a single hidden service instance (or possibly made collaboratively?), and the rest of the service instances fetch it, learn which intro points they're supposed to be at, and parasitically establish fallback introduction circuits there.
There are probably other alternatives too; let's see if we can think of some more.
Here are some possible desirable things. I don't know if they're all important, or all worth it. Let's discuss!
Goal 1) Obscure number of hidden service instances. Goal 2) No "master" hidden service instance. Goal 3) If there is a "master" hidden service instance, clean fail-over from one master to the next, undetectable by the network. Goal 4) Obscure which instances are up and which are down.
What other goals should we have in this kind of design?