[tor-dev] [Draft Proposal] Scalable Hidden Services

Matthew Finkel matthew.finkel at gmail.com
Wed Oct 30 07:52:39 UTC 2013


On Mon, Oct 28, 2013 at 07:40:12PM +0000, Christopher Baines wrote:
> On 28/10/13 13:19, Matthew Finkel wrote:
> > This is a proposal I wrote to implement scalable hidden services. It's
> > by no means finished (there are some slight inconsistencies which I will
> > be correcting later today or tomorrow) but I want to make it public in
> > the meantime. I'm also working on some additional security measures that
> > can be used, but those haven't been written yet.
> 
> Great, I will try to link this in to the earlier thread for some continuity.
> 

Sounds good. For those just joining this discussion, the previous thread
can be found at
https://lists.torproject.org/pipermail/tor-dev/2013-October/005556.html

> It seems to me that this is a description of "Alternative 3" in Nick's
> email. Multiple instances, with multiple sets of introduction points,
> somehow combined in to one service descriptor?

Yes, mostly. The proposal describes Nick's "Alternative 3", but there is
no technical reason why the instances can not coordinate their IPs and
share some subset, assuming some additional modicications to the HS
design. This obviously was not included in the prop, but it would be
easy to extend the protocol to include this.

> I haven't managed to
> fully comprehend your proposal yet, but I though I would try and
> continue the earlier discussion.
 
That's fine, we can work through it, but you seem to understand it
pretty well

>
> So, going back to the goals, this "alternative" can have master nodes,
> but, can have you can also just have this "captain" role dynamically
> self assigned.

It's not really self-assigned, more like "assigned by the operator but
it is a result of previous node failures". I think we can think of it
as an "awareness" rather than an "assignment", in this scenario.

> Why did you include an alternative here, do you see these
> being used differently? It seems like the initial mode does not fulfil
> goal 2 or 3?

Yes, I see the Master-Slave design as an alternative to the Peer-Nodes
design. I think they do satisfy Nick's goal 3, but they obviously don't
satisfy goal 2.

> 
> One of the differences between the alternatives that keeps coming up, is
> who (if anyone) can determine the number of nodes. Alternative 3 can
> keep this secret to the service operator by publishing a combined
> descriptor. I also discussed in the earlier thread how you could do this
> in the "Alternative 4: Single hidden service descriptor, multiple
> service instances per intro point." design, by having the instances
> connect to each introduction point 1, or more times, and possibly only
> connecting to a subset of the introduction points (possibly didn't
> consider this in the earlier thread).

Out of the 4 designs in the proposal, I think section 4.2, the
Homogeneous shared-key peer-nodes design, is the best and is the most
versatile (but the most complex, as a result). So, technically, our
two proposals can be merged without much difficulty. However, there
are still some issues that I'm having some trouble solving in a sane
way. When you make the introduction point the focal point there are
some tradeoffs. I'm still not sure if these are the right tradeoffs
just to disguise the size of the hidden service.

> 
> Another recurring point for comparison, is can anyone determine if a
> particular service instance is down.

Absolutely. This is a problem I hope we can solve.

> Alternative 4 can get around this
> by hiding the instances behind the introduction points, and to keep the
> information from the introduction points, each instance (as described
> above) can keep multiple connections open, occasionally dropping some to
> keep the introduction point guessing. I think this would work, providing
> that the introduction point cannot work out what connections correspond
> with what instances.

True, but this sounds extremely risky and error prone. I really hope we
can do better than this to solve the problem.

> If each instance has a disjoint set of introduction
> points, of which some subset (possibly total) is listed in the
> descriptor, it would be possible to work out both if a instance goes
> down, and what introduction points correspond to that instance, just by
> repeatedly trying to connect through all the introduction points? If you
> start failing to connect for a particular subset of the introduction
> points, this could suggest a instance failure. Correlating this with
> power or network outages could give away the location of that instance?

Sure, but as the proposal describes, the security of the multi-node
hidden service is reduced to the security of the single-node hidden
service. As such, the location-via-correlation attack you describe
(there's probably a real/better name for it) is a result of the design,
and I decided not to fix it in fear of introducing another, more
dangerous, attack.

> 
> Also, compared to the setup of other alternatives, this seems more
> complex for the hidden service operator. Both in terms of understanding
> what they are doing, and debugging failures?

It is more complex, no argument there, but I don't think that it is
unfair to impose this on the operator (nothing in the world is free).
If an op wants to setup a multi-node hidden service, then it will not
be much more effort than setting up a single-node service. If the
application running behind the hidden service can support scaling, then
its configuration will likely be a lot more complicated than configuring
a multi-node HS. The way the proposal describes it, it's very
repetitive. So, on the bright side, the operator will be very familiar
with the torrc config lines when their they're done.  :)

> I think it would be good to
> partition the goals (as there are quite a lot (not inherently bad)). In
> particular, one subset of goals would be as follows:
> 
> Operator (the person, or people controlling the service) Usability
>  - Simple Initial Setup

>From the proposal, every node will contain two hidden services, one for
the public hidden service, and one used for inter-node communication.
I don't think this will be a barrier for entry as the operator will
likely be following step-by-step directions, in any case.

>  - Simple Addition of new Instances

As soon as the management hidden service has been created on the new
node, the operator simply appends it to the HiddenServicePeers line
on every node. I agree that this will require more work than most
people want to spend on something like this, but there are ways we can
solve this, with a little more complexity in the protocol. For example,
we can allow peers to sync configs with each other and then rewrite
portions of the torrc. This is something I am very hesitant to do,
though.

>  - Simple Removal of Instances

The opposite of addition, just remove the hidden service address from
the HiddenServicePeers line on each node.

>  - Graceful behaviour regarding instance failure (with respect to the
> operator)
>  - Well defined failure modes
>    - If something doesn’t work, it should be possible for the operator
> to work out what, and how to resolve it.

This is tricky but doable within the scope of this proposal. The tricky
part arises when returning any type of error code/message in the event
of an authentication/validation failure. But we can still produce useful
information that an operator can use to troubleshoot the configuration.

As an example, say we have a multi-node configuration that contains
nodes A and B. The HiddenServicePeers line in A's torrc contains the
hidden service for B but the HiddenServicePeers line in B's torrc
does not contain A, as a result when A tries to connect to B the
circuit is destroyed during authentication. The two failure reasons here
are that B doesn't have A's hidden service descriptor or A is not in
B's torrc. It is very easy for an operator to rule out the latter case,
and log messages from B should allow the operator to determine any other
problem.

I think it's a given that this configuration is much more complex than
the one you proposed, but the failure mode is not much worse
because in both proposals *some node*, somewhere, will always publish a
descriptor. It will certainly be difficult for the operator to
determine which node actually published it (without adding a new
mechanism to Tor for this) and the load balancing may not work as
expected until the operator investigates and fixes it. But, as far as I
can tell, neither of them fail closed, for better or worse. I'm
reanalyzing my design to see if I missed a case that should require it.

> 
> Now, obviously, these are minor compared to the more technical goals,
> but I think they are worth considering, as we have a few technically
> workable proposals on the table.
> 
> As for what I am doing on this currently, I have been reading lots of
> the related, and not so related papers. I hope to begin doing some more
> Hidden Service related Chutney stuff this or next week, such that I have
> something to test with when I start implementing something (note: what I
> implement, might not be adopted/adoptable by the project, but I am doing
> it anyway as part of my degree). I am available on #tor-dev as cbaines
> for any and all discussion.

Awesome! Are you planning on revising your proposal and sending it to
tor-dev again? I know I am interested in seeing what changes you decided
to make.

Thanks for your feedback and thoughts on information leakage!


More information about the tor-dev mailing list