Hi all,
Apologies for top-posting. I'm making a brief comment about general concerns in this design. Additional apologies if something already said obviates my comments. I have purposely not looked at most of the developments in HS design changes because they are _too_ interesting to me and I have had to make sure to keep my focus on some other features of Tor where progress is needed.
The biggest concern is that no matter how you handle the commitment and the size of the flexible set, you make it fairly easy for a HS simply following this protocol precisely and with just the resource of a handful of other nodes (n) in the network to identify the client guard with certainty each time. (If he owns less than n, it becomes a likelihood rather than a certainty each time. Alternatively if he owns a few ASes or IXPs he could accomplish similar results with a judicious choice of the network location of all n.) Given the push elsewhere to use single guards for long periods, this makes guard nodes all the more likely to face subpoena or other forms of attack since the argument that this is a successful strategy to locate clients of interest is greatly strengthened. Since the HS can choose two hops that it does not object to, the client should be similarly protected, i.e., four relays in the circuit overall.
The other big concern is that this looks like there are many places to DoS or resource deplete the hidden service. Earlier designs kept per introduction connection state for the HS to a minimum. There may be ways to reduce that, but it is an important consideration.
aloha, Paul
On Tue, Feb 11, 2014 at 10:07:20PM -0500, Zack Weinberg wrote:
On 02/11/2014 11:55 AM, Qingping Hou wrote:
Hi all,
We are a university research group and following is a proposal for possible optimization on Tor's rendezvous circuit design. It's not written as a formal Tor proposal yet because I want to have a discussion with the community first, get some feedbacks and make sure it does not introduce new vulnerabilities.
I am working with Qingping on this design, and we actually think we might be able to get the hop count all the way down to three. Five is solid (assuming we can solve the problems that have come up already, and assuming there are no further problems) but with some slightly cleverer crypto and a more complicated handshake, it's possible for each side to prove to the other that it is *not* cheating on the connection chain, which is the primary reason we need extra hops.
First off, recall that a single anonymizing proxy --
Client -- Rendezvous -- Service
still guarantees that Client can't learn Service's IP and vice versa, and no single-point eavesdropper can learn both Client and Service's IPs. The reason we don't use single-hop circuits in Tor is that the rendezvous itself might be malicious (or subverted).
Now, here is the usual three-hop Tor circuit, but for a hidden service:
Client -- Client_Guard -- Rendezvous -- Service_Guard -- Service
The additional guard hops protect both Client and Service from a malicious third-party Rendezvous, but - please correct me if I'm wrong about this - I believe that is *all* they are doing relative to a single proxy. Therefore, if Client could be assured that Service had set up its half of the chain honestly - was *not* deliberately forfeiting (some of) its own anonymity in order to learn more about Client - and vice versa, a three-hop circuit would be sufficient.
Here is a potential way to accomplish this. It also has the nice property of *not* needing Client and Service to agree on the complete list of available relays. One unusual cryptographic primitive is required: we need a cipher that is *deterministic* and *commutative* but retains all the other desiderata of a message cipher. These special properties are defined as follows:
Deterministic: E_k(M1) == E_k(M2) iff M1 == M2 Commutative: D_q(E_k(E_q(M))) == E_k(M)
The Massey-Omura cryptosystem https://en.wikipedia.org/wiki/Three-pass_protocol#Massey-Omura_cryptosystem has these properties, but there may be others.
In what follows, I will use E_k() and D_k() to refer to this cryptosystem specifically. I will use curly braces { ... } to indicate a message which is, as a whole, signed by its sender; these messages may in practice also be encrypted, but I *think* this is not actually required in any case.
In what follows, Fx (for some x) always refers to the fingerprint of some Tor node, and
(0) In advance, Client and Service each pick a set of guards that they will use for all communication, as in the present system.
(1) Client and Service somehow agree on a shared secret S. Client and Service also each independently pick a secret which is *not* shared with the counterparty: Client's secret will be referred to as A, and Service's secret will be referred to as B.
(2) Service picks a guard that it will use for this connection, and chooses K rendezvous candidates from its list of all available relays, such that the fingerprint of each candidate satisfies a condition based on S, that the client can verify (something like FP(c) mod S < Q, where Q is a constant determined in advance). This step gives the service a certain amount of flexibility in its chosen set of rendezvous candidates: the idea is to allow an honest service to avoid selecting any of its own guards, or itself. K needs to be large enough that a *malicious* service can't reasonably control all of the candidates, but *small* enough that a malicious *client* can't practically enumerate a service's guards by repeatedly connecting to the service.
(3) Service transmits to Client:
{ E_b(Fs), E_b(Fg), E_b(F1), ..., E_b(Fn), F1, ..., Fn }_s
where Fs is the fingerprint of the service itself, Fg the fingerprint of the service's selected guard, and F1...Fn the fingerprints of the rendezvous candidates.
(4) Client verifies that none of the encrypted fingerprints are equal, and selects a rendezvous point from the list of candidates. Client transmits to Server:
{ E_a(E_b(Fs)), E_a(E_b(Fg)), E_a(Fr), E_a(Ff), E_a(Fc), Fr }_c
where Fr is the fingerprint of the selected rendezvous point, Ff is the fingerprint of the *client's* guard, and Fc is the fingerprint of the client itself. Client connects to its guard, extends a circuit to the selected rendezvous point, and notifies the RP that it expects a hidden service rendezvous from a peer with concealed fingerprint E_b(Fg).
(5) Server computes E_a(Fs) = D_b(E_a(E_b(Fs)) and E_a(Fg) = D_b(E_a(E_b(Fg))), then verifies that none of the encrypted fingerprints are equal. Server connects to its guard, extends a circuit to the selected rendezvous point, and notifies the RP that it expects a hidden service rendezvous from a peer with concealed fingerprint E_a(Ff).
(6) Client and Server each reveal their respective secrets (A, B) to the rendezvous point, enabling the RP to verify that each connection has the fingerprint that its peer expects. If this checks out, the RP links the circuits together.
This last step is necessary to ensure that neither side connects directly to the RP. Revealing A and B does not allow the RP to do anything it couldn't otherwise do, because we're done encrypting things with A and B at this stage, and a malicious RP could reveal the identity of the next hop anyway.
I *think* the way Tor defines identity fingerprints guarantees that neither side can lie about its identity (claiming to be Fs when it is *itself* Fg, for instance): as long as all the messages involved are signed as Tor does already, then only the node that really is Fs could generate those signatures.
I also think that this protocol is immune to abort-if-you-don't-like-the-answer provided that each side imposes a substantial delay between connection attempts. If there is a better way to do that I am all ears.
zw
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev