[tor-dev] RFC on obfs3 pluggable transport

Wed Dec 12 02:02:49 UTC 2012

On Wed, Dec 12, 2012 at 03:13:59AM +0200, George Kadianakis wrote:
> Ian Goldberg <iang at cs.uwaterloo.ca> writes:
> 
> > [Should we not be copying tor-dev on this thread?]
> >
> 
> We definitely should.
> 
> Is it OK if I forward the whole thread to tor-dev (including this mail
> and your reply)? Feel free to do it yourself too if you want.

I'll copy tor-dev on this email, and not snip things in order that
context is mostly preserved.  If you think some past snippage should be
restored, feel free to forward that as well.

> BTW, in my first email, I had tor-assistants CCed, but you stripped it
> off when you replied, so I didn't add it again :)

Oops; that was unintentional.  My bad.  But tor-dev is better, anyway.

> > On Fri, Dec 07, 2012 at 02:09:21AM +0200, George Kadianakis wrote:
> > <snip>
> >
> >> > You'll also need to be careful about the exact protocol to select
> >> > whether to use the curve or the twist; I remember we discussed this at
> >> > PETS, sitting outside the wine tasting.
> >> >
> >> 
> >> Ugh. I admit that even though I remember our discussion at PETS, I
> >> don't recall that part of our discussion. What do you mean by "the
> >> exact protocol to select whether to use the curve or the twist"?
> >> 
> >> My plan would be to choose randomly between the base point of the
> >> curve and the base point of its twist, make a public key using the
> >> randomly chosen base point, and finally send the public key to the
> >> other party. Looking at the Telex paper, this is similar to the
> >> creation of a_0/a_1 in the 'Setup' phase of appendix 'Tagging Details'.
> >> 
> >> IIUC the other party shouldn't care whether the public key was created
> >> using the curve or its twist, and can just do another point
> >> multiplication to derive the shared secret.
> >> 
> >> Is this the protocol you are talking about, or am I getting my stuff
> >> wrong?
> >
> > That works if clients know a pair of public keys for the bridge in
> > advance.  It was unclear to me whether this would be the case.  Is it?
> > Can we put public keys in with the bridge descriptors?  In this case,
> > you'd need 42 octets for the public key.  The spec I see in gitweb
> > doesn't assume that.  Rather, it has all operations being performed on a
> > common curve.
> >
> 
> Correct. Let's assume that clients _don't_ have any shared-secrets
> beforehand. It makes our transport more deployable and usable, and it
> doesn't rely on non-implemented pluggable transport features.
> 
> (FWIW, adding public keys (or other info) to bridge descriptors is
> supported by the pluggable transport spec (DECLARE option), but it's
> not implemented. Support for passing public keys (or other info) to
> pluggable transport proxies is implemented in ticket #3594 but it's
> not ready for merge yet.)
> 
> > The trick is that both sides need to use the *same* curve: either the
> > main curve or the twist.  So if I watch a host I suspect is an
> > obfs3proxy responder and find that the first 21 bytes it outputs is
> > always on the same curve as the first 21 bytes it received, that's good
> > evidence for my suspicion.
> >
> 
> I see.
> 
> Let me formalize the protocol I tried to describe in my previous mail.
> 
> Alice randomly selects a bit 'b' \in {0,1}. Similar to Telex, the bit
> 'b' selects whether Alice will use the normal curve E or its twist E'). 
> Depending on the bit 'b', Alice generates a public key on E or E'.
> 
> Now, Alice sends to Bob her public key along with padding, as
> described in the obfs3 spec.
> 
> Bob receives Alice's message and parses the first 21 bytes, which he
> considers her public key. Bob checks whether the public key fits E or
> E'. Depending on the curve, Bob loads up the correct domain
> parameters, does a point multiplication with Alice's public key and
> his private key, and derives the shared secret.
> 
> Now, Bob does the same dance with Alice, so that Alice can also derive
> the shared secret.
> 
> What's the problem with this protocol?

When Bob "does a point multiplication with Alice's public key", you mean
"Bob picks a private key y, and multiplies Alice's public key by y to
yield the shared secret".  Note that the resulting shared secret (y
times Alice's public key) will be in the same group (E or E') that
Alice's public key is in.

Now what does Bob send to Alice?  Bob should be sending y times the
generator of the group.  Which group?  If Bob and Alice are to agree on
the shared secret, it has to be the same group Alice used.  Otherwise,
Alice's computation will necessarily end up different from Bob's (since
they'd be in different groups) and the two secrets won't match.

So in order for this protocol to work, Bob has to always choose the same
group he received as a public key from Alice.  But that's now a
distinguisher for the protocol: if I'm watching someone I think is
running an obfsproxy3 bridge, and I notice that the first 21 bytes he
receives on any TCP connection always lies in the same group as the
first 21 bytes he sends in response, then I can be pretty confident in
my suspicion.

> I think it's similar to how Telex generates and inspects tags: Looking
> at the algorithm of the "Telex tag inspection" section of the Telex
> paper, it seems to me that the Telex station doesn't know whether the
> tag is on the curve or its twist either.

The big difference is that Telex uses *static* DH (and gains forward
secrecy by rotating keys): the client comes pre-loaded with (rP, rP')
where P and P' are generators of E and E'.  The Telex client then
randomly picks Q to be either P or P', and randomly picks s, and sends
sQ.  The Telex station then can tell which group sQ is in, and multiply
sQ by r.  Conversely, the client uses whichever of rP and rP'
corresponds to its choice of Q, and multiplies that by s.  Then the
secrets match.

The important distinction is that in Telex, the client *already knows
one public key in each group*.  In the above obfsproxy3 proposal, it
does not.

> I tried to find how the Telex station code does it, but I didn't
> manage to find the telex-relay code on the Internet :(

Indeed, I don't believe the UMich people have (yet?) published it.

> > Are we married to using elliptic curves?  Is performance a serious
> > concern at the moment?  If not, it may be easier to use Z_p, while still
> > using a trick similar to the "twist":
> >
> 
> No, we are not married to elliptic curves. It was just that the only
> trick I knew on hiding DH public keys was the Telex trick, which uses
> ECC.
> 
> > Let p = 3 mod 4 be prime, with q=(p-1)/2 also prime, and p is at least
> > 1536 bits.  (2048 if there's room.)  [Use group 5 or group 14 from RFC
> > 3526.]  Let g be a generator of the order-q subgroup of Z_p^* (g=2 for
> > the two above groups from the RFC.)
> >
> > To pick a private key, pick a random 1536-bit (or 2048-bit) number, and
> > force the low bit to 0.  So there are 1535 (2047) bits of randomness,
> > which is the size of q.  Let x be that private key.  Let X = g^x mod p.
> >
> > Here's the trick:  When you send the public key, randomly decide to send
> > either X or p-X.  That will make the public key part a uniform 1536-bit
> > (2048-bit) string (well, negligibly different from uniform).
> >
> > The other side constructs y and Y=g^y mod p in the same way, and sends
> > either Y or p-Y.
> >
> > Note that both (p-Y)^x = Y^x mod p since x is even, and similarly
> > (p-X)^y = X^y mod p, so key derivation goes through unchanged.
> >
> > The downside of the larger public keys is that it puts a lower bound on
> > the size of the data sent by the initiator before the responder answers.
> >
> 
> Ha. The Z_p stunt you describe is nifty! I will seriously consider it,
> if the telex-curve protocol doesn't work out without a pre-shared
> public key.

Note that the above protocol assumes that p is very close to a power of
256, but that's true of the RFC 3526 groups 5 and 14 I pointed to.

> > On a side note, in Philipp Winter's "put the hash there" trick, I think
> > it would be better to use HMAC(SHARED_SECRET, "obfsproxy3 delimiter
> > string") rather than h(SHARED_SECRET).
> >
> 
> True.
> 
> As a matter of fact, we should probably have two HMACs, one for the
> initiator and another for the responder. Otherwise, hashing the
> SHARED_SECRET with a common key, will result in HASH_LEN bytes that
> appear both in the initiator's and in the responder's traffic stream.

Yes indeed.

   - Ian