After discussion with John Schanck and Trevor Perrin over the last month, we've decided to make some alterations to the specification for hybrid handshakes in Tor proposal #269.
It seems that John, Trevor, and I are mostly in agreement about most of the construction.
First, I'll try to summarise a list of changes and the reasoning/concerns which provoked them. For what it's worth, these concerns mostly involve highly theoretical problems surrounding whether we model a hash function with a random oracle or try to make some stronger claims to its properties, and aren't due to any real world attacks (assuming that hash functions do what you'd expect them to do and aren't something crazy like a NULL op or a pineapple slicing machine).
1. [NTOR] Inputs to HKDF-extract(SALT, SECRET) which are not secret (e.g. server identity ID, and public keys A, X, Y) are now removed from SECRET and instead placed in the SALT.
Reasoning: *Only* secret data should be placed into the HKDF extractor, and public data should not be mixed into whatever entropic material is used for key generation. This eliminates a theoretical attack in which the server chooses its public material in such a way as to bias the entropy extraction. This isn't reasonably assumed to be possible in a "hash functions aren't probablistically pineapple slicers" world.
Previously, and also in NTor, we were adding the transcript of the handshake(s) and other public material (e.g. ID, A, X, Y, PROTOID) directly into the secret portion of an HMAC call, the output of which is eventually used to derive the key material. The SALT for HKDF (as specified in RFC5869) can be anything, even a static string, but if we're going to be adding transcript material into the handshake, it shouldn't be in the entropy extraction phrase.
2. [NTOR] The authentication of transcript data, i.e. (ID, A, X, Y, EPK, C) where EPK is the public and ephemeral portion which the client sends in the post-quantum KEM, and C is the reconciliation data, is now distinctly separate from the production of SALT for the extractor, and is first HMACed separately (where the key is derived from the same HKDF extraction which produces the seed) before being included within the expansion phase.
Reasoning: The idea is to avoid attempting to do context-binding (of the transcript, in this case) and entropy extraction at the same time, in order to have a stronger argument that the shared key used for authenticating the context is secure, whereas (before, in NTor) things were a bit murky.
The use of auth_input in ntor was designed to prevent a certain type of collision attack (see [Zav12, SZW16]). However the auth_input countermeasure is unnecessary if the authentication tag is of length 2*LAMBDA. A collision attack on a random function of output length 2*LAMBDA has cost 2^LAMBDA. This change additionally avoids the collision attack.
3. The hashing of first SALT has been removed.
(Or, alternatively, it's still there, assuming you're following the specification for HKDF extraction in RFC5869 to the T and hashing any incoming SALT longer than 32 bytes. If you're using something like SHAKE-256 or similar… well, it's not exactly clear or specified yet how to use SHAKE as a dropin replacement for a KDF. Joan is apparently writing something.)
Reasoning: It was originally included due to concerns that a malicious adverary could potentially choose some SALT such that when passed into the HKDF-extractor, it would "nullify" the SECRET input rather than extracting from it. However, the HKDF-extractor is HMAC(SALT, SECRET) and we assume the HMAC's underlying hash function is not a machine which factors discrete logarithms or a slices pineapples. Were you to replace your own hash function with a pineapple slicer, you'd simply be creating a vulnerability for yourself, and thus it doesn't really make sense in Tor's case to hash the SALT first because we're not living in the horrible world where hash functions can turn out to be pineapple slicers. (And even if it were possible for them to be pineapple slicers, it personally still doesn't make sense to me why you'd want to protect against potential pineapple slicers by putting your data through a pineapple slicer twice.)
Best regards,