Hi,
A few of us discussed this proposal in the #tor-dev IRC earlier this week.
There are still a few things in the proposal that we found confusing. (Some of these things are our mistakes in tor-spec, which were copied to the proposal. We’re working on fixing them in tor-spec.)
Once the proposal has been revised, we would like to: * try condensing the proposal into an algorithm in pseudocode * estimate how it will perform compared to Tor’s current relay crypto * write a tor-spec patch based on the proposal
Here’s a summary of our feedback:
Changing the Proposal Structure
We’re confused by 3.1 and 3.2, which seem to be duplicate sections using different notation.
We think the following structure would help us understand the proposal better: * 3.1. Describe key derivation for a single hop circuit, like tor-spec section 5.2.2 at: https://github.com/torproject/torspec/blob/master/tor-spec.txt#L1210 * 3.2. Describe encryption and decryption for a single-hop circuit, like tor-spec sections 0.3 and 0.4 at: https://github.com/torproject/torspec/blob/master/tor-spec.txt#L77 https://github.com/torproject/torspec/blob/master/tor-spec.txt#L124 * 3.3. Describe encryption and decryption for multi-hop circuits, based on the latest tor-spec section 5.5 at: https://github.com/torproject/torspec/pull/27/files#diff-390fa89f5c74e769e6e...
Making Re-Keying and Tweaking more Explicit
Tor’s current cell crypto keeps the AES-CTR counters synchronized and incrementing, while this proposal sems to derive them from the cell contents.
In this proposal, we had trouble working out: * when the AES is re-keyed vs tweaked * when the GHASH is re-keyed vs updated
So we wondered if the proposed scheme is not IND-CPA. i.e., if the client repeats a plaintext cell, it will produce identical ciphertext cells, because the AES-CTR nonce is based solely on the message and keys.
We think the new structure will be clearer, because the re-keying should all be in section 3.1, and the tweaks/updates should be in section 3.2.
Generalising to Different Numbers of Hops
The proposal assumes that all circuits are 3-hop circuits, but Tor typically builds 1, 3, 4, and 5-hop circuits. Also, Tor currently generates the same number of keys for each hop. There’s no way it can just generate kf_M, kf_E, kb_M, and kb_E for the final hop, because sometimes the final hop changes. (Due to circuit cannibalisation, or failed intro extension.)
Please generalise the proposal so that: * all references to “3-hop circuit” are changed to "N-hop circuit". * all references to kf_1,2,3, kb_1,2,3, k_fM, k_fE, k_bM and k_bE; are changed to kf_N, kb_N, kfM_N, kfE_N, kbM_N and kbE_N.
Do we really need 6 keys per hop? It’s ok if we do, let’s make sure there are no redundant keys.
Explaining which Hops do Integrity Checks
Are kfM_N, kfE_N, kbM_N and kbE_N only used for the innermost encryption and decryption? (That is, when N is the client’s target hop, or the cell is sent from hop N to the client.) If so, please explain that the integrity checks only apply to the lowest layer, and how this is secure. If we can do the integrity checks for each layer, perhaps we should, because then we are protecting against tagging attacks on some nodes, even when the exit doesn’t support the protocol.
For example, in a standard 3-hop circuit: C = G = M - E Where: = means end-to-end integrity checks - means no integrity checks If C and M support the proposed protocol, can they protect against tagging between G and E, even if G and E don’t support the proposed protocol? (Perhaps this could be a good example for the proposal?)
Protecting Onion Services
When onion services communicate with clients, we want to make sure that they aren’t vulnerable to tagging attacks.
For example, in a standard onion service rendezvous, two circuits are spliced at the rendezvous point R: C = G = M1 = R ~ M2 ~ M3 ~ G ~ S Where: = means end-to-end integrity checks between C and R ~ means end-to-end integrity checks between S and R - means no integrity checks
We think that the cells from C to S are protected from tagging, if the C to R and S to R circuits are protected from tagging, because tagging requires two unprotected nodes, one to add the tag, and one to remove it. (Also, like the scenario above with the exit circuit, if some of the hops support the protocol, then any hops closer to the client or service can’t add or remove tags.)
If onion services are protected, let’s explain that in the proposal.
Minor Stylistic Comments
We’re not used to the notation that you’re using, so it’s hard for us to follow some sections.
In particular: * Using the notation X||Y to represent a tuple of arguments to an operation is confusing, because || usually means concatenation. (e.g., H(K||X) should probably be H(K, X) because GHASH doesn't do a simple concatentation of K and X) * The capitalization of "K" vs “k” is inconsistent, so it’s hard to tell if keys are the same or different.
Here are some of our notes for our future reference:
Impact on Relay Performance
We would like to estimate how the proposed encryption will perform, compared to Tor’s current relay crypto.
We think the proposed scheme could be faster, because: * If we can throw out SHA1, we're saving a lot in practice: SHA1 is expensive compared to AES_CTR on current cpus * We should figure this out with the assumption that we do/don't have aesni and/or carryless-multiply * If we have a no-carry multiply (*CLMUL*), fast ghash is easy * If not, it's a bit of a dark art to get it without timing sidechannels
It looks like we trade 1 SHA1 for N+1 GHASHes, where N is the number of hops. (GHASH is much faster than SHA1, we think more than 3x.)
Here are some other minor corrections or questions, inline on the proposal:
On 5 Jul 2018, at 06:42, Nick Mathewson nickm@torproject.org wrote:
Here is (I hope) the correct version of this proposal, reformatted. Filename: 295-relay-crypto-with-atl.txt Title: Using ADL-GCM for relay cryptography (solving the crypto-tagging attack) Author: Tomer Ashur Created: 09 Apr 2018 Status: Open
- Context
Although Crypto Tagging Attacks were identified already in the original Tor design, it was not before the rise of the Procyonidae in 2012 that their severity was fully realized. In Proposal 202 (Two improved relay encryption protocols for Tor cells) Nick Mathewson discussed two approaches to stymie tagging attacks and generally improve Tor's cryptography. In Proposal 261 (AEZ for relay cryptography) Mathewson puts forward a concrete approach which uses the tweakable wide-block cipher AEZ.
- Motivation
For motivations, see proposal 202.
- Summary and preliminaries
This proposal suggests an alternative approach to Proposal 261 using the notion of Release (of) Unverified Plaintexts (RUP-security). It describes an improved algorithm for circuit encryption based on CTR-mode which is already used in Tor, and an additional component for hashing. When this additional component is GHash, a GCM-based solution can be used from the OpenSSL library.
Incidentally, and similar to Proposal 261, this proposal employs the ENCODE-then-ENCIPHER approach thus it improves Tor's E2E integrity by using (sufficiently enough) redundancy.
For more information about the scheme and a security proof for its RUP-security see
Tomer Ashur, Orr Dunkelman, Atul Luykx: Boosting Authenticated Encryption Robustness with Minimal Modifications. CRYPTO (3) 2017: 3-33 available online at https://eprint.iacr.org/2017/239 .
2.1. Cryptographic algorithms
As an encryption primitive we use AES. We denote one call to the primitive by E(·), e.g.,
Z = E_k(X)
where k is a K-bit key, X is the input to AES and Z is the output. We omit k where it is irrelevant or is clear from context. While we propose to set K=256 to ensure long-term post-quantum resistance, we stress that the solution also works for other key lengths.
To encrypt, we use AES is counter mode and a nonce N to produce a random bit stream, e.g.,
CTR(N) = E(N)||E(N+1)||...||E(N+l) = z_1||z_2||...||Z_{n+1} = Z
and XOR Z with the message M to produce the ciphertext C, e.g.,
C = Z ^ M
For authentication, we use the universal hash function GHASH and a key k, e.g.,
T = H(K||X) = H_k(X)
where K is a 256-bit key, X an arbitrary length string, and T a unique authentication tag for X. While we propose to set K=256 to ensure long-term post-quantum resistance, we stress that the solution works for any universal hash function as well as for keys of length K!=256.
- The proposed solution
3.1 An overview of the proposed solution
Noting that in CTR-mode the nonce is essential for the encryption/decryption operations, the proposed solution is based on linking the nonce and the transmitted data in such a way that any change to the data (e.g., through a tagging attack) randomizes the nonce and results in a random value Z' != Z for CTR(N' != N).
3.1.1. 'Forward' direction (same direction as CREATE):
When preparing a message to be sent in the forward direction, the client is responsible for encrypting all layers. The client starts by digesting the message to create a synthetic tweak T_0 = H_{k_fM}(M) and uses this tweak to encrypt an all-zero string to create a synthetic IV: S_0 = T_0 ^ E_{k_fE}(0 ^ T_0).
The message and the synthetic IV (i.e., M||S_0) are passed as input to the first encryption layer which uses S_0 to produce the random stream Z = CTR_k1(S_0) and the ciphertext C = Z ^ M. Then, a tweak T = H_k2(C) is generated by digesting the ciphertext of this layer. Finally, S_0 is encrypted using AES and the tweak T to produce an encrypted nonce S = T ^ E_k3(N ^ T) which is appended to the the
Duplicate word "the the”.
Also the N should probably be S_0.
ciphertext and is used as the nonce for the next encryption layer.
After encrypting 3 times, the client sends C||S to the guard. Whereas
Some circuits are not 3 hops, see above.
Entry nodes can be bridges or guards or a random node (when UseEntryGuards is 0). So let’s say “entry”, not “guard".
the protocol previously required that the nonce be kept in sync between each node and the client, this is no longer required as it can now be recovered from C||S. To decrypt, the node first computes T = H_k2(C) then uses it to obtain S = D_k3(S ^ T) ^ T. Finally, CTR(S)
should be S_0 = D_k3(S ^ T) ^ T and CTR_k1(S_0)?
is used to decrypt C into M. The once-decrypted nonce is appended to the data and M||S is sent to the next node which repeats this process.
The exit verifies the integrity of the message by digesting it and
End nodes can be exits or directories or intro or rend nodes. This is an existing problem in tor-spec, see: https://trac.torproject.org/projects/tor/ticket/26885
using the resulting T to decrypt S_0 into N. If N=0, the message is authenticated. Otherwise, if the message has been changed by an adversary, N!=0 and the circuit is destroyed.
3.1.2 'Back' direction (opposite direction from CREATE):
When sending back a message from the exit to the client, each node adds an additional layer of encryption. The exit starts by digesting the message to create a synthetic tweak T_0 = H_{k_bM}(M) and uses this tweak to encrypt an all-zero string to create a synthetic IV: S_0 = T_0 ^ E_{k_bE}(0 ^ T_0). The synthetic IV is then used to produce the random stream Z = CTR_k1(S_0) and uses that to encrypt the plaintext through C = Z ^ M. The nonce is appended to the ciphertext and C||S_0 is sent to the next node.
Receiving a message from the exit, the middle starts by producing a tweak T = H_k2(C). The tweak is then used to encrypt the nonce through S = T ^ E_k3(N ^ T) which is then used to produce the random stream Z' = CTR_k1(S). Finally, the message is encrypted via C' = Z' ^ C and the encrypted nonce is appended to C'. The message C'||S is sent to the next node which repeats this procedure.
Some circuits are not 3 hops, see above.
When the message arrives at the client, the client peels each layer using the nonce that is appended at the end of the message. The decrypted layer is used to compute the tweak T = H_k2(P), which is used to decrypt this layer's nonce into the next layer's nonce.
During our review, we noticed that the decryption order in tor-spec is wrong, so we fixed it: https://trac.torproject.org/projects/tor/ticket/26860
It looks like this proposal has the correct order, but we think it will be clearer if it is re-structured. (See above.)
The client verifies the integrity of the message by digesting it and using the resulting T to decrypt S_0 into N. If N=0, the message is authenticated. Otherwise, if the message has been changed by an adversary, N!=0 and the circuit is destroyed.
3.2 Circuit management
3.2.1 Creating circuits
The general process of creating a circuit remains the same (see tor-spec.txt), except for step 5 where instead of creating a pair of keys (kf_1,kb_1) for forward and backward communication, respectively, 2 triplets of keys (kf_1,kf_2,kf_3) and (kb_1,kb_2,kb_3) are created for communication in the forward and backward directions, respectively.
Some circuits are not 3 hops, see above.
Two additional pairs of "authentication keys" (k_fM,k_fE) and (k_bM,k_bE) are created between the client and the exit in the forward and backward directions,
We think we might need these keys for each node, not just the exit. (See above.)
respectively.
note: the keys can be created by running KDF-RFC5869 3 times in each direction, or by running it once in each direction to create a master-key and derive the rest as subkeys from this master key
Let’s use the same construction as tor-spec to derive the keys, but generate more data when we need more keys.
3.2.2 Routing relay cells
When an OR receives a RELAY or RELAY_EARLY cell, it checks the cell's circID and determines whether it has a corresponding circuit along that connection. If not, the OR drops the cell.
This paragraph is copied from tor-spec.txt. I think we can drop it from the proposal, and reference tor-spec instead.
Otherwise, if the OR is not at the OP edge of the circuit (that is, either an 'exit node' or a non-edge node), it de/encrypts the payload with the stream cipher,
This sentence is copied from tor-spec.txt. It’s confusing by itself.
The corresponding section in tor-spec.txt contains some errors. And it’s also hard to read.
We opened a ticket for this issue: https://trac.torproject.org/projects/tor/ticket/26859
And then we fixed it in this ticket, because the modified the same sections: https://trac.torproject.org/projects/tor/ticket/26860
We’d like to see the proposal revised so it matches the new tor-spec structure: * forward encryption at the client * forward decryption at ORs * backward encryption at the end (exit) * backward decryption at the client
as follows (the message structure is C||S):
'Forward' relay cell (same direction as CREATE) (input structure is: C||S): 1. T = H_{kf_2}(T'||C); digest ciphertext (T' is a running digest of all previous cells, i.e., last round's T) 2. N = T ^ D_{kf_3}(S ^ T); decrypt nonce 3. M = CTR_{kf_1}(N) ^ C; decrypt message (output structure is M||N) 'Back' relay cell (opposite direction from CREATE) (input structure is: M||N): 1. T = H_{kb_2}(T'||M); digest ciphertext (T' is a running digest of all previous cells, i.e., last round's T) 2. S = T ^ E_{kb_3}(N ^ T); encrypt nonce 3. C = CTR_{kb_1}(S) ^ M; encrypt message (output structure is C||S)
Is T’ the previous T (tweak) held over as a piece of state?
Because the "running digest” doesn’t seem to be the internal state of a digest algorithm.
Note that in counter mode, decrypt (D) and encrypt (E) are the same operation.
This paragraph is copied from tor-spec.txt. In this context, it’s incorrect.
The OR then decides whether it recognizes the relay cell, by inspecting the payload as described in Section 4.1 below. If the OR recognizes the cell, it processes the contents of the relay cell. Otherwise, it passes the decrypted relay cell along the circuit if the circuit continues. If the OR at the end of the circuit encounters an unrecognized relay cell, an error has occurred: the OR sends a DESTROY cell to tear down the circuit.
This paragraph is copied from tor-spec.txt. I think we can drop it from the proposal, and reference tor-spec instead.
- Application connections and stream management
4.1 Relay cells
Within a circuit, the OP and the exit node use the contents of RELAY packets to tunnel end-to-end commands and TCP connections ("Streams") across circuits. End-to-end commands can be initiated by either edge; streams are initiated by the OP.
The payload of each unencrypted RELAY cell consists of: Relay command [1 byte] 'Recognized' [2 bytes] StreamID [2 bytes] Length [2 bytes] Data [PAYLOAD_LEN-23 bytes]
The 'recognized' field is used for a simple indication if the cell still encrypted or not. When sending cells, the unencrypted 'recognized' MUST be set to zero.
We revised this paragraph in tor-spec, see below.
Maybe the proposal can just reference tor-spec, rather than copying this text?
When receiving and decrypting cells the 'recognized' will always be zero if we're the endpoint that the cell is destined for. For cells that we should relay, the 'recognized' field will usually be nonzero, but will accidentally be zero with P=2^-32.
2 bytes, P=2^-16
This error seems to be copied from tor-spec.txt section 6.1.
For a fix, see: https://trac.torproject.org/projects/tor/ticket/26860
The old Digest field is removed, since sufficient information for authentication is now included in the all-zero string used to create the nonce for the first encryption layer.
The 'Length' field of a relay cell contains the number of bytes in the relay payload which contain real payload data. The remainder of the payload is padded with NUL bytes.
We have open proposals and tickets on padding, so I suggest that we use this wording:
"The remainder of the payload is padded with padding bytes. See tor-spec Section 3 for details."
For those who are interested, here's the list of upcoming changes:
Clarify/determine specification for padding https://trac.torproject.org/projects/tor/ticket/26228
prop289: Implement authenticated SENDME https://trac.torproject.org/projects/tor/ticket/26288
In particular:
prop289: Leave unused random bytes in relay cell payload https://trac.torproject.org/projects/tor/ticket/26846
prop289: randomize the unused part of relay payloads https://trac.torproject.org/projects/tor/ticket/26871
4.2 Appending the encrypted nonce
When a cell is prepared by the client to be sent in the 'forward' direction, as explained in Section 3.2.2, the encrypted nonce S is appended to the encrypted cell (occupying the last 16 bytes of the cell).
Tor’s current digest is in the cell header. But this proposal moves the integrity check from the header to end of the cell. Is there any reason why we can’t put S in the cell header?
If we did, we’d need a specification like the existing tor-spec digest:
“The nonce S is generated based on the content of the entire cell, with the nonce field filled with NUL bytes.”
We should decide if it’s better to: * have non-data fields at the start and end of the cell, or * fill a field with NUL bytes, calculate the nonce, then replace the NUL bytes with the nonce.
If the cell is prepared to be sent to a node supporting the new protocol, S is used as the nonce for the next round's encryption. Otherwise, if the next node only supports the old protocol, S is still appended to the encrypted cell, but a synchronized nonce (as per the old protocol) is used for encryption.
Mixed version networks will be easier to implement if we derive keys pairwise with each node based on the handshake. (See above.)
We should probably say something like: "Both the client and the node need to support the new protocol in this proposal, otherwise we fall back to the old one."
When a cell is sent along the circuit in the 'backward' direction, nodes supporting the new protocol always assume that the last 16 bytes of the input are the nonce used by the previous node, which they process as per Section 3.2.2. If the previous node also supports the new protocol, these cells are indeed the nonce. If the previous node only supports the old protocol, these bytes are either encrypted NUL bytes or encrypted data.
“encrypted padding bytes", see above.
4.3 Authenticating the message
End-to-end authentication is performed in both directions in the same way. Once the final layer of encryption is removed, the message-to-be-authenticated is digested and this digest is used to decrypt the nonce used in this layer. If the decrypted nonce is all-zero, the message is authentic. Otherwise, if a change was made to the ciphertext or the encrypted nonce somewhere along the circuit, the nonce decrypts into a random value and the circuit is destroyed.
We think every hop should authenticate, at least in the transition period. (See above.)
T
-- teor
Please reply @torproject.org New subkeys 1 July 2018 PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B