commit bd6da728510d4fe3bf79d59905a599661c5d7e1f Author: Nick Mathewson nickm@torproject.org Date: Mon Dec 28 17:37:18 2015 -0500
Add proposal 261 and 262, for AEZ and rekeying --- proposals/000-index.txt | 4 + proposals/261-aez-crypto.txt | 282 ++++++++++++++++++++++++++++++++++++++ proposals/262-rekey-circuits.txt | 130 ++++++++++++++++++ 3 files changed, 416 insertions(+)
diff --git a/proposals/000-index.txt b/proposals/000-index.txt index 8dcbe26..e098c72 100644 --- a/proposals/000-index.txt +++ b/proposals/000-index.txt @@ -181,6 +181,8 @@ Proposals by number: 258 Denial-of-service resistance for directory authorities [OPEN] 259 New Guard Selection Behaviour [DRAFT] 260 Rendezvous Single Onion Services [DRAFT] +261 AEZ for relay cryptography [OPEN] +262 Re-keying live circuits with new cryptographic material [OPEN]
Proposals by status: @@ -234,6 +236,8 @@ Proposals by status: 246 Merging Hidden Service Directories and Introduction Points 256 Key revocation for relays and authorities 258 Denial-of-service resistance for directory authorities + 261 AEZ for relay cryptography + 262 Re-keying live circuits with new cryptographic material ACCEPTED: 140 Provide diffs between consensuses 172 GETINFO controller option for circuit information diff --git a/proposals/261-aez-crypto.txt b/proposals/261-aez-crypto.txt new file mode 100644 index 0000000..14435e7 --- /dev/null +++ b/proposals/261-aez-crypto.txt @@ -0,0 +1,282 @@ +Filename: 261-aez-crypto.txt +Title: AEZ for relay cryptography +Author: Nick Mathewson +Created: 28 Oct 2015 +Status: Open + +0. History + + I wrote the first draft of this around October. This draft takes a + more concrete approach to the open questions from last time around. + +1. Summary and preliminaries + + This proposal describes an improved algorithm for circuit + encryption, based on the wide-block SPRP AEZ. I also describe the + attendant bookkeeping, including CREATE cells, and several + variants of the proposal. + + For more information about AEZ, see + http://web.cs.ucdavis.edu/~rogaway/aez/ + + For motivations, see proposal 202. + +2. Specifications + +2.1. New CREATE cell types. + + We add a new CREATE cell type that behaves as an ntor cell but which + specifies that the circuit will be created to use this mode of + encryption. + + [TODO: Can/should we make this unobservable?] + + The ntor handshake is performed as usual, but a different PROTOID is + used: + "ntor-curve25519-sha256-aez-1" + + To derive keys under this handshake, we use SHAKE256 to derive the + following output: + + struct shake_output { + u8 aez_key[48]; + u8 chain_key[32]; + u8 chain_val_forward[16]; + u8 chain_val_backward[16]; + }; + + The first two two fields are constant for the lifetime of the + circuit. + +2.2. New relay cell payload + + We specify the following relay cell payload format, to be used when + the exit node circuit hop was created with the CREATE format in 2.1 + above: + + struct relay_cell_payload { + u32 zero_1; + u16 zero_2; + u16 stream_id; + u16 length IN [0..498]; + u8 command; + u8 data[498]; // payload_len - 11 + }; + + Note that the payload length is unchanged. The fields are now + rearranged to be aligned. The 'recognized' and 'length' fields are + replaced with zero_1, zero_2, and the high 7 bits of length, for a + minimum of 55 bits of unambigious verification. (Additional + verification can be done by checking the other fields for + correctness; AEZ users can exploit plaintext redundancy for + additional cryptographic checking.) + + When encrypting a cell for a hop that was created using one of these + circuits, clients and relays encrypt them using the AEZ algorithm + with the following parameters: + + Let Chain denote chain_val_forward if this is a forward cell + or chain_val_backward otherwise. + + tau = 0 + + # We set tau=0 because want no per-hop ciphertext expansion. Instead + # we use redundancy in the plaintext to authenticate the data. + + Nonce = + struct { + u64 cell_number; + u8 is_forward; + u8 is_early; + } + + # The cell number is the number of relay cells that have + # traveled in this direction on this circuit before this cell. + # ie, it's zero for the first cell, two for the second, etc. + # + # is_forward is 1 for outbound cells, 0 for inbound cells. + # is_early is 1 for cells packaged as RELAY_EARLY, 0 for + # cells packaged as RELAY. + # + # Technically these two values would be more at home in AD + # than in Nonce; but AEZ doesn't actually distinguish N and AD + # internally. + + Define CELL_CHAIN_BYTES = 32 + + AD = [ XOR(prev_plaintext[:CELL_CHAIN_BYTES], + prev_ciphertext[:CELL_CHAIN_BYTES]), + Chain ] + + # Using the previous cell's plaintext/ciphertext as additional data + # guarantees that any corrupt ciphertext received will corrupt the + # plaintext, which will corrupt all future plaintexts. + + Set Chain = AES(chain_key, Chain) xor Chain. + + # This 'chain' construction is meant to provide forward + # secrecy. Each chain value is replaced after each cell with a + # (hopefully!) hard-to-reverse construction. + + This instantiates a wide-block cipher, tweaked based on the cell + index and direction. It authenticates part of the previous cell's + plaintext, thereby ensuring that if the previous cell was corrupted, + this cell will be unrecoverable. + +3. Design considerations + +3.1. Wide-block pros and cons? + + See proposal 202, section 4. + +3.2. Given wide-block, why AEZ? + + It's a reasonably fast probably secure wide-block cipher. In + particular, it's performance-competitive with AES_CTR, and far better + than what we're doing now. See performance appendix. + + It seems secure-ish too. Several cryptographers I know seem to + think it's likely secure enough, and almost surely at least as + good as AES. + + [There are many other competing wide-block SPRP constructions if + you like. Many require blocks be an integer number of blocks, or + aren't tweakable. Some are slow. Do you know a good one?] + +3.3. Why _not_ AEZ? + + There are also some reasons to consider avoiding AEZ, even if we do + decide to use a wide-block cipher. + + FIRST it is complicated to implement. As the specification says, + "The easiness claim for AEZ is with respect to ease and versatility + of use, not implementation." + + SECOND, it's still more complicated to implement well (fast, + side-channel-free) on systems without AES acceleration. We'll need + to pull the round functions out of fast assembly AES, which is + everybody's favorite hobby. + + THIRD, it's really horrible to try to do it in hardware. + + FOURTH, it is comparatively new. Although several cryptographers + like it, and it is closely related to a system with a security proof, + you never know. + + FIFTH, something better may come along. + + +4. Alternative designs + +4.1. Two keys. + + We could have a separate AEZ key for forward and backward encryption. + This would use more space, however. + +4.2. Authenticating things differently + + In computing the AD, we could replace xor with concat. + + In computing the AD, we could replace CELL_CHAIN_BYTES with 16, or + 509. + + (Another thing we might dislike about the current proposal is + that it appears to requires us to remember 32 bytes of plaintext + until we get another cell. But that part is fixable: note that + in the structure of AEZ, the AD is processed in the AEZ-hash() + function, and then no longer used. We can compute the AEZ-hash() + to be used for the next cell after each cell is en/de crypted.) + +4.3. Other hashes. + + We could update the ntor definition used in this to use a better hash + than SHA256 inside. + +4.4. Less predictable plaintext. + + A positively silly option would be to reserve the last X bytes of + each relay cell's plaintext for random bytes, if they are not used + for payload. This might help a little, in a really doofy way. + + +A.1. Performance notes: memory requirements + + Let's ignore Tor overhead here, but not openssl overhead. + + IN THE CURRENT PROTOCOL, the total memory required at each relay is: 2 + sha1 states, 2 aes states. + + Each sha1 state uses 96 bytes. Each aes state uses 244 bytes. (Plus + 32 bytes counter-mode overhead.) This works out to 704 bytes at each + hop. + + IN THE PROTOCOL ABOVE, using an optimized AEZ implementation, we'll + need 128 bytes for the expanded AEZ key schedule. We'll need another + 244 bytes for the AES key schedule for the chain key. And there's 32 + bytes of chaining values. This gives us 404 bytes at each hop, for a + savings of 42%. + + If we used separate AES and AEZ keys in each direction, we would be + looking at 776 bytes, for a space increase of 10%. + +A.2. Performance notes: CPU requirements on AESNI hosts + + The cell_ops benchmark in bench.c purports to tell us how long it + takes to encrypt a tor cell. But it wasn't really telling the truth, + since it only did one SHA1 operation every 2^16 cells, when + entries and exits really do one SHA1 operation every end-to-end cell. + + I expanded it to consider the slow (SHA1) case as well. I ran this on + my friendly OSX laptop (2.9 GHz Intel Core i5) with AESNI support: + + Inbound cells: 169.72 ns per cell. + Outbound cells: 175.74 ns per cell. + Inbound cells, slow case: 934.42 ns per cell. + Outbound cells, slow case: 928.23 ns per cell. + + + Note that So for an n-hop circuit, each cell does the slow case and + (n-1) fast cases at the entry; the slow case at the exit, and the fast + case at each middle relay. + + So for 3 hop circuits, the total current load on the network is + roughly 425 ns per hop, concentrated at the exit. + + Then I started messing around with AEZ benchmarks, using the + aesni-optimized version of AEZ on the AEZ website. (Further + optimizations are probably possible.) For the AES256, I + used the usual aesni-based aes construction. + + Assuming xor is free in comparison to other operations, and + CELL_CHAIN_BYTES=32, I get roughly 270 ns per cell for the entire + operation. + + If we were to pick CELL_CHAIN_BYTES=509, we'd be looking at around 303 + ns per cell. + + If we were to pick CELL_CHAIN_BYTES=509 and replace XOR with CONCAT, + it would be around 355 ns per cell. + + If we were to keep CELL_CHAIN_BYTES=32, and remove the + AES256-chaining, I see values around 260 ns per cell. + + (This is all very spotty measurements, with some xors left off, and + not much effort done at optimization beyond what the default + optimized AEZ does today.) + +A.3. Performance notes: what if we don't have AESNI? + + Here I'll test on a host with sse2 and ssse3, but no aesni instructions. + From Tor's benchmarks I see: + + Inbound cells: 1218.96 ns per cell. + Outbound cells: 1230.12 ns per cell. + Inbound cells, slow case: 2099.97 ns per cell. + Outbound cells, slow case: 2107.45 ns per cell. + + For a 3-hop circuit, that gives on average 1520 ns per cell. + + [XXXX Do a real benchmark with a fast AEZ backend. First, write one. + Preliminary results are a bit disappointing, though, so I'll + need to invetigate alternatives as well.] + diff --git a/proposals/262-rekey-circuits.txt b/proposals/262-rekey-circuits.txt new file mode 100644 index 0000000..ca8ad01 --- /dev/null +++ b/proposals/262-rekey-circuits.txt @@ -0,0 +1,130 @@ +Filename: 262-rekey-circuits.txt +Title: Re-keying live circuits with new cryptographic material +Author: Nick Mathewson +Created: 28 Dec 2015 +Status: Open + +1. Summary and Motivation + + Cryptographic primitives have an upper limit of how much data should + be encrypted with the same key. But currently Tor circuits have no + upper limit of how much data they will deliver. + + While the upper limits of our AES-CTR crypto is ridiculously high + (on the order of hundreds of exabytes), the AEZ crypto we're + considering suggests we should rekey after the equivalent in cells + after around 280 TB. 280 TB is still high, but not ridiculously + high. + + So in this proposal I explain a general mechanism for rekeying a + circuit. We shouldn't actually build this unless we settle on + +2. RELAY_REKEY cell operation + + To rekey, the circuit initiator ("client") can send a new RELAY_REKEY cell + type: + + struct relay_rekey { + u16 rekey_method IN [0, 1]; + u8 rekey_data[]; + } + + const REKEY_METHOD_ACK = 0; + const REKEY_METHOD_SHAKE128_CLIENT = 1; + + This cell means "I am changing the key." The new key material will be + derived from SHAKE128 of the aez_key concatenated with the rekey_data + field, to fill a new shake_output structure. The client should set + rekey_data at random. + + After sending one of these RELAY_REKEY cells, the client uses the new + aez_key to encrypt all of its data to this hop, but retains the old + aez_key for decrypting the data coming back from the relay. + + When the relay receives a RELAY_REKEY cell, it sends a RELAY_REKEY + cell back towards the client, with empty rekey_data, and + relay_method==0, and then updates its own key material for all + additional data it sends and receives to the client. + + When the client receives this reply, it can discard the old AEZ key, and + begin decrypting subsequent inbound cells with the new key. + + + So in summary: the client sends a series of cells encrypted with the + old key, and then sends a REKEY cell, followed by relay cells + encrypted with the new key: + + OldKey[data data data ... data rekey] NewKey[data data data...] + + And after the server receives the REKEY cell, it stops sending relay + cells encrypted with the old keys, sends its own REKEY cell with the + ACK method, and starts sending cells encrypted with the new key. + + REKEY arrives somewhere in here + I + V + OldKey[data data data data rekey-ack] NewKey[data data data ...] + +2.1. Supporting other cryptography types + + Each relay cipher must specify its own behavior in the presence of a + REKEY cell of each type that it supports. In general, the behavior + of method 1 ("shake128-client") is "regenerate keys as if we were + calling the original KDF after a CREATE handshake, using SHAKE128 on + our current static key material and on a 32-byte random input." + + The behavior of any unsupported REKEY method must be to close the + circuit with an error. + + The AES-CTR relay cell crypto doesn't support rekeying. See 3.2 below + if you disagree. + +2.2. How often to re-key? + + Clients should follow a deterministic algorithm in deciding when to + re-key, so as not to leak client differences. This algorithm should + be type-specific. For AEZ, I recommend that clients conservatively + rekey every 2**32 cells (about 2 TB). And to make sure that this + code actually works, the schedule should be after 2**15 cells, and + then every 2**32 cells thereafter. + + It may be beneficial to randomize these numbers. If so, let's try + subtracting between 0 and 25% at random. + +2.3. How often to allow re-keying? + + We could define a lower bound to prevent too-frequent rekeying. I'm + not sure I see the point here; the process described above is not + that costly. + +3. Alternative designs + +3.1. Should we add some public key cryptography here? + + We could change the body of a REKEY cell and its ack to be more like + CREATE/CREATED. Then we'd have to add a third step from the client + to the server to acknowledge receipt of the 'CREATED' cell and + changing of the key. + + So, what would this added complexity and computational load buy us? + It would defend against the case where an adversary had compromised + the shared key material for a circuit, but was not able to compromise + the rekey process. I'm not sure that this is reasonable; the + likeliest cases I can think of here seem to be "get compromised, stay + compromised" for a circuit. + +3.2. Hey, could we use this for forward secrecy with AES-CTR? + + We could, but the best solution to AES-CTR's limitations right now is + to stop using our AES-CTR setup. Anything that supports REKEY will + also presumably support AEZ or something better. + +3.3. We could upgrade ciphers with this! + + Yes we could. We could define this not only to change the key, but + to upgrade to a better ciphersuite. For example, we could start by + negotiating AES-CTR, and then "secretly" upgrade to AEZ. I'm not + sure that's worth the complexity, or that it would really be secret + in the presence of traffic analysis. + +
tor-commits@lists.torproject.org