commit 875fdaecd544bd477edfa3ee98ec7458f53583ed Author: Isis Lovecruft isis@torproject.org Date: Fri Jul 22 12:03:10 2016 +0000
Assign the RebelAlliance hybrid handshake proposal a number. --- proposals/000-index.txt | 2 + proposals/270-newhope-hybrid-handshake.txt | 764 +++++++++++++++++++++++++++++ proposals/XXX-newhope-hybrid-handshake.txt | 764 ----------------------------- proposals/proposal-status.txt | 6 + 4 files changed, 772 insertions(+), 764 deletions(-)
diff --git a/proposals/000-index.txt b/proposals/000-index.txt index c794db7..4af78b9 100644 --- a/proposals/000-index.txt +++ b/proposals/000-index.txt @@ -190,6 +190,7 @@ Proposals by number: 267 Tor Consensus Transparency [DRAFT] 268 New Guard Selection Behaviour [DRAFT] 269 Transitionally secure hybrid handshakes [DRAFT] +270 RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope [DRAFT]
Proposals by status: @@ -217,6 +218,7 @@ Proposals by status: 267 Tor Consensus Transparency 268 New Guard Selection Behaviour 269 Transitionally secure hybrid handshakes + 270 RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope NEEDS-REVISION: 190 Bridge Client Authorization Based on a Shared Secret NEEDS-RESEARCH: diff --git a/proposals/270-newhope-hybrid-handshake.txt b/proposals/270-newhope-hybrid-handshake.txt new file mode 100644 index 0000000..ccf3390 --- /dev/null +++ b/proposals/270-newhope-hybrid-handshake.txt @@ -0,0 +1,764 @@ +Filename: 270-newhope-hybrid-handshake.txt +Title: RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope +Author: Isis Lovecruft, Peter Schwabe +Created: 16 Apr 2016 +Updated: 22 Jul 2016 +Status: Draft +Depends: prop#220 prop#249 prop#264 prop#270 + +§0. Introduction + + RebelAlliance is a post-quantum secure hybrid handshake, comprised of an + alliance between the X25519 and NewHope key exchanges. + + NewHope is a post-quantum-secure lattice-based key-exchange protocol based + on the ring-learning-with-errors (Ring-LWE) problem. We propose a hybrid + handshake for Tor, based on a combination of Tor's current NTor handshake + and a shared key derived through a NewHope ephemeral key exchange. + + For further details on the NewHope key exchange, the reader is referred to + "Post-quantum key exchange - a new hope" by Alkim, Ducas, Pöppelmann, and + Schwabe [0][1]. + + For the purposes of brevity, we consider that NTor is currently the only + handshake protocol in Tor; the older TAP protocol is ignored completely, due + to the fact that it is currently deprecated and nearly entirely unused. + + +§1. Motivation + + An attacker currently monitoring and storing circuit-layer NTor handshakes + who later has the ability to run Shor's algorithm on a quantum computer will + be able to break Tor's current handshake protocol and decrypt previous + communications. + + It is unclear if and when such attackers equipped with large quantum + computers will exist, but various estimates by researchers in quantum + physics and quantum engineering give estimates of only 1 to 2 decades. + Clearly, the security requirements of many Tor users include secrecy of + their messages beyond this time span, which means that Tor needs to update + the key exchange to protect against such attackers as soon as possible. + + +§2. Design + + An initiator and responder, in parallel, conduct two handshakes: + + - An X25519 key exchange, as described in the description of the NTor + handshake in Tor proposal #216. + - A NewHope key exchange. + + The shared keys derived from these two handshakes are then concatenated and + used as input to the SHAKE-256 extendable output function (XOF), as described + in FIPS-PUB-202 [2], in order to produce a shared key of the desired length. + The testvectors in §C assume that this key has a length of 32 bytes, but the + use of a XOF allows arbitrary lengths to easily support future updates of + the symmetric primitives using the key. See also §3.3.1. + + +§3. Specification + +§3.1. Notation + + Let `a || b` be the concatenation of a with b. + + Let `a^b` denote the exponentiation of a to the bth power. + + Let `a == b` denote the equality of a with b, and vice versa. + + Let `a := b` be the assignment of the value of b to the variable a. + + Let `H(x)` be 32-bytes of output of the SHAKE-256 XOF (as described in + FIPS-PUB-202) applied to message x. + + Let X25519 refer to the curve25519-based key agreement protocol described + in RFC7748 §6.1. [3] + + Let `EXP(a, b) == X25519(., b, a)` with `g == 9`. Let X25519_KEYGEN() do + the appropriate manipulations when generating the secret key (clearing the + low bits, twidding the high bits). Additionally, EXP() MUST include the + check for all-zero output due to the input point being of small + order (cf. RFC7748 §6). + + Let `X25519_KEYID(B) == B` where B is a valid X25519 public key. + + When representing an element of the Curve25519 subgroup as a byte string, + use the standard (32-byte, little-endian, x-coordinate-only) representation + for Curve25519 points. + + Let `ID` be a router's identity key taken from the router microdescriptor. + In the case for relays possessing Ed25519 identity keys (cf. Tor proposal + #220), this is a 32-byte string representing the public Ed25519 identity key. + For backwards and forwards compatibility with routers which do not possess + Ed25519 identity keys, this is a 32-byte string created via the output of + H(ID). + + We refer to the router as the handshake "responder", and the client (which + may be an OR or an OP) as the "initiator". + + + ID_LENGTH [32 bytes] + H_LENGTH [32 bytes] + G_LENGTH [32 bytes] + + PROTOID := "pqtor-x25519-newhope-shake256-1" + T_MAC := PROTOID || ":mac" + T_KEY := PROTOID || ":key_extract" + T_VERIFY := PROTOID || ":verify" + + (X25519_SK, X25519_PK) := X25519_KEYGEN() + + +§3.2. Protocol + + ======================================================================================== + | | + | Fig. 1: The NewHope-X25519 Hybrid Handshake. | + | | + | Before the handshake the Initiator is assumed to know Z, a public X25519 key for | + | the Responder, as well as the Responder's ID. | + ---------------------------------------------------------------------------------------- + | | + | Initiator Responder | + | | + | SEED := H(randombytes(32)) | + | x, X := X25519_KEYGEN() | + | a, A := NEWHOPE_KEYGEN(SEED) | + | CLIENT_HDATA := ID || Z || X || A | + | | + | --- CLIENT_HDATA ---> | + | | + | y, Y := X25519_KEYGEN() | + | NTOR_KEY, AUTH := NTOR_SHAREDB(X,y,Y,z,Z,ID,B) | + | M, NEWHOPE_KEY := NEWHOPE_SHAREDB(A) | + | SERVER_HDATA := Y || AUTH || M | + | sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) | + | | + | <-- SERVER_HDATA ---- | + | | + | NTOR_KEY := NTOR_SHAREDA(x, X, Y, Z, ID, AUTH) | + | NEWHOPE_KEY := NEWHOPE_SHAREDA(M, a) | + | sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) | + | | + ======================================================================================== + + +§3.2.1. The NTor Handshake + +§3.2.1.1. Prologue + + Take a router with identity ID. As setup, the router generates a secret key z, + and a public onion key Z with: + + z, Z := X25519_KEYGEN() + + The router publishes Z in its server descriptor in the "ntor-onion-key" entry. + Henceforward, we refer to this router as the "responder". + + +§3.2.1.2. Initiator + + To send a create cell, the initiator generates a keypair: + + x, X := X25519_KEYGEN() + + and creates the NTor portion of a CREATE2V cell's HDATA section: + + CLIENT_NTOR := ID || Z || X [96 bytes] + + The initiator includes the responder's ID and Z in the CLIENT_NTOR so that, in + the event the responder OR has recently rotated keys, the responder can + determine which keypair to use. + + The initiator then concatenates CLIENT_NTOR with CLIENT_NEWHOPE (see §3.2.2), + to create CLIENT_HDATA, and creates and sends a CREATE2V cell (see §A.1) + to the responder. + + CLIENT_NEWHOPE [1824 bytes] (see §3.2.2) + CLIENT_HDATA := CLIENT_NTOR || CLIENT_NEWHOPE [1920 bytes] + + If the responder does not respond with a CREATED2V cell, the initiator SHOULD + NOT attempt to extend the circuit through the responder by sending fragmented + EXTEND2 cells, since the responder's lack of support for CREATE2V cells is + assumed to imply the responder also lacks support for fragmented EXTEND2 + cells. Alternatively, for initiators with a sufficiently late consensus + method, the initiator MUST check that "proto" line in the responder's + descriptor (cf. Tor proposal #264) advertises support for the "Relay" + subprotocol version 3 (see §5). + + +§3.2.1.3. Responder + + The responder generates a keypair of y, Y = X25519_KEYGEN(), and does + NTOR_SHAREDB() as follows: + + (NTOR_KEY, AUTH) ← NTOR_SHAREDB(X, y, Y, z, Z, ID, B): + secret_input := EXP(X, y) || EXP(X, z) || ID || B || Z || Y || PROTOID + NTOR_KEY := H(secret_input, T_KEY) + verify := H(secret_input, T_VERIFY) + auth_input := verify || ID || Z || Y || X || PROTOID || "Server" + AUTH := H(auth_input, T_MAC) + + The responder sends a CREATED2V cell containing: + + SERVER_NTOR := Y || AUTH [64 bytes] + SERVER_NEWHOPE [2048 bytes] (see §3.2.2) + SERVER_HDATA := SERVER_NTOR || SERVER_NEWHOPE [2112 bytes] + + and sends this to the initiator. + + +§3.2.1.4. Finalisation + + The initiator then checks Y is in G^* [see NOTE below], and does + NTOR_SHAREDA() as follows: + + (NTOR_KEY) ← NTOR_SHAREDA(x, X, Y, Z, ID, AUTH) + secret_input := EXP(Y, x) || EXP(Z, x) || ID || Z || X || Y || PROTOID + NTOR_KEY := H(secret_input, T_KEY) + verify := H(secret_input, T_VERIFY) + auth_input := verify || ID || Z || Y || X || PROTOID || "Server" + if AUTH == H(auth_input, T_MAC) + return NTOR_KEY + + Both parties now have a shared value for NTOR_KEY. They expand this into + the keys needed for the Tor relay protocol. + + [XXX We think we want to omit the final hashing in the production of NTOR_KEY + here, and instead put all the inputs through SHAKE-256. --isis, peter] + + [XXX We probably want to remove ID and B from the input to the shared key + material, since they serve for authentication but, as pre-established + "prologue" material to the handshake, they should not be used in attempts to + strengthen the cryptographic suitability of the shared key. Also, their + inclusion is implicit in the DH exponentiations. I should probably ask Ian + about the reasoning for the original design choice. --isis] + + +§3.2.2. The NewHope Handshake + +§3.2.2.1. Parameters & Mathematical Structures + + Let ℤ be the ring of rational integers. Let ℤq, for q ≥ 1, denote the quotient + ring ℤ/qℤ. We define R = ℤ[X]/((X^n)+1) as the ring of integer polynomials + modulo ((X^n)+1), and Rq = ℤq[X]/((X^n)+1) as the ring of integer polynomials + modulo ((X^n)+1) where each coefficient is reduced modulo q. When we refer to + a polynomial, we mean an element of Rq. + + n := 1024 + q := 12289 + + SEED [32 Bytes] + NEWHOPE_POLY [1792 Bytes] + NEWHOPE_REC [256 Bytes] + NEWHOPE_KEY [32 Bytes] + + NEWHOPE_MSGA := (NEWHOPE_POLY || SEED) + NEWHOPE_MSGB := (NEWHOPE_POLY || NEWHOPE_REC) + + +§3.2.2.2. High-level Description of Newhope API Functions + + For a description of internal functions, see §B. + + (NEWHOPE_POLY, NEWHOPE_MSGA) ← NEWHOPE_KEYGEN(SEED): + â := gen_a(seed) + s := poly_getnoise() + e := poly_getnoise() + ŝ := poly_ntt(s) + ê := poly_ntt(e) + b̂ := pointwise(â, ŝ) + ê + sp := poly_tobytes(ŝ) + bp := poly_tobytes(b̂) + return (sp, (bp || seed)) + + (NEWHOPE_MSGB, NEWHOPE_KEY) ← NEWHOPE_SHAREDB(NEWHOPE_MSGA): + s' := poly_getnoise() + e' := poly_getnoise() + e" := poly_getnoise() + b̂ := poly_frombytes(bp) + â := gen_a(seed) + ŝ' := poly_ntt(s') + ê' := poly_ntt(e') + û := poly_pointwise(â, ŝ') + ê' + v := poly_invntt(poly_pointwise(b̂,ŝ')) + e" + r := helprec(v) + up := poly_tobytes(û) + k := rec(v, r) + return ((up || r), k) + + NEWHOPE_KEY ← NEWHOPE_SHAREDA(NEWHOPE_MSGB, NEWHOPE_POLY): + û := poly_frombytes(up) + ŝ := poly_frombytes(sp) + v' := poly_invntt(poly_pointwise(û, ŝ)) + k := rec(v', r) + return k + + When a client uses a SEED within a CREATE2V cell, the client SHOULD NOT use + that SEED in any other CREATE2V or EXTEND2 cells. See §4 for further + discussion. + + +§3.3. Key Expansion + + The client and server derive a shared key, SHARED, by: + + HKDFID := "THESE ARENT THE DROIDS YOURE LOOKING FOR" + SHARED := SHAKE_256(HKDFID || NTorKey || NewHopeKey) + + +§3.3.1. Note on the Design Choice + + The reader may wonder why one would use SHAKE-256 to produce a 256-bit + output, since the security strength in bits for SHAKE-256 is min(d/2,256) + for collision resistance and min(d,256) for first- and second-order + preimages, where d is the output length. + + The reasoning is that we should be aiming for 256-bit security for all of + our symmetric cryptography. One could then argue that we should just use + SHA3-256 for the KDF. We choose SHAKE-256 instead in order to provide an + easy way to derive longer shared secrets in the future without requiring a + new handshake. The construction is odd, but the future is bright. + As we are already using SHAKE-256 for the 32-byte output hash, we are also + using it for all other 32-byte hashes involved in the protocol. Note that + the only difference between SHA3-256 and SHAKE-256 with 32-byte output is + one domain-separation byte. + + [XXX why would you want 256-bit security for the symmetric side? Are you + talking pre- or post-quantum security? --peter] + + +§4. Security & Anonymity Implications + + This handshake protocol is one-way authenticated. That is, the server is + authenticated, while the client remains anonymous. + + The client MUST NOT cache and reuse SEED. Doing so gives non-trivial + adversarial advantages w.r.t. all-for-the-price-of-one attacks during the + caching period. More importantly, if the SEED used to generate NEWHOPE_MSGA + is reused for handshakes along the same circuit or multiple different + circuits, an adversary conducting a sybil attack somewhere along the path(s) + will be able to correlate the identity of the client across circuits or + hops. + + +§5. Compatibility + + Because our proposal requires both the client and server to send more than + the 505 bytes possible within a CREATE2 cell's HDATA section, it depends + upon the implementation of a mechanism for allowing larger CREATE cells + (cf. Tor proposal #249). + + We reserve the following handshake type for use in CREATE2V/CREATED2V and + EXTEND2V/EXTENDED2V cells: + + 0x0003 [NEWHOPE + X25519 HYBRID HANDSHAKE] + + We introduce a new sub-protocol number, "Relay=3", (cf. Tor proposal #264 + §5.3) to signify support this handshake, and hence for the CREATE2V and + fragmented EXTEND2 cells which it requires. + + There are no additional entries or changes required within either router + descriptors or microdescriptors to support this handshake method, due to the + NewHope keys being ephemeral and derived on-the-fly, and due to the NTor X25519 + public keys already being included within the "ntor-onion-key" entry. + + Add a "UseNewHopeKEX" configuration option and a corresponding consensus + parameter to control whether clients prefer using this NewHope hybrid + handshake or some previous handshake protocol. If the configuration option + is "auto", clients SHOULD obey the consensus parameter. The default + configuration SHOULD be "auto" and the consensus value SHOULD initially be "0". + + +§6. Implementation + + The paper by Alkim, Ducas, Pöppelmann and Schwabe describes two software + implementations of NewHope, one C reference implementation and an optimized + implementation using AVX2 vector instructions. Those implementations are + available at [1]. + + Additionally, there are implementations in Go by Yawning Angel, available + from [4] and in Rust by Isis Lovecruft, available from [5]. + + The software used to generate the test vectors in §C is based on the C + reference implementation and available from: + + https://code.ciph.re/isis/newhope-tor-testvectors + https://github.com/isislovecruft/newhope-tor-testvectors + + +§7. Performance & Scalability + + The computationally expensive part in the current NTor handshake is the + X25519 key-pair generation and the X25519 shared-key computation. The + current implementation in Tor is a wrapper to support various highly optimized + implementations on different architectures. On Intel Haswell processors, the + fastest implementation of X25519, as reported by the eBACS benchmarking + project [6], takes 169920 cycles for key-pair generation and 161648 cycles + for shared-key computation; these add up to a total of 331568 cycles on each + side (initiator and responder). + + The C reference implementation of NewHope, also benchmarked on Intel + Haswell, takes 358234 cycles for the initiator and 402058 cycles for the + Responder. The core computation of the proposed combination of NewHope and + X25519 will thus mean a slowdown of about a factor of 2.1 for the Initiator + and a slowdown by a factor of 2.2 for the Responder compared to the current + NTor handshake. These numbers assume a fully optimized implementation of the + NTor handshake and a C reference implementation of NewHope. With optimized + implementations of NewHope, such as the one for Intel Haswell described in + [0], the computational slowdown will be considerably smaller than a factor + of 2. + + +§8. References + +[0]: https://cryptojedi.org/papers/newhope-20160328.pdf +[1]: https://cryptojedi.org/crypto/#newhope +[2]: http://www.nist.gov/customcf/get_pdf.cfm?pub_id=919061 +[3]: https://tools.ietf.org/html/rfc7748#section-6.1 +[4]: https://github.com/Yawning/newhope +[5]: https://code.ciph.re/isis/newhopers +[6]: http://bench.cr.yp.to + + +§A. Cell Formats + +§A.1. CREATE2V Cells + + The client portion of the handshake should send CLIENT_HDATA, formatted + into a CREATE2V cell as follows: + + CREATE2V { [2114 bytes] + HTYPE := 0x0003 [2 bytes] + HLEN := 0x0780 [2 bytes] + HDATA := CLIENT_HDATA [1920 bytes] + IGNORED := 0x00 [194 bytes] + } + + [XXX do we really want to pad with IGNORED to make CLIENT_HDATA the + same number of bytes as SERVER_HDATA? --isis] + +§A.2. CREATED2V Cells + + The server responds to the client's CREATE2V cell with SERVER_HDATA, + formatted into a CREATED2V cell as follows: + + CREATED2V { [2114 bytes] + HLEN := 0x0800 [2 bytes] + HDATA := SERVER_HDATA [2112 bytes] + IGNORED := 0x00 [0 bytes] + } + +§A.3. Fragmented EXTEND2 Cells + + When the client wishes to extend a circuit, the client should fragment + CLIENT_HDATA into four EXTEND2 cells: + + EXTEND2 { + NSPEC := 0x02 { [1 byte] + LINK_ID_SERVER [22 bytes] XXX + LINK_ADDRESS_SERVER [8 bytes] XXX + } + HTYPE := 0x0003 [2 bytes] + HLEN := 0x0780 [2 bytes] + HDATA := CLIENT_HDATA[0,461] [462 bytes] + } + EXTEND2 { + NSPEC := 0x00 [1 byte] + HTYPE := 0xFFFF [2 bytes] + HLEN := 0x0000 [2 bytes] + HDATA := CLIENT_HDATA[462,954] [492 bytes] + } + EXTEND2 { + NSPEC := 0x00 [1 byte] + HTYPE := 0xFFFF [2 bytes] + HLEN := 0x0000 [2 bytes] + HDATA := CLIENT_HDATA[955,1447] [492 bytes] + } + EXTEND2 { + NSPEC := 0x00 [1 byte] + HTYPE := 0xFFFF [2 bytes] + HLEN := 0x0000 [2 bytes] + HDATA := CLIENT_HDATA[1448,1919] || 0x00[20] [492 bytes] + } + EXTEND2 { + NSPEC := 0x00 [1 byte] + HTYPE := 0xFFFF [2 bytes] + HLEN := 0x0000 [2 bytes] + HDATA := 0x00[172] [172 bytes] + } + + The client sends this to the server to extend the circuit from, and that + server should format the fragmented EXTEND2 cells into a CREATE2V cell, as + described in §A.1. + +§A.4. Fragmented EXTENDED2 Cells + + EXTENDED2 { + NSPEC := 0x02 { [1 byte] + LINK_ID_SERVER [22 bytes] XXX + LINK_ADDRESS_SERVER [8 bytes] XXX + } + HTYPE := 0x0003 [2 bytes] + HLEN := 0x0800 [2 bytes] + HDATA := SERVER_HDATA[0,461] [462 bytes] + } + EXTENDED2 { + NSPEC := 0x00 [1 byte] + HTYPE := 0xFFFF [2 bytes] + HLEN := 0x0000 [2 bytes] + HDATA := SERVER_HDATA[462,954] [492 bytes] + } + EXTEND2 { + NSPEC := 0x00 [1 byte] + HTYPE := 0xFFFF [2 bytes] + HLEN := 0x0000 [2 bytes] + HDATA := SERVER_HDATA[955,1447] [492 bytes] + } + EXTEND2 { + NSPEC := 0x00 [1 byte] + HTYPE := 0xFFFF [2 bytes] + HLEN := 0x0000 [2 bytes] + HDATA := SERVER_HDATA[1448,1939] [492 bytes] + } + EXTEND2 { + NSPEC := 0x00 [1 byte] + HTYPE := 0xFFFF [2 bytes] + HLEN := 0x0000 [2 bytes] + HDATA := SERVER_HDATA[1940,2112] [172 bytes] + } + + +§B. NewHope Internal Functions + + gen_a(SEED): returns a uniformly random poly + poly_getnoise(): returns a poly sampled from a centered binomial + poly_ntt(poly): number-theoretic transform; returns a poly + poly_invntt(poly): inverse number-theoretic transform; returns a poly + poly_pointwise(poly, poly): pointwise multiplication; returns a poly + poly_tobytes(poly): packs a poly to a NEWHOPE_POLY byte array + poly_frombytes(NEWHOPE_POLY): unpacks a NEWHOPE_POLY byte array to a poly + + helprec(poly): returns a NEWHOPE_REC byte array + rec(poly, NEWHOPE_REC): returns a NEWHOPE_KEY + + + --- Description of the Newhope internal functions --- + + gen_a(SEED seed) receives as input a 32-byte (public) seed. It expands + this seed through SHAKE-128 from the FIPS202 standard. The output of SHAKE-128 + is considered a sequence of 16-bit little-endian integers. This sequence is + used to initialize the coefficients of the returned polynomial from the least + significant (coefficient of X^0) to the most significant (coefficient of + X^1023) coefficient. For each of the 16-bit integers first eliminate the + highest two bits (to make it a 14-bit integer) and then use it as the next + coefficient if it is smaller than q=12289. + Note that the amount of output required from SHAKE to initialize all 1024 + coefficients of the polynomial varies depending on the input seed. + Note further that this function does not process any secret data and thus does + not need any timing-attack protection. + + + poly_getnoise() first generates 4096 bytes of uniformly random data. This can + be done by reading these bytes from the system's RNG; efficient + implementations will typically only read a 32-byte seed from the system's RNG + and expand it through some fast PRG (for example, ChaCha20 or AES-256 in CTR + mode). The output of the PRG is considered an array of 2048 16-bit integers + r[0],...,r[2047]. The coefficients of the output polynomial are computed as + HW(r[0])-HW(r[1]), HW(r[2])-HW(r[3]),...,HW(r[2046])-HW(r[2047]), where HW + stands for Hamming weight. + Note that the choice of RNG is a local decision; different implementations are + free to use different RNGs. + Note further that the output of this function is secret; the PRG (and the + computation of HW) need to be protected against timing attacks. + + + poly_ntt(poly f): For a mathematical description of poly_ntt see the [0]; a + pseudocode description of a very naive in-place transformation of an input + polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the + following code (all arithmetic on coefficients performed modulo q): + + psi = 7 + omega = 49 + + for i in range(0,n): + t[i] = f[i] * psi^i + + for i in range(0,n): + f[i] = 0 + for j in range(0,n): + f[i] += t[j] * omega^((i*j)%n) + + Note that this is not how poly_ntt should be implemented if performance is + an issue; in particular, efficient algorithms for the number-theoretic + transform take time O(n*log(n)) and not O(n^2) + Note further that all arithmetic in poly_ntt has to be protected against + timing attacks. + + + poly_invntt(poly f): For a mathematical description of poly_invntt see the + [0]; a pseudocode description of a very naive in-place transformation of an + input polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the + following code (all arithmetic on coefficients performed modulo q): + + invpsi = 8778; + invomega = 1254; + invn = 12277; + + for i in range(0,n): + t[i] = f[i]; + + for i in range(0,n): + f[i]=0; + for j in range(0,n): + f[i] += t[j] * invomega^((i*j)%n) + f[i] *= invpsi^i + f[i] *= invn + + Note that this is not how poly_invntt should be implemented if performance + is an issue; in particular, efficient algorithms for the inverse + number-theoretic transform take time O(n*log(n)) and not O(n^2) + Note further that all arithmetic in poly_invntt has to be protected against + timing attacks. + + + poly_pointwise(poly f, poly g) performs pointwise multiplication of the two + polynomials. This means that for f = (f0 + f1*X + f2*X^2 + ... + + f1023*X^1023) and g = (g0 + g1*X + g2*X^2 + ... + g1023*X^1023) it computes + and returns h = (h0 + h1*X + h2*X^2 + ... + h1023*X^1023) with h0 = f0*g0, + h1 = f1*g1,..., h1023 = f1023*g1023. + + + poly_tobytes(poly f) first reduces all coefficents of f modulo q, i.e., + brings them to the interval [0,q-1]. Denote these reduced coefficients as + f0,..., f1023; note that they all fit into 14 bits. The function then packs + those coefficients into an array of 1792 bytes r[0],..., r[1792] in "packed + little-endian representation", i.e., + r[0] = f[0] & 0xff; + r[1] = (f[0] >> 8) & ((f[1] & 0x03) << 6) + r[2] = (f[1] >> 2) & 0xff; + r[3] = (f[1] >> 10) & ((f[2] & 0x0f) << 4) + . + . + . + r[1790] = (f[1022]) >> 12) & ((f[1023] & 0x3f) << 2) + r[1791] = f[1023] >> 6 + Note that this function needs to be protected against timing attacks. In + particular, avoid non-constant-time conditional subtractions (or other + non-constant-time expressions) in the reduction modulo q of the coefficients. + + + poly_frombytes(NEWHOPE_POLY b) is the inverse of poly_tobytes; it receives + as input an array of 1792 bytes and coverts it into the internal + representation of a poly. Note that poly_frombytes does not need to check + whether the coefficients are reduced modulo q or reduce coefficients modulo + q. Note further that the function must not leak any information about its + inputs through timing information, as it is also applied to the secret key + of the initiator. + + + helprec(poly f) computes 256 bytes of reconciliation information from the + input poly f. Internally, one byte of reconciliation information is computed + from four coefficients of f by a function helprec4. Let the input polynomial f + = (f0 + f1*X + f2*X^2 + ... + f1023*X^1023); let the output byte array be + r[0],...r[256]. This output byte array is computed as + r[0] = helprec4(f0,f256,f512,f768) + r[1] = helprec4(f1,f257,f513,f769) + r[2] = helprec4(f2,f258,f514,f770) + . + . + . + r[255] = helprec4(f255,f511,f767,f1023), where helprec4 does the following: + + helprec4(x0,x1,x2,x3): + b = randombit() + r0,r1,r2,r3 = CVPD4(8*x0+4*b,8*x1+4*b,8*x2+4*b,8*x3+4*b) + r = (r0 & 0x03) | ((r1 & 0x03) << 2) | ((r2 & 0x03) << 4) | ((r3 & 0x03) << 6) + return r + + The function CVPD4 does the following: + + CVPD4(y0,y1,y2,y3): + v00 = round(y0/2q) + v01 = round(y1/2q) + v02 = round(y2/2q) + v03 = round(y3/2q) + v10 = round((y0-1)/2q) + v11 = round((y1-1)/2q) + v12 = round((y2-1)/2q) + v13 = round((y3-1)/2q) + t = abs(y0 - 2q*v00) + t += abs(y1 - 2q*v01) + t += abs(y2 - 2q*v02) + t += abs(y3 - 2q*v03) + if(t < 2q): + v0 = v00 + v1 = v01 + v2 = v02 + v3 = v03 + k = 0 + else + v0 = v10 + v1 = v11 + v2 = v12 + v3 = v13 + r = 1 + return (v0-v3,v1-v3,v2-v3,k+2*v3) + + In this description, round(x) is defined as ⌊x + 0.5⌋, where ⌊x⌋ rounds to + the largest integer that does not exceed x; abs() returns the absolute + value. + Note that all computations involved in helprec operate on secret data and must + be protected against timing attacks. + + + rec(poly f, NEWHOPE_REC r) computes the pre-hash (see paper) Newhope key from + f and r. Specifically, it computes one bit of key from 4 coefficients of f and + one byte of r. Let f = f0 + f1*X + f2*X^2 + ... + f1023*X^1023 and let r = + r[0],r[1],...,r[255]. Let the bytes of the output by k[0],...,k[31] and let + the bits of the output by k0,...,k255, where + k0 = k[0] & 0x01 + k1 = (k[0] >> 1) & 0x01 + k2 = (k[0] >> 2) & 0x01 + . + . + . + k8 = k[1] & 0x01 + k9 = (k[1] >> 1) & 0x01 + . + . + . + k255 = (k[32] >> 7) + The function rec computes k0,...,k255 as + k0 = rec4(f0,f256,f512,f768,r[0]) + k1 = rec4(f1,f257,f513,f769,r[1]) + . + . + . + k255 = rec4(f255,f511,f767,f1023,r[255]) + + The function rec4 does the following: + + rec4(y0,y1,y2,y3,r): + r0 = r & 0x03 + r1 = (r >> 2) & 0x03 + r2 = (r >> 4) & 0x03 + r3 = (r >> 6) & 0x03 + Decode(8*y0-2q*r0, 8*y1-2q*r1, 8*y2-2q*r2, 8*y3-q*r3) + + The function Decode does the following: + + Decode(v0,v1,v2,v3): + t0 = round(v0/8q) + t1 = round(v1/8q) + t2 = round(v2/8q) + t3 = round(v3/8q) + t = abs(v0 - 8q*t0) + t += abs(v0 - 8q*t0) + t += abs(v0 - 8q*t0) + t += abs(v0 - 8q*t0) + if(t > 1) return 1 + else return 0 + + +§C. Test Vectors diff --git a/proposals/XXX-newhope-hybrid-handshake.txt b/proposals/XXX-newhope-hybrid-handshake.txt deleted file mode 100644 index 9db9a96..0000000 --- a/proposals/XXX-newhope-hybrid-handshake.txt +++ /dev/null @@ -1,764 +0,0 @@ -Filename: XXX-newhope-hybrid-handshake.txt -Title: RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope -Author: Isis Lovecruft, Peter Schwabe -Created: 16 Apr 2016 -Updated: 4 May 2016 -Status: Draft -Depends: prop#220 prop#249 prop#264 - -§0. Introduction - - RebelAlliance is a post-quantum secure hybrid handshake, comprised of an - alliance between the X25519 and NewHope key exchanges. - - NewHope is a post-quantum-secure lattice-based key-exchange protocol based - on the ring-learning-with-errors (Ring-LWE) problem. We propose a hybrid - handshake for Tor, based on a combination of Tor's current NTor handshake - and a shared key derived through a NewHope ephemeral key exchange. - - For further details on the NewHope key exchange, the reader is referred to - "Post-quantum key exchange - a new hope" by Alkim, Ducas, Pöppelmann, and - Schwabe [0][1]. - - For the purposes of brevity, we consider that NTor is currently the only - handshake protocol in Tor; the older TAP protocol is ignored completely, due - to the fact that it is currently deprecated and nearly entirely unused. - - -§1. Motivation - - An attacker currently monitoring and storing circuit-layer NTor handshakes - who later has the ability to run Shor's algorithm on a quantum computer will - be able to break Tor's current handshake protocol and decrypt previous - communications. - - It is unclear if and when such attackers equipped with large quantum - computers will exist, but various estimates by researchers in quantum - physics and quantum engineering give estimates of only 1 to 2 decades. - Clearly, the security requirements of many Tor users include secrecy of - their messages beyond this time span, which means that Tor needs to update - the key exchange to protect against such attackers as soon as possible. - - -§2. Design - - An initiator and responder, in parallel, conduct two handshakes: - - - An X25519 key exchange, as described in the description of the NTor - handshake in Tor proposal #216. - - A NewHope key exchange. - - The shared keys derived from these two handshakes are then concatenated and - used as input to the SHAKE-256 extendable output function (XOF), as described - in FIPS-PUB-202 [2], in order to produce a shared key of the desired length. - The testvectors in §C assume that this key has a length of 32 bytes, but the - use of a XOF allows arbitrary lengths to easily support future updates of - the symmetric primitives using the key. See also §3.3.1. - - -§3. Specification - -§3.1. Notation - - Let `a || b` be the concatenation of a with b. - - Let `a^b` denote the exponentiation of a to the bth power. - - Let `a == b` denote the equality of a with b, and vice versa. - - Let `a := b` be the assignment of the value of b to the variable a. - - Let `H(x)` be 32-bytes of output of the SHAKE-256 XOF (as described in - FIPS-PUB-202) applied to message x. - - Let X25519 refer to the curve25519-based key agreement protocol described - in RFC7748 §6.1. [3] - - Let `EXP(a, b) == X25519(., b, a)` with `g == 9`. Let X25519_KEYGEN() do - the appropriate manipulations when generating the secret key (clearing the - low bits, twidding the high bits). Additionally, EXP() MUST include the - check for all-zero output due to the input point being of small - order (cf. RFC7748 §6). - - Let `X25519_KEYID(B) == B` where B is a valid X25519 public key. - - When representing an element of the Curve25519 subgroup as a byte string, - use the standard (32-byte, little-endian, x-coordinate-only) representation - for Curve25519 points. - - Let `ID` be a router's identity key taken from the router microdescriptor. - In the case for relays possessing Ed25519 identity keys (cf. Tor proposal - #220), this is a 32-byte string representing the public Ed25519 identity key. - For backwards and forwards compatibility with routers which do not possess - Ed25519 identity keys, this is a 32-byte string created via the output of - H(ID). - - We refer to the router as the handshake "responder", and the client (which - may be an OR or an OP) as the "initiator". - - - ID_LENGTH [32 bytes] - H_LENGTH [32 bytes] - G_LENGTH [32 bytes] - - PROTOID := "pqtor-x25519-newhope-shake256-1" - T_MAC := PROTOID || ":mac" - T_KEY := PROTOID || ":key_extract" - T_VERIFY := PROTOID || ":verify" - - (X25519_SK, X25519_PK) := X25519_KEYGEN() - - -§3.2. Protocol - - ======================================================================================== - | | - | Fig. 1: The NewHope-X25519 Hybrid Handshake. | - | | - | Before the handshake the Initiator is assumed to know Z, a public X25519 key for | - | the Responder, as well as the Responder's ID. | - ---------------------------------------------------------------------------------------- - | | - | Initiator Responder | - | | - | SEED := H(randombytes(32)) | - | x, X := X25519_KEYGEN() | - | a, A := NEWHOPE_KEYGEN(SEED) | - | CLIENT_HDATA := ID || Z || X || A | - | | - | --- CLIENT_HDATA ---> | - | | - | y, Y := X25519_KEYGEN() | - | NTOR_KEY, AUTH := NTOR_SHAREDB(X,y,Y,z,Z,ID,B) | - | M, NEWHOPE_KEY := NEWHOPE_SHAREDB(A) | - | SERVER_HDATA := Y || AUTH || M | - | sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) | - | | - | <-- SERVER_HDATA ---- | - | | - | NTOR_KEY := NTOR_SHAREDA(x, X, Y, Z, ID, AUTH) | - | NEWHOPE_KEY := NEWHOPE_SHAREDA(M, a) | - | sk := SHAKE-256(NTOR_KEY || NEWHOPE_KEY) | - | | - ======================================================================================== - - -§3.2.1. The NTor Handshake - -§3.2.1.1. Prologue - - Take a router with identity ID. As setup, the router generates a secret key z, - and a public onion key Z with: - - z, Z := X25519_KEYGEN() - - The router publishes Z in its server descriptor in the "ntor-onion-key" entry. - Henceforward, we refer to this router as the "responder". - - -§3.2.1.2. Initiator - - To send a create cell, the initiator generates a keypair: - - x, X := X25519_KEYGEN() - - and creates the NTor portion of a CREATE2V cell's HDATA section: - - CLIENT_NTOR := ID || Z || X [96 bytes] - - The initiator includes the responder's ID and Z in the CLIENT_NTOR so that, in - the event the responder OR has recently rotated keys, the responder can - determine which keypair to use. - - The initiator then concatenates CLIENT_NTOR with CLIENT_NEWHOPE (see §3.2.2), - to create CLIENT_HDATA, and creates and sends a CREATE2V cell (see §A.1) - to the responder. - - CLIENT_NEWHOPE [1824 bytes] (see §3.2.2) - CLIENT_HDATA := CLIENT_NTOR || CLIENT_NEWHOPE [1920 bytes] - - If the responder does not respond with a CREATED2V cell, the initiator SHOULD - NOT attempt to extend the circuit through the responder by sending fragmented - EXTEND2 cells, since the responder's lack of support for CREATE2V cells is - assumed to imply the responder also lacks support for fragmented EXTEND2 - cells. Alternatively, for initiators with a sufficiently late consensus - method, the initiator MUST check that "proto" line in the responder's - descriptor (cf. Tor proposal #264) advertises support for the "Relay" - subprotocol version 3 (see §5). - - -§3.2.1.3. Responder - - The responder generates a keypair of y, Y = X25519_KEYGEN(), and does - NTOR_SHAREDB() as follows: - - (NTOR_KEY, AUTH) ← NTOR_SHAREDB(X, y, Y, z, Z, ID, B): - secret_input := EXP(X, y) || EXP(X, z) || ID || B || Z || Y || PROTOID - NTOR_KEY := H(secret_input, T_KEY) - verify := H(secret_input, T_VERIFY) - auth_input := verify || ID || Z || Y || X || PROTOID || "Server" - AUTH := H(auth_input, T_MAC) - - The responder sends a CREATED2V cell containing: - - SERVER_NTOR := Y || AUTH [64 bytes] - SERVER_NEWHOPE [2048 bytes] (see §3.2.2) - SERVER_HDATA := SERVER_NTOR || SERVER_NEWHOPE [2112 bytes] - - and sends this to the initiator. - - -§3.2.1.4. Finalisation - - The initiator then checks Y is in G^* [see NOTE below], and does - NTOR_SHAREDA() as follows: - - (NTOR_KEY) ← NTOR_SHAREDA(x, X, Y, Z, ID, AUTH) - secret_input := EXP(Y, x) || EXP(Z, x) || ID || Z || X || Y || PROTOID - NTOR_KEY := H(secret_input, T_KEY) - verify := H(secret_input, T_VERIFY) - auth_input := verify || ID || Z || Y || X || PROTOID || "Server" - if AUTH == H(auth_input, T_MAC) - return NTOR_KEY - - Both parties now have a shared value for NTOR_KEY. They expand this into - the keys needed for the Tor relay protocol. - - [XXX We think we want to omit the final hashing in the production of NTOR_KEY - here, and instead put all the inputs through SHAKE-256. --isis, peter] - - [XXX We probably want to remove ID and B from the input to the shared key - material, since they serve for authentication but, as pre-established - "prologue" material to the handshake, they should not be used in attempts to - strengthen the cryptographic suitability of the shared key. Also, their - inclusion is implicit in the DH exponentiations. I should probably ask Ian - about the reasoning for the original design choice. --isis] - - -§3.2.2. The NewHope Handshake - -§3.2.2.1. Parameters & Mathematical Structures - - Let ℤ be the ring of rational integers. Let ℤq, for q ≥ 1, denote the quotient - ring ℤ/qℤ. We define R = ℤ[X]/((X^n)+1) as the ring of integer polynomials - modulo ((X^n)+1), and Rq = ℤq[X]/((X^n)+1) as the ring of integer polynomials - modulo ((X^n)+1) where each coefficient is reduced modulo q. When we refer to - a polynomial, we mean an element of Rq. - - n := 1024 - q := 12289 - - SEED [32 Bytes] - NEWHOPE_POLY [1792 Bytes] - NEWHOPE_REC [256 Bytes] - NEWHOPE_KEY [32 Bytes] - - NEWHOPE_MSGA := (NEWHOPE_POLY || SEED) - NEWHOPE_MSGB := (NEWHOPE_POLY || NEWHOPE_REC) - - -§3.2.2.2. High-level Description of Newhope API Functions - - For a description of internal functions, see §B. - - (NEWHOPE_POLY, NEWHOPE_MSGA) ← NEWHOPE_KEYGEN(SEED): - â := gen_a(seed) - s := poly_getnoise() - e := poly_getnoise() - ŝ := poly_ntt(s) - ê := poly_ntt(e) - b̂ := pointwise(â, ŝ) + ê - sp := poly_tobytes(ŝ) - bp := poly_tobytes(b̂) - return (sp, (bp || seed)) - - (NEWHOPE_MSGB, NEWHOPE_KEY) ← NEWHOPE_SHAREDB(NEWHOPE_MSGA): - s' := poly_getnoise() - e' := poly_getnoise() - e" := poly_getnoise() - b̂ := poly_frombytes(bp) - â := gen_a(seed) - ŝ' := poly_ntt(s') - ê' := poly_ntt(e') - û := poly_pointwise(â, ŝ') + ê' - v := poly_invntt(poly_pointwise(b̂,ŝ')) + e" - r := helprec(v) - up := poly_tobytes(û) - k := rec(v, r) - return ((up || r), k) - - NEWHOPE_KEY ← NEWHOPE_SHAREDA(NEWHOPE_MSGB, NEWHOPE_POLY): - û := poly_frombytes(up) - ŝ := poly_frombytes(sp) - v' := poly_invntt(poly_pointwise(û, ŝ)) - k := rec(v', r) - return k - - When a client uses a SEED within a CREATE2V cell, the client SHOULD NOT use - that SEED in any other CREATE2V or EXTEND2 cells. See §4 for further - discussion. - - -§3.3. Key Expansion - - The client and server derive a shared key, SHARED, by: - - HKDFID := "THESE ARENT THE DROIDS YOURE LOOKING FOR" - SHARED := SHAKE_256(HKDFID || NTorKey || NewHopeKey) - - -§3.3.1. Note on the Design Choice - - The reader may wonder why one would use SHAKE-256 to produce a 256-bit - output, since the security strength in bits for SHAKE-256 is min(d/2,256) - for collision resistance and min(d,256) for first- and second-order - preimages, where d is the output length. - - The reasoning is that we should be aiming for 256-bit security for all of - our symmetric cryptography. One could then argue that we should just use - SHA3-256 for the KDF. We choose SHAKE-256 instead in order to provide an - easy way to derive longer shared secrets in the future without requiring a - new handshake. The construction is odd, but the future is bright. - As we are already using SHAKE-256 for the 32-byte output hash, we are also - using it for all other 32-byte hashes involved in the protocol. Note that - the only difference between SHA3-256 and SHAKE-256 with 32-byte output is - one domain-separation byte. - - [XXX why would you want 256-bit security for the symmetric side? Are you - talking pre- or post-quantum security? --peter] - - -§4. Security & Anonymity Implications - - This handshake protocol is one-way authenticated. That is, the server is - authenticated, while the client remains anonymous. - - The client MUST NOT cache and reuse SEED. Doing so gives non-trivial - adversarial advantages w.r.t. all-for-the-price-of-one attacks during the - caching period. More importantly, if the SEED used to generate NEWHOPE_MSGA - is reused for handshakes along the same circuit or multiple different - circuits, an adversary conducting a sybil attack somewhere along the path(s) - will be able to correlate the identity of the client across circuits or - hops. - - -§5. Compatibility - - Because our proposal requires both the client and server to send more than - the 505 bytes possible within a CREATE2 cell's HDATA section, it depends - upon the implementation of a mechanism for allowing larger CREATE cells - (cf. Tor proposal #249). - - We reserve the following handshake type for use in CREATE2V/CREATED2V and - EXTEND2V/EXTENDED2V cells: - - 0x0003 [NEWHOPE + X25519 HYBRID HANDSHAKE] - - We introduce a new sub-protocol number, "Relay=3", (cf. Tor proposal #264 - §5.3) to signify support this handshake, and hence for the CREATE2V and - fragmented EXTEND2 cells which it requires. - - There are no additional entries or changes required within either router - descriptors or microdescriptors to support this handshake method, due to the - NewHope keys being ephemeral and derived on-the-fly, and due to the NTor X25519 - public keys already being included within the "ntor-onion-key" entry. - - Add a "UseNewHopeKEX" configuration option and a corresponding consensus - parameter to control whether clients prefer using this NewHope hybrid - handshake or some previous handshake protocol. If the configuration option - is "auto", clients SHOULD obey the consensus parameter. The default - configuration SHOULD be "auto" and the consensus value SHOULD initially be "0". - - -§6. Implementation - - The paper by Alkim, Ducas, Pöppelmann and Schwabe describes two software - implementations of NewHope, one C reference implementation and an optimized - implementation using AVX2 vector instructions. Those implementations are - available at [1]. - - Additionally, there are implementations in Go by Yawning Angel, available - from [4] and in Rust by Isis Lovecruft, available from [5]. - - The software used to generate the test vectors in §C is based on the C - reference implementation and available from: - - https://code.ciph.re/isis/newhope-tor-testvectors - https://github.com/isislovecruft/newhope-tor-testvectors - - -§7. Performance & Scalability - - The computationally expensive part in the current NTor handshake is the - X25519 key-pair generation and the X25519 shared-key computation. The - current implementation in Tor is a wrapper to support various highly optimized - implementations on different architectures. On Intel Haswell processors, the - fastest implementation of X25519, as reported by the eBACS benchmarking - project [6], takes 169920 cycles for key-pair generation and 161648 cycles - for shared-key computation; these add up to a total of 331568 cycles on each - side (initiator and responder). - - The C reference implementation of NewHope, also benchmarked on Intel - Haswell, takes 358234 cycles for the initiator and 402058 cycles for the - Responder. The core computation of the proposed combination of NewHope and - X25519 will thus mean a slowdown of about a factor of 2.1 for the Initiator - and a slowdown by a factor of 2.2 for the Responder compared to the current - NTor handshake. These numbers assume a fully optimized implementation of the - NTor handshake and a C reference implementation of NewHope. With optimized - implementations of NewHope, such as the one for Intel Haswell described in - [0], the computational slowdown will be considerably smaller than a factor - of 2. - - -§8. References - -[0]: https://cryptojedi.org/papers/newhope-20160328.pdf -[1]: https://cryptojedi.org/crypto/#newhope -[2]: http://www.nist.gov/customcf/get_pdf.cfm?pub_id=919061 -[3]: https://tools.ietf.org/html/rfc7748#section-6.1 -[4]: https://github.com/Yawning/newhope -[5]: https://code.ciph.re/isis/newhopers -[6]: http://bench.cr.yp.to - - -§A. Cell Formats - -§A.1. CREATE2V Cells - - The client portion of the handshake should send CLIENT_HDATA, formatted - into a CREATE2V cell as follows: - - CREATE2V { [2114 bytes] - HTYPE := 0x0003 [2 bytes] - HLEN := 0x0780 [2 bytes] - HDATA := CLIENT_HDATA [1920 bytes] - IGNORED := 0x00 [194 bytes] - } - - [XXX do we really want to pad with IGNORED to make CLIENT_HDATA the - same number of bytes as SERVER_HDATA? --isis] - -§A.2. CREATED2V Cells - - The server responds to the client's CREATE2V cell with SERVER_HDATA, - formatted into a CREATED2V cell as follows: - - CREATED2V { [2114 bytes] - HLEN := 0x0800 [2 bytes] - HDATA := SERVER_HDATA [2112 bytes] - IGNORED := 0x00 [0 bytes] - } - -§A.3. Fragmented EXTEND2 Cells - - When the client wishes to extend a circuit, the client should fragment - CLIENT_HDATA into four EXTEND2 cells: - - EXTEND2 { - NSPEC := 0x02 { [1 byte] - LINK_ID_SERVER [22 bytes] XXX - LINK_ADDRESS_SERVER [8 bytes] XXX - } - HTYPE := 0x0003 [2 bytes] - HLEN := 0x0780 [2 bytes] - HDATA := CLIENT_HDATA[0,461] [462 bytes] - } - EXTEND2 { - NSPEC := 0x00 [1 byte] - HTYPE := 0xFFFF [2 bytes] - HLEN := 0x0000 [2 bytes] - HDATA := CLIENT_HDATA[462,954] [492 bytes] - } - EXTEND2 { - NSPEC := 0x00 [1 byte] - HTYPE := 0xFFFF [2 bytes] - HLEN := 0x0000 [2 bytes] - HDATA := CLIENT_HDATA[955,1447] [492 bytes] - } - EXTEND2 { - NSPEC := 0x00 [1 byte] - HTYPE := 0xFFFF [2 bytes] - HLEN := 0x0000 [2 bytes] - HDATA := CLIENT_HDATA[1448,1919] || 0x00[20] [492 bytes] - } - EXTEND2 { - NSPEC := 0x00 [1 byte] - HTYPE := 0xFFFF [2 bytes] - HLEN := 0x0000 [2 bytes] - HDATA := 0x00[172] [172 bytes] - } - - The client sends this to the server to extend the circuit from, and that - server should format the fragmented EXTEND2 cells into a CREATE2V cell, as - described in §A.1. - -§A.4. Fragmented EXTENDED2 Cells - - EXTENDED2 { - NSPEC := 0x02 { [1 byte] - LINK_ID_SERVER [22 bytes] XXX - LINK_ADDRESS_SERVER [8 bytes] XXX - } - HTYPE := 0x0003 [2 bytes] - HLEN := 0x0800 [2 bytes] - HDATA := SERVER_HDATA[0,461] [462 bytes] - } - EXTENDED2 { - NSPEC := 0x00 [1 byte] - HTYPE := 0xFFFF [2 bytes] - HLEN := 0x0000 [2 bytes] - HDATA := SERVER_HDATA[462,954] [492 bytes] - } - EXTEND2 { - NSPEC := 0x00 [1 byte] - HTYPE := 0xFFFF [2 bytes] - HLEN := 0x0000 [2 bytes] - HDATA := SERVER_HDATA[955,1447] [492 bytes] - } - EXTEND2 { - NSPEC := 0x00 [1 byte] - HTYPE := 0xFFFF [2 bytes] - HLEN := 0x0000 [2 bytes] - HDATA := SERVER_HDATA[1448,1939] [492 bytes] - } - EXTEND2 { - NSPEC := 0x00 [1 byte] - HTYPE := 0xFFFF [2 bytes] - HLEN := 0x0000 [2 bytes] - HDATA := SERVER_HDATA[1940,2112] [172 bytes] - } - - -§B. NewHope Internal Functions - - gen_a(SEED): returns a uniformly random poly - poly_getnoise(): returns a poly sampled from a centered binomial - poly_ntt(poly): number-theoretic transform; returns a poly - poly_invntt(poly): inverse number-theoretic transform; returns a poly - poly_pointwise(poly, poly): pointwise multiplication; returns a poly - poly_tobytes(poly): packs a poly to a NEWHOPE_POLY byte array - poly_frombytes(NEWHOPE_POLY): unpacks a NEWHOPE_POLY byte array to a poly - - helprec(poly): returns a NEWHOPE_REC byte array - rec(poly, NEWHOPE_REC): returns a NEWHOPE_KEY - - - --- Description of the Newhope internal functions --- - - gen_a(SEED seed) receives as input a 32-byte (public) seed. It expands - this seed through SHAKE-128 from the FIPS202 standard. The output of SHAKE-128 - is considered a sequence of 16-bit little-endian integers. This sequence is - used to initialize the coefficients of the returned polynomial from the least - significant (coefficient of X^0) to the most significant (coefficient of - X^1023) coefficient. For each of the 16-bit integers first eliminate the - highest two bits (to make it a 14-bit integer) and then use it as the next - coefficient if it is smaller than q=12289. - Note that the amount of output required from SHAKE to initialize all 1024 - coefficients of the polynomial varies depending on the input seed. - Note further that this function does not process any secret data and thus does - not need any timing-attack protection. - - - poly_getnoise() first generates 4096 bytes of uniformly random data. This can - be done by reading these bytes from the system's RNG; efficient - implementations will typically only read a 32-byte seed from the system's RNG - and expand it through some fast PRG (for example, ChaCha20 or AES-256 in CTR - mode). The output of the PRG is considered an array of 2048 16-bit integers - r[0],...,r[2047]. The coefficients of the output polynomial are computed as - HW(r[0])-HW(r[1]), HW(r[2])-HW(r[3]),...,HW(r[2046])-HW(r[2047]), where HW - stands for Hamming weight. - Note that the choice of RNG is a local decision; different implementations are - free to use different RNGs. - Note further that the output of this function is secret; the PRG (and the - computation of HW) need to be protected against timing attacks. - - - poly_ntt(poly f): For a mathematical description of poly_ntt see the [0]; a - pseudocode description of a very naive in-place transformation of an input - polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the - following code (all arithmetic on coefficients performed modulo q): - - psi = 7 - omega = 49 - - for i in range(0,n): - t[i] = f[i] * psi^i - - for i in range(0,n): - f[i] = 0 - for j in range(0,n): - f[i] += t[j] * omega^((i*j)%n) - - Note that this is not how poly_ntt should be implemented if performance is - an issue; in particular, efficient algorithms for the number-theoretic - transform take time O(n*log(n)) and not O(n^2) - Note further that all arithmetic in poly_ntt has to be protected against - timing attacks. - - - poly_invntt(poly f): For a mathematical description of poly_invntt see the - [0]; a pseudocode description of a very naive in-place transformation of an - input polynomial f = f[0] + f[1]*X + f[2]*X^2 + ... + f[1023]*X^1023 is the - following code (all arithmetic on coefficients performed modulo q): - - invpsi = 8778; - invomega = 1254; - invn = 12277; - - for i in range(0,n): - t[i] = f[i]; - - for i in range(0,n): - f[i]=0; - for j in range(0,n): - f[i] += t[j] * invomega^((i*j)%n) - f[i] *= invpsi^i - f[i] *= invn - - Note that this is not how poly_invntt should be implemented if performance - is an issue; in particular, efficient algorithms for the inverse - number-theoretic transform take time O(n*log(n)) and not O(n^2) - Note further that all arithmetic in poly_invntt has to be protected against - timing attacks. - - - poly_pointwise(poly f, poly g) performs pointwise multiplication of the two - polynomials. This means that for f = (f0 + f1*X + f2*X^2 + ... + - f1023*X^1023) and g = (g0 + g1*X + g2*X^2 + ... + g1023*X^1023) it computes - and returns h = (h0 + h1*X + h2*X^2 + ... + h1023*X^1023) with h0 = f0*g0, - h1 = f1*g1,..., h1023 = f1023*g1023. - - - poly_tobytes(poly f) first reduces all coefficents of f modulo q, i.e., - brings them to the interval [0,q-1]. Denote these reduced coefficients as - f0,..., f1023; note that they all fit into 14 bits. The function then packs - those coefficients into an array of 1792 bytes r[0],..., r[1792] in "packed - little-endian representation", i.e., - r[0] = f[0] & 0xff; - r[1] = (f[0] >> 8) & ((f[1] & 0x03) << 6) - r[2] = (f[1] >> 2) & 0xff; - r[3] = (f[1] >> 10) & ((f[2] & 0x0f) << 4) - . - . - . - r[1790] = (f[1022]) >> 12) & ((f[1023] & 0x3f) << 2) - r[1791] = f[1023] >> 6 - Note that this function needs to be protected against timing attacks. In - particular, avoid non-constant-time conditional subtractions (or other - non-constant-time expressions) in the reduction modulo q of the coefficients. - - - poly_frombytes(NEWHOPE_POLY b) is the inverse of poly_tobytes; it receives - as input an array of 1792 bytes and coverts it into the internal - representation of a poly. Note that poly_frombytes does not need to check - whether the coefficients are reduced modulo q or reduce coefficients modulo - q. Note further that the function must not leak any information about its - inputs through timing information, as it is also applied to the secret key - of the initiator. - - - helprec(poly f) computes 256 bytes of reconciliation information from the - input poly f. Internally, one byte of reconciliation information is computed - from four coefficients of f by a function helprec4. Let the input polynomial f - = (f0 + f1*X + f2*X^2 + ... + f1023*X^1023); let the output byte array be - r[0],...r[256]. This output byte array is computed as - r[0] = helprec4(f0,f256,f512,f768) - r[1] = helprec4(f1,f257,f513,f769) - r[2] = helprec4(f2,f258,f514,f770) - . - . - . - r[255] = helprec4(f255,f511,f767,f1023), where helprec4 does the following: - - helprec4(x0,x1,x2,x3): - b = randombit() - r0,r1,r2,r3 = CVPD4(8*x0+4*b,8*x1+4*b,8*x2+4*b,8*x3+4*b) - r = (r0 & 0x03) | ((r1 & 0x03) << 2) | ((r2 & 0x03) << 4) | ((r3 & 0x03) << 6) - return r - - The function CVPD4 does the following: - - CVPD4(y0,y1,y2,y3): - v00 = round(y0/2q) - v01 = round(y1/2q) - v02 = round(y2/2q) - v03 = round(y3/2q) - v10 = round((y0-1)/2q) - v11 = round((y1-1)/2q) - v12 = round((y2-1)/2q) - v13 = round((y3-1)/2q) - t = abs(y0 - 2q*v00) - t += abs(y1 - 2q*v01) - t += abs(y2 - 2q*v02) - t += abs(y3 - 2q*v03) - if(t < 2q): - v0 = v00 - v1 = v01 - v2 = v02 - v3 = v03 - k = 0 - else - v0 = v10 - v1 = v11 - v2 = v12 - v3 = v13 - r = 1 - return (v0-v3,v1-v3,v2-v3,k+2*v3) - - In this description, round(x) is defined as ⌊x + 0.5⌋, where ⌊x⌋ rounds to - the largest integer that does not exceed x; abs() returns the absolute - value. - Note that all computations involved in helprec operate on secret data and must - be protected against timing attacks. - - - rec(poly f, NEWHOPE_REC r) computes the pre-hash (see paper) Newhope key from - f and r. Specifically, it computes one bit of key from 4 coefficients of f and - one byte of r. Let f = f0 + f1*X + f2*X^2 + ... + f1023*X^1023 and let r = - r[0],r[1],...,r[255]. Let the bytes of the output by k[0],...,k[31] and let - the bits of the output by k0,...,k255, where - k0 = k[0] & 0x01 - k1 = (k[0] >> 1) & 0x01 - k2 = (k[0] >> 2) & 0x01 - . - . - . - k8 = k[1] & 0x01 - k9 = (k[1] >> 1) & 0x01 - . - . - . - k255 = (k[32] >> 7) - The function rec computes k0,...,k255 as - k0 = rec4(f0,f256,f512,f768,r[0]) - k1 = rec4(f1,f257,f513,f769,r[1]) - . - . - . - k255 = rec4(f255,f511,f767,f1023,r[255]) - - The function rec4 does the following: - - rec4(y0,y1,y2,y3,r): - r0 = r & 0x03 - r1 = (r >> 2) & 0x03 - r2 = (r >> 4) & 0x03 - r3 = (r >> 6) & 0x03 - Decode(8*y0-2q*r0, 8*y1-2q*r1, 8*y2-2q*r2, 8*y3-q*r3) - - The function Decode does the following: - - Decode(v0,v1,v2,v3): - t0 = round(v0/8q) - t1 = round(v1/8q) - t2 = round(v2/8q) - t3 = round(v3/8q) - t = abs(v0 - 8q*t0) - t += abs(v0 - 8q*t0) - t += abs(v0 - 8q*t0) - t += abs(v0 - 8q*t0) - if(t > 1) return 1 - else return 0 - - -§C. Test Vectors diff --git a/proposals/proposal-status.txt b/proposals/proposal-status.txt index 088ce54..dc0b332 100644 --- a/proposals/proposal-status.txt +++ b/proposals/proposal-status.txt @@ -442,3 +442,9 @@ again to remind me!
Describes a generalised protocol for composing X25519 key exchanges with post-quantum ones. + +270 RebelAlliance: A Post-Quantum Secure Hybrid Handshake Based on NewHope [DRAFT] + + Describes a hybrid handshake based on the ntor handshake and the + NewHope post-quantum key exchange. Currently needs revision to + specify how this proposal depends upon prop#269.