commit 8ab02c54456fa28acc8e51ef304c65ef95b53304 Author: Nick Mathewson nickm@torproject.org Date: Tue Jun 19 12:14:37 2012 -0400
Add proposal 202: Two improved relay encryption protocols for Tor cells --- proposals/000-index.txt | 2 + proposals/202-improved-relay-crypto.txt | 816 +++++++++++++++++++++++++++++++ 2 files changed, 818 insertions(+), 0 deletions(-)
diff --git a/proposals/000-index.txt b/proposals/000-index.txt index 0293b44..c4a2ef9 100644 --- a/proposals/000-index.txt +++ b/proposals/000-index.txt @@ -122,6 +122,7 @@ Proposals by number: 199 Integration of BridgeFinder and BridgeFinderHelper [OPEN] 200 Adding new, extensible CREATE, EXTEND, and related cells [OPEN] 201 Make bridges report statistics on daily v3 network status requests [OPEN] +202 Two improved relay encryption protocols for Tor cells [OPEN]
Proposals by status: @@ -160,6 +161,7 @@ Proposals by status: 199 Integration of BridgeFinder and BridgeFinderHelper [for 0.2.4.x+] 200 Adding new, extensible CREATE, EXTEND, and related cells 201 Make bridges report statistics on daily v3 network status requests [for 0.2.4.x] + 202 Two improved relay encryption protocols for Tor cells ACCEPTED: 117 IPv6 exits [for 0.2.4.x] 140 Provide diffs between consensuses diff --git a/proposals/202-improved-relay-crypto.txt b/proposals/202-improved-relay-crypto.txt new file mode 100644 index 0000000..dafafdd --- /dev/null +++ b/proposals/202-improved-relay-crypto.txt @@ -0,0 +1,816 @@ +Filename: 202-improved-relay-crypto.txt +Title: Two improved relay encryption protocols for Tor cells +Author: Nick Mathewson +Created: 19 Jun 2012 +Status: Open + + +Overview: + + This document describes two candidate designs for a better Tor + relay encryption/decryption protocol, designed to stymie tagging + attacks and better accord with best practices in protocol design. + + My hope is that readers will examine these protocols, evaluate their + security and practicality, improve on them, and help to pick one for + implementation in Tor. + + In section 1, I describe Tor's current relay crypto protocol, its + drawbacks, how it fits in with the rest of Tor, and some + requirements/desiderata for a replacement. In sections 2 and 3, I + propose two alternative replacements for this protocol. In section + 4, I discuss their pros and cons. + +1. Background and Motivation + +1.0. A short overview of Tor's current protocols + + The core pieces of the Tor protocol are the link protocol, + the circuit extend protocol, the relay protocol, and the stream + protocol. All are documented in [TorSpec]. + + Briefly: + + - The link protocol handles all direct pairwise communication + between nodes. Everything else is transmitted over it. It + uses TLS. + + - The circuit extend protocol uses public-key crypto to set up + multi-node virtual tunnels, called "circuits", from a client + through one or more nodes. + + *** The relay protocol uses established circuits to communicate + from a client to a node on a circuit. That's the one we'll + be talking about here. *** + + - The stream protocol is tunneled over relay protocol; clients + use it to tell servers to open anonymous TCP connections, to + send data, and so forth. Servers use it to report success or + failure opening anonymous TCP connections, to send data from + TCP connections back to clients, and so forth. + + In more detail: The link protocol's job is to connect two + computers with an encrypted, authenticated stream, to + authenticate one or both of them to the other, and to provide a + channel for passing "cells" between them. The circuit extend + protocol's job is to set up circuits: persistent tunnels that + connect a Tor client to an exit node through a series of + (typically three) hops, each of which knows only the previous and + next hop, and each of which has a set of keys that they share + only with the client. Finally, the relay protocol's job is to + allow a client to communicate bidirectionally with the node(s) on + the circuit, once their shared keys have been established. + + (We'll ignore the stream protocol for the rest of this document.) + + Note on terminology: Tor nodes are sometimes called "servers", + "relays", or "routers". I'll use all these terms more or less + interchangeably. For simplicity's sake, I will call the party + who is constructing and using a circuit "the client" or "Alice", + even though nodes sometimes construct circuits too. + + Tor's internal packets are called "cells". All the cells we deal + with here are 512 bytes long. + + The nodes in a circuit are called its "hops"; most circuits are 3 + hops long. This doesn't include the client: when Alice builds a + circuit through nodes Bob_1, Bob_2, and Bob_3, the first hop is + "Bob_1" and the last hop is "Bob_3". + +1.1. The current relay protocol and its drawbacks + + [This section describes Tor's current relay protocol. It is not a + proposal; rather it is what we do now. Sections 2 and 3 have my + proposed replacements for it.] + + A "relay cell" is a cell that is generated by the client to send + to a node, or by a node to send to the client. It's called a + "relay" cell because a node that receives one may need to relay + it to the next or previous node in the circuit (or to handle the + cell itself). + + A relay cell moving towards the client is called "inbound"; a + cell moving away is called "outbound". + + When a relay cell is constructed by the client, the client adds one + layer of crypto for each node that will process the cell, and gives + the cell to the first node in the circuit. Each node in turn then + removes one layer of crypto, and either forwards the cell to the next + node in the circuit or acts on that cell itself. + + When a relay cell is constructed by a node, it encrypts it and sends + it to the preceding node in the circuit. Each node between the + originating node and the client then encrypts the cell and passes it + back to the preceding node. When the client receives the cell, it + removes layers of crypto until it has an unencrypted cell, which it + then acts on. + + In the current protocol, the body of each relay cell contains, in + its unencrypted form: + + Relay command [1 byte] + Zeros [2 bytes] + StreamID [2 bytes] + Digest [4 bytes] + Length [2 bytes] + Data [498 bytes] + + (This only adds up to 509 bytes. That's because the Tor link + protocol transfers 512-byte cells, and has a 3 byte overhead per + cell. Not how we would do it if we were starting over at this + point.) + + At every node of a circuit, the node relaying a cell + encrypts/decrypts it with AES128-CTR. If the cell is outbound + and the "zeros" field is set to all-zeros, the node additionally + checks whether 'digest' is correct. A correct digest is the + first 4 bytes of the running SHA1 digest of: a shared secret, + concatenated with all the relay cells destined for this node on + this circuit so far, including this cell. If _that's_ true, then + the node accepts this cell. (See section 6 of [TorSpec] for full + detail; see section A.1 for a more rigorous description.) + + The current approach has some actual problems. Notably: + + * It permits tagging attacks. Because AES_CTR is an XOR-based + stream cipher, an adversary who controls the first node in a + circuit can XOR anything he likes into the relay cell, and + then see whether he/she encounters an correspondingly + defaced cell at some exit that he also controls. + + That is, the attacker picks some pattern P, and when he + would transmit some outbound relay cell C at hop 1, he + instead transmits C xor P. If an honest exit receives the + cell, the digest will probably be wrong, and the honest exit + will reject it. But if the attacker controls the exit, he + will notice that he has received a cell C' where the digest + doesn't match, but where C' xor P _does_ have a good digest. + The attacker will then know that his two nodes are on the + same circuit, and thereby be able to link the user (whom the + first node sees) to her activities (which the last node sees). + + Back when we did the Tor design, this didn't seem like a + big deal, since an adversary who controls both the first and + last node in a circuit is presumed to win already based on + traffic correlation attacks. This attack seemed strictly + worse than that, since it was trivially detectable in the + case where the attacker _didn't_ control both ends. See + section 4.4 of the Tor paper [TorDesign] for our early + thoughts here; see Xinwen Fu et al's 2009 work for a more + full explanation of the in-circuit tagging attack [XF]; and + see "The 23 Raccoons'" March 2012 "Analysis of the Relative + Severity of Tagging Attacks" mail on tor-dev (and the + ensuing thread) for a discussion of why we may want to care + after all, due to attacks that use tagging to amplify route + capture. [23R] + + It also has some less practical issues. + + * The digest portion is too short. Yes, if you're an attacker + trying to (say) change an "ls *" into an "rm *", you can + only expect to get one altered cell out of 2^32 accepted -- + and all future cells on the circuit will be rejected with + similar probability due to the change in the running hash + -- but 1/2^32 is a pretty high success rate for crypto attacks. + + * It does MAC-then-encrypt. That makes smart people cringe. + + * Its approach to MAC is H(Key | Msg), which is vulnerable to + length extension attack if you've got a Merkle-Damgard hash + (which we do). This isn't relevant in practice right now, + since the only parties who see the digest are the two + parties that rely on it (because of the MAC-then-encrypt). + + +1.2. Some feature requirements + + Relay cells can originate at the client or at a relay. Relay cells + that originate at the client are given to the first node in the + circuit, and constructed so that they will be decrypted and forwarded + by the first M-1 nodes in the circuit, and finally decrypted and + processed by the Mth node, where the client chooses M. (Usually, the + Mth node is the the last one, which will be an exit node.) Relay + cells that originate at a hop in the circuit are given to the + preceding node, and eventually delivered to the client. + + Tor provides a so called "leaky pipe" circuit topology + [TorDesign]: a client can send a cell to any node in the circuit, + not just the last node. I'll try to keep that property, although + historically we haven't really made use of it. + + In order to implement telescoping circuit construction (where the + circuit is built by iteratively asking the last node in the + circuit to extend the circuit one hop more), it's vital that the + last hop of the circuit be able to change. + + Proposal 188 [Prop188] suggests that we implement a "bridge guards" + feature: making some (non-public) nodes insert an extra hop into + the circuit after themselves, in a way that will make it harder + for other nodes in the network to enumerate them. We + therefore want our circuits to be one-hop re-extensible: when the + client extends a circuit from Bob1 to Bob2, we want Bob1 to be + able to introduce a new node "Bob1.5" into the circuit such that + the circuit runs Alice->Bob1->Bob1.5->Bob2. (This feature has + been called "loose source routing".) + + Any new approach should be able to coexist on a circuit + with the old approach. That is, if Alice wants to build a + circuit through Bob1, Bob2, and Bob3, and only Bob2 supports a + revised relay protocol, then Alice should be able to build a + circuit such that she can have Bob1 and Bob3 process each cell + with the current protocol, and Bob2 process it with a revised + protocol. (Why? Because if all nodes in a circuit needed to use + the same relay protocol, then each node could learn information + about the other nodes in the circuit from which relay protocol + was chosen. For example, if Bob1 supports the new protocol, and + sees that the old relay protocol is in use, and knows that Bob2 + supports the new one, then Bob1 has learned that Bob3 is some + node that does not support the new relay protocol.) + + Cell length needs to be constant as cells move through the + network. For historical reasons mentioned above in section 1.1, + the length of the encrypted part of a relay cell needs to be 509 + bytes. + +1.3. Some security requirements + + Two adjacent nodes on a circuit can trivially tell that they are + on the same circuit, and the first node can trivially tell who + the client is. Other than that, we'd like any attacker who + controls a node on the circuit not to have a good way to learn + any other nodes, even if he/she controls those nodes. [*] + + Relay cells should not be malleable: no relay should be able to + alter a cell between an honest sender and an honest recipient in + a way that they cannot detect. + + Relay cells should be secret: nobody but the sender and recipient + of a relay cell should be able to learn what it says. + + Circuits should resist transparent, recoverable tagging attacks: + if an attacker controls one node in a circuit and alters a relay + cell there, no non-adjacent point in the circuit should be able + to recover the relay cell as it would have received it had the + attacker not altered it. + + The above properties should apply to sequences of cells too: + an attacker shouldn't be able to change what sequence of cells + arrives at a destination (for example, by removing, replaying, or + reordering one or more cells) without being detected. + + (Feel free to substitute whatever formalization of the above + requirements makes you happiest, and add whatever caveats are + necessary to make you comfortable. I probably missed at least + two critical properties.) + + [*] Of course, an attacker who controls two points on a circuit + can probably confirm this through traffic correlation. But + we'd prefer that the cryptography not create other, easier + ways for them to do this. + +1.4. A note on algorithms + + This document is deliberately agnostic concerning the choice of + cryptographic primitives -- not because I have no opinions about + good ciphers, MACs, and modes of operation -- but because + experience has taught me that mentioning any particular + cryptographic primitive will prevent discussion of anything else. + + Please DO NOT suggest algorithms to use in implementing these + protocols yet. It will distract! There will be time later! + + If somebody _else_ suggests algorithms to use, for goodness' sake + DON'T ARGUE WITH THEM! There will be time later! + + +2. Design 1: Large-block encryption + + In this approach, we use a tweakable large-block cipher for + encryption and decryption, and a tweak-chaining function TC. + +2.1. Chained large-block what now? + + We assume the existence of a primitive that provides the desired + properties of a tweakable[Tweak] block cipher, taking blocks of any + desired size. (In our case, the block size is 509 bytes[*].) + + It also takes a Key, and a per-block "tweak" parameter that plays + the same role that an IV plays in CBC, or that the counter plays + in counter mode. + + The Tweak-chaining function TC takes as input a previous tweak, a + tweak chaining key, and a cell; it outputs a new tweak. Its + purpose is to make future cells undecryptable unless you have + received all previous cells. It could probably be something like + a MAC of the old tweak and the cell using the tweak chaining key + as the MAC key. + + (If the initial tweak is secret, I am not sure that TC needs to + be keyed.) + + [*] Some large-block cipher constructions use blocks whose size is + the multiple of some underlying cipher's block size. If we wind + up having to use one of those, we can use 496-byte blocks instead + at the cost of 2.5% wasted space. + +2.2. The protocol + +2.2.1. Setup phase + + The circuit construction algorithm needs to produce forward and + backward keys Kf and Kb, the forward and backward tweak chaining + keys TCKf and TCKb, as well as initial tweak values Tf and Tb. + +2.2.2. The cell format + + We replace the "Digest" and "Zeros" fields of the cell with a + single Z-byte "Zeros" field to determine when the cell is + recognized and correctly decrypted; its length is a security + parameter. + +2.2.3. The decryption operations + + For a relay to handle an inbound RELAY cell, it sets Tb_next to + TC(TCKb, Tb, Cell). Then it encrypts the cell using the large + block cipher keyed with Kb and tweaked with Tb. Then it sets Tb + to Tb_next. + + For a relay to handle an outbound RELAY cell, it sets Tf_next to + TC(TCKf, Tf, Cell). Then it decrypts the cell using the large + block cipher keyed with Kf and tweaked with Tf. Then it sets Tf + to Tf_next. Then it checks the 'Zeros' field on the cell; + if that field is all [00] bytes, the cell is for us. + +2.3. Security discussion + + This approach is fairly simple (at least, no more complex than + its primitives) and achieves some of our security goals. Because + of the large block cipher approach, any change to a cell will + render that cell undecryptable, and indistinguishable from random + junk. Because of the tweak chaining approach, if even one cell + is missed or corrupted or reordered, future cells will also + decrypt into random junk. + + The tagging attack in this case is turned into a circuit-junking + attack: an adversary who tries to mount it can probably confirm + that he was indeed first and last node on a target circuit + (assuming that circuits which turn to junk in this way are rare), + but cannot recover the circuit after that point. + + As a neat side observation, note that this approach improves upon + Tor's current forward secrecy, by providing forward secrecy while + circuits are still operational, since each change to the tweak + should make previous cells undecryptable if the old tweak value + isn't recoverable. + + The length of Zeros is a parameter for what fraction of "random + junk" cells will potentially be accepted by a router or client. + If Zeros is Z bytes long, then junk cells will be accepted with + P < 2^-(8*Z + 7). (The '+7' is there because the top 7 bits of + the Length field must also be 0.) + + There's no trouble using this protocol in a mixed circuit, where + some nodes speak the old protocol and some speak the + large-block-encryption protocol. + +3. Design 2: short-MAC-and-pad + + In this design, we behave more similarly to a mix-net design + (such as Mixmaster or Mixminion's headers). Each node checks a + MAC, and then re-pads the cell to its chosen length before + decoding the cell. + + This design uses as a primitive a MAC and a stream cipher. It + might also be possible to use an authenticating cipher mode, + if we can find one that works like a stream cipher and allows us + to efficiently output authenticators for the stream so far. + + NOTE TO AE/AEAD FANS: The encrypt-and-MAC model here could be + replaced with an authenticated encryption mode without too much + loss of generality. + +3.1. The protocol + +3.1.1 Setup phase + + The circuit construction algorithm needs to produce forward and + backward keys Kf and Kb, forward and backward stream cipher IVs + IVf and IVb, and forward and backward MAC keys Mf and Mb. + + Additionally, the circuit construction algorithm needs a way for + the client to securely (and secretly? XXX) tell each hop in the + circuit a value 'bf' for the number of bytes of MAC it should + expect on outbound cells and 'bb' for the number of bytes it + should use on cells it's generating. Each node can get a + different 'bf' and 'bb'. These values can be 0: if a node's bf + is 0, it doesn't authenticate cells; if its bb is 0, it doesn't + originate them. + + The circuit construction algorithm also needs a way to tell each + the client to securely (and secretly? XXX) tell each hop in the + circuit whether it is allowed to be the final destination for + relay cells. + + Set the stream Sf and the stream Sb to empty values. + +3.1.2. The cell format + + The Zeros and Digest field of the cell format are removed. + +3.1.3. The relay operations + + Upon receiving an outbound cell, a node removes the first b bytes + of the cell, and puts them aside as 'M'. The node then computes + between 0 and 2 MACs of the stream consisting of all previously + MAC'd data, plus the remainder of the cell: + + If b>0 and there is a next hop, the node computes M_relay. + + If this node was told to deliver traffic, or it is the last + node in the circuit so far, the node computes M_receive. + + M_relay is computed as MAC(stream | "relay"); M_receive is + computed as MAC(stream | "receive"). + + If M = M_receive, this cell is for the node; it should process + it. + + If M = M_relay, or if b == 0, this cell should be relayed. + + If a MAC was computed and neither of the above cases was met, + then the cell is bogus; the node should discard it and destroy + the circuit. + + The node then removes the first bf bytes of the cell, and pads the + cell at the end with bf zero bytes. Finally, the node decrypts + the whole remaining padded cell with the stream cipher. + + To handle an inbound cell, the node simply does a stream cipher + with no checking. + +3.1.4. Generating inbound cells. + + To generate an inbound cell, a node makes a 509-bb byte RELAY + cell C, encrypts that cell with Kb, appends the encrypted cell to + Sb, and prepends M_receive(Sb) to the cell. + +3.1.5. Generating outbound cells + + Generating an outbound cell is harder, since we need to know what + padding the earlier nodes will generate in order to know what + padding the later nodes will receive and compute their MACs, but + we also need to know what MACs we'll send to the later nodes in + order to compute which MACs we'll send to the earlier ones. + + Mixnet clients have needed to do this for ages, though, so the + algorithms are pretty well settled. I'll give one below in A.3. + +3.2. Security discussion + + This approach is also simple and (given good parameter choices) + can achieve our goals. The trickiest part is the algorithm that + clients must follow to package cells, but that's no so bad. + + It's not necessary to check MACs on inbound traffic, because + nobody but the client can tell scrambled messages from good ones, + and the client can be trusted to keep the client's own secrets. + + With this protocol, if the attacker tries to do a tagging attack, + the circuit should get destroyed by the next node in the circuit + that has a nonzero "bf" value, with probability == 1-2^-(8*bf). + (If there are further intervening honest nodes, they also have a + chance to detect the attack.) + + Similarly, any attempt to replay, or reorder outbound cells + should fail similarly. + + The "bf" values could reveal to each node its position in the + circuit and the client preferences, depending on how we set them; + on the other hand, having a fixed bf value would reveal to the + last node the length of the circuit. Neither option seems great. + + This protocol doesn't provide any additional forward secrecy + beyond what Tor has today. We could fix that by changing our use + of the stream cipher so that instead of running in counter mode + between cells, we use a tweaked stream cipher and change the + tweak with each cell (possibly based on the unused portion of the + MAC). + + This protocol does support loose source routing, so long as the + no padding bytes are added by any router-added nodes. + + In a circuit, every node has at least one relay cell sent to it: + even non-exit nodes get a RELAY_EXTEND cell. + +4. Discussion + + I'm not currently seeing a reason to strongly prefer one of these + approaches over another. + + In favor of large-block encryption: + - The space overhead seems smaller: we need to use up fewer + bytes in order to get equivalent looking security. + + (For example, if we want to have P < 2^64 that a cell altered + by hop 1 could be accepted by hop 2 or hop 3, *and* we want P + < 2^64 that a cell altered by hop 2 could be accepted by hop + 3, the large-block approach needs about 8 bytes for the Zeros + field, whereas the short-MAC-and-pad approach needs 16 bytes + worth of MACs.) + + - We get forward secrecy pretty easily. + + - The format doesn't leak anything about the length of the + circuit, or limit it. + + - We don't need complicated logic to set the 'b' parameters. + + - It doesn't need tricky padding code. + + In the favor of short-MAC-and-pad: + - All of the primitives used are much better analyzed and + understood. There's nothing esoteric there. The format + itself is similar to older, well-analyzed formats. + + - Most of the constructions for the large-block-cipher approach + seem less efficient in CPU cycles than a good stream cipher + and a MAC. (But I don't want to discuss that now; see section + 1.4 above!) + + Unclear: + + - Suppose that an attacker controls the first and last hop of a + circuit, and tries an end-to-end tagging attack. With + large-block encryption, the tagged cell and all future cells + on the circuit turn to junk after the tagging attack, with + P~~100%. With short-MAC-and-pad, the circuit is destroyed at + the second hop, with P ~~ 1- 2^(-8*bf). Is one of these + approaches materially worse for the attacker? + + - Can we do better than the "compute two MACs" approach for + distinguishing the relay and receive cases of the + short-MAC-and-pad protocol? + + - To what extent can we improve these protocols? + + - If we do short-MAC-and-pad, should we apply the forward + security hack alluded to in section 3.2? + +5. Acknowledgments + + Thanks to the many reviewers of the initial drafts of this + proposal. If you can make any sense of what I'm saying, they + deserve much of the credit. + +A. Formal description + + Note that in all these cases, more efficient descriptions exist. + +A.1. The current Tor relay protocol. + + Relay cell format: + + Relay command [1 byte] + Zeros [2 bytes] + StreamID [2 bytes] + Digest [4 bytes] + Length [2 bytes] + Data [498 bytes] + + Circuit setup: + + (Specified elsewhere; the client negotiates with each router in + a circuit the secret AES keys Kf, Kb, and the secret 'digest + keys' Df, and Db. They initialize AES counters Cf and Cb to + 0. They initialize the digest stream Sf to Df, and Sb to Db.) + + HELPER FUNCTION: CHECK(Cell [in], Stream [in,out]): + + 1. If the Zeros field of Cell is not [00 00], return False. + 2. Let Cell' = Cell with its Digest field set to [00 00 00 00]. + 3. Let S' = Stream | Cell'. + 4. If SHA1(S') = the Digest field of Cell, set Stream to S', + and return True. + 5. Otherwise return False. + + HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out]) + + 1. Set the Zeros and Digest field of Cell' to [00] bytes. + 2. Set Stream to Stream | Cell'. + 3. Construct Cell from Cell' by setting the Digest field to + SHA1(Stream), and taking all other fields as-is. + 4. Return Cell. + + HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out]) + 1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and + counter 'Ctr'. Increment 'Ctr' by the length of the cell. + + HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out]) + 1. Same as ENCRYPT. + + + Router operation, upon receiving an inbound cell -- that is, one + sent towards the client. + + 1. ENCRYPT(cell, Kb, Cb) + 2. Send the decrypted contents towards the client. + + Router operation, upon receiving an outbound cell -- that is, one + sent away from the client. + + 1. DECRYPT(cell, Kf, Cf) + 2. If CHECK(Cell, Sf) is true, this cell is for us. Do not + relay the cell. + 3. Otherwise, this cell is not for us. Send the decrypted cell + to the next hop on the circuit, or discard it if there is no + next hop. + + Router operation, to create a relay cell that will be delivered + to the client. + + 1. Construct a Relay cell Cell' with the relay command, length, + stream ID, and body fields set as appropriate. + 2. Let Cell = CONSTRUCT(Cell', Sb). + 3. ENCRYPT(Cell, Kb, Cb) + 4. Send the encrypted cell towards the client. + + Client operation, receiving an inbound cell. + + For each hop H in a circuit, starting with the first hop and + ending (possibly) with the last: + + 1. DECRYPT(Cell, Kb_H, Cb_H) + + 2. If CHECK(Cell, Sb_H) is true, this cell was sent from hop + H. Exit the loop, and return the cell in its current + form. + + If we reach the end of the loop without finding the right hop, + the cell is bogus or corrupted. + + Client operation, sending an outbound cell to hop H. + + 1. Construct a Relay cell Cell' with the relay command, length, + stream ID, and body fields set as appropriate. + 2. Let Cell = CONSTRUCT(Cell', Sf_H) + 3. For i = H..1: + A. ENCRYPT(Cell, Sf_i, Cf_i) + 4. Deliver Cell to the first hop in the circuit. + +A.2. The large-block-cipher protocol + + Same as A.1, except for the following changes. + + The cell format is now: + Zeros [Z bytes] + Length [2 bytes] + StreamID [2 bytes] + Relay command [1 byte] + Data [504-Z bytes] + + Ctr is no longer a counter, but a Tweak value. + + Each key is now a tuple of (Key_Crypt, Key_TC) + + Streams are no longer used. + + HELPER FUNCTION: CHECK(Cell [in], Stream [in,out]) + 1. If the Zeros field of Cell contains only [00] bytes, + return True. + 2. Otherwise return false. + + HELPER FUNCTION: CONSTRUCT(Cell' [in], Stream [in,out]) + 1. Let Cell be Cell', with its "Zeros" field set to Z [00] + bytes. + 2. Return Cell'. + + HELPER FUNCTION: ENCRYPT(Cell [in,out], Key [in], Tweak [in,out]) + 2. Encrypt Cell using Key and Tweak + 1. Let Tweak' = TC(Key_TC, Tweak, Cell) + 3. Set Tweak to Tweak'. + + HELPER FUNCTION: DECRYPT(Cell [in,out], Key [in], Tweak [in,out]) + 1. Let Tweak' = TC(Key_TC, Tweak, Cell) + 2. Decrypt Cell using Key and Tweak + 3. Set Tweak to Tweak'. + +A.3. The short-MAC-and-pad protocol. + + Define M_relay(K,S) as MAC(K, S|"relay"). + Define M_receive(K,S) as MAC(K, S|"receive"). + Define Z(n) as a series of n [00] bytes. + Define BODY_LEN as 509 + + The cell message format is now: + + Relay command [1 byte] + StreamID [2 bytes] + Length [2 bytes] + Data [variable bytes] + + Helper function: CHECK(Cell [in], b [in], K [in], S [in,out]): + Let M = Cell[0:b] + Let Rest = Cell[b:...] + If b == 0: + Return (nil, Rest) + Let S' = S | Rest + If M == M_relay(K,S')[0:b]: + Set S = S' + Return ("relay", Rest) + If M == M_receive(K,S')[0:b]: + Set S = S' + Return ("receive", Rest) + Return ("BAD", nil) + + HELPER_FUNCTION: ENCRYPT(Cell [in,out], Key [in], Ctr [in,out]) + 1. Encrypt Cell's contents using AES128_CTR, with key 'Key' and + counter 'Ctr'. Increment 'Ctr' by the length of the cell. + + HELPER_FUNCTION: DECRYPT(Cell [in,out], Key [in], Ctr [in,out]) + 1. Same as ENCRYPT. + + Router operation, upon receiving an inbound cell: + 1. ENCRYPT(cell, Kb, Cb) + 2. Send the decrypted contents towards the client. + + Router operation, upon receiving an outbound cell: + 1. Let Status, Msg = CHECK(Cell, bf, Mf, Sf) + 2. If Status == "BAD", drop the cell and destroy the circuit. + 3. Let Cell' = Msg | Z(BODY_LEN - len(Msg)) + 4. DECRYPT(Cell', Kf, Cf) [*] + 5. If Status == "receive" or (Status == nil and there is no + next hop), Cell' is for us: process it. + 6. Otherwise, send Cell' to the next node. + + Router operation, sending a cell towards the client: + 1. Let Body = a relay cell body of BODY_LEN-bb_i bytes. + 2. Let Cell' = ENCRYPT(Body, Kb, Cb) + 3. Let Sb = Sb | Cell' + 4. Let M = M_receive(Mb, Sb)[0:b] + 5. Send the cell M | Cell' back towards the client. + + Client operation, upon receiving an inbound cell: + + For each hop H in the circuit, from first to last: + + 1. Let Status, Msg = CHECK(Cell, bb_i, Mb_i, Sb_i) + 2. If Status = "relay", drop the cell and destroy + the circuit. (BAD is okay; it means that this hop didn't + originate the cell.) + 3. DECRYPT(Msg, Kb_i, Cb_i) + 4. If Status = "receive", this cell is from hop i; process + it. + 5. Otherwise, set Cell = Msg. + + Client operation, sending an outbound cell: + + Let BF = the total of all bf_i values. + + 1. Construct a relay cell body Msg of BODY_LEN-BF bytes. + 2. For each hop i, let Stream_i = ENCRYPT(Kf_i,Z(CELL_LEN),Cf_i) + 3. Let Pad_0 = "". + 4. For i in range 1..N, where N is the number of hops: + Let Pad_i = Pad_{i-1} | Z(bf_i) + Let S_last = the last len(Pad_i) bytes of Stream_i. + Let Pad_i = Pad_i xor S_last + Now Pad_i is the padding as it will stand after node i + has processed it. + + 5. For i in range N..1, where N is the number of hops: + If this is the last hop, let M_* = M_receive. Else let + M_* = M_relay. + + Let Body = Msg xor the first len(Msg) bytes of Stream_i + + Let M = M_*(Mf, Body | Pad_(i-1)) + + Set Msg = M[:bf_i] | Body + + 6. Send Msg outbound to the first relay in the circuit. + + + [*] Strictly speaking, we could omit the pad-and-decrypt + operation once we know we're the final hop. + + + +R. References + +[Prop188] Tor Proposal 188: Bridge Guards and other anti-enumeration defenses + https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/188-bridge-gu... + +[TorSpec] The Tor Protocol Specification + https://gitweb.torproject.org/torspec.git?a=blob_plain;hb=HEAD;f=tor-spec.tx... + +[TorDesign] Dingledine et al, "Tor: The Second Generation Onion + Router", + https://svn.torproject.org/svn/projects/design-paper/tor-design.pdf + +[Tweak] Liskov et al, "Tweakable Block Ciphers", + http://www.cs.berkeley.edu/~daw/papers/tweak-crypto02.pdf + +[XF] Xinwen Fu et al, "One Cell is Enough to Break Tor's Anonymity" + +[23R] The 23 Raccoons, "Analysis of the Relative Severity of Tagging + Attacks" http://archives.seul.org/or/dev/Mar-2012/msg00019.html + (You'll want to read the rest of the thread too.)