On TLS Empty Record Insertion ----------------------------- I was asked by the Tor project whether it was safe to disable the insertion of empty records, at least in the case of Tor. I claim that it is safe for a variety of applications, in fact, and, in practice, is also so for Tor. Background ---------- OpenSSL prepends an empty record to each outgoing TLS record, in order to avoid an adaptive plaintext attack. See http://www.openssl.org/~bodo/tls-cbc.txt. Since that is a rather convoluted document, I reproduce the attack, which is due to Rogaway, slightly rephrased for clarity, below: "Phil Rogaway observed that CBC mode is not secure against chosen-plaintext attack if the IV is known or can be predicted by the attacker before he choses his plaintext [1]. Similarly, CBC mode is not secure if the attacker can observe the last ciphertext block before choosing the next block of plaintext, because the last block of ciphertext essentially serves as the IV for the rest of the message. The attack itself is very simple. Remember that in CBC mode, each plaintext block is XOR'ed with the last ciphertext block and then encrypted to produce the next ciphertext block. Suppose the attacker suspects that plaintext block P_i might be x, and wants to test whether that's the case, he would choose some later plaintext block P_j to be x XOR C_(i-1) XOR C_(j-1). If his guess is correct, then C_j = Encrypt(P_j XOR C_(j-1)) = Encrypt(P_i XOR C_(i-1)) = C_i, and so he can confirm his guess by looking at whether C_j = C_i." Analysis -------- For this attack to work, the attacker must be able to control a plaintext block for which he knows the preceding ciphertext block, C_(j-1). SSL and TLS open up this problem because the IV for each packet is the previous packet's final ciphertext block[1]. Thus, if the attacker can choose plaintext for a TLS packet that immediately follows a TLS packet he has observed, he can take a guess at the plaintext of any packet he has seen in that ciphertext stream and test whether the guess is correct. Not many applications exist that allow the attacker to do this. In particular, if an attacker has an ability to inject plaintext into a TLS connection that contains data unknown to him, then the connection must contain data not supplied by him. Therefore any protocol running over TLS that contains data entirely controlled by a single party is immune. Note, though, that this does not include protocols like POP3 or IMAP, where the payloads are the result of many different people sending emails. Nor does it include Tor. It does, however, include static web pages. The next important requirement is that the very first block of the plaintext for some TLS packet must be controlled by the attacker, which means that the protocol wrapped in TLS must contain no preamble at the start of each TLS packet. So, any protocol that has a preamble for any attacker-controlled data it sends is also immune. This does include POP3 and IMAP. Web pages, even with dynamic content, might appear to be almost certain to be immune, too, because the preamble is the HTTP header block, or various bits of HTML, but care should be taken because of protocols like AJAX that can use long-lived HTTP connections. Beware, though - if the preamble is short, there may be a statistical attack that relies on the preamble happening to come out right. Also beware that buffering might cause natural breaks in the protocol to not correspond to TLS packet boundaries. The risk from this seems low, since the exact boundaries of the buffers would seem hard to control by the attacker, and the buffer would have to be empty at exactly the right moment. Tor may be vulnerable to such a statistical attack (this attack is due to Steven Murdoch). If padding blocks are enabled, then an attacker can determine if a block is a padding block with (2^128-2^16)/2^128 certainty. The attack works because padding blocks are all zeroes. So, if the attacker thinks block C_i travelling between nodes A and B may be a padding block, he can test for this as follows. The attacker first creates a circuit to A, then extends it to B. He then waits for a TLS packet with an IV s.t. IV xor C_(i-1) has 0x03 as the third byte. This is then used as the plaintext for a block the attacker sends to A over the circuit. Since this block will be a valid relay cell, when node A receives the block it will modify it by mapping the first two bytes (the circuit ID) of the incoming packet to a different value via a lookup table unknown to the attacker. If the lookup happens to leave these two bytes unaltered, then A's encryption of the packet will become the "test" packet described above and the attacker will see C_j = C_i and can conclude that the target packet was, indeed, a padding packet (with high probability). This means that the attacker has a 1 in 2^16 chance of detecting the padding packet at each attempt. Because only one byte (the third byte) of the plaintext is fixed, the attacker can attempt this attack on 1 in 2^8 TLS packets. Beacuse the attacker cannot be certain of the value of the first two plaintext bytes, a plaintext that happens to be all zero except the first two bytes could trigger a false positive, with probablity 2^-112, so therefore the attacker knows the block is a padding block with probability (2^128-2^16)/2^128. Note that because A modifies the circuit ID, the attacker can choose any value he likes in the packet he sends to A. This means that after a successful identification of a padding packet the attacker now knows one entry in the lookup table. If the desired value of the circuit ID (i.e. the first two bytes of IV xor C_(i-1)) happens to match in a future test, then the attacker can use his known value for his ciruit ID and will then be certain of the value in the plaintext. After each successful attack with a different circuit ID, the attacker may learn another entry in this table, so after 2^16 successful attacks the difficulty of an attack falls from 1 in 2^16 to something closer to 1 in 2. Because Tor does not currently actually send padding packets, this attack is not currently applicable. Furthermore, if Tor modified padding to be random instead of all zeroes, this would counter the attack. Finally, the attacker must be able to react in a timely way. Having observed a TLS packet, they must be able to react quickly enough to construct their new packet and have it be the payload for the next TLS packet. This means that any multiplexed protocol that is reasonably heavily loaded is immune. This, in practice, includes Tor. Other Countermeasures --------------------- A Nagle-like algorithm would also make attacks harder in general. Conclusion ---------- Not only is Tor likely to be immune to the attack, in practice, without the need for record insertion, a variety of other protocols are, too. The most widely used protocol, HTTPS, in particular, is immune in most cases. If Tor enables padding, then an attacker could detect padding as described above, but with such low probability it is unclear whether this attack would actually be useful. In any case, randomizing padding is a simple countermeasure. Care should be taken to ensure that a protocol and any particular implementation of that protocol are actually immune as described above before disabling empty packet insertion. [1] The defence of including an empty record before each "real" record works because the attacker is rendered unable to predict the IV for the record he controls.