SHA-3 isn't looking so hot to me

newer
[Patch] or/eventdns.c

older
Subject: Re: The consequences of...

Markku-Juhani O. Saarinen

2 Nov 2011 2 Nov '11

8:20 a.m.

Watson Ladd:

...

(HMAC is a bad idea anyway: quadratic security bounds are not the best possible, we have to use nonces anyway to prevent replay attacks, so Wegman-Carter is a better idea for better in{faster, more secure}. GCM would be an example of this.)

GCM has quadratic security bounds, not only in the proof but in practice too. Furthermore a 128-bit GCM MAC offers much less than 128-bit security; in this sense it is certainly weaker than, say, MD5-HMAC ! And there are the implementation caveats mentioned already mentioned in the list, including Joux's "forbidden attack" which may indeed be very real in many implementations. GCM also tends to be slow on software. I've been told that GCM was pushed through the NIST standardization process mainly by CISCO who were concerned about the performance of their hardware crypto accelerators (many hash functions are tricky to implement on hardware). There's also the issue of weak keys etc. See http://eprint.iacr.org/2011/202.pdf This came out in ECRYPT II Hash Workshop 2011, 19-20 May 2011, Tallinn, Estonia. The current ECRYPT recommendations take this into account, see section 8.5.5 of "ECRYPT II Yearly Report on Algorithms and Keysizes (2010-2011)" for a brief discussion: http://www.ecrypt.eu.org/documents/D.SPA.17.pdf I should be expanding this work as Niels Ferguson noted in the Redmond Crypto Retreat workshop last August that there's a trivial extension of this attack that allows arbitrary message modification. Also a significantly faster method for identifying weak AES-GCM keys has been discovered. As a hash function researcher I would personally select SHA-512 with digest truncated to required number of bits as an interim solution. SHA-512/256 tends to be faster than SHA-256 in software. Also it makes sense to implement a single hash core rather than two (SHA-256 and SHA-512 are fundamentally different). This is also in the standards, see: http://csrc.nist.gov/publications/drafts/fips180-4/Draft-FIPS180-4_Feb2011.p... Cheers, - markku Dr. Markku-Juhani O. Saarinen <mjos@reveresecurity.com>

Show replies by date

Zooko O'Whielacronx

2 Nov 2 Nov

4:18 p.m.

For what it is worth, I would probably prefer Poly1305-AES over HMAC if I were needing message integrity. I don't know if I would prefer Poly1305-AES over using an integrated-integrity mode like GCM. On Wed, Nov 2, 2011 at 2:20 AM, Markku-Juhani O. Saarinen <mjos@reveresecurity.com> wrote:

...

As a hash function researcher I would personally select SHA-512 with digest truncated to required number of bits as an interim solution. SHA-512/256 tends to be faster than SHA-256 in software.

I like this suggestion because it seems very safe. However, it isn't the full story to say that SHA-512 tends to be faster than SHA-256 in software. That's true for 64-bit chips, but untrue for 32-bit. According to [1], while SHA-512 requires only about 2/3 as many CPU cycles as SHA-256 on a powerful Sandy Bridge server chip ("sandy0"), it requires 4 times as many CPU cycles on a 32-bit ARM ("gcc33"). As I've argued recently on this list, it might not matter whether hashing your 4096-byte packet on one core of a powerful server (sandy0) takes 15 μsec (SHA-512) or 22 μsec (SHA-256), but it might matter whether hashing it on a cheap, power-efficient embedded chip (gcc33) takes 120μsec (SHA-256) or 481 μsec (SHA-512). On the other hand, maybe ~500 μsec time spent hashing per packet is good enough on Freedom Boxes, smart phones, and ARM servers [2], and the added safety of SHA-512/256 vs. SHA-256 would be worth it. Regards, Zooko [1] http://bench.cr.yp.to/results-hash.html [2] http://www.pcworld.com/article/242946/calxedas_chip_boosts_arms_server_fight...

Jon Callas

3 Nov 3 Nov

10:16 p.m.

Zooko forwarded a hash question over to the SHA-3 competition mailing list, and mentioned the discussion that has been going on here. He's going to forward over comments that I made and John Kelsey made. Nonetheless, I'd like to offer some comments on what I've read in a larger context. I don't see any discussions of requirements, goals, or anything like that. I see discussions of what hash function to use, ECC, symmetric modes, and so on, and yet no reasons why you would do anything. If you don't have requirements, you aren't going to get a good answer. Any of those proposed solutions needs to have an answer to the question "What is this trying to solve?" at the very least. The next question should be, "How does it solve it?" Protocols are almost never broken because of the crypto. That's not the problem. Tor as a protocol and as a *service* has many things it needs to worry about long before you should worry about the crypto. Don't lose sight of the real issues. Trust me, they aren't the crypto. You need to stick to well-understood ciphers. Do not be frightened by warts on a function. The property of being well-understood means that it will have warts. Additionally, you should stick to well-recognized constructions, too. The main reason is that warts and all, we know most about these things. The alternatives frequently either have different warts, or have unknown warts that will bite you in the ass when they're finally found. Moreover, Tor is an important protocol, service, and system. If you use a weird-ass construction, then a flaw in that will be publicized as a flaw in Tor. If someone finds a non-important problem in AES or SHA256, you'll be shielded by ubiquity. If you use the Whizzy2^n-1 whatever, then when a problem is found in it, that will hit the press as "FLAW FOUND IN TOR! EVERYONE RUN FOR THEIR LIVES!" and you'll be left defending it yourself. The crypto elders will also cluck their tongues and say, "Yeah, they really should have used what NIST documented in SP 800-9999." Lastly on that, Tor is no important that you'll be a target. People *will* target Tor because a teenie, tiny, unimportant break in Tor is a guaranteed paper acceptance in Usenix Security or Eurocrypt. Anonymity loves crowds; so do cryptographers. Here are some specifics: * Hash functions. Zooko will forward my full response, the summary of which is to use SHA-2 in whatever length or form. Should you not want to do that, why? What are you after? Speed? How much speed? Again, I don't see what you're trying to solve, but I'll recommend using Skein if you don't like SHA-2. I'm a co-designer of it, for full disclosure. Skein is secure, fast (under 7 clocks per byte on x64, and at 15.5 on armv7), and has lots of Swiss Army Knife features. You can use it as a KDF, PRNG, and so on. It has a one-pass MAC that has a proof of security on it, so it is a valuable alternative to HMAC. ZRTP is using Skein's MAC to speed up VOIP encryption as an alternative to HMAC (more full disclosure, I'm a co-author of ZRTP, too, but Phil Z had to talk me into it, and he had real performance goals to meet). However, the safe, sane thing to do is use SHA-256. * ECC. I don't see a reason why you're looking at it; my remark isn't code for thinking it's a bad idea, it means I don't see *reasons*, and yeah, yeah, I keep harping on that. ECC can be faster than RSA. But sometimes it isn't. I have seen RSA 1280 running at 10x the speed of P-256. Note that P-256 has the security properties of RSA 3072, so it's not comparing apples to apples. But I'll bet that Tor could run for the next five years on RSA 1280 or RSA 1536, and punt the problem into the future. It is my gut feel that if you migrate to 256 bit ECC from RSA 1024, you will see a performance drop. Nonetheless, you need to move from RSA 1024 to *something*. I am just noting that from a performance standpoint, you're not at the place where ECC is clearly better yet. Yet. Now on the other hand, ECC is the wave of the future and why not bite the bullet now? The patent issues are still thorny and that's another reason to punt the decision into the future. The major ECC patents started expiring this last summer, and if you delay the decision for three to five years, it'll be pretty much a non-issue. On the other, other hand, I don't think the Tor project has a lot to worry about. It is unlikely that RIM/Certicom will come after Tor. Heck, they are as likely to donate a license to the project as anything else. The IP issues shouldn't be much of a concern, but if they are -- wait. I see no requirements that say why you need ECC, and as I've said, you might find that ECC is slower than RSA 1280 or 1536. Should you go to ECC, use P-256; do not use some variant of 25519. You want to be in a crowd. You want to use something that lots of smart people have looked at. That means P-256. For *any* counter argument, please supply the requirement or goal that makes P-256 the inferior choice, preferably with a metric. Absent such, P-256 is your baby. * Stream ciphers. For a stream cipher, you should be doing at AES-CTR. Really. Stop worrying about the warts. There are hardware implementations with AES-NI, or attached hardware that don't have problems. ArmV8 has AES in hardware along with SHA256, which another reason to stick to the well-understood. AES-NI on an x64 processor runs at under 1 clock per byte. That's all goodness. If, for some reason, you really, really want to go off on a limb, then look seriously at counter mode with either Twofish or Serpent, as they're likely the next best-understood algorithms there are. However, there's not a lot of glory in breaking them, so they haven't seen nearly as much work. Ditto for any eStream cipher. If you must be trendy, then heck, please look at Threefish, which runs twice as fast as AES and is inherently tweakable (note again, that I'm a co-designer). But really, just use AES-CTR. GCM is an alternative, but I'd use CTR and an HMAC, myself, especially since you don't seem to have performance goals that are driving it. As others have noted, it's really tetchy to use correctly, and there are limits on the security it gives you. There is a 128-bit hardware multiply in the same Intel processors that have AES-NI to speed it up, but I'd stick to the tried-and-true HMAC, absent reasons with metrics. Whatever you do, please do not do something as gosh-darned daft as XOR two keystreams together. The crypto is not the problem, and every time someone tries to do something like that to strengthen the crypto it all ends up with someone's eye out. * Random number generators. You're bikeshedding. Any of those are great things to do, and there's no reason why continuing with what you've been doing (yeah, throw in more entropy sources, it never hurts) is wrong or suboptimal. Whatever. Do what you want, but I think you have bigger fish to fry. * Fingerprints. Also a reasonable discussion of what color the shed should be. My sole comment is that the gods punish hubris, and coding your hash function into two bits is hubris. One of the major lessons of the obsessive bit-conservation that PGP did is that you write more code and make more bugs saving those bits than the bits were worth. If you use something smaller than a byte, you will regret it. * Other comments. In the relay crypto comments, there was something about counter "in a less scary mode" -- just use what they describe in NIST SP800-whatever. Okay, that's it for me. Regards, Jon

Robert Ransom

4 Nov 4 Nov

1:01 p.m.

On 2011-11-03, Jon Callas <joncallas@me.com> wrote:

...

Zooko forwarded a hash question over to the SHA-3 competition mailing list, and mentioned the discussion that has been going on here. He's going to forward over comments that I made and John Kelsey made. Nonetheless, I'd like to offer some comments on what I've read in a larger context.

I don't see any discussions of requirements, goals, or anything like that. I see discussions of what hash function to use, ECC, symmetric modes, and so on, and yet no reasons why you would do anything. If you don't have requirements, you aren't going to get a good answer. Any of those proposed solutions needs to have an answer to the question "What is this trying to solve?" at the very least. The next question should be, "How does it solve it?"

For ‘What is this trying to solve?’, see: * https://www.torproject.org/docs/documentation#DesignDoc * https://svn.torproject.org/svn/projects/design-paper/tor-design.html * https://svn.torproject.org/svn/projects/design-paper/challenges.pdf * https://blog.torproject.org/blog/one-cell-enough See also the tor-dev mailing-list threads started with these documents: * https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-cry... * https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-cry... But as you may have seen, the tor-dev Cryptographic Bikeshed Design Task Force has ignored that part because it's more fun to talk endlessly about the least important details. (Based on the rest of your message, I assume they didn't even tell you about that part before inviting you into the discussion.)

...

Protocols are almost never broken because of the crypto. That's not the problem. Tor as a protocol and as a *service* has many things it needs to worry about long before you should worry about the crypto. Don't lose sight of the real issues. Trust me, they aren't the crypto. You need to stick to well-understood ciphers. Do not be frightened by warts on a function. The property of being well-understood means that it will have warts.

Additionally, you should stick to well-recognized constructions, too. The main reason is that warts and all, we know most about these things. The alternatives frequently either have different warts, or have unknown warts that will bite you in the ass when they're finally found. Moreover, Tor is an important protocol, service, and system. If you use a weird-ass construction, then a flaw in that will be publicized as a flaw in Tor. If someone finds a non-important problem in AES or SHA256, you'll be shielded by ubiquity. If you use the Whizzy2^n-1 whatever, then when a problem is found in it, that will hit the press as "FLAW FOUND IN TOR! EVERYONE RUN FOR THEIR LIVES!" and you'll be left defending it yourself. The crypto elders will also cluck their tongues and say, "Yeah, they really should have used what NIST documented in SP 800-9999." Lastly on that, Tor is no important that you'll be a target. People *will* target Tor because a teenie, tiny, unimportant break in Tor is a g uaranteed paper acceptance in Usenix Security or Eurocrypt. Anonymity loves crowds; so do cryptographers.

Tor's use of well-recognized constructions hasn't stopped the recent wave of ‘semi-technical bullshit attacks’ and plagiarisms of old, not-particularly-effective attacks from winning the attention economy. (One of the claims in the latest anti-Tor publicity campaign is that using AES in CTR mode is a ‘bugdoor’, and that we knew that we should have used CBC mode instead. This is even after the news that use of block ciphers in CBC mode breaks SSL's security. Yes, people *lie* in order to get publicity for an alleged attack on Tor.)

...

Here are some specifics:

* Hash functions. Zooko will forward my full response, the summary of which is to use SHA-2 in whatever length or form. Should you not want to do that, why? What are you after? Speed? How much speed? Again, I don't see what you're trying to solve, but I'll recommend using Skein if you don't like SHA-2. I'm a co-designer of it, for full disclosure.

Skein is secure, fast (under 7 clocks per byte on x64, and at 15.5 on armv7), and has lots of Swiss Army Knife features. You can use it as a KDF, PRNG, and so on. It has a one-pass MAC that has a proof of security on it, so it is a valuable alternative to HMAC. ZRTP is using Skein's MAC to speed up VOIP encryption as an alternative to HMAC (more full disclosure, I'm a co-author of ZRTP, too, but Phil Z had to talk me into it, and he had real performance goals to meet).

Skein sucks on 32-bit processors, and on AMD64 processors in 32-bit mode. All of our Windows packages are 32-bit. And I'm not a fan of using a cryptographic primitive with as many features as Skein has. Implementation complexity sucks, especially in crypto software.

...

However, the safe, sane thing to do is use SHA-256.

SHA-256 sucks unnecessarily on 64-bit processors. Our fast relays are 64-bit.

...

* ECC. I don't see a reason why you're looking at it; my remark isn't code for thinking it's a bad idea, it means I don't see *reasons*, and yeah, yeah, I keep harping on that.

* RSA public keys are long. Every client needs to download at least one public key (the circuit-extension handshake authentication key (called an ‘onion key’ in our current design documents)) for every relay. (Clients also need enough information about each relay to open and authenticate a ‘link’ to it, but a hash of the relay's identity key suffices in our current (TLS-based) link protocols.) * RSA private-key operations are slow. Every time a client extends a circuit to a relay, the relay must perform one private-key operation using its circuit-extension handshake key. (See also https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-nto... for one thing that we might replace our circuit-extension handshake with. We might use an EC-based signature system for circuit-extension handshake authentication instead, so that more of the relay-side scalar-multiply operations can be performed in advance.) * We want forward secrecy for circuits, so we use Diffie-Hellman, too. For every circuit extended to a relay, the relay must perform two scalar-multiply operations. Currently, we use 320-bit exponents in a 1024-bit mod-p group published in the 1990s. ECDH using Curve25519 is likely to be faster *and* more secure. * With ECC, we can use whole public keys instead of fingerprints. This actually matters in the new hidden service protocol, so we will use an EC-based signature system there.

...

ECC can be faster than RSA. But sometimes it isn't. I have seen RSA 1280 running at 10x the speed of P-256. Note that P-256 has the security properties of RSA 3072, so it's not comparing apples to apples. But I'll bet that Tor could run for the next five years on RSA 1280 or RSA 1536, and punt the problem into the future. It is my gut feel that if you migrate to 256 bit ECC from RSA 1024, you will see a performance drop. Nonetheless, you need to move from RSA 1024 to *something*. I am just noting that from a performance standpoint, you're not at the place where ECC is clearly better yet. Yet.

We are. (Yes, I know no one told you that.)

...

Now on the other hand, ECC is the wave of the future and why not bite the bullet now? The patent issues are still thorny and that's another reason to punt the decision into the future. The major ECC patents started expiring this last summer, and if you delay the decision for three to five years, it'll be pretty much a non-issue. On the other, other hand, I don't think the Tor project has a lot to worry about. It is unlikely that RIM/Certicom will come after Tor. Heck, they are as likely to donate a license to the project as anything else. The IP issues shouldn't be much of a concern, but if they are -- wait. I see no requirements that say why you need ECC, and as I've said, you might find that ECC is slower than RSA 1280 or 1536.

See above for the requirements no one told you about.

...

Should you go to ECC, use P-256; do not use some variant of 25519. You want to be in a crowd. You want to use something that lots of smart people have looked at. That means P-256. For *any* counter argument, please supply the requirement or goal that makes P-256 the inferior choice, preferably with a metric. Absent such, P-256 is your baby.

In the proposed ‘ntor’ handshake, the relay performs a scalar-multiply operation on a user-specified group element, using a long-term secret key. Curve25519 has a secure ‘twist’, so the relay can skip the group-membership test which might be required with other EC groups. Because of that (potentially significant) performance advantage, we have put some effort into improving at least one Curve25519 implementation. (I should note here that relays do become overloaded with circuit-extension crypto in the deployed Tor network, so we really do need to care about performance.) I have also seen parameters for an Edwards curve equivalent to Curve25519; we will need the Edwards-curve parameters in order to implement point addition efficiently in constant time for our EC signature scheme. I would rather not have to fuss around with finding the/an Edwards curve corresponding to NIST P-256. We can represent a Curve25519 public key in 51 base32 characters; we would need one additional character for a base32-encoded P-256 public key (assuming point reduction), and we wouldn't gain much (if any) security. Any one of these reasons should be sufficient to outweigh ‘NSA told us to use X’.

...

* Stream ciphers. For a stream cipher, you should be doing at AES-CTR. Really. Stop worrying about the warts. There are hardware implementations with AES-NI, or attached hardware that don't have problems. ArmV8 has AES in hardware along with SHA256, which another reason to stick to the well-understood. AES-NI on an x64 processor runs at under 1 clock per byte. That's all goodness.

I don't have one of those. Why should I have to use a cipher which no one has implemented securely *and* efficiently on my computer? Why do you want to continue to inflict side-channel leaks on our users in oppressed countries who have even older computers?

...

If, for some reason, you really, really want to go off on a limb, then look seriously at counter mode with either Twofish or Serpent, as they're likely the next best-understood algorithms there are. However, there's not a lot of glory in breaking them, so they haven't seen nearly as much work. Ditto for any eStream cipher. If you must be trendy, then heck, please look at Threefish, which runs twice as fast as AES and is inherently tweakable (note again, that I'm a co-designer). But really, just use AES-CTR.

Twofish is designed to be implemented using S-boxes, just like AES. Using Twofish gives us all of the side-channel attack risk and no speed benefit. Serpent is ridiculously slow. Salsa20 looks like a Good Thing. Salsa20/8 would be even better. In general, if we don't have to write a paragraph or two explaining why the published attacks on an algorithm used in our relay crypto are bogus or irrelevant to Tor, we're using too many rounds of it. The relay crypto only needs to make cryptographic attacks harder enough than the end-to-end timing correlation attack that no one will ever benefit by implementing an attack on our crypto. Anything we are willing to use will far exceed that goal.

...

GCM is an alternative, but I'd use CTR and an HMAC, myself, especially since you don't seem to have performance goals that are driving it. As others have noted, it's really tetchy to use correctly, and there are limits on the security it gives you. There is a 128-bit hardware multiply in the same Intel processors that have AES-NI to speed it up, but I'd stick to the tried-and-true HMAC, absent reasons with metrics.

Whatever you do, please do not do something as gosh-darned daft as XOR two keystreams together. The crypto is not the problem, and every time someone tries to do something like that to strengthen the crypto it all ends up with someone's eye out.

* Random number generators. You're bikeshedding. Any of those are great things to do, and there's no reason why continuing with what you've been doing (yeah, throw in more entropy sources, it never hurts) is wrong or suboptimal. Whatever. Do what you want, but I think you have bigger fish to fry.

* Fingerprints. Also a reasonable discussion of what color the shed should be. My sole comment is that the gods punish hubris, and coding your hash function into two bits is hubris. One of the major lessons of the obsessive bit-conservation that PGP did is that you write more code and make more bugs saving those bits than the bits were worth. If you use something smaller than a byte, you will regret it.

I agree here. We currently use a leading "$" to distinguish relay identity-key fingerprints from relay nicknames; we should distinguish new-style fingerprints by placing an algorithm identifier before the "$".

...

* Other comments. In the relay crypto comments, there was something about counter "in a less scary mode" -- just use what they describe in NIST SP800-whatever.

This is ‘less scary’ from the point of view of resisting side-channel attacks. NSA was warned that array lookups with secret indices would allow side-channel attacks. NSA told us to not worry about the side-channel attacks. Now we know that we should have been worrying about side-channel attacks all along, instead of just using the cipher NSA told us to. Robert Ransom

Ian Goldberg

2:24 p.m.

On Fri, Nov 04, 2011 at 01:01:09PM +0000, Robert Ransom wrote:

...

I have also seen parameters for an Edwards curve equivalent to Curve25519; we will need the Edwards-curve parameters in order to implement point addition efficiently in constant time for our EC signature scheme.

Hmm? curve25519 _is_ an Edwards curve (that's why it has that slightly annoying non-prime order), and djb's implementation, at least, is supposed to be constant-time. - Ian

Watson Ladd

4:21 p.m.

On Fri, Nov 4, 2011 at 9:24 AM, Ian Goldberg <iang@cs.uwaterloo.ca> wrote:

...

On Fri, Nov 04, 2011 at 01:01:09PM +0000, Robert Ransom wrote:

...
I have also seen parameters for an Edwards curve equivalent to Curve25519; we will need the Edwards-curve parameters in order to implement point addition efficiently in constant time for our EC signature scheme.

Hmm? curve25519 _is_ an Edwards curve (that's why it has that slightly annoying non-prime order), and djb's implementation, at least, is supposed to be constant-time. Dear all,

curve25519 is rationally equivalent to ed25519. Point addition isn't defined for curve25519 because public keys do not encode sign, because they are only the x coordinate. This is to take advantage of the special form y^2=x^3+a_2x^2+x. Until recently this was the fastest point exponentiation available, at the cost of making addition impossible. ed25519 supports point addition, and point compression without patents. This is because Edwards curves have never been discussed in Certicom patents. ed25519 is also faster then curve25519, due to new algorithms. In the future DJB has indicated he will have curve25519 convert into Edwards form for calculation. But signing requires ed25519 be used because addition is not defined on packed curve25519 keys. P-256 sadly does not support point compression without infringing on patents. So keys will have to be 64 bytes long. Edwards curves always exist over closures. The problem is that they only exist when the order is divisible by 4. Twisted Edward curves have points of order 2. P-256 could only be put into Edwards form with extension fields, and extension fields are slow. If we go with curve25519 we should not implement it ourselves. DJB has written an implementation that is quite nice to use in the form of NaCL. Signing is implemented along with batch signature verification (not in NaCL yet, but written). NaCL is also a lot nicer to use then OpenSSL, and is very fast (and ensures it always goes the fastest). Sincerely, Watson Ladd

...

- Ian _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

-- "Those who would give up Essential Liberty to purchase a little Temporary Safety deserve neither Liberty nor Safety." -- Benjamin Franklin

Marsh Ray

4:07 p.m.

On 11/04/2011 08:01 AM, Robert Ransom wrote:

...

On 2011-11-03, Jon Callas<joncallas@me.com> wrote:

...
However, the safe, sane thing to do is use SHA-256.

SHA-256 sucks unnecessarily on 64-bit processors. Our fast relays are 64-bit.

It may be worth mentioning the newly-standardized SHA-512/256 here. This is not a new function, it's "SHA-2". I.e., its SHA-512 with a unique IV and output truncated to 256 (or 224) bits.

...

http://csrc.nist.gov/publications/drafts/fips180-4/FRN_Draft-FIPS180-4.pdf

SHA-512 is based on 64 bit integer operations and seems to run a bit faster than SHA-256 on 64 bit processors. It looks quite competitive with even the SHA-3 candidates and no less conservative for security. Of course, whether or not it's better to be faster on 32-bit CPUs or 64-bit CPUs is another interesting discussion. Given the complex cache and bus organization on modern chips, my guess is that a design decision like CELL_LEN=512 is likely to have as much of an effect on overall throughput as a difference of a half-dozen clocks per byte in the hash function. - Marsh

Nick Mathewson

5:05 p.m.

Hi, Robert! Hi, Jon! As usual, please take me not as being "That fellow who is a pompous ass and says things that aren't true" but rather as "that fellow who knows that he is probably wrong about some stuff, and doesn't know a better way to find out what he's wrong about than getting corrected." IOW, this is what I think now, but if it's not reasonable, I want to thing something else. On Fri, Nov 4, 2011 at 9:01 AM, Robert Ransom <rransom.8774@gmail.com> wrote:

...

On 2011-11-03, Jon Callas <joncallas@me.com> wrote:

...
Zooko forwarded a hash question over to the SHA-3 competition mailing list, and mentioned the discussion that has been going on here. He's going to forward over comments that I made and John Kelsey made. Nonetheless, I'd like to offer some comments on what I've read in a larger context.

I don't see any discussions of requirements, goals, or anything like that. I see discussions of what hash function to use, ECC, symmetric modes, and so on, and yet no reasons why you would do anything. If you don't have requirements, you aren't going to get a good answer. Any of those proposed solutions needs to have an answer to the question "What is this trying to solve?" at the very least. The next question should be, "How does it solve it?"

For ‘What is this trying to solve?’, see:

* https://www.torproject.org/docs/documentation#DesignDoc

* https://svn.torproject.org/svn/projects/design-paper/tor-design.html

* https://svn.torproject.org/svn/projects/design-paper/challenges.pdf

* https://blog.torproject.org/blog/one-cell-enough

On this last point about cell-tagging attacks, see also https://lists.torproject.org/pipermail/tor-dev/2011-November/003024.html for my current thinking.

...

See also the tor-dev mailing-list threads started with these documents:

* https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-cry...

* https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-cry...

Right; and keep in mind that the requirements themselves are also in a evolving draft status. I think that one of the main problems with the draft I circulated the other week is that it tried to start discussion on three questions: * Should we keep using our current primitive operations (SHA1, AES-CTR starting at IV 0, 1024-bit DH, 1024-bit RSA, OpenSSL PRNG) or switch to something else? * Regardless of what primitive operations we're using, should we keep using them in the same ways in our protocol, or are their protocol improvements we could reasonably make? (That's the main point of Section 6 for relay crypto, and of the Goldberg-Stebilia-Ustaoglu "ntor" design for circuit exension.) * Assuming that we pick any primitives and protocols other than the ones we currently have, how can we best migrate? It should probably have been predictable that the first one would have gotten all the attention. Still, I'm not going to call "bike shed" too loudly here, since it helps people get involved in design, *and* the end result is actually important. Let me try to motivate this and explain goals a bit better. We want to choose our crypto primitives and protocols for a few properties. Some of the more justifiable ones from an engineering perspective are: [Please treat the above as a draft and tentative set of goals, motivations, and requirements. If you think any of these are stupid and wrongheaded, or just unmotivated, please tell me so!] A) They should be strong enough that attacking the crypto primitives and protocols is not the best attack on Tor. (We have this property today, I believe, and will continue to have it so long as nobody in the research community comes up with a solution to the low-latency traffic correlation attack.) B) Further, crypto attacks should not be even _close_ to the best attacks on Tor. (Except for the cell tagging attack, I think we have this property today. With cell tagging, however, the attack is IMO only a little worse than the end-to-end timing-attacks that a smart attacker would be doing instead. Moreover,, to the extent that it's feasible, I'd like to reach a situation where every crypto- or protocol-based attack on Tor is (and remains!) many orders of magnitude more expensive than the traffic correlation attacks that the research community doesn't know how to solve. Ideally, I'd like every crypto- or protocol-based attack to be hideously expensive on an absolute scale, if not downright impossible.) C) We need good performance. Our original design goal was "commodity hardware should be able to saturate commodity bandwidth." Right now, though, some of the busier servers do seem to be getting CPU-bound. (If I'm remembering what I've seen from profiles, the time sink seems to be to some extent AES and SHA1, and in larger part exponentiation for RSA and DH. But I'd want to double-check that to be sure. See comments below on performance requirements.) We also need our cell-crypto protocol to be space-efficient: bandwidth is at a premium on the Tor network. D) We'd like a large security margin in our protocol designs. For example, if we have a choice between two protocols, where one of them requires that SHA256 be collision-resistant while the other one only requires preimage-resistance, it would be nicer (ceteris paribus!) to use the second. Or if we can use our cipher in two modes, one of which requires that it resist adaptive chosen plaintext attacks, and the other one only requires that it resist known-plaintext attacks, the second one would seem (again, all other things being equal) better. E) Because our crypto aims to provide forward secrecy, the parts that do so need to have higher life expectancy than the rest of the rest of the system. For example, it's no big deal if somebody in the future manages to forge signatures from a key that's no longer valid: that doesn't open up new attacks. But if they break a stream cipher or a DH-based key agreement scheme that we are no longer using, that would be poor. That argues (I think) for higher security margins in our key agreement, key derivation, and encryption than in signatures and integrity. F) We ought to look for DOS opportunities and resist or mitigate them as possible. (This isn't limited to crypto stuff, but it does show up here a bit.) And here are some implementation concerns: G) As much as possible, we shouldn't have to be the primary maintainers of our crypto primitives. H) We need portable implementations of our primitives. I) We need high-quality implementations. To me, right now, this suggests that I'd like categorical rather than empirical arguments for non-viability of timing attacks. (IOW, I would much rather say "This implementation cannot have a timing side channel" than "Any timing side-channel that this implementation might have is probably tiny, infrequent, and hard to exploit.") And here are some more fluffy concerns: J) Nothing should be embarrassing, even if it's safe. It provokes unconfidence and uncertainty when there turns out to be a weakness in a primitive or protocol... even if as in D above we give ourselves enough margin to tolerate unforeseen weaknesses in our protocols or primitives, and even if as in B above we ensure that crypto attacks remain a deeply stupid way to attack Tor. K) It would be nice not to be the biggest or most prominent user of anything if we can possibly help it. This ensures that a wider variety of cryptographers are looking at it, [...]

...

...
Protocols are almost never broken because of the crypto. That's not the problem. Tor as a protocol and as a *service* has many things it needs to worry about long before you should worry about the crypto. Don't lose sight of the real issues. Trust me, they aren't the crypto.

Believe me, we know. Just because this is the Tor development topic that you're seeing first does not mean that it is the only one, or even the highest-priority one on my plate. :)

...

...
You need to stick to well-understood ciphers. Do not be frightened by warts on a function. The property of being well-understood means that it will have warts.

I think I either agree or disagree with you here depending on what you mean by 'warts'. I agree with you when it means stuff that's undesirable but easy to work around (like the length-extension properties of Merkle-Damgård hashes), or that is undesirable but impossible to exploit (like a 2^126.5 attack on a 128-bit system). But I am less sanguine about stuff that weakens a primitive to a degree that I don't understand. For example, consider the AES256 key schedule problems, which as I understand them are interesting but as yet unexploitable, since (again, if I understand right) you need a ridiculous number of plaintexts encrypted with a related keys, and any decent KDF will ensure that you don't have such related keys. But these attacks do seem to get better every year, they make me nervous about AES256. (We're using AES128 these days, as noted elsewhere.) You're right that it's a major cognitive error to think that a system with fewer known problems is necessarily better than one with more known problems, unless you know how much attention both systems have received.

...

...
Additionally, you should stick to well-recognized constructions, too. The main reason is that warts and all, we know most about these things. The alternatives frequently either have different warts, or have unknown warts that will bite you in the ass when they're finally found.

Agreed; we shouldn't be rolling our own MACs, for example. This is an area BTW where our existing protocols are not so great: there are some places in the protocol where we use H(x||y) when HMAC would be a better choice. Our ersatz KDF is as-yet-unbroken but kinda idiosyncratic. [ H(secret || [0]) || H(secret || [1]) || ... ]. We made some roll-our-own choices when we were getting started, and I'd like to move towards standard and well-analyzed stuff instead. Also, this is an area where we don't have off-the-shelf solutions for everything. For better or worse, there are not a good set of highly analyzed protocols for some of the things we need to build a working onion routing network these days. Most notably, the circuit establishment protocol and the cell relay protocol are ones where we had to invent stuff as we began, since there didn't seem to be a preexisting solution for either. In both cases, we wound up with something kind of inefficient with some well-known but not-so-bad issues. And in both cases we wound up with protocols where I believe we could do better today, given care and attention.

...

...
Moreover, Tor is an important protocol, service, and system. If you use a weird-ass construction, then a flaw in that will be publicized as a flaw in Tor. If someone finds a non-important problem in AES or SHA256, you'll be shielded by ubiquity. If you use the Whizzy2^n-1 whatever, then when a problem is found in it, that will hit the press as "FLAW FOUND IN TOR! EVERYONE RUN FOR THEIR LIVES!" and you'll be left defending it yourself. The crypto elders will also cluck their tongues and say, "Yeah, they really should have used what NIST documented in SP 800-9999." Lastly on that, Tor is no important that you'll be a target. People *will* target Tor because a teenie, tiny, unimportant break in Tor is a g uaranteed paper acceptance in Usenix Security or Eurocrypt. Anonymity loves crowds; so do cryptographers.

And how!

...

Tor's use of well-recognized constructions hasn't stopped the recent wave of ‘semi-technical bullshit attacks’ and plagiarisms of old, not-particularly-effective attacks from winning the attention economy.

(One of the claims in the latest anti-Tor publicity campaign is that using AES in CTR mode is a ‘bugdoor’, and that we knew that we should have used CBC mode instead. This is even after the news that use of block ciphers in CBC mode breaks SSL's security. Yes, people *lie* in order to get publicity for an alleged attack on Tor.)

I don't think that's a lie -- I think that's a combination of a probably-independent rediscovery of the cell-tagging attack with an inadequate understanding of Tor's threat model and the problems with CBC for our use.[*] Never attribute to malice etc. (And never attribute to plagiarism what can be explained by an unfamiliarity with parts of the literature, or a buried-but-still-present credit. The more serious the charge, the more caution and forbearance is warranted, especially when people are already ticked off with one another.) [*] The problems with "let's use CBC" in summary: If you do CBC wrong and don't do a new IV per cell, you're exposing yourself to something like the Rizzo-Duong attack on SSL. If you do CBC right, you need one IV per layer per cell, which is a pretty high overhead for Tor. CBC doesn't get particularly good tagging resistance: it just lowers the number of bits you can tag and your ability to recover the stream afterwards. So if you're willing to burn one block per hop per cell, you're better off using those bytes for a real MAC or GCM or something than for a CBC IV. [And if you are hoping to avoid transmitting IVs by having each cell get a fresh set of IV derived from a keyed PRNG or something, you aren't really doing orthodox CBC any more.]

...

...
Here are some specifics:

* Hash functions. [...]

Skipping this part for now. FWIW, I like Skein myself, and actually like its "pile of features" approach. (To my mind, if you don't provide a construction in a hash function for doing X (where X is personalization, MAC, PRNG, or whatever), people will go and do it badly. I know I have!) But you're right that we ought to write up speed requirements first. I think that for all our speed discussions, we ought to stop, instrument some busy servers, and look at how many cell encryptions, circuit extension handshakes, connection handshakes, and miscellaneous "other" crypto operations they actually spend their time on. Then we can talk categorically about which parts of the system matter how much for performance. We do need to be reasonably good on weak clients, but it's more important to be fast on servers IMO. [...]

...

...
* ECC. I don't see a reason why you're looking at it; my remark isn't code for thinking it's a bad idea, it means I don't see *reasons*, and yeah, yeah, I keep harping on that.

For me, the big reason is this one:

...

* We want forward secrecy for circuits, so we use Diffie-Hellman, too. For every circuit extended to a relay, the relay must perform two scalar-multiply operations. Currently, we use 320-bit exponents in a 1024-bit mod-p group published in the 1990s.

Because of reasons C (gotta be fast), D (gotta have security margins), and E (forward secrecy needs the biggest margins) from above, I think our DH operations are the ones where we should be most conservative about key size, and that plus performance requirements suggests a move to ECDH to me. Whether it's next year, in two years, or in three years is going to depend mostly on what the lawyers say. [...]

...

...
Should you go to ECC, use P-256; do not use some variant of 25519. You want to be in a crowd. You want to use something that lots of smart people have looked at. That means P-256. For *any* counter argument, please supply the requirement or goal that makes P-256 the inferior choice, preferably with a metric. Absent such, P-256 is your baby.

Possible! I want to talk to a bunch of ECC people here. Robert: from your list of properties, I think some of them are desirable, but not flatly compelling on their own. Nothing in ntor will _fail_ if we need to do group membership tests. Smaller keys and point representations are nice-to-have, but not so critical on their own. The overall speed advantage is what has been most attractive to me, but before we can decide whether it's compelling, we need more analysis and data. I think we can reach the best conclusions here by avoiding getting too married to any particular solution. [...]

...

...
* Stream ciphers. For a stream cipher, you should be doing at AES-CTR. Really. Stop worrying about the warts. There are hardware implementations with AES-NI, or attached hardware that don't have problems. ArmV8 has AES in hardware along with SHA256, which another reason to stick to the well-understood. AES-NI on an x64 processor runs at under 1 clock per byte. That's all goodness.

I don't have one of those. Why should I have to use a cipher which no one has implemented securely *and* efficiently on my computer? Why do you want to continue to inflict side-channel leaks on our users in oppressed countries who have even older computers?

I doubt Jon wants to inflict anything on anybody. At their best, hardware-based AES implementations can be pretty good these days. We do however need to see whether we can find better software-based implementations than openssl's, which is not making me thrilled at the moment. (Its solutions for avoiding cached-based timing seem kludgier than I would like, if I am understanding them correctly.) For software AES, some of what I've reading about bit-slicing implementations AES is pretty impressive. Assuming that we use as much of any given counter stream as I believe we typically do, we can easily tolerate the requirement of having to encrypt 1-4k of stream material in a go. Of course, there could be issues here that I don't at all know about! More knowledge and analysis is needed. [...]

...

In general, if we don't have to write a paragraph or two explaining why the published attacks on an algorithm used in our relay crypto are bogus or irrelevant to Tor, we're using too many rounds of it. The relay crypto only needs to make cryptographic attacks harder enough than the end-to-end timing correlation attack that no one will ever benefit by implementing an attack on our crypto. Anything we are willing to use will far exceed that goal.

I think some of the goals I list above suggest that we want more of a security margin that you suggest here. In particular, consider that the security of our cipher affects our forward secrecy. (Not that you're necessarily wrong, but it is not 100% clear-cut to me.) [...]

...

...
* Other comments. In the relay crypto comments, there was something about counter "in a less scary mode" -- just use what they describe in NIST SP800-whatever.

Guessing that you mean SP800-38A appendix B? There's more than one thing that looks like counter mode or a variation thereof in the SP800-... series. Assuming we're still on counter mode, yeah, what you said. Right now we use counter mode with a fixed initial IV, which isn't *bad* necessarily, but random-IV would be better. Ugh, what a long mail! I hope we are closer to asking the right questions soon. cheers, -- Nick

Zooko O'Whielacronx

11:49 p.m.

Here is the letter I wrote to the SHA-3 mailing list, followed by replies from Jon Callas and John Kelsey. ------- From: Zooko O'Whielacronx Folks: You might be interested in this discussion on the tor-dev mailing list about a new crypto protocol for Tor: https://lists.torproject.org/pipermail/tor-dev/2011-November/date.html One of the issues brought up by Tor developer Nick Mathewson was whether to switch from SHA-1 to SHA-2 or instead wait for SHA-3. I argued that SHA-2 is better than SHA-3 (at least for a few years). I feel slightly traitorous saying it, because I've enjoyed reading this mailing list for years, and many of the SHA-3 candidates are beautiful constructions, and I want them to be widely studied and used, but there you have it: I can't think of good arguments for the Tor developers to jump to SHA-3. Perhaps you can! Regards, Zooko ------- ------- From: Jon Callas Why do you feel traitorous saying it? In my opinion it's the right decision (and frankly, having read the notes, some of the others are sub-optimal), and I'm a co-author of a finalist. Engineering is the art of applying science within constraints. Those constraints include but are not limited to the laws of physics, budgets, schedules, performance, and so on. The worst decision is no decision. If you are unhappy with SHA-1 for security reasons, then your first choice should be using SHA-256. Your second choice should be SHA-512, possibly in one of the narrow-width constructions that NIST recently published. Those are the best-understood secure hash functions we have. SHA-3 candidates, no matter which ones they are are each and all slightly less well-understood than SHA-2, for all the obvious reasons. If you are unhappy with the above assessment for *performance* reasons, well, that's a different kettle of fish entirely. Further discussion is needed, but it's easier to find a performance metric and select along it than there is to find a security metric. It can be done, obviously, but the debate is a lot more entertaining. The hardest thing of all, of course, is to try to minimax the two considerations, security and performance. That debate would be really entertaining. All of this, however, presupposes that Tor can't reserve an algorithm identifier for SHA-3 and design things so SHA-256 is used now and SHA-3 when it's finally available. And yes, that's easier said than done, but heck, that's what engineering is, too. Jon ------- ------- From: John Kelsey For whatever it's worth, if someone asked me the same question, I'd give the same advice--go ahead and switch to SHA256 or SHA512, but make sure you build algorithm identifiers and versioning into your protocol, so that it's possible to change hash functions (and everything else) when you need to. SHA1 has a claimed collision attack (which looks like it should be right, but has never been verified experimentally), and also has a certain collision attack at 2^{80} work, so if you are worried about collisions in your application, you should definitely switch, and I think SHA2 is pretty much the only good choice right now. Similarly, if you're in a position to change hash functions now, but it will be a big pain to do it in two or three years, switching to SHA2 makes good sense. On the other hand, if you're only using SHA1 with HMAC to do MACing and key derivation and random number generation, there doesn't appear to be any great urgency in switching over, as there are no results I'm aware of that come anywhere close to really threatening those applications. There's no reason to keep SHA1 around in those applications if you don't need to, but there is also no enormous urgency switching over to SHA2 or SHA3 for HMAC. Finally, if you're only worried about preimage and second preimage resistance, SHA1 still seems okay. But be careful assuming that you don't care about collisions--there are a lot of weird and surprising things you can do to hash functions if you can find intermediate collisions. For example, if a single colliding message is no big deal, but a billion messages with the same hash is a disaster for your application, it turns out you care very much about the difficulty of finding one collision at a time in your Merkle-Damgaard hash function. It's much safer to just use hash functions whose cryptographic strength and width makes collisions impractical to ever find--that is, hash functions like SHA2 instead of SHA1. --John -------

5021

Age (days ago)

5023

Last active (days ago)

List overview

Download

8 comments

8 participants

participants (8)

Ian Goldberg
Jon Callas
Markku-Juhani O. Saarinen
Marsh Ray
Nick Mathewson
Robert Ransom
Watson Ladd
Zooko O'Whielacronx