Re: [tor-dev] SHA-3 isn't looking so hot to me

4 Nov 2011

      On 2011-11-03, Jon Callas <joncallas@me.com> wrote:
...
Zooko forwarded a hash question over to the SHA-3 competition mailing list,
and mentioned the discussion that has been going on here. He's going to
forward over comments that I made and John Kelsey made. Nonetheless, I'd
like to offer some comments on what I've read in a larger context.
I don't see any discussions of requirements, goals, or anything like that. I
see discussions of what hash function to use, ECC, symmetric modes, and so
on, and yet no reasons why you would do anything. If you don't have
requirements, you aren't going to get a good answer. Any of those proposed
solutions needs to have an answer to the question "What is this trying to
solve?" at the very least. The next question should be, "How does it solve
it?"
For ‘What is this trying to solve?’, see:

* https://www.torproject.org/docs/documentation#DesignDoc

* https://svn.torproject.org/svn/projects/design-paper/tor-design.html

* https://svn.torproject.org/svn/projects/design-paper/challenges.pdf

* https://blog.torproject.org/blog/one-cell-enough

See also the tor-dev mailing-list threads started with these
documents:

* https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-cry...

* https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-cry...

But as you may have seen, the tor-dev Cryptographic Bikeshed Design
Task Force has ignored that part because it's more fun to talk
endlessly about the least important details.  (Based on the rest of
your message, I assume they didn't even tell you about that part
before inviting you into the discussion.)
...
Protocols are almost never broken because of the crypto. That's not the
problem. Tor as a protocol and as a *service* has many things it needs to
worry about long before you should worry about the crypto. Don't lose sight
of the real issues. Trust me, they aren't the crypto. You need to stick to
well-understood ciphers. Do not be frightened by warts on a function. The
property of being well-understood means that it will have warts.
Additionally, you should stick to well-recognized constructions, too. The
main reason is that warts and all, we know most about these things. The
alternatives frequently either have different warts, or have unknown warts
that will bite you in the ass when they're finally found. Moreover, Tor is
an important protocol, service, and system. If you use a weird-ass
construction, then a flaw in that will be publicized as a flaw in Tor. If
someone finds a non-important problem in AES or SHA256, you'll be shielded
by ubiquity. If you use the Whizzy2^n-1 whatever, then when a problem is
found in it, that will hit the press as "FLAW FOUND IN TOR! EVERYONE RUN FOR
THEIR LIVES!" and you'll be left defending it yourself. The crypto elders
will also cluck their tongues and say, "Yeah, they really should have used
what NIST documented in SP 800-9999." Lastly on that, Tor is no important
that you'll be a target. People *will* target Tor because a teenie, tiny,
unimportant break in Tor is a g
 uaranteed paper acceptance in Usenix Security or Eurocrypt. Anonymity loves
crowds; so do cryptographers.
Tor's use of well-recognized constructions hasn't stopped the recent
wave of ‘semi-technical bullshit attacks’ and plagiarisms of old,
not-particularly-effective attacks from winning the attention economy.

(One of the claims in the latest anti-Tor publicity campaign is that
using AES in CTR mode is a ‘bugdoor’, and that we knew that we should
have used CBC mode instead.  This is even after the news that use of
block ciphers in CBC mode breaks SSL's security.  Yes, people *lie* in
order to get publicity for an alleged attack on Tor.)
...
Here are some specifics:
* Hash functions. Zooko will forward my full response, the summary of which
is to use SHA-2 in whatever length or form. Should you not want to do that,
why? What are you after? Speed? How much speed? Again, I don't see what
you're trying to solve, but I'll recommend using Skein if you don't like
SHA-2. I'm a co-designer of it, for full disclosure.
Skein is secure, fast (under 7 clocks per byte on x64, and at 15.5 on
armv7), and has lots of Swiss Army Knife features. You can use it as a KDF,
PRNG, and so on. It has a one-pass MAC that has a proof of security on it,
so it is a valuable alternative to HMAC. ZRTP is using Skein's MAC to speed
up VOIP encryption as an alternative to HMAC (more full disclosure, I'm a
co-author of ZRTP, too, but Phil Z had to talk me into it, and he had real
performance goals to meet).
Skein sucks on 32-bit processors, and on AMD64 processors in 32-bit
mode.  All of our Windows packages are 32-bit.

And I'm not a fan of using a cryptographic primitive with as many
features as Skein has.  Implementation complexity sucks, especially in
crypto software.
...
However, the safe, sane thing to do is use SHA-256.
SHA-256 sucks unnecessarily on 64-bit processors.  Our fast relays are
64-bit.
...
* ECC. I don't see a reason why you're looking at it; my remark isn't code
for thinking it's a bad idea, it means I don't see *reasons*, and yeah,
yeah, I keep harping on that.
* RSA public keys are long.  Every client needs to download at least
  one public key (the circuit-extension handshake authentication key
  (called an ‘onion key’ in our current design documents)) for every
  relay.  (Clients also need enough information about each relay to
  open and authenticate a ‘link’ to it, but a hash of the relay's
  identity key suffices in our current (TLS-based) link protocols.)

* RSA private-key operations are slow.  Every time a client extends a
  circuit to a relay, the relay must perform one private-key operation
  using its circuit-extension handshake key.  (See also
  https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-nto...
  for one thing that we might replace our circuit-extension handshake
  with.  We might use an EC-based signature system for
  circuit-extension handshake authentication instead, so that more of
  the relay-side scalar-multiply operations can be performed in
  advance.)

* We want forward secrecy for circuits, so we use Diffie-Hellman, too.
  For every circuit extended to a relay, the relay must perform two
  scalar-multiply operations.  Currently, we use 320-bit exponents in
  a 1024-bit mod-p group published in the 1990s.  ECDH using
  Curve25519 is likely to be faster *and* more secure.

* With ECC, we can use whole public keys instead of fingerprints.
  This actually matters in the new hidden service protocol, so we will
  use an EC-based signature system there.
...
ECC can be faster than RSA. But sometimes it isn't. I have seen RSA 1280
running at 10x the speed of P-256. Note that P-256 has the security
properties of RSA 3072, so it's not comparing apples to apples. But I'll bet
that Tor could run for the next five years on RSA 1280 or RSA 1536, and punt
the problem into the future. It is my gut feel that if you migrate to 256
bit ECC from RSA 1024, you will see a performance drop. Nonetheless, you
need to move from RSA 1024 to *something*. I am just noting that from a
performance standpoint, you're not at the place where ECC is clearly better
yet. Yet.
We are.  (Yes, I know no one told you that.)
...
Now on the other hand, ECC is the wave of the future and why not bite the
bullet now? The patent issues are still thorny and that's another reason to
punt the decision into the future. The major ECC patents started expiring
this last summer, and if you delay the decision for three to five years,
it'll be pretty much a non-issue. On the other, other hand, I don't think
the Tor project has a lot to worry about. It is unlikely that RIM/Certicom
will come after Tor. Heck, they are as likely to donate a license to the
project as anything else. The IP issues shouldn't be much of a concern, but
if they are -- wait. I see no requirements that say why you need ECC, and as
I've said, you might find that ECC is slower than RSA 1280 or 1536.
See above for the requirements no one told you about.
...
Should you go to ECC, use P-256; do not use some variant of 25519. You want
to be in a crowd. You want to use something that lots of smart people have
looked at. That means P-256. For *any* counter argument, please supply the
requirement or goal that makes P-256 the inferior choice, preferably with a
metric. Absent such, P-256 is your baby.
In the proposed ‘ntor’ handshake, the relay performs a scalar-multiply
operation on a user-specified group element, using a long-term secret
key.  Curve25519 has a secure ‘twist’, so the relay can skip the
group-membership test which might be required with other EC groups.
Because of that (potentially significant) performance advantage, we
have put some effort into improving at least one Curve25519
implementation.  (I should note here that relays do become overloaded
with circuit-extension crypto in the deployed Tor network, so we
really do need to care about performance.)

I have also seen parameters for an Edwards curve equivalent to
Curve25519; we will need the Edwards-curve parameters in order to
implement point addition efficiently in constant time for our EC
signature scheme.  I would rather not have to fuss around with finding
the/an Edwards curve corresponding to NIST P-256.

We can represent a Curve25519 public key in 51 base32 characters; we
would need one additional character for a base32-encoded P-256 public
key (assuming point reduction), and we wouldn't gain much (if any)
security.

Any one of these reasons should be sufficient to outweigh ‘NSA told us
to use X’.
...
* Stream ciphers. For a stream cipher, you should be doing at AES-CTR.
Really. Stop worrying about the warts. There are hardware implementations
with AES-NI, or attached hardware that don't have problems. ArmV8 has AES in
hardware along with SHA256, which another reason to stick to the
well-understood. AES-NI on an x64 processor runs at under 1 clock per byte.
That's all goodness.
I don't have one of those.  Why should I have to use a cipher which no
one has implemented securely *and* efficiently on my computer?  Why do
you want to continue to inflict side-channel leaks on our users in
oppressed countries who have even older computers?
...
If, for some reason, you really, really want to go off on a limb, then look
seriously at counter mode with either Twofish or Serpent, as they're likely
the next best-understood algorithms there are. However, there's not a lot of
glory in breaking them, so they haven't seen nearly as much work. Ditto for
any eStream cipher. If you must be trendy, then heck, please look at
Threefish, which runs twice as fast as AES and is inherently tweakable (note
again, that I'm a co-designer). But really, just use AES-CTR.
Twofish is designed to be implemented using S-boxes, just like AES.
Using Twofish gives us all of the side-channel attack risk and no
speed benefit.

Serpent is ridiculously slow.

Salsa20 looks like a Good Thing.  Salsa20/8 would be even better.

In general, if we don't have to write a paragraph or two explaining
why the published attacks on an algorithm used in our relay crypto are
bogus or irrelevant to Tor, we're using too many rounds of it.  The
relay crypto only needs to make cryptographic attacks harder enough
than the end-to-end timing correlation attack that no one will ever
benefit by implementing an attack on our crypto.  Anything we are
willing to use will far exceed that goal.
...
GCM is an alternative, but I'd use CTR and an HMAC, myself, especially since
you don't seem to have performance goals that are driving it. As others have
noted, it's really tetchy to use correctly, and there are limits on the
security it gives you. There is a 128-bit hardware multiply in the same
Intel processors that have AES-NI to speed it up, but I'd stick to the
tried-and-true HMAC, absent reasons with metrics.
Whatever you do, please do not do something as gosh-darned daft as XOR two
keystreams together. The crypto is not the problem, and every time someone
tries to do something like that to strengthen the crypto it all ends up with
someone's eye out.
* Random number generators. You're bikeshedding. Any of those are great
things to do, and there's no reason why continuing with what you've been
doing (yeah, throw in more entropy sources, it never hurts) is wrong or
suboptimal. Whatever. Do what you want, but I think you have bigger fish to
fry.
* Fingerprints. Also a reasonable discussion of what color the shed should
be. My sole comment is that the gods punish hubris, and coding your hash
function into two bits is hubris. One of the major lessons of the obsessive
bit-conservation that PGP did is that you write more code and make more bugs
saving those bits than the bits were worth. If you use something smaller
than a byte, you will regret it.
I agree here.  We currently use a leading "$" to distinguish relay
identity-key fingerprints from relay nicknames; we should distinguish
new-style fingerprints by placing an algorithm identifier before the
"$".
...
* Other comments. In the relay crypto comments, there was something about
counter "in a less scary mode" -- just use what they describe in NIST
SP800-whatever.
This is ‘less scary’ from the point of view of resisting side-channel
attacks.  NSA was warned that array lookups with secret indices would
allow side-channel attacks.  NSA told us to not worry about the
side-channel attacks.  Now we know that we should have been worrying
about side-channel attacks all along, instead of just using the cipher
NSA told us to.

Robert Ransom