Re: [tor-dev] SHA-3 isn't looking so hot to me

4 Nov 2011

      Hi, Robert!  Hi, Jon!

As usual, please take me not as being "That fellow who is a pompous
ass and says things that aren't true" but rather as "that fellow who
knows that he is probably wrong about some stuff, and doesn't know a
better way to find out what he's wrong about than getting corrected."

IOW, this is what I think now, but if it's not reasonable, I want to
thing something else.
On Fri, Nov 4, 2011 at 9:01 AM, Robert Ransom <rransom.8774@gmail.com> wrote:
...
On 2011-11-03, Jon Callas <joncallas@me.com> wrote:
...
Zooko forwarded a hash question over to the SHA-3 competition mailing list,
and mentioned the discussion that has been going on here. He's going to
forward over comments that I made and John Kelsey made. Nonetheless, I'd
like to offer some comments on what I've read in a larger context.
I don't see any discussions of requirements, goals, or anything like that. I
see discussions of what hash function to use, ECC, symmetric modes, and so
on, and yet no reasons why you would do anything. If you don't have
requirements, you aren't going to get a good answer. Any of those proposed
solutions needs to have an answer to the question "What is this trying to
solve?" at the very least. The next question should be, "How does it solve
it?"
For ‘What is this trying to solve?’, see:
* https://www.torproject.org/docs/documentation#DesignDoc
* https://svn.torproject.org/svn/projects/design-paper/tor-design.html
* https://svn.torproject.org/svn/projects/design-paper/challenges.pdf
* https://blog.torproject.org/blog/one-cell-enough
On this last point about cell-tagging attacks, see also
https://lists.torproject.org/pipermail/tor-dev/2011-November/003024.html
for my current thinking.
...
See also the tor-dev mailing-list threads started with these
documents:
* https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-cry...
* https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-cry...
Right; and keep in mind that the requirements themselves are also in a
evolving draft status.

I think that one of the main problems with the draft I circulated the
other week is that it tried to start discussion on three questions:

  * Should we keep using our current primitive operations (SHA1,
AES-CTR starting at IV 0, 1024-bit DH, 1024-bit RSA, OpenSSL PRNG) or
switch to something else?
  * Regardless of what primitive operations we're using, should we
keep using them in the same ways in our protocol, or are their
protocol improvements we could reasonably make?  (That's the main
point of Section 6 for relay crypto, and of the
Goldberg-Stebilia-Ustaoglu "ntor" design for circuit exension.)
  * Assuming that we pick any primitives and protocols other than the
ones we currently have, how can we best migrate?

It should probably have been predictable that the first one would have
gotten all the attention.  Still, I'm not going to call "bike shed"
too loudly here, since it helps people get involved in design, *and*
the end result is actually important.

Let me try to motivate this and explain goals a bit better.  We want
to choose our crypto primitives and protocols for a few properties.
Some of the more justifiable ones from an engineering perspective are:

[Please treat the above as a draft and tentative set of goals,
motivations, and requirements.  If you think any of these are stupid
and wrongheaded, or just unmotivated, please tell me so!]

  A) They should be strong enough that attacking the crypto primitives
and protocols is not the best attack on Tor. (We have this property
today, I believe, and will continue to have it so long as nobody in
the research community comes up with a solution to the low-latency
traffic correlation attack.)

  B) Further, crypto attacks should not be even _close_ to the best
attacks on Tor.  (Except for the cell tagging attack, I think we have
this property today.  With cell tagging, however, the attack is IMO
only a little worse than the end-to-end timing-attacks that a smart
attacker would be doing instead.  Moreover,, to the extent that it's
feasible, I'd like to reach a situation where every crypto- or
protocol-based attack on Tor is (and remains!) many orders of
magnitude more expensive than the traffic correlation attacks that the
research community doesn't know how to solve.  Ideally, I'd like every
crypto- or protocol-based attack to be hideously expensive on an
absolute scale, if not downright impossible.)

  C) We need good performance.  Our original design goal was
"commodity hardware should be able to saturate commodity bandwidth."
Right now, though, some of the busier servers do seem to be getting
CPU-bound.  (If I'm remembering what I've seen from profiles, the time
sink seems to be to some extent AES and SHA1, and in larger part
exponentiation for RSA and DH.  But I'd want to double-check that to
be sure. See comments below on performance requirements.)  We also
need our cell-crypto protocol to be space-efficient: bandwidth is at a
premium on the Tor network.

  D) We'd like a large security margin in our protocol designs.  For
example, if we have a choice between two protocols, where one of them
requires that SHA256 be collision-resistant while the other one only
requires preimage-resistance, it would be nicer (ceteris paribus!) to
use the second.  Or if we can use our cipher in two modes, one of
which requires that it resist adaptive chosen plaintext attacks, and
the other one only requires that it resist known-plaintext attacks,
the second one would seem (again, all other things being equal)
better.

  E) Because our crypto aims to provide forward secrecy, the parts
that do so need to have higher life expectancy than the rest of the
rest of the system.  For example, it's no big deal if somebody in the
future manages to forge signatures from a key that's no longer valid:
that doesn't open up new attacks.  But if they break a stream cipher
or a DH-based key agreement scheme that we are no longer using, that
would be poor.  That argues (I think) for higher security margins in
our key agreement, key derivation, and encryption than in signatures
and integrity.

  F) We ought to look for DOS opportunities and resist or mitigate
them as possible.  (This isn't limited to crypto stuff, but it does
show up here a bit.)

And here are some implementation concerns:
  G) As much as possible, we shouldn't have to be the primary
maintainers of our crypto primitives.

  H) We need portable implementations of our primitives.

  I) We need high-quality implementations. To me, right now, this
suggests that I'd like categorical rather than empirical arguments for
non-viability of timing attacks.  (IOW, I would much rather say "This
implementation cannot have a timing side channel" than "Any timing
side-channel that this implementation might have is probably tiny,
infrequent, and hard to exploit.")

And here are some more fluffy concerns:

  J) Nothing should be embarrassing, even if it's safe.  It provokes
unconfidence and uncertainty when there turns out to be a weakness in
a primitive or protocol... even if as in D above we give ourselves
enough margin to tolerate unforeseen weaknesses in our protocols or
primitives, and even if as in B above we ensure that crypto attacks
remain a deeply stupid way to attack Tor.

  K) It would be nice not to be the biggest or most prominent user of
anything if we can possibly help it.  This ensures that a wider
variety of cryptographers are looking at it,

[...]
...
...
Protocols are almost never broken because of the crypto. That's not the
problem. Tor as a protocol and as a *service* has many things it needs to
worry about long before you should worry about the crypto. Don't lose sight
of the real issues.  Trust me, they aren't the crypto.
Believe me, we know.  Just because this is the Tor development topic
that you're seeing first does not mean that it is the only one, or
even the highest-priority one on my plate. :)
...
...
You need to stick to
well-understood ciphers. Do not be frightened by warts on a function. The
property of being well-understood means that it will have warts.
I think I either agree or disagree with you here depending on what you
mean by 'warts'.  I agree with you when it means stuff that's
undesirable but easy to work around (like the length-extension
properties of Merkle-Damgård hashes), or that is undesirable but
impossible to exploit (like a 2^126.5 attack on a 128-bit system).

But I am less sanguine about stuff that weakens a primitive to a
degree that I don't understand.  For example, consider the AES256 key
schedule problems, which as I understand them are interesting but as
yet unexploitable, since (again, if I understand right) you need a
ridiculous number of plaintexts encrypted with a related keys, and any
decent KDF will ensure that you don't have such related keys.  But
these attacks do seem to get better every year, they make me nervous
about AES256.  (We're using AES128 these days, as noted elsewhere.)

You're right that it's a major cognitive error to think that a system
with fewer known problems is necessarily better than one with more
known problems, unless you know how much attention both systems have
received.
...
...
Additionally, you should stick to well-recognized constructions, too. The
main reason is that warts and all, we know most about these things. The
alternatives frequently either have different warts, or have unknown warts
that will bite you in the ass when they're finally found.
Agreed; we shouldn't be rolling our own MACs, for example.

This is an area BTW where our existing protocols are not so great:
there are some places in the protocol where we use H(x||y) when HMAC
would be a better choice.  Our ersatz KDF is as-yet-unbroken but kinda
idiosyncratic. [ H(secret || [0]) || H(secret || [1]) || ... ].  We
made some roll-our-own choices when we were getting started, and I'd
like to move towards standard and well-analyzed stuff instead.

Also, this is an area where we don't have off-the-shelf solutions for
everything.  For better or worse, there are not a good set of highly
analyzed protocols for some of the things we need to build a working
onion routing network these days.  Most notably, the circuit
establishment protocol and the cell relay protocol are ones where we
had to invent stuff as we began, since there didn't seem to be a
preexisting solution for either.  In both cases, we wound up with
something kind of inefficient with some well-known but not-so-bad
issues.

And in both cases we wound up with protocols where I believe we could
do better today, given care and attention.
...
...
Moreover, Tor is
an important protocol, service, and system. If you use a weird-ass
construction, then a flaw in that will be publicized as a flaw in Tor. If
someone finds a non-important problem in AES or SHA256, you'll be shielded
by ubiquity. If you use the Whizzy2^n-1 whatever, then when a problem is
found in it, that will hit the press as "FLAW FOUND IN TOR! EVERYONE RUN FOR
THEIR LIVES!" and you'll be left defending it yourself. The crypto elders
will also cluck their tongues and say, "Yeah, they really should have used
what NIST documented in SP 800-9999." Lastly on that, Tor is no important
that you'll be a target. People *will* target Tor because a teenie, tiny,
unimportant break in Tor is a g
 uaranteed paper acceptance in Usenix Security or Eurocrypt. Anonymity loves
crowds; so do cryptographers.
And how!
...
Tor's use of well-recognized constructions hasn't stopped the recent
wave of ‘semi-technical bullshit attacks’ and plagiarisms of old,
not-particularly-effective attacks from winning the attention economy.
(One of the claims in the latest anti-Tor publicity campaign is that
using AES in CTR mode is a ‘bugdoor’, and that we knew that we should
have used CBC mode instead.  This is even after the news that use of
block ciphers in CBC mode breaks SSL's security.  Yes, people *lie* in
order to get publicity for an alleged attack on Tor.)
I don't think that's a lie -- I think that's a combination of a
probably-independent rediscovery of the cell-tagging attack with an
inadequate understanding of Tor's threat model and the problems with
CBC for our use.[*]  Never attribute to malice etc.

(And never attribute to plagiarism what can be explained by an
unfamiliarity with parts of the literature, or a
buried-but-still-present credit. The more serious the charge, the more
caution and forbearance is warranted, especially when people are
already ticked off with one another.)

[*]  The problems with "let's use CBC" in summary: If you do CBC wrong
and don't do a new IV per cell, you're exposing yourself to something
like the Rizzo-Duong attack on SSL.  If you do CBC right, you need one
IV per layer per cell, which is a pretty high overhead for Tor.  CBC
doesn't get particularly good tagging resistance: it just lowers the
number of bits you can tag and your ability to recover the stream
afterwards.  So if you're willing to burn one block per hop per cell,
you're better off using those bytes for a real MAC or GCM or something
than for a CBC IV.  [And if you are hoping to avoid transmitting IVs
by having each cell get a fresh set of IV derived from a keyed PRNG or
something, you aren't really doing orthodox CBC any more.]
...
...
Here are some specifics:
* Hash functions.
[...]
Skipping this part for now.  FWIW, I like Skein myself, and actually
like its "pile of features" approach.  (To my mind, if you don't
provide a construction in a hash function for doing X (where X is
personalization, MAC, PRNG, or whatever), people will go and do it
badly. I know I have!)  But you're right that we ought to write up
speed requirements first.

I think that for all our speed discussions, we ought to stop,
instrument some busy servers, and look at how many cell encryptions,
circuit extension handshakes, connection handshakes, and miscellaneous
"other" crypto operations they actually spend their time on.  Then we
can talk categorically about which parts of the system matter how much
for performance.

We do need to be reasonably good on weak clients, but it's more
important to be fast on servers IMO.

 [...]
...
...
* ECC. I don't see a reason why you're looking at it; my remark isn't code
for thinking it's a bad idea, it means I don't see *reasons*, and yeah,
yeah, I keep harping on that.
For me, the big reason is this one:
...
* We want forward secrecy for circuits, so we use Diffie-Hellman, too.
 For every circuit extended to a relay, the relay must perform two
 scalar-multiply operations.  Currently, we use 320-bit exponents in
 a 1024-bit mod-p group published in the 1990s.
Because of reasons C (gotta be fast), D (gotta have security margins),
and E (forward secrecy needs the biggest margins) from above, I think
our DH operations are the ones where we should be most conservative
about key size, and that plus performance requirements suggests a move
to ECDH to me.  Whether it's next year, in two years, or in three
years is going to depend mostly on what the lawyers say.

[...]
...
...
Should you go to ECC, use P-256; do not use some variant of 25519. You want
to be in a crowd. You want to use something that lots of smart people have
looked at. That means P-256. For *any* counter argument, please supply the
requirement or goal that makes P-256 the inferior choice, preferably with a
metric. Absent such, P-256 is your baby.
Possible!  I want to talk to a bunch of ECC people here.

Robert: from your list of properties, I think some of them are
desirable, but not flatly compelling on their own. Nothing in ntor
will _fail_ if we need to do group membership tests.  Smaller keys and
point representations are nice-to-have, but not so critical on their
own.

The overall speed advantage is what has been most attractive to me,
but before we can decide whether it's compelling, we need more
analysis and data.

I think we can reach the best conclusions here by avoiding getting too
married to any particular solution.

[...]
...
...
* Stream ciphers. For a stream cipher, you should be doing at AES-CTR.
Really. Stop worrying about the warts. There are hardware implementations
with AES-NI, or attached hardware that don't have problems. ArmV8 has AES in
hardware along with SHA256, which another reason to stick to the
well-understood. AES-NI on an x64 processor runs at under 1 clock per byte.
That's all goodness.
I don't have one of those.  Why should I have to use a cipher which no
one has implemented securely *and* efficiently on my computer?  Why do
you want to continue to inflict side-channel leaks on our users in
oppressed countries who have even older computers?
I doubt Jon wants to inflict anything on anybody.

At their best, hardware-based AES implementations can be pretty good
these days.  We do however need to see whether we can find better
software-based implementations than openssl's, which is not making me
thrilled at the moment.  (Its solutions for avoiding cached-based
timing seem kludgier than I would like, if I am understanding them
correctly.)

For software AES, some of what I've reading about bit-slicing
implementations AES is pretty impressive.  Assuming that we use as
much of any given counter stream as I believe we typically do, we can
easily tolerate the requirement of having to encrypt 1-4k of stream
material in a go.

Of course, there could be issues here that I don't at all know about!
More knowledge and analysis is needed.

[...]
...
In general, if we don't have to write a paragraph or two explaining
why the published attacks on an algorithm used in our relay crypto are
bogus or irrelevant to Tor, we're using too many rounds of it.  The
relay crypto only needs to make cryptographic attacks harder enough
than the end-to-end timing correlation attack that no one will ever
benefit by implementing an attack on our crypto.  Anything we are
willing to use will far exceed that goal.
I think some of the goals I list above suggest that we want more of a
security margin that you suggest here.  In particular, consider that
the security of our cipher affects our forward secrecy.  (Not that
you're necessarily wrong, but it is not 100% clear-cut to me.)

[...]
...
...
* Other comments. In the relay crypto comments, there was something about
counter "in a less scary mode" -- just use what they describe in NIST
SP800-whatever.
Guessing that you mean SP800-38A appendix B?  There's more than one
thing that looks like counter mode or a variation thereof in the
SP800-... series.

Assuming we're still on counter mode, yeah, what you said.  Right now
we use counter mode with a fixed initial IV, which isn't *bad*
necessarily, but random-IV would be better.

Ugh, what a long mail!  I hope we are closer to asking the right questions soon.

cheers,
-- 
Nick

Re: [tor-dev] SHA-3 isn't looking so hot to me

Nick Mathewson