Re: [tor-dev] Even more notes on relay-crypto constructions

19 Oct 2012

      On Thu, Oct 18, 2012 at 11:18 PM, Mike Perry <mikeperry@torproject.org> wrote:
...
Thus spake Nick Mathewson (nickm@alum.mit.edu):
...
On Thu, Oct 18, 2012 at 6:10 PM, Mike Perry <mikeperry@torproject.org> wrote:
 [...]
...
...
There are modes that are supposed to prevent this, and applying them
to a decent wide-block cipher might solve the issue. IGE is one of
them [IGE], but it turns out to be broken by an attacker who knows
some plaintext.  The Accumulated Block Chaining [ABC] construction is
supposed to fix that; I'm not too sure whether it's correct or
efficient.
Am I crazy to think we might try to stop the bleeding of tagging attacks
by figuring out a way to use ABC or IGE mode as a stopgap until people
can code and evaluate new constructions for performance and timing
side-channels? ABC/IGE would "only" involve a mode change, rather than
an entire relay protocol upgrade and new cipher coding..
ABC or IGE wouldn't help us much on their own without a wide-block
cipher, and IGE is just plain broken. (See explanation below.)
Remember, in the document I originally sent, I was talking about using
ABC or some other corruption-propagation mode at a block level.  That
requires a wide-block cipher, though.  And it turns out we can do
better if the corruption-propagation is part of the wide-block idea.
We'd also burn our performance on platforms without AES acceleration, I think.
...
IGE might also actually exist in OpenSSL:
http://www.links.org/?p=137
It also sounds like IGE is only broken if we try to use it for
authentication.. We don't really need that property, do we? What we
really want is the plaintext corruption property at the middle node upon
ciphertext modification..
That _is_ a kind of authentication, or an analogue to it.  And the
point is that an adversary can repair a hole in the stream, and *stop*
the plaintext corruption.  So IGE does not deliver the property we
would want for it, even if we could use it.
I am still wondering if it is possible to eliminate enough consecutive
regions of known plaintext to make this acceptable for the short-term,
until we figure out the wide-block thing for real. From the attack here:
http://www.mail-archive.com/cryptography@metzdowd.com/msg06599.html it
looks as though as long as we can avoid 32 consecutive bytes of known
plaintext (two consecutive 128bit cipher blocks), we can prevent
hole-closing.
But 32 consecutive bytes of known plaintext are pretty much
inevitable, right?  That's how protocols work.

Also, that is *one* attack on IGE.  That one needs a given amount of
known plaintext.  I have no idea which others exist.  Generally once
there's one known attack on a system, it gets less attention from
cryptographers, since who'd bother going around enumerating all the
other circumstances under which a broken thing is broken.  So when you
read that the system isn't secure because it breaks if you do X, you
can't usually infer that it's okay so long as you _don't_ do X -- you
need to look for actual proofs or whatever that it is safe under the
circumstance you propose.
...
If you want to know why I'm crazy enough to still be wondering this,
see subsequent paragraphs.
...
Check out this thread, and the stuff it references:
   http://www.mail-archive.com/cryptography@metzdowd.com/msg06599.html
...
We could also remove a lot of known plaintext by replacing zero-fill
with random fill in RELAY_RESOLVE, RELAY_BEGIN, and other short relay
cells. That should only be expensive at the client...
So long as there is a block's worth of known or guessed plaintext, IGE
fails to ensure that changes propagate forward.  Like, 16 bytes worth
of guessable HTTP in a payload (if you're thinking about this in a
non-wide-block scenario).
Hrmm.. I think that failures after the stream is established are way
less dangerous than ways you can tag and cause failures *before* the
stream is established. In the pre-established case, Tor keeps retrying
transparently behind the user's back until it gets a compromised exit.
In the post-established case, the user is completely unable to use Tor
80% or 90% of the time, because the circuit is torn down *after* their
user agent has begun sending data.. In other words, at least we would
fail closed.
So could one workaround right answer be to time out after fewer exits,
and/or notice differential stream failure rates between different
guards?  That would be a pretty neat thing to do; I wonder if it would
work.

Of course we'd need to figure it out nice and get it implemented solidly.
...
This reminds me of something I also wanted to ask about. Technically for
the tagging attack, all we need to authenticate is circuit construction
and RELAY_RESOLVE and RELAY_BEGIN. Might there be ways to get this
without the expense and complications of either truncated MAC's or
wide-block ciphers? Or at least remove known-plaintext from *those*
cells?
I don't think those are the only attack opportunities here.  Again,
they're ones that have been explained and proposed, but there's surely
more stuff too.
...
...
Two general process thoughts:
* I may be saying this from an overabundance of caution, but: I don't
think we should use cryptographic primitives and constructions with
known flaws, even if we can't see a way for them to hurt us right now,
and even if we can come up with a solid-seeming argument for how those
flaws can't hurt us..  That's how we got into our AES-CTR mess in the
first place.
I would argue that where we *really* need an overabundance of caution is
to ensure we provide the agility to change the cipher mode/construction
for this scheme in a very short period of time. I don't think our *real*
woes are because we didn't think hard enough about cryptography or the
security properties of AES_CTR. They're because we fixed the cipher and
mode at "AES_CTR", and now we're going to be stuck with vulnerability to
a very dangerous attack for years.. "If you're typing the letters AES
into your code, you're doing it wrong."
Well, keep in mind that we didn't, and still don't, have a drop-in
replacement that's any good.  Our design right now has a place where
you plug in a stream cipher.  Sure, we could have made it so you can
drop in RC4 or 3DES-OFB or whatever craziness we would have come up
with in 2004.  But dropping in something *good* instead of AES_CTR
requires that it not be a stream cipher.  And we don't have a
non-stream-cipher mode that works here.
...
Based on this idea, I'm wondering if we should spend more of our time
thinking hard about making the relay protocol be able to support
changing the construction/primitive so we can support a readily
available but non-ideal mode for 0.2.4.x, but then upgrade to something
stronger for 0.2.5.x. (And when *that* construction/implementation turns
out to be flawed or have side-channels, we can switch again in 0.2.6.x).
If we spend time on ensuring this agility instead of pondering the deep
magic of wide-block ciphers, we might be able to roll out AES_IGE +
eliminate consecutive regions of pre-established relay cell known
plaintext for 0.2.4.x, and then save the deep magic for 0.2.5.x or
beyond.
I looked through Proposal 202, and I don't see any mechanism for
switching constructions/cipher choices in there?
That'd be semi-implicit in proposal 200, where you use the create cell
type to select the crypto you want.

Migrating to a new algorithm here will be kind of fun -- or rather,
disabling the old one will be -- since the only way to turn off the
old one entirely is to stop allowing servers who don't support the new
one on the network.  That could use a better writeup and thought
process than we've given it yet.
...
...
* I know everybody wants our crypto problems to get solved, but it's
critical to get this stuff right.  I think that the way to do right by
our users is by taking the time we will need to design the right thing
properly, rather than jumping into something halfcocked.  We all
acknowledge that it's easy for people and organizations to screw this
stuff up: so let's take our time and actually come up with something
solid.  Against the current pain and badness of our current system, we
must weigh the potential harm of jumping precipitously into something
that turns out to be broken because we didn't think about it hard
enough.
Will I ever be able to convince you of the value of "jumping early and
often?" ;)
Only by having it pay off.   :)

yrs,
-- 
Nick