Re: [tor-dev] memcmp() & co. timing info disclosures?

7 May 2011

      On Fri, 6 May 2011 23:14:58 -0400
Nick Mathewson <nickm@freehaven.net> wrote:
...
On Fri, May 6, 2011 at 10:40 PM, Marsh Ray <marsh@extendedsubset.com> wrote:
[...]
...
...
but the problem in general is worrisome and we
should indeed replace (nearly) all of our memcmps with
data-independent variants.
Maybe some of the str*cmps too? I grep 681 of them.
Yeah; most of these are not parsing anything secret, but we should
audit all of them to look for worrisome cases.  It's even less clear
what data-independence means for strcmp: possibly it means that the
run-time should depend only on the length of the shorter string, or
the longer string, or on one of the arguments arbitrarily designated
as the "non-secret" string, or such.  All of these could be reasonable
under some circumstances, but we should figure out what we actually
mean.
...
We should also look for other cases where any data or padding might be
checked, decompressed, or otherwise operated on without being as obvious as
calling memcmp. Lots of error conditions can disclose timing information.
Yeah.  This is going to be a reasonably big job.
...
...
(Pedantic nit-pick: we should be saying "data-independent," not
"constant-time."  We want a memcmp(a,b,c) that takes the same number
of cycles for a given value of c no matter what a and b are.  That's
data-independence.  A constant-time version would be one that took the
same number of cycles no matter what c is.)
That's a good point. In most of the code I glanced at, the length was fixed
at compile-time. I suppose a proper "constant-time" function would have to
take as much time as a 2GB comparison (on 32) :-).
...
int mem_neq(const void *m1, const void *m2, size_t n)
{
 const uint8_t *b1 = m1, *b2 = m2;
 uint8_t diff = 0;
 while (n--)
   diff |= *b1++ ^ *b2++;
 return diff != 0;
}
#define mem_eq(m1, m2, n) (!mem_neq((m1), (m2),(n)))
Looks good to me.
What if n is 0? Is 'equals' or 'neq' a more conservative default ?
If n is 0, then "equals" is the answer: all empty strings are equal, right? :)
...
Would it make sense to die in a well-defined way if m1 or m2 is NULL?
Adding a tor_assert(m1 && m2) would be fine.
...
Also, if the MSB of n is set it's an invalid condition, the kind that could
result from a conversion from a signed value.
Adding a tor_assert(n < SIZE_T_CEILING) is our usual way of handling this.
Also, as I said on the bug, doing a memcmp in constant time is harder
than doing eq/neq.  I *think* that all of the cases where we care
about memcmp returning a tristate -1/0/1 result, we don't need
data-independence... but in case we *do* need one, we'll have to do
some malarkey like
int memcmp(const void *m1, const void *m2, size_t n)
{
/*XXX I don't know if this is even right; I haven't tested it at all */
  const uint8_t *b1 = m1, *b2 = m2;
  int retval = 0;
while (n--) {
    const uint8_t v1 = b1[n], v2 = b2[n];
    int diff = (int)v1 - (int)v2;
    retval = (v1 == v2) * retval + diff;
  }
return retval;
}
GCC is likely to turn (v1 == v2) into a backdoor.  Also, we would need
to make sure sign extension is constant-time; it *probably* is on IA-32
and AMD64, but we may need to disassemble the compiler's output to
verify that on ARM.

Other than that, it looks correct.  We *can* fix the dependence on ==
and make the multiply unnecessary at the same time, though.

I've attached my optimized constant-time comparison functions for
16-byte and 32-byte values to this message.  They're packaged in the
format for a submission to SUPERCOP and/or NaCl, but for some reason I
never actually submitted them.

Robert Ransom