On Fri, 6 May 2011 23:14:58 -0400 Nick Mathewson nickm@freehaven.net wrote:
On Fri, May 6, 2011 at 10:40 PM, Marsh Ray marsh@extendedsubset.com wrote: [...]
but the problem in general is worrisome and we should indeed replace (nearly) all of our memcmps with data-independent variants.
Maybe some of the str*cmps too? I grep 681 of them.
Yeah; most of these are not parsing anything secret, but we should audit all of them to look for worrisome cases. It's even less clear what data-independence means for strcmp: possibly it means that the run-time should depend only on the length of the shorter string, or the longer string, or on one of the arguments arbitrarily designated as the "non-secret" string, or such. All of these could be reasonable under some circumstances, but we should figure out what we actually mean.
We should also look for other cases where any data or padding might be checked, decompressed, or otherwise operated on without being as obvious as calling memcmp. Lots of error conditions can disclose timing information.
Yeah. This is going to be a reasonably big job.
(Pedantic nit-pick: we should be saying "data-independent," not "constant-time." We want a memcmp(a,b,c) that takes the same number of cycles for a given value of c no matter what a and b are. That's data-independence. A constant-time version would be one that took the same number of cycles no matter what c is.)
That's a good point. In most of the code I glanced at, the length was fixed at compile-time. I suppose a proper "constant-time" function would have to take as much time as a 2GB comparison (on 32) :-).
int mem_neq(const void *m1, const void *m2, size_t n) { const uint8_t *b1 = m1, *b2 = m2; uint8_t diff = 0; while (n--) diff |= *b1++ ^ *b2++; return diff != 0; } #define mem_eq(m1, m2, n) (!mem_neq((m1), (m2),(n)))
Looks good to me.
What if n is 0? Is 'equals' or 'neq' a more conservative default ?
If n is 0, then "equals" is the answer: all empty strings are equal, right? :)
Would it make sense to die in a well-defined way if m1 or m2 is NULL?
Adding a tor_assert(m1 && m2) would be fine.
Also, if the MSB of n is set it's an invalid condition, the kind that could result from a conversion from a signed value.
Adding a tor_assert(n < SIZE_T_CEILING) is our usual way of handling this.
Also, as I said on the bug, doing a memcmp in constant time is harder than doing eq/neq. I *think* that all of the cases where we care about memcmp returning a tristate -1/0/1 result, we don't need data-independence... but in case we *do* need one, we'll have to do some malarkey like
int memcmp(const void *m1, const void *m2, size_t n) { /*XXX I don't know if this is even right; I haven't tested it at all */ const uint8_t *b1 = m1, *b2 = m2; int retval = 0;
while (n--) { const uint8_t v1 = b1[n], v2 = b2[n]; int diff = (int)v1 - (int)v2; retval = (v1 == v2) * retval + diff; }
return retval; }
GCC is likely to turn (v1 == v2) into a backdoor. Also, we would need to make sure sign extension is constant-time; it *probably* is on IA-32 and AMD64, but we may need to disassemble the compiler's output to verify that on ARM.
Other than that, it looks correct. We *can* fix the dependence on == and make the multiply unnecessary at the same time, though.
I've attached my optimized constant-time comparison functions for 16-byte and 32-byte values to this message. They're packaged in the format for a submission to SUPERCOP and/or NaCl, but for some reason I never actually submitted them.
Robert Ransom