[tor-bugs] #23061 [Core Tor/Tor]: crypto_rand_double() should produce all possible outputs on platforms with 32-bit int

Thu Oct 25 02:58:00 UTC 2018

#23061: crypto_rand_double() should produce all possible outputs on platforms with
32-bit int
-------------------------------------------------+-------------------------
 Reporter:  teor                                 |          Owner:  teor
     Type:  defect                               |         Status:
                                                 |  assigned
 Priority:  Medium                               |      Milestone:  Tor:
                                                 |  0.3.6.x-final
Component:  Core Tor/Tor                         |        Version:  Tor:
                                                 |  0.2.2.14-alpha
 Severity:  Normal                               |     Resolution:
 Keywords:  tor-relay, security-low, privcount,  |  Actual Points:  0.5
  029-backport, review-group-22,                 |
  034-triage-20180328, 034-removed-20180328,     |
  031-unreached-backport, 035-roadmap-subtask,   |
  035-triaged-in-20180711                        |
Parent ID:  #26637                               |         Points:  0.1
 Reviewer:                                       |        Sponsor:
                                                 |  SponsorV-can
-------------------------------------------------+-------------------------

Comment (by riastradh):

 If you take the uniform distribution on [0,1] or (0,1), i.e. the Lebesgue
 measure, and round it to the nearest floating-point number, the
 distribution on the result is the natural notion of the uniform
 distribution on floating-point numbers in [0,1].

 Note that even if you round a sample from the open interval (0,1) of real
 numbers, the result might be rounded to 0 or 1, if it is in (0, 2^emin - p
 - 1^] or [1 - eps/4, 1):

 - The interval near 0 is of negligible size in any format larger than
 binary16 -- less than 2^-126^ in binary32 even if you use flush-to-zero,
 less than 2^-1022^ in binary64 -- so you don't need to worry about it and
 can safely compute, e.g., 1/x or log(x).

 - However, the probability of getting a real number in the interval near 1
 is 2^-54^ in binary64, which is small but not negligible, so I recommend
 not rejecting it unless you must avoid 1, for instance because you need to
 compute log1p(-x).

 In infinite-precision floating-point, the exponent is geometrically
 distributed in {-1, -2, -3, ...}, and the fractional part of the
 significand is a uniformly distributed infinite bit string.  If we round
 this to finite precision, the exponent remains the same (unless rounded to
 subnormal or zero) and the significand is rounded to p bits.  The outcome
 of rounding is not actually affected by more than a finite number of
 significand bits, with one tricky exception, so you don't need to sample
 an infinite stream of bits and round it; to get this, you can:

 1. flip a coin until you get one heads, to pick a geometrically
 distributed exponent;

 2. flip a coin >p+1 times to choose the uniformly distributed fractional
 part of a significand, and round it to p bits with a trick to avoid a bias
 toward `even'.

 If you drew p bits for the fractional part of the significand, there would
 be no rounding, and you would exclude from the possible significands the
 interval (2 - eps/2, 2] of real numbers that are rounded to 2 -- small but
 not nonnegligible.  If you draw >p bits and round it, then no real number
 in [1, 2] is excluded.

 However, there is a slight bias toward `even' outcomes -- that is,
 outcomes whose least significant bit is zero -- because there's a nonzero
 probability that the finite significand you choose is a tie, exactly
 between two floating-point numbers, which is broken by choosing the even
 option.  This event has probability zero in Lebesgue measure.  You can
 simply break the ties by drawing >p+1 bits and forcing the least
 significant bit to be 1.

 The attached file uniform01.c implements this algorithm in the function
 `sample_uniform_01`, with parameters chosen for IEEE 754 binary64
 ('double-precision') floating-point.

 As it happens, the variant algorithm that chooses exactly p bits and
 ''doesn't'' round them gives a uniform distribution on [0, 1 - eps/4),
 which incidentally effectively never returns zero, and so yields what you
 might think of 'the uniform distribution on floating-point numbers in (0,
 1)'.  I named it `sample_uniform_01_open`, and added some paranoia to
 clamp the result at the smallest subnormal in case your PRNG is broken.
 But it also works to just do rejection sampling on `sample_uniform_01` and
 reduce the amount of code to worry about.

 Note: This code does not run in constant time -- the time it takes is
 proportional to the magnitude of the exponent.  (In principle, it may also
 take variable time for subnormal arithmetic, but the probability of that
 is negligible for any format larger than binary16.)

 You could make it run in constant time by always flipping the maximum
 number of coins for an exponent, and counting the leading zero bits.  To
 improve performance, you can safely truncate the exponent at, say, 128 --
 if you flip 128 coins and you get tails every time, either your coin
 flipper is broken or there's a glitch in the matrix.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/23061#comment:69>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online