[tor-bugs] #16943 [Tor]: Implement prop250 (Random Number Generation During Tor Voting)

Tue Feb 9 15:44:36 UTC 2016

#16943: Implement prop250 (Random Number Generation During Tor Voting)
-------------------------+------------------------------------
 Reporter:  asn          |          Owner:
     Type:  enhancement  |         Status:  needs_review
 Priority:  High         |      Milestone:  Tor: 0.2.8.x-final
Component:  Tor          |        Version:
 Severity:  Normal       |     Resolution:
 Keywords:  tor-hs       |  Actual Points:
Parent ID:  #8244        |         Points:  large
  Sponsor:  SponsorR     |
-------------------------+------------------------------------

Comment (by dgoulet):

 Replying to [comment:22 teor]:

 Thanks for that teor!

 New branch in `prop250_final_v4`.

 >
 > Every other multi-word file name in src/or uses underscores.

 Fixed. Extra commit here since this applies to almost all commits.

 >
 > If a function moves data from `(dtype* dst, const stype* src)`, can the
 arguments be in that order, and can the src argument be const?
 >
 > There are a lot of missed opportunities for const pointer arguments.

 Fixed most of those. See "fixup" commits

 >
 > e282227ebc03ae71f1cd6490a2adf2058140c66e:
 >
 > If you use `#ifdef WORDS_BIGENDIAN` (from orconfig.h), rather than a
 runtime check against htonl, some compilers will produce better code.

 Fixed!

 >
 > 72baec6805aaab8d62f2deed8fbc658c82a7d086:
 >
 > shared-random-state.c:
 >
 > If we use an asymmetric magic number, we can detect when the two halves
 are swapped.
 > {{{
 > #define SR_DISK_STATE_MAGIC 42424242
 > }}}
 >

 Fixed

 > A hard-coded `SHARED_RANDOM_N_ROUNDS` is going to make it really hard to
 test hidden services quickly using chutney. (We'll always be testing them
 using the default initial shared random value.) Can we make this
 configurable in test networks?
 > {{{
 > #define SHARED_RANDOM_N_ROUNDS 12
 > }}}

 The part I do not like about changing this value for testing network is
 that we do NOT get the real behavior of the protocol... I'm not against
 for a testing value but I would do that after merge in a separate ticket.

 >
 > get_start_time_of_current_round() doesn't take
 TestingV3AuthVotingStartOffset into account. Why not use:
 > {{{
 > time_t next_start = voting_schedule.interval_starts;
 > time_t curr_start = dirvote_get_start_of_next_interval(next_start -
 interval - 1,interval,
 >                       options->TestingV3AuthVotingStartOffset);
 > }}}
 >
 > get_next_valid_after_time() duplicates code in
 dirvote_recalculate_timing().
 > Why not use `voting_schedule.interval_starts`?
 > (You'd need to make sure dirvote_recalculate_timing() had been called
 before either of these functions used `voting_schedule.interval_starts`.)

 Changing this part of the code after the extensive testing we did could be
 very delicate. asn is looking at possible solution.

 >
 > Nitpick: why 2? Can it be a named constant like
 `SHARED_RANDOM_N_PHASES`?
 > get_state_valid_until_time, get_sr_protocol_phase
 > {{{
 > int total_rounds = SHARED_RANDOM_N_ROUNDS * 2;
 > }}}

 Definitely better.

 >
 > get_state_valid_until_time() also ignores
 TestingV3AuthVotingStartOffset. I think this should fix it:
 > {{{
 > current_round = (beginning_of_current_round / voting_interval) %
 total_rounds;
 > }}}
 >
 > get_sr_protocol_phase() also ignores TestingV3AuthVotingStartOffset.
 Perhaps passing around the time isn't the best way to do this? It seems
 like if we can make this same mistake in multiple places, then we should
 refactor the round number calculation so it's in a single function.
 >
 > Does get_sr_protocol_phase() need to check it's getting an actual
 valid_after time, rather than now?
 No, valid_after time that is now is valid which is what we set when you
 start up since we have no valid_after time until first consensus.
 >
 > Why do we have disk_state_validate() and disk_state_validate_cb()?

 We must have a callback for the "state file" API, NULL is not accepted.
 Furthermore, if the callback return an error, right now tor does a
 wonderful `assert(0)`... so it has been ignored.

 > Should one call the other? Should they check _magic?
 This is done with CONFIG_CHECK() which is called multiple time when you
 use `config_assign()`.

 > Should they check ValidAfter < ValidUntil?

 Fixed! (But the correct check is >= here :)

 >
 > disk_state_parse_commits() fails to `SMARTLIST_FOREACH(args, char *, cp,
 tor_free(cp));` if commit is NULL.
 >
 Fixed
 > disk_state_parse_sr_values() contains duplicate code in the if/else
 block. The function comment should say it parses both
 SharedRandPreviousValue and SharedRandCurrentValue.
 >
 Fixed
 > I think disk_state_parse_sr_values() might be able to leak memory if
 disk_state_parse_srv() fails. At the very least, it leaves an allocated
 but zero pointer, which could confuse the rest of the code.
 >
 Fixed with the commit above.
 > disk_state_parse_sr_values() can assert its arguments are not NULL.
 >
 Fixed with commit above
 > disk_state_put_commit_line() can have commit as a const pointer, and in
 the opposite order for consistency.
 >
 Fixed.
 > In disk_state_put_commit_line(), do we need to memwipe reveal_str?
 > Or are we about to make it public anyway?
 >
 Indeed. I think you are right. We do put _our_ commit on disk during the
 commit phase so better to cleanup memory here to be safe.

 > disk_state_put_srv_line() can have srv as a const pointer, and in the
 opposite order for consistency.
 >
 Fixed.
 > I think disk_state_reset() is the moral equivalent of disk_state_free()
 followed by disk_state_new(). Why are they different functions? (Apart
 from the memory thing.)
 >
 Not much. We avoid alloc/free every time we update the state which in a
 voting period happens often. (probably not enough to be noticeable but
 still)

 > We don't seem to check for errors in disk_state_update(). I think this
 is ok, but I'm not sure exactly why.
 >
 We assume that everything in our memory state is coherent so disk update
 can only use value that are usable. I think it's fine, since if the memory
 state has crazy stuff even if we don't put on disk or not, thing will go
 bad.

 > In disk_state_save_to_disk() and children (and other functions), should
 we memwipe any temporary memory that contains the reveal value, or the
 encoded reveal value? (before it is revealed)
 > We do this some places, but not others.
 >
 I've added a needed memwipe in `reveal_encode` since we do encode _our_
 commitment during the commit phase so the reveal value should only be in
 the commit object memory and not on stack.

 > Can data in state_query_get_() be a const pointer?
 > Some of the sr_state_set_* functions can take const pointers.
 the `get_` yes, I fixed it but not for the set since digestmap doesn't
 handle a const.
 >
 > shared-random-state.h:
 >
 > I think this comment is in the wrong place, shouldn't it correspond to
 `SHARED_RANDOM_STATE_PRIVATE`?
 > {{{
 > /* Private methods (only used by shared-random.c): */
 > }}}
 Ah, by that we mean methods that are _ONLY_ used by shared_random.c so
 "private" to the subsystems but not static because they need to be
 visible.
 >
 > shared-random.c:
 >
 > {{{
 > char b64_decoded[SR_COMMIT_LEN + 2];
 > }}}
 > Can we either fix #17868 as part of this branch, or add a #define for
 the rounding-up that has to occur when the base64 value isn't padded?
 > {{{
 > #define B64_DECODE_PAD(len) ((len) + ((len) % 3))
 > char b64_decoded[B64_DECODE_PAD(SR_COMMIT_LEN)];
 > }}}

 I would really want to fix #17868 instead of a temporary workaround macro.
 You'll notice that there are still XXX: at that place and it's to show the
 obvious bad thing of `+2`.

 >
 > Is TIMESTAMP first, or H(REVEAL)?
 > commit_decode says one thing in the comments, and another in the code.
 >
 Hrm both the code and comment do the same or am I missing something?
 > Can we assert that `decoded_len == SR_COMMIT_LEN`? Is that useful?
 >
 We prefer not since this is data from the network so potentially
 untrusted.
 > reveal_decode() duplicates commit_decode(). Can we refactor to avoid
 that?
 >
 Plausible. I would do that in a later iteration though. Having both
 separated gives us more room for better logging and if one of those change
 in some way, we can adapt easily.
 > In sr_parse_srv(), why do we use base16 for the SRV, but base64 for the
 commits?
 >
 No reasons... We'll make it base64.
 > In sr_parse_commit(), we ignore extra arguments after the fourth. Is
 this intentional?
 > A comment either way would help.
 We do and the top commit of the function details the order of args. What
 extra commit you have in mind?
 >
 > I have 5 more commits to review. (Please add fixups to the end of the
 branch so I don't lose my place.)

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/16943#comment:23>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online