[tor-bugs] #16943 [Tor]: Implement prop250 (Random Number Generation During Tor Voting)

Wed Feb 10 03:09:17 UTC 2016

#16943: Implement prop250 (Random Number Generation During Tor Voting)
-------------------------+------------------------------------
 Reporter:  asn          |          Owner:
     Type:  enhancement  |         Status:  needs_review
 Priority:  High         |      Milestone:  Tor: 0.2.8.x-final
Component:  Tor          |        Version:
 Severity:  Normal       |     Resolution:
 Keywords:  tor-hs       |  Actual Points:
Parent ID:  #8244        |         Points:  large
  Sponsor:  SponsorR     |
-------------------------+------------------------------------

Comment (by teor):

 Replying to [comment:23 dgoulet]:
 > Replying to [comment:22 teor]:
 >
 > Thanks for that teor!
 >
 > New branch in `prop250_final_v4`.
 >
 > > A hard-coded `SHARED_RANDOM_N_ROUNDS` is going to make it really hard
 to test hidden services quickly using chutney. (We'll always be testing
 them using the default initial shared random value.) Can we make this
 configurable in test networks?
 > > {{{
 > > #define SHARED_RANDOM_N_ROUNDS 12
 > > }}}
 >
 > The part I do not like about changing this value for testing network is
 that we do NOT get the real behavior of the protocol... I'm not against
 for a testing value but I would do that after merge in a separate ticket.

 Not necessarily - 12 can still be the default number of rounds.
 (I'm pretty sure we haven't hard-coded this macro value anywhere. So it
 should be easy to replace with a function that reads an option for the
 number of rounds.)

 Opened #18295 for this.

 > > get_state_valid_until_time() also ignores
 TestingV3AuthVotingStartOffset. I think this should fix it:
 > > {{{
 > > current_round = (beginning_of_current_round / voting_interval) %
 total_rounds;
 > > }}}
 > >
 > > get_sr_protocol_phase() also ignores TestingV3AuthVotingStartOffset.
 Perhaps passing around the time isn't the best way to do this? It seems
 like if we can make this same mistake in multiple places, then we should
 refactor the round number calculation so it's in a single function.

 Can we refactor the round number / time calculations so they're in one
 place?
 I feel nervous we'll break them if we try to fix them later, and I'd
 rather do that now when we're not live.

 > > shared-random.c:
 > >
 > > {{{
 > > char b64_decoded[SR_COMMIT_LEN + 2];
 > > }}}
 > > Can we either fix #17868 as part of this branch, or add a #define for
 the rounding-up that has to occur when the base64 value isn't padded?
 > > {{{
 > > #define B64_DECODE_PAD(len) ((len) + ((len) % 3))
 > > char b64_decoded[B64_DECODE_PAD(SR_COMMIT_LEN)];
 > > }}}
 >
 > I would really want to fix #17868 instead of a temporary workaround
 macro. You'll notice that there are still XXX: at that place and it's to
 show the obvious bad thing of `+2`.

 Made #17868 a child ticket of this ticket.

 > > Is TIMESTAMP first, or H(REVEAL)?
 > > commit_decode says one thing in the comments, and another in the code.
 > >
 > Hrm both the code and comment do the same or am I missing something?

 The commit_decode() function header comment says:
 {{{
 base64-encode( H(REVEAL) || TIMESTAMP )
 }}}

 The code says:
 {{{
   /* First is the timestamp (8 bytes). */
   commit->commit_ts = (time_t) tor_ntohll(get_uint64(b64_decoded));
   offset += sizeof(uint64_t);
   /* Next is hashed reveal. */
   memcpy(commit->hashed_reveal, b64_decoded + offset,
          sizeof(commit->hashed_reveal));
 }}}

 > > Can we assert that `decoded_len == SR_COMMIT_LEN`? Is that useful?
 > >
 > We prefer not since this is data from the network so potentially
 untrusted.

 OK, then we're deliberately ignoring any characters after SR_COMMIT_LEN?
 Can we make that explicit in a comment?

 > > reveal_decode() duplicates commit_decode(). Can we refactor to avoid
 that?
 > >
 > Plausible. I would do that in a later iteration though. Having both
 separated gives us more room for better logging and if one of those change
 in some way, we can adapt easily.

 The risk of leaving them duplicated is that bugs get fixed in one and not
 the other.
 Or they fall out of sync accidentally. This is a real mess to clean up
 later - see router_pick_directory_server_impl() and
 router_pick_trusteddirserver_impl() for an example.

 This has already happened to these two functions:
 * commit_decode() uses offset, reveal_decode() uses a hard-coded 8.
 * the "too small" warning in reveal_decode() is less detailed, but could
 be identical to commit_decode()

 So I have a more radical suggestion for you:
 * you have two groups of 3 struct fields that are the same types, and have
 the same operators performed on them;
 * abstract these 3 fields into their own struct type (ts, hashed, encoded)
 * add operations for encoding and decoding this type (and perhaps, turning
 this authority's reveal into its commit)
 * refactor the existing code to call these operations

 That way, the interface stays the same, you can add any extra logging you
 need in the commit- or reveal-specific functions, but you get the benefit
 of using the same code for encoding/decoding commits and reveals.

 > > In sr_parse_commit(), we ignore extra arguments after the fourth. Is
 this intentional?
 > > A comment either way would help.
 > We do and the top commit of the function details the order of args. What
 extra commit you have in mind?

 If an authority supplies 5 arguments in its vote, we will ignore the
 fifth.
 There's no comment documenting whether this is intentional or not.
 And should we ignore the entire line instead?

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/16943#comment:25>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online