[tor-dev] Draft Proposal: Random Number Generation During Tor Voting

Tue Aug 4 12:00:39 UTC 2015

teor <teor2345 at gmail.com> writes:

>> On 4 Aug 2015, at 00:03 , George Kadianakis <desnacked at riseup.net> wrote:
>> …
>
>> 3.1.2. Shared Random Document During Commitment Phase [SRDOCCOMMIT]
> …

Hello,

and thanks for the comments.

I uploaded a new version of the proposal that addresses some of your feedback.

You can find it here: https://gitweb.torproject.org/user/asn/torspec.git/log/?h=rng-draft-v4-asn

Here are some comments:

>>  Also, an authority should not be able to register a commitment value for a
>>  different authority. Hence, an authority X should only vote and place in
>>  the SR doc commitments by authority Y, iff authority Y included that
>>  commitment in its vote.
>
> Should this paragraph apply to reveal values as well?
> This requirement seems to be implicit in [REBOOT], but we should make it explicit.
>

You are right, I think this paragraph should apply to reveal values as well.

I didn't do the change in my branch yet, so that I can hear feedback from David
as well.

>> 3.3.1. Shared Randomness Calculation [SRCALC]
>> 
>>  An authority that wants to derive the shared random value V, should use the
>>  appropriate reveal values for that time period and calculate V as follows:
>> 
>>     V = H(ID_a | R_a | ID_b | R_b | ...)
>> 
>>  where the ID_k value is the identity fingerprint of directory authority k
>>  and R_k is its corresponding reveal value of that authority for the current
>>  period. H is sha256 for protocol version 1.
>
> When an ordering is required, we typically use numeric / lexicographical order of ID.
> Can we make that explicit?
>

OK, I think we fixed this. The new wording is:

"ID_a, R_a pairs are ordered based on the identity fingerprint of the authority in ascending order."

>> <snip>
>>
>> 3.5. Rebooting Directory Authorities [REBOOT]
>> 
>>  The shared randomness protocol must be able to support directory
>>  authorities who leave or join in the middle of the protocol execution.
>> 
>>  An authority that commits in the Commitment Phase and then leaves SHOULD
>>  store its reveal value on disk so that it continues participating in the
>>  protocol if it returns before or during the Reveal Phase. The reveal value
>>  MUST be stored timestamped to avoid sending it on wrong protocol runs.
>> 
>>  For this reason, other authorities should carry the commitment values of
>>  absent authorities in the shared randomness document until the end of the
>>  protocol. The shared randomness document can be used to verify that the
>>  commitment values are carried properly.
>
> It's unclear how this paragraph interacts with [SRDOCCOMMIT].
> In particular, is it enough that an authority commits to its own value in any one of the consensuses in the commitment phase?
> If this is true, then let's make the carrying forward explicit in [SRDOCCOMMIT].
>

Very good point. I think I fixed this on the top commit of my branch.

The new wording is:

"  Hence, an authority X should only
   vote and place in the SR doc commitments by authority Y, iff authority Y
   included that commitment in its vote, or that commitment was included on the
   previous SR doc and assigned to Y. "

>> …
>> 
>> 3.6. How we define majority [MAJORITY]
>> 
>>  The shared randomness protocol must be able to support directory
>>  authorities who participate in the consensus protocol but not in the shared
>>  randomness protocol. It must also be able to tolerate authorities who drop
>>  or join in the middle of the protocol.
>> 
>>  The security of this proposal strongly relies on forming majority opinion
>>  so it's important for the number of participants to always be well defined:
>> 
>>  In the voting session right before creating the SR doc, we define the
>>  number of active participants to be the number of directory authorities
>>  that included commit/reveal values in their votes.
>
> Given [SRDOCCOMMIT] and [REBOOT], it should be made explicit that this is the number of authorities who have included their *own* value in either:
> * at least one of their votes, or
> * their latest vote.
>

Good point! Fixed this as well. New wording:

"
   In the voting session right before creating the SR doc, we define the number
   of active participants to be the number of directory authorities that
   included their *own* commit/reveal values in their votes.
"

>>  XXX The number of active participants is dynamic as authorities leave and
>>      join the protocol. Since the number of active participants is dynamic ,
>>      an attacker could trick some authorities believing there are N
>>      participants and some others believing there are N-1 participants, by
>>      sending different votes to different auths. Should we worry? [asn]
>
> One scenario in which this matters is when the difference between N and N-1 is
> enough to make the majority vote on the shared random value either succeed or
> fail. This could split the authorities into two groups with different views of
> the protocol.
>
> Furthermore, X colluding attackers could create this scenario when the difference between N and N-X is enough to produce the above scenario.
>
> Does this imply a minimum number of non-colluding authorities for this protocol to be secure from X colluding authorities?
> If so, can we make that minimum explicit?
> Do we need to check for this minimum in the implementation before voting on the document?
> (We'd need to choose a number of colluding authorities we'd want to protect against.)
>
> But, from the security analysis:
>
>> "the protocol is insecure to the extent that an
>>  adversary who controls b of the authorities gets to choose among 2^b
>>  outcomes for the result of the protocol"
>
> So the dynamic numbering appears to be (at most) as insecure as the rest of the protocol, as it allows a single attacker to cause half the authorities to believe failure, in certain circumstances.
>
>>      A way to avoid a dynamic number of participants could be to set the
>>      number of participants to be the number of auths who committed during the
>>      very first commitment phase round.
>
> What if (too many) extra authorities arrive later in the commit phase?
> Does this break the majority 50% + 1 property?
>
> Why not make it the minimum? average? maximum? of the first and second-last rounds?
>
> This would make the value harder to manipulate in the last round, while taking
> into account authorities which dropped out during the phase. (Authorities which
> dropped out or arrived in the last round wouldn't make a difference to the set
> of commitments, as they come from the second-last round.) It's still somewhat
> dynamic, though, so this could be over-complicating things for very little
> benefit.
>

Hmm, need to think more about this.

>> 3.7. Shared Randomness Disaster Recovery [SRDISASTER]
>> 
>>  If the consensus at 12:00UTC fails to be created, then there will be no new
>>  shared random value for the day.
>> 
>>  Directory authorities should keep including the previous shared random
>>  values in the consensus till the next 12:00UTC commit-and-reveal session.
>>  The time period needs to be updated to reflect the current time period even
>>  if the random value stays the same.
>> 
>>  Clients should keep on using this shared random values.
>
> (If the timestamp is updated, clients wouldn't even know that the protocol had failed, even if they downloaded the shared randomness document, unless they specifically checked for failure.)
>
> We could instead run an instance of the hash with an empty set of reveals like so:
>
> V = H("TOR SHARED RANDOMNESS PROTOCOL VERSION 1" | DATE | PREVIOUS_SHARED_RANDOMNESS_VALUE)
>
> This would gracefully degrade the protocol, and provide a predictable but different value each day, based on the previous value and the date, very much like the current HSDir hash ring.
>
> But does this gain us anything?
> (The HSDirs would at least move each day, albeit predictably. We might want this property.)
>

I think the result here will be the same.

However, I agree that making it hard to detect failure is a bad idea. We should
think of how to make this more obvious.

>> …
>
>> 4.1.1 Encoding the authority's own commit/reveal value
>> 
>>  An authority that wants to commit (or reveal) a value during a vote, should
>>  generate a random 256-bit value REVEAL, and include its commitment COMMIT
>>  in its 12:00UTC vote as follows:
>> 
>>     "shared-rand-commitment" SP algname SP COMMIT [SP REVEAL] NL
>
> Is COMMIT encoded using base64-encode as well?
>

I think so. Renamed 'C' to 'COMMIT' in my branch (oops).

>>  During the Reveal Phase, an authority can also optionally reveal the value
>>  REVEAL. The "algname" is the hash algorithm that should be used to compute
>>  COMMIT and REVEAL if any. It should be "sha256" for this version.
>> 
>>  The commitment value COMMIT is constructed as follows:
>> 
>>     C = base64-encode( SHA256(REVEAL) )
>
> What happens to reveals during the commit phase?
> Is the whole commit/reveal line ignored?
> Or is only the reveal ignored?
> Or are the commit and reveal included in other authorities' votes as received?
>

Hm this has not been specified.

We should either ignore reveals or ignore the whole vote if it includes reveals
during the commit phase.

Reveals during the commit phase are useless and a well behaving dirauth
shouldn't send them, so I think it's fine to ignore the whole vote in that case.

>> 4.1.2 Encoding commit/reveal values received by other authorities [COMMITOTHER]
>> 
>>  An authority puts in its vote the commitments and reveals it has seen from
>>  the other authorities. To do so, it includes the following in its votes:
>> 
>>     "shared-rand-received-commitment" SP identity SP algname SP COMMIT [SP REVEAL] NL
>> 
>>  when "identity" is the hex-encoded commitment's authority fingerprint and
>>  COMMIT is the received commitment value. Authorities can also optionally
>>  include the reveal value REVEAL. There MUST be only one line per authority
>>  else the vote is considered invalid. Finally, the "algname" is the hash
>>  algorithm that should be used to compute COMMIT and REVEAL if any.
>
> This section could be interpreted to mean that an authority could drop a reveal produced by another authority. Let's not permit that.
>
>> 4.1.3. Shared Random value
>> 
>> Authorities include a shared random value in their votes using the following
>> encoding for the previous and current value respectively:
>> 
>>    "shared-rand-previous-value" SP value NL
>>    "shared-rand-current-value" SP value NL
>> 
>> where "value" is the actual shared random value. It's computed as specified
>> in the section [SRCALC].
>
> Is "value" encoded using base64-encode as well?
>

Hm, we should define this.

BTW, do we prefer base64 or base32? I would vote for base32 for readability.

Also, the few extra bytes can't be that much overhead.

>> …
>> 
>> 4.2.1 Format [SRFORMAT]
>> 
>> …
>
>>  The following details the commitment and reveal section.
>> 
>>    "shared-rand-commitment" SP algname SP identity SP commitment-value
>>                             [SP revealed-value] NL
>> 
>>       [Exactly once per authority]
>> 
>>       This is the commitment or/and reveal value agreed upon by the majority
>>       from one authority. The algname is always "sha256" in version 1. The
>>       "identity" is the authority hex-encoded digest of the authority
>>       identity key of the signing authority from which the values are from.
>>       Finally, "{commitment|revealed}-value" is the value as specified in
>>       section [SPEC].
>
> These values need to be sorted in order of identity key, in order for each authority to produce the same document.
>

Need to fix this.

>> …
>> 
>>  Finally, the footer of the document:
>> 
>>    "shared-random-footer" NL
>> 
>>       [Exactly once]
>> 
>>       It contains one subsection, the authority signatures.
>> 
>>          "authority-signature" SP algname SP identity SP signing-key-digest
>>                                NL Signature NL
>> 
>>             [Exactly once per authority]
>> 
>>             The "identity" is the hex-encoded digest of the authority
>>             identity key and the "signing-key-digest" is the hex-encoded
>>             digest of the current authority signing key of the signing
>>             authority.
>> 
>>             The "algname" item is the algorithm used to compute the hash of
>>             the document before signing it. As proposal 162 proposed,
>>             "sha256" should be used. The authority-signature entry MUST be
>>             ignored if "algname" is unrecognized.
>> 
>>             See dir-spec.txt for a specification for the Signature item.
>
> While I don't think it's strictly required, as the signatures themselves are not being signed:
> For consistency, the signatures should probably be sorted in order of identity key, in order for each authority to produce exactly the same document.
>
>> 4.3. Shared Random Value in Consensus [SRCONSENSUS]
>> 
>>  Authorities insert the two shared random values in the consensus following
>>  the same encoding format as in [SRFORMAT].
>
> Which consensus?
> The standard consensus?
> The microdescriptor consensus?
> Both?
> Does this require a new consensus version, as well as a new consensus flavour?
>

The microdescriptor consensus we meant here. We need to fix this.

> Do we want to allow clients to use the shared random values in old consensuses for up to 24 hours, just like they use other values in old consensuses?
> Or should we make (encourage) clients download the latest consensus before using any shared randomness values?
>

I think the way it was suggested in prop224, is that hidden services will start
using the new randomness at 00:00 of the next day. So basically, 12 hours after
the randomness has been generated.

So the randomness will have been included in 12 consensuses before clients need
to use it, which means that most clients will have an appropriate consensus. 

I think if a client has a very old consensus and needs to use the very latest
random value, we should add some logic to the client that fetches a newer
consensus.

>> …
>> 
>> 5.2. Is there a need for a final agreement phase?
>> 
>>  Commit-and-reveal protocols usually also end with an agreement phase,
>>  during which participants agree on which reveal values should be used to
>>  make the shared random value.
>> 
>>  An agreement phase is needed, because if the protocol ended with the reveal
>>  phase, an evil authority could wait until the last reveal round, and reveal
>>  its value to half of the authorities. That would partition the authorities
>>  into two sets: the ones who think that the shared random value should
>>  contain this new reveal, and the rest who don't know about it. This would
>>  result in a tie and two different SR docs.
>> 
>>  However, we believe that an agreement phase is not necessary in our
>>  protocol since reveal values are transcribed in the SR document if only if
>>  the majority agrees. Hence, a tie is not enough to confuse the authorities
>>  since it's not majority and the offending value would just be discarded.
>
>>  That said, an attack that could still work here would be if an authority
>>  can make half of the authorities believe that the value should be
>>  discarded, and make the other half of the authorities believe that the
>>  value should be included. That could be achieved if the attacker could
>>  force honest authorities to send different votes to different authorities.
>>  We believe this should not be the case currently, but we should look more
>>  into this.
>
> Having a dynamic number of authorities would appear to allow this attack (see
> above), at least to the extent that a certain number of colluding authorities
> could cause some of the authorities to believe the protocol had failed, and
> others to believe it had succeeded.
> …
>> 5.3. Predicting the shared random value during reveal phase
>> 
>>  The reveal phase lasts 12 hours, and most authorities will send their
>>  reveal value on the first round of the reveal phase. This means that an
>>  attacker can predict the final shared random value about 12 hours before
>>  it's generated.
>> 
>>  This does not pose a problem for the HSDir hash ring, since we impose an
>>  uptime restriction on HSDir nodes, so 12 hours predictability is not an
>>  issue.
>
> This may be a question for the next-generation hidden services proposal, rather than this one.
>
> Is it possible to:
> 0. Occupy evenly-spaced sites on the hash ring with your nodes
> 1. Predict the shared random value 12 hours in advance
> 2. Expand the space your nodes occupy on the hash ring, during that 12 hours, by
> eliminating (via DoS or otherwise) the nodes which occupy the spaces you wish to
> occupy (and perhaps you can also use the 24-48 hours the shared random value is
> active)
>
> I think this attack either requires a large number of evenly-spaced nodes, or
> has a small probability of success on any one day. But these may not be
> sufficient conditions to protect certain hidden services - because you could
> wait and try the attack on the days where your node(s) are near enough to the
> site(s) you wish to occupy.
>
> Of course, having 6 HSDir replicas reduces the probability you could occupy more than 1 or 2 of their locations - but is occupying 1 or 2 locations enough to deanonymise clients?
>
>> 
>>  Any other protocols using the shared random value from this system should
>>  be aware of this property.
>
> We could reduce this timeframe to around 1-2 hours by having authorities choose a random consensus between first and second-last (or third-last) to start revealing their commitments.
>
> Or we could determine reveal order based on anticipated fingerprint order,
> modulo the number of consensuses in a phase (which is only relevant for > 12
> authorities or < 12 consensuses per phase). So an authority which anticipates
> having the first fingerprint of N fingerprints, would reveal in consensus 12 - N
> - 1. Then the next would reveal in 12 - N - 2. And the last would reveal in
> consensus 12 - 1 (the second-last consensus). Or, again, we could do the final
> reveals in the third-last consensus.
>
> Or we could use a hybrid of these schemes, and weight the random reveal consensus by anticipated fingerprint order.
>
> This would somewhat increase the risk that authorities which leave the protocol during the reveal phase, wouldn't have revealed their votes yet.
>
> I don't think there's a need for reducing the predictability from 12 hours to 1-2 hours, but we could keep it in mind if a future version needs to reduce the amount of time the shared random value is predictable.
>