[tor-dev] Draft Proposal: Random Number Generation During Tor Voting

teor teor2345 at gmail.com
Mon Aug 3 16:42:20 UTC 2015

> On 4 Aug 2015, at 00:03 , George Kadianakis <desnacked at riseup.net> wrote:
> 3.1.2. Shared Random Document During Commitment Phase [SRDOCCOMMIT]
>  Also, an authority should not be able to register a commitment value for a
>  different authority. Hence, an authority X should only vote and place in
>  the SR doc commitments by authority Y, iff authority Y included that
>  commitment in its vote.

Should this paragraph apply to reveal values as well?
This requirement seems to be implicit in [REBOOT], but we should make it explicit.

> 3.3.1. Shared Randomness Calculation [SRCALC]
>  An authority that wants to derive the shared random value V, should use the
>  appropriate reveal values for that time period and calculate V as follows:
>     V = H(ID_a | R_a | ID_b | R_b | ...)
>  where the ID_k value is the identity fingerprint of directory authority k
>  and R_k is its corresponding reveal value of that authority for the current
>  period. H is sha256 for protocol version 1.

When an ordering is required, we typically use numeric / lexicographical order of ID.
Can we make that explicit?

> 3.4. Bootstrapping Procedure
>  As described in [MDCONS], two shared random values are required for the
>  HSDir overlay periods to work properly as specified in proposal 224. Hence
>  clients MUST NOT use the randomness of this system till it has bootstrapped
>  completely; that is, until two shared random values are included in a
>  consensus. This should happen after three 12:00UTC consensuses have been
>  produced, which takes 48 hours.

While it's not directly relevant to the proposal, I'd like to highlight the following implementation and testing issues:

When we test the shared randomness protocol, we're likely to want to run it at higher speeds in test networks. One potential implementation could specify the number of rounds per shared random value as a torrc option (which can only be modified in testing tor networks). We'd need minimum (3?) [See section 6.2], default (24), and maximum (100?) values for this option. If we made 3 the minimum number of rounds per random value, we'd have to avoid assuming too much in the implementation - for example, that both phases have the same number of periods.

Currently, chutney configures a consensus interval of 2 seconds (yes, seconds), and can bootstrap a testing tor network with hidden services in around 30 seconds. I'd like to keep it running on that kind of timescale - it encourages people to run tests with every build.

If hidden services (and other shared randomness clients) require two random values to bootstrap, 2 rounds of 3 consensuses of 2 seconds would add 12 seconds to the bootstrap time. This is a significant bootstrap time increase, which is somewhat mitigated by the fact that we're waiting for 1-2 of those consensuses anyway.

> 3.5. Rebooting Directory Authorities [REBOOT]
>  The shared randomness protocol must be able to support directory
>  authorities who leave or join in the middle of the protocol execution.
>  An authority that commits in the Commitment Phase and then leaves SHOULD
>  store its reveal value on disk so that it continues participating in the
>  protocol if it returns before or during the Reveal Phase. The reveal value
>  MUST be stored timestamped to avoid sending it on wrong protocol runs.
>  For this reason, other authorities should carry the commitment values of
>  absent authorities in the shared randomness document until the end of the
>  protocol. The shared randomness document can be used to verify that the
>  commitment values are carried properly.

It's unclear how this paragraph interacts with [SRDOCCOMMIT].
In particular, is it enough that an authority commits to its own value in any one of the consensuses in the commitment phase?
If this is true, then let's make the carrying forward explicit in [SRDOCCOMMIT].

> 3.6. How we define majority [MAJORITY]
>  The shared randomness protocol must be able to support directory
>  authorities who participate in the consensus protocol but not in the shared
>  randomness protocol. It must also be able to tolerate authorities who drop
>  or join in the middle of the protocol.
>  The security of this proposal strongly relies on forming majority opinion
>  so it's important for the number of participants to always be well defined:
>  In the voting session right before creating the SR doc, we define the
>  number of active participants to be the number of directory authorities
>  that included commit/reveal values in their votes.

Given [SRDOCCOMMIT] and [REBOOT], it should be made explicit that this is the number of authorities who have included their *own* value in either:
* at least one of their votes, or
* their latest vote.

>  XXX The number of active participants is dynamic as authorities leave and
>      join the protocol. Since the number of active participants is dynamic ,
>      an attacker could trick some authorities believing there are N
>      participants and some others believing there are N-1 participants, by
>      sending different votes to different auths. Should we worry? [asn]

One scenario in which this matters is when the difference between N and N-1 is enough to make the majority vote on the shared random value either succeed or fail. This could split the authorities into two groups with different views of the protocol.

Furthermore, X colluding attackers could create this scenario when the difference between N and N-X is enough to produce the above scenario.

Does this imply a minimum number of non-colluding authorities for this protocol to be secure from X colluding authorities?
If so, can we make that minimum explicit?
Do we need to check for this minimum in the implementation before voting on the document?
(We'd need to choose a number of colluding authorities we'd want to protect against.)

But, from the security analysis:

> "the protocol is insecure to the extent that an
>  adversary who controls b of the authorities gets to choose among 2^b
>  outcomes for the result of the protocol"

So the dynamic numbering appears to be (at most) as insecure as the rest of the protocol, as it allows a single attacker to cause half the authorities to believe failure, in certain circumstances.

>      A way to avoid a dynamic number of participants could be to set the
>      number of participants to be the number of auths who committed during the
>      very first commitment phase round.

What if (too many) extra authorities arrive later in the commit phase?
Does this break the majority 50% + 1 property?

Why not make it the minimum? average? maximum? of the first and second-last rounds?

This would make the value harder to manipulate in the last round, while taking into account authorities which dropped out during the phase. (Authorities which dropped out or arrived in the last round wouldn't make a difference to the set of commitments, as they come from the second-last round.) It's still somewhat dynamic, though, so this could be over-complicating things for very little benefit.

> 3.7. Shared Randomness Disaster Recovery [SRDISASTER]
>  If the consensus at 12:00UTC fails to be created, then there will be no new
>  shared random value for the day.
>  Directory authorities should keep including the previous shared random
>  values in the consensus till the next 12:00UTC commit-and-reveal session.
>  The time period needs to be updated to reflect the current time period even
>  if the random value stays the same.
>  Clients should keep on using this shared random values.

(If the timestamp is updated, clients wouldn't even know that the protocol had failed, even if they downloaded the shared randomness document, unless they specifically checked for failure.)

We could instead run an instance of the hash with an empty set of reveals like so:


This would gracefully degrade the protocol, and provide a predictable but different value each day, based on the previous value and the date, very much like the current HSDir hash ring.

But does this gain us anything?
(The HSDirs would at least move each day, albeit predictably. We might want this property.)

> 4.1.1 Encoding the authority's own commit/reveal value
>  An authority that wants to commit (or reveal) a value during a vote, should
>  generate a random 256-bit value REVEAL, and include its commitment COMMIT
>  in its 12:00UTC vote as follows:
>     "shared-rand-commitment" SP algname SP COMMIT [SP REVEAL] NL

Is COMMIT encoded using base64-encode as well?

>  During the Reveal Phase, an authority can also optionally reveal the value
>  REVEAL. The "algname" is the hash algorithm that should be used to compute
>  COMMIT and REVEAL if any. It should be "sha256" for this version.
>  The commitment value COMMIT is constructed as follows:
>     C = base64-encode( SHA256(REVEAL) )

What happens to reveals during the commit phase?
Is the whole commit/reveal line ignored?
Or is only the reveal ignored?
Or are the commit and reveal included in other authorities' votes as received?

> 4.1.2 Encoding commit/reveal values received by other authorities [COMMITOTHER]
>  An authority puts in its vote the commitments and reveals it has seen from
>  the other authorities. To do so, it includes the following in its votes:
>     "shared-rand-received-commitment" SP identity SP algname SP COMMIT [SP REVEAL] NL
>  when "identity" is the hex-encoded commitment's authority fingerprint and
>  COMMIT is the received commitment value. Authorities can also optionally
>  include the reveal value REVEAL. There MUST be only one line per authority
>  else the vote is considered invalid. Finally, the "algname" is the hash
>  algorithm that should be used to compute COMMIT and REVEAL if any.

This section could be interpreted to mean that an authority could drop a reveal produced by another authority. Let's not permit that.

> 4.1.3. Shared Random value
> Authorities include a shared random value in their votes using the following
> encoding for the previous and current value respectively:
>    "shared-rand-previous-value" SP value NL
>    "shared-rand-current-value" SP value NL
> where "value" is the actual shared random value. It's computed as specified
> in the section [SRCALC].

Is "value" encoded using base64-encode as well?

> 4.2.1 Format [SRFORMAT]
>  The following details the commitment and reveal section.
>    "shared-rand-commitment" SP algname SP identity SP commitment-value
>                             [SP revealed-value] NL
>       [Exactly once per authority]
>       This is the commitment or/and reveal value agreed upon by the majority
>       from one authority. The algname is always "sha256" in version 1. The
>       "identity" is the authority hex-encoded digest of the authority
>       identity key of the signing authority from which the values are from.
>       Finally, "{commitment|revealed}-value" is the value as specified in
>       section [SPEC].

These values need to be sorted in order of identity key, in order for each authority to produce the same document.

>  Finally, the footer of the document:
>    "shared-random-footer" NL
>       [Exactly once]
>       It contains one subsection, the authority signatures.
>          "authority-signature" SP algname SP identity SP signing-key-digest
>                                NL Signature NL
>             [Exactly once per authority]
>             The "identity" is the hex-encoded digest of the authority
>             identity key and the "signing-key-digest" is the hex-encoded
>             digest of the current authority signing key of the signing
>             authority.
>             The "algname" item is the algorithm used to compute the hash of
>             the document before signing it. As proposal 162 proposed,
>             "sha256" should be used. The authority-signature entry MUST be
>             ignored if "algname" is unrecognized.
>             See dir-spec.txt for a specification for the Signature item.

While I don't think it's strictly required, as the signatures themselves are not being signed:
For consistency, the signatures should probably be sorted in order of identity key, in order for each authority to produce exactly the same document.

> 4.3. Shared Random Value in Consensus [SRCONSENSUS]
>  Authorities insert the two shared random values in the consensus following
>  the same encoding format as in [SRFORMAT].

Which consensus?
The standard consensus?
The microdescriptor consensus?
Does this require a new consensus version, as well as a new consensus flavour?

Do we want to allow clients to use the shared random values in old consensuses for up to 24 hours, just like they use other values in old consensuses?
Or should we make (encourage) clients download the latest consensus before using any shared randomness values?

> 5.2. Is there a need for a final agreement phase?
>  Commit-and-reveal protocols usually also end with an agreement phase,
>  during which participants agree on which reveal values should be used to
>  make the shared random value.
>  An agreement phase is needed, because if the protocol ended with the reveal
>  phase, an evil authority could wait until the last reveal round, and reveal
>  its value to half of the authorities. That would partition the authorities
>  into two sets: the ones who think that the shared random value should
>  contain this new reveal, and the rest who don't know about it. This would
>  result in a tie and two different SR docs.
>  However, we believe that an agreement phase is not necessary in our
>  protocol since reveal values are transcribed in the SR document if only if
>  the majority agrees. Hence, a tie is not enough to confuse the authorities
>  since it's not majority and the offending value would just be discarded.

>  That said, an attack that could still work here would be if an authority
>  can make half of the authorities believe that the value should be
>  discarded, and make the other half of the authorities believe that the
>  value should be included. That could be achieved if the attacker could
>  force honest authorities to send different votes to different authorities.
>  We believe this should not be the case currently, but we should look more
>  into this.

Having a dynamic number of authorities would appear to allow this attack (see above), at least to the extent that a certain number of colluding authorities could cause some of the authorities to believe the protocol had failed, and others to believe it had succeeded.
> 5.3. Predicting the shared random value during reveal phase
>  The reveal phase lasts 12 hours, and most authorities will send their
>  reveal value on the first round of the reveal phase. This means that an
>  attacker can predict the final shared random value about 12 hours before
>  it's generated.
>  This does not pose a problem for the HSDir hash ring, since we impose an
>  uptime restriction on HSDir nodes, so 12 hours predictability is not an
>  issue.

This may be a question for the next-generation hidden services proposal, rather than this one.

Is it possible to:
0. Occupy evenly-spaced sites on the hash ring with your nodes
1. Predict the shared random value 12 hours in advance
2. Expand the space your nodes occupy on the hash ring, during that 12 hours, by eliminating (via DoS or otherwise) the nodes which occupy the spaces you wish to occupy (and perhaps you can also use the 24-48 hours the shared random value is active)

I think this attack either requires a large number of evenly-spaced nodes, or has a small probability of success on any one day. But these may not be sufficient conditions to protect certain hidden services - because you could wait and try the attack on the days where your node(s) are near enough to the site(s) you wish to occupy.

Of course, having 6 HSDir replicas reduces the probability you could occupy more than 1 or 2 of their locations - but is occupying 1 or 2 locations enough to deanonymise clients?

>  Any other protocols using the shared random value from this system should
>  be aware of this property.

We could reduce this timeframe to around 1-2 hours by having authorities choose a random consensus between first and second-last (or third-last) to start revealing their commitments.

Or we could determine reveal order based on anticipated fingerprint order, modulo the number of consensuses in a phase (which is only relevant for > 12 authorities or < 12 consensuses per phase). So an authority which anticipates having the first fingerprint of N fingerprints, would reveal in consensus 12 - N - 1. Then the next would reveal in 12 - N - 2. And the last would reveal in consensus 12 - 1 (the second-last consensus). Or, again, we could do the final reveals in the third-last consensus.

Or we could use a hybrid of these schemes, and weight the random reveal consensus by anticipated fingerprint order.

This would somewhat increase the risk that authorities which leave the protocol during the reveal phase, wouldn't have revealed their votes yet.

I don't think there's a need for reducing the predictability from 12 hours to 1-2 hours, but we could keep it in mind if a future version needs to reduce the amount of time the shared random value is predictable.


Tim (teor)

Tim Wilson-Brown (teor)

teor2345 at gmail dot com

teor at blah dot im
OTR D5BE4EC2 255D7585 F3874930 DB130265 7C9EBBC7

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20150804/c539a51e/attachment-0001.sig>

More information about the tor-dev mailing list