Hello. I would like to suggest this as an alternative. It's not complete, but the basic idea is there.
Filename: - Title: VSS shared rng Author: Kang Created: 2013-12-03 Status: Draft, Needs-Research
1. Overview & Motivation
This protocol and document are works in progress.
A method of generating a shared random value between a set of directory authorities. The purpose of the shared random value is to act as a salt during HSDir (Hidden Service Directory) selection/lookup calculations. A new shared random would be calculated periodically and served by the directory authorities.
The protocol is designed around VSS (Verifiable Secret Sharing). Appendix [SECRET_SHARING] gives a very basic overview of what secret sharing is. Secret sharing is used to prevent backing out, where ``backing out'' means changing the final result by refusing to participate any further in the protocol [for instance withholding your contribution]. Verifiable secret sharing must be used because it allows participants to verify that shares they receive are valid and consistent. That is, it ensures that each participant commits to a single value which they can't change, and that dishonest participants can't modify or lie about shares they've received.
The motivation for the shared random solution is that it allows solving of the problem in ticket #8244 without a complete redesign of the HSDir system. https://trac.torproject.org/projects/tor/ticket/8244 Details of how the shared random is actually used to solve that problem are not covered here, only calculation of the shared random value.
While this protocol is designed specifically for the Tor project it should be applicable for calculating shared random values in almost any situation. It is still rather rough around the edges, so feel free to point out any flaws or mistakes, or suggest an improvement or better method.
Please note that ``participant'' and ``(directory) authority'' will be used interchangeably. ``secret'' and ``contribution'' will also be used interchangeably, although I will try to stick to ``contribution''. n will be taken to mean the number of participants (directory authorities) taking part in the protocol.
2. Design
The basic idea behind the protocol is as follows. Each participant (authority) generates a secret contribution. These contributions are then distributed simultaneously among all participants using a modified VSS scheme and a threshold of t = floor((n-1)/2) + 1. The shared random is then calculated as some function of all those contributions, for instance the XOR of each contribution together.
The threshold t used in the VSS scheme is chosen such that no adversarial participant (or adversarial group of less than 50% of all participants) can calculate the secret early. If a group of adversarial participants were able to calculate the secret earlier than the honest participants they would be able to alter the final result either by withholding their shares (backing out) or by causing the protocol to restart (and hence a completely new value calculated). Appendix [THRESHOLD_CHOICE] discusses the choice of threshold.
The exact VSS scheme to be used is not set in stone. The one referenced while designing the basic protocol is Feldman's VSS, which is described in the paper: Feldman, Paul. "A practical scheme for non-interactive verifiable secret sharing." Foundations of Computer Science, 1987., 28th Annual Symposium on. IEEE, 1987.
The protocol is split into several phases. Each phase can be summarised as follows:
--INIT phase: The initial phase. Just wait until we either reach our scheduled start time or we receive a DISTRIBUTE message, triggering the shared random calculation. There should probably be some sort of time bound here, since we don't want it to be possible to trigger creation of a new random an arbitrary amount of times; an adversary could easily abuse that by calculating random values until a desirable one is created. An example time bound could be ``if a message is received within 10 minutes of our scheduled start time then begin the protocol, else drop the message [or try to get the authorities to resync their schedules]''.
--DISTRIBUTE phase: Perform the sharing phase for each of the n instances of the chosen VSS scheme. Create our shares and commitments/proofs, then distribute those among the other participants. Each participant should be sent a single share, plus any information they need to validate that share. This phase doesn't end until we've both distributed all n-1 of our shares and we've received a share from each participant (received n-1) [or error / timeout of course]. Whenever a participant receives a DISTRIBUTE message they must validate the share it contains (using the methods provided by the VSS scheme). If validation fails the participant must notify the other participants of the error and then the protocol restarts from the beginning.
--SYNC phase: When a participant transitions into the SYNC phase they send a SYNC message to all other participants; this contains no shares, however may contain validation information. When a participant has received n-1 SYNC messages they may transition into the REVEAL phase; additionally we may transition into the REVEAL phase if we receive t commitments and then the phase time-out timer goes off [This requires more analysis. It _might_ be possible to exploit such extra behaviour].
The purpose of the SYNC phase is to ensure that all participants have successfully completed the DISTRIBUTE phase before anybody moves to the REVEAL phase. We need to ensure that everybody has n-1 valid shares; 1 share from each secret (not counting their own secret). This is very important for preventing backing-out; it prevents anybody from REVEALing prematurely. Additionally this phase can be used for some parts of the validation (depending on the VSS method selected), specifically comparing commitment/proof information.
To calculate the result prematurely here an adversary would require control over t or more of the total participants (a majority). As long as there is an honest majority there is no danger in restarting the protocol from the beginning during this phase or earlier [other than delaying calculations].
--REVEAL phase: At this point everybody distributes the list of n-1 shares they received during the DISTRIBUTE phase [they _don't_ redistribute their own shares]. When a participant has enough shares (has received t-1 REVEAL messages) they reconstruct each of the secret contributions. When each secret contribution has been reconstructed by the participant they XOR'd them together to produce the shared random [alternative combination methods are possible too]. If an error occurs during this phase (or later) we should avoid restarting the protocol if at all possible; restarting opens up the potential for backing out.
--CONCLUSION phase: In this phase we prepare the random secret for publishing. That is, gather information so any Tor client can verify that the shared random was agreed upon by some majority of the authorities. This likely won't require any additional exchanges between participants as you can probably just publish the shares and validation data used during the calculation.
3. Specification
TODO: A more detailed specification will go here when I have a sufficiently ironed out one.
4. Compatibility & Implementation
This protocol could be implemented as part of the Tor software or could be implemented as a completely separate application (provided input and output are well defined).
5. Performance and scalability notes
Each participant sends and receives approximately n messages during each phase. The total number of messages sent during execution of the protocol should be O(n^2). This should be fine considering it's intended to be used among < 20 or so authorities.
Appendix A. Choice of threshold t [THRESHOLD_CHOICE]
For the threshold used in secret sharing we chose a value of t = floor((n-1)/2) + 1. The purpose of this section is to explain and justify this choice.
Notation: t = secret sharing threshold = number of shares required to reconstruct a secret. n = number of participants (participating directory authorities). d = number of collaborating adversary participants. n-d = the number of honest participants.
It is possible for a single adversary to cause changes in the result blindly; but doing so is pointless because they're merely exchanging one unknown value with another. For instance, they could refuse to share their contribution, causing either the protocol to restart (and an entirely new result calculated) or for them to be ignored (and the result is calculated without their input). Additionally they could cause a fatal error which results in a full restart (and full recalculation).
If an adversary has control over a large enough subset d of the participants, however, it becomes possible to prematurely calculate the result. This is where things get dangerous; the collaborating adversaries could make informed decisions when modifying the result. For that reason careful selection of the secret sharing threshold t is important. We want to prevent all these kinds of attacks while also maximising the number of adversaries we can defend against.
There are two main points where premature calculation of the result can be done; the first is during the DISTRIBUTE or SYNC phase, and the second is during the REVEAL phase.
--In the DISTRIBUTE or SYNC phase During the DISTRIBUTE phase each participant generates a secret, splits it into n-1 shares, and then distributes one share to each other participant.
Assume the set of d adversaries simply waits at the start of this phase; they wait and collect shares received from the honest participants. When the honest participants have all completed distributing their shares the set of adversaries can pool all their received shares and get a list of d*(n-d) honest shares. That is, the adversaries have a list of d shares from each honest participant's secret. If the number of adversaries d is larger than the threshold t then they can use these shares to reconstruct all of the honest contributions and prematurely calculate the result.
Therefore to prevent this we require that t > d, that the threshold is larger than the number of adversaries.
--In the REVEAL phase Assume that the whole first half of the protocol completes without issue; the adversaries behave themselves and we reach the REVEAL phase.
At the start of the reveal phase every participant should have received n-1 shares; 1 share from each other participant's secret. If there are d adversarial participants then they have collected a total of d*(n-d) honest shares; d shares from each of the honest contributions. The honest participants (if they pooled their shares) have collected a total of (n-d)*d shares from the adversaries; (n-d) shares from each of the adversary contributions.
Since the honest participants are honest they will abide by the protocol and each REVEAL all (n-d) shares they have learned to every other participant. The adversarial participants just wait and collect shares as they did in the previous method. After all honest participants have REVEALed the adversaries pool together what they know to get (n-1)*(n-d); every share from every honest contribution. The adversaries can now reconstruct all honest contributions and calculate the final result.
If n-d < t then the honest participants can't reconstruct any of the secret contributions without at least one adversary REVEALing. Since the adversaries know the final result but the honest participants don't they can alter it by all refusing to REVEAL. Therefore to prevent this we require that n-d >= t, that the number of honest participants is larger than the threshold.
--Protecting against both To protect against both attacks we need to satisfy: t > d AND n-d >= t
We can change that to: n > 2d AND d < t <= n - d
n > 2d, so n/2 > d, and since we're dealing with integers here floor((n-1)/2) >= d. That is, d = floor((n-1)/2) = the largest number of adversaries we can handle while defending against both attacks.
This gives us: floor((n-1)/2) < t <= n - floor((n-1)/2)
Example: n = 10 d = floor((10 - 1)/2) = 4 4 < t <= 6 t in {5, 6}
n = 11 d = floor((11 - 1)/2) = 5 5 < t <= 6 t in {6}
n = 12 d = floor((12 - 1)/2) = 5 5 < t <= 7 t in {6, 7}
n = 13 d = floor((13 - 1)/2) = 6 6 < t <= 7 t in {7}
...
We'll choose the smaller value for t when more than one is possible, giving us: t = d + 1 = floor((n-1)/2) + 1
[TODO: ensure the smaller value is indeed the better choice]
Appendix B. Very basic overview of secret sharing [SECRET_SHARING]
Secret sharing algorithms are algorithms that take some secret value as input, then split that into a set of _shares_. On their own shares are useless, however, if you combine them you can reconstruct the original secret value.
Secret sharing algorithms are often combined with a threshold scheme, where the original secret value can be reconstructed from any t shares. These tend to be denoted as (t, n)-threshold schemes; where t is the minimum number of shares required for reconstruction, and n is the total number of shares created.
The creator of the shares is often referred to as the _dealer_. Generally the dealer splits their secret into n shares which are then distributed among n _participants_. Any t of those participants can then work together to reconstruct the secret.
In regular secret sharing it is possible to create shares such that different combinations of t shares will produce different reconstructions of the secret. Verifiable secret sharing (VSS) is a more secure type of secret sharing. Verifiable secret sharing methods allow holders of shares to verify that their shares are valid and consistent; they protect against dishonest dealers creating invalid or inconsistent shares, and against participants modifying or lying about their shares.