[tor-dev] Shared random value calculation edge cases (proposal 250)

David Goulet dgoulet at ev0ke.net
Fri Nov 20 16:24:24 UTC 2015


On 20 Nov (10:19:07), David Goulet wrote:
> On 20 Nov (16:06:59), George Kadianakis wrote:
> > s7r <s7r at sky-ip.org> writes:
> > 
> > > Hello,
> > >
> > > <snip>
> > >
> > > The idea of adding flags in the votes so each dirauth can advertise if
> > > it is participating (has an opinion for the <current> SR or not) is
> > > great and helps us build more defenses, probably make it easier in the
> > > future too if we decide to change anything.
> > >
> > > What if the consensus for SR calculation would define majority based
> > > on dirauths actually participating (and advertising so with a flag in
> > > the vote). Also, the participating or not participating flag should be
> > > used per vote/consensus and split into:
> > >
> > > a) we know current SR value for today so we vote it
> > > or
> > > we know previous SR value and we know for sure if we should follow the
> > > disaster protocol or not (in case we are about to vote at 01:00 UTC).
> > > so
> > > We participate in the vote for <current SR>.
> > >
> > > b) we are able to participate in this protocol run which will
> > > calculate the SR value for next day (after 00:00 UTC) so we send our
> > > commits/reveals.
> > >
> > > This is useful in case we are a dirauth that joined at 00:30 UTC and
> > > we couldn't get the _latest_ consensus (to find out if the 00:00 UTC
> > > consensus was created, and if not, previous SR value so we can follow
> > > the disaster procedure) we will not have an opinion for the <current>
> > > SR value at 01:00 UTC, but we can start participating in the protocol
> > > run for the next day - send our commit values. Once we decided on a
> > > <current> SR value for that day we save it and vote normally next time.
> > >
> > > So, if we have 5 dirauths running/signing consensus in total, out of
> > > which only 4 participate in the shared randomness protocol, the 4
> > > participating ones should be able to create a valid consensus
> > > themselves with the insurance that the 5th one won't break consensus.
> > >
> > > One way to do this is: the dirauth which is not participating will
> > > take the SR value voted by the majority of the participating dirauths
> > > and include that in its consensus and sign. We need at least 3
> > > dirauths agreeing on a SR value in order to accept it.
> > >
> > > Is this crazy? It shouldn't open the door new attacks, since this
> > > doesn't allow a single actor to game it, only the majority could game it.
> > >
> > 
> > Thanks for the suggestions.
> > 
> > Let me try to suggest a procedure here based on your ideas and some other ideas.
> > 
> > [Notation: SRV = shared random value]
> > 
> > The goal here is to minimize the edge cases during SRV calculation and disaster
> > SRV calculation. The edge cases here appear because there is no clear view on
> > whether other dirauths know the current or previous SRVs, or whether the SRV for
> > this period was ever created. The disaster recovery scenario is especially
> > annoying here. 
> > 
> > Here are some edge cases for example:
> > 
> > * Dirauth boots up at 02:00UTC and the 01:00UTC consensus does not contain any
> >   SR information (maybe because not enough SR-enabled dirauths participated at
> >   that time).
> > 
> >   Should the dirauth do the disaster recovery procedure, or just play it cool
> >   and put no SR information on the consensus? If it has to do disaster recovery,
> >   then what previous SRV does it use (the 01:00UTC consensus did not contain
> >   such info)?
> > 
> >   This type of edge case is my main concern, since with dirauths upgrading and
> >   going offline at random times, it's likely that we will eventually create a
> >   consensus without SR info in the middle of the protocol run.
> > 
> > * Dirauth boots up at 23:55UTC without having a previous consensus. It is
> >   supposed to vote and form a 00:00UTC consensus without knowing any previous
> >   SRVs. How does it figure out whether all the other dirauths are also
> >   bootstrapping, or whether the other dirauths actually know the previous SRVs?
> > 
> > Here are some prerequisites for the logic I'm going to suggest. The two first
> > suggestions are useful in any case, I think:
> > 
> > - First of all, we treat consensuses as trusted, so dirauths MUST learn
> >   previous/current SRVs they didn't know about from any consensus they fetch.
> 
> Yes ++.
> 
> A dirauth booting up should always try to learn the current and previous
> SRV from the latest consensus (< 24h) and update the disk state
> accordingly _even_ if the disk state has SRV values in there. We should
> trust the consensus voted by majority before our disk state imo.
> 
> > 
> > - We are also going to assume that we have some sort of "SR flag" on votes to
> >   denote whether the dirauth participates in the protocol.
> 
> The requirements for having that flag "on" or "off" eludes me a bit. I
> assume that this SR flag is "on" when a dirauth thinks its able to
> compute a valid SRV that is I have enough commits. We can't make "having
> the previous SRV" a requirement else we will never bootstrap properly
> ever since in the first runs, previous SRV is alway 0.
> 
> > 
> > - We are introducing another 'status' for shared random values (see [SRVOTE]).  
> > 
> >   Specifically, if a dirauth witnessed a 00:00UTC consensus failing to be
> >   created, or it did not contain any SRV information, then it sets the status of
> >   "shared-rand-current-value" in its votes to "none".
> 
> Hrm... I'm wondering here if "none" status is instead the same of having
> the dirauth putting the disaster value in the vote?
> 
> I would argue that a dirauth should _always_ try to compute a SR value
> at any time (even when booting). Between having no line because we
> couldn't compute due to lack of commits or a line that is the disaster,
> I think the former simplifies things that is "We always have a SRV value
> at 00:00 from a dirauth".
> 
> Here is what I propose:
> 
> 1) Like you said, we always try to learn SRV values from previous
> consensus. If we can't get one from a consensus or our state, compute
> disaster at 00:00.
> 
> 2) We do _not_ use a SR flag for vote, a dirauth always tries to compute
> a current SRV value at 00:00. If it's 13:00, it simply doesn't put
> anything in the vote since no SRV data.
> 
> The big issue I see here is when a dirauth does _not_ have the previous
> SRV. Without it, both disaster and current SRV computation will end up
> not matching the majority. So let's explore this binary state:
> 
> - I have the previous SRV (either I can get it from my state or
>   consensus).
> 
>   This seems a non issue, dirauth will use it, carry it in the vote and
>   compute the current SRV if enough commits have been seen. At 00:00, if
>   majority wasn't reached for the current SRV value, we have a consensus
>   without it which is a valid use case but we at least have the previous
>   one in the consensus flagged as non-fresh so it can be used for
>   disaster or SRV computation at the next protocol run.
> 
> - I don't have a previous SRV (couldn't get it in the latest consensus
>   and I have _no_ disk state).
> 
>   For instance, if this happens at 23:30, a dirauth won't have the
>   chance to participate in the SR protocol because it didn't see any
>   commits to compute SRV value thus disaster mode engage.
> 
>   If that dirauth was the 5th one out of 9 (meaning the majority
>   breaker), we end up with no SRV values at all at 00:00 (not even the
>   previous SRV since only 4/9 had it). So a new protocol run starts,
>   we'll end up with a valid current SRV in 24 hours and both values in
>   48 hours. Essentially, we end up back in bootstrap mode.
> 
>   IMPORTANT note here: I think a dirauth MUST NOT keep any SRV values
>   that aren't in the 00:00 consensus.
> 
>   Question is, how big of a deal it is to have this issue where if we
>   don't reach majority because a dirauth rebooted and for some magical
>   reason didn't get the latest consensus to at least get the previous
>   SRV, we return in bootstrap?

Ok more thoughts on this, asn asked that I make an algorithm so here is
a try at that:

-- Booting up Conditions --

    - A dirauth MUST try to acquire both previous and current SRV from
      the last consensus. If it can't, get it from disk state. If
      nothing is available, none exists for this protocol run.

-- Algorithm --

At 16:00 (arbitrary time which is _not_ the current SRV calculation):

    # Voting
    if dirauth has previous SRV:
        put it in vote
    if dirauth has current SRV:
        put it in vote

    Output: Consensus is created

    # Consensus
    (This goes for both previous and current SRV)
    if SRV in consensus:
        dirauth MUST keep it even if the one they have doesn't match.
        Majority has decided what should be used.
    else:
        dirauth MUST discard the SRV it has.

At 00:00

    # Voting
    if current SRV can't be created because lack of commits:
        current SRV == disaster mode (previous SRV or not)
    else:
        compute current SRV (with or without a previous SRV)

    (Proceed like the 16:00 period)

    # Consensus
    if majority agrees on SR value(s):
        put in consensus
    else:
        No SR value(s) in consensus

    Output: consensus is created

    (Proceed like the 16:00 period)


Side effect of only keeping SRV that are in the consensus. If one voting
round goes bad for X reason and consensus end up with no SRV, we end up
in bootstrapping mode that is no previous nor current SRV in the
consensus which is problematic because for 48 hours, we won't have a
previous SRV which is the one used by everyone.

I don't see a way to get out of this because consensus is decided from
the votes deterministically thus if not enough vote for SR values, we'll
end up with a consensus with none so this is why client/HS have to
fallback to a disaster value by themselves I think which can NOT be
based on the previous SRV.

David

> 
> Cheers!
> David
> 
> > 
> > Now we are going to use the SRV lines in the votes as an indicator on how the
> > consensus creation should play out.
> > 
> > ================================================================================
> > 
> > Here is some logic for the consensus at 00:00UTC:
> > 
> >      if majority of votes have disabled the SR flag:
> >                then don't write anything SR-related to consensus 
> >                exit
> > 
> >      If the majority of votes contain the previous SRV value:
> >                then calculate SRV as detailed in section [SRCALC]  
> >      else:
> >                then calculate SRV as in [SRCALC] but with previous_SRV set to 0
> > 
> > And here is some logic for all the other consensuses (to figure out when
> > dirauths should perform the disaster procedure):
> > 
> >      if majority of votes have disabled the SR flag:
> >                then don't write anything SR-related to consensus 
> >                exit
> > 
> >      if we know the current SRV:
> >                then write it on the consensus
> >                exit
> > 
> >      if the majority of votes have the current SRV status as "none" _AND_
> >         those votes also contain the previous SRV value:
> >                then do the disaster SRV procedure
> >      else:
> >                then don't write anything SR-related to consensus             
> > 
> > ================================================================================
> > 
> > This _might_ work for fixing a good bunch of edge cases. But is it far too complex?
> > 
> > Should we just assume that these things will never happen on the real network
> > and avoid baking additional complexity? What do you think? :/
> > 
> > 
> > _______________________________________________
> > tor-dev mailing list
> > tor-dev at lists.torproject.org
> > https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev



> _______________________________________________
> tor-dev mailing list
> tor-dev at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 603 bytes
Desc: Digital signature
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20151120/68e8133d/attachment-0001.sig>


More information about the tor-dev mailing list