[tor-bugs] #12595 [Tor]: Think of better data structures for guard nodes

Mon Aug 4 11:34:21 UTC 2014

#12595: Think of better data structures for guard nodes
------------------------+--------------------------------
     Reporter:  asn     |      Owner:
         Type:  defect  |     Status:  new
     Priority:  normal  |  Milestone:  Tor: 0.2.6.x-final
    Component:  Tor     |    Version:
   Resolution:          |   Keywords:  tor-guard
Actual Points:          |  Parent ID:
       Points:          |
------------------------+--------------------------------

Comment (by asn):

 Replying to [comment:7 nickm]:
 > Replying to [comment:3 asn]:
 > > Some more thoughts on the new data structures:
 > >
 > > - Since we want our circuit guards to also be our directory guards (if
 > >   possible), we should probably use a single entry guard list (like we
 > >   do currently), instead of using separate lists for circuit guards
 > >   and directory guards. Reasons for this can be found here:
 > >   https://lists.torproject.org/pipermail/tor-dev/2014-May/006824.html
 >
 > Seems plausible to me.
 >
 > > - Furthermore, we want to make sure that if our directory guard claims
 > >   that it doesn't have a microdescriptor, we will go ahead and ask
 > >   other directory caches too:
 > >   https://lists.torproject.org/pipermail/tor-dev/2014-May/006820.html
 > >
 > >   I wonder if this behavior is any different from the corresponding
 > >   behavior of circuit guards: "If our circuit guard fails our circuit,
 > >   we have to go ahead and ask the next circuit guard"
 > >
 > >   If not, maybe we could just switch NumDirectoryGuards to 1 too, and
 > >   just make sure that if a microdescriptor gets denied we move to the
 > >   next directory guard, till we have enough microdescriptor to be
 happy?
 > >   (logic similar to compute_frac_paths_available()).
 >
 > Watch out there.  "compute_frac_paths_available()" still allows some
 epistemic bias.  It only insists that we have a large proportion of
 possible microdescriptors before we're happy enough to build circuits, not
 that we have all of them.  That's not such a big deal with multiple
 noncolluding directory guards, but with only one guard, it's possibly
 trouble.
 >

 Yep, we should be careful.

 I was thinking of an algorithm like this:
 {{{
 Fetch all mds from first dirguard
 If all mds got fetched successfully:
    Done!

 else:
    # If any mds got denied try next dirguard (even if we have enough dir
 info )
    while True: # Keep on asking other dirguards till we have enough dir
 info
      Get missing mds from the next dirguard
      if have_minimum_dir_info():
         Done

 }}}

 Which would basically ensure that we have tried at least two dirguards if
 the first one didn't serve us all the microdescriptors we were looking
 for. After we have asked two dirguards, we exit when we get enough
 directory info.

 I guess the idea is that the first dirguard might lie, but the probability
 of both dirguards lying is less. The algorithm can also be adapted to ask
 the first three dirguards (if we prefer the number 'three' instead of
 'two').

 I think this kind of algorithm ("Try guards sequentially till you get what
 you want") is a better approach than "Choose between the 3 top currently
 active guards in your list." since it avoids bugs like #12466.

 A big engineering problem with this idea, is that the networking logic of
 Tor is probably not ready to support this feature and will require non-
 trivial changes. For example, we will need some kind of logic that marks
 dirguards as skippable if they failed to deliver a microdescriptor, and
 then the next dirguard query would use the next guard. I can imagine
 implementation of this feature to get quite complicated.

 To better understand whether such an algorithm would work (and how often
 it would need to ask more dirguards), we need to check how frequent
 microdescriptor fetch failures are in the wild right now.

 Also, maybe it makes sense to write a proposal with the ideas from:
 https://lists.torproject.org/pipermail/tor-dev/2014-June/006944.html
 https://lists.torproject.org/pipermail/tor-dev/2014-June/006945.html
 and then try to implement it along with #12538, so that we eventually get
 rid of this issue:
 {{{
 ...
 If the above is the only case, we can fix it by making sure that
 directory servers start serving a consensus _only after_ they have
 downloaded the descriptors of all the routers mentioned in that
 consensus.
 ...
 }}}

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/12595#comment:8>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online