[tor-dev] Clarifications on guard-spec.txt

George Kadianakis desnacked at riseup.net
Fri May 19 11:30:59 UTC 2017

Roger Dingledine <arma at mit.edu> writes:

> Hi folks!
> I'm catching up on my proposals, and I really like guard-spec.txt.
> Nicely done!
> I made some fixes that I hope are straightforward:
> https://lists.torproject.org/pipermail/tor-commits/2017-May/122942.html
> https://lists.torproject.org/pipermail/tor-commits/2017-May/122948.html
> https://lists.torproject.org/pipermail/tor-commits/2017-May/122986.html


all three commits above LGTM. Thanks for cleaning up the proposal!

> And I have four remaining questions/topics that could use some feedback
> from the spec authors/implementors rather than just having me bust in
> and change things.
> ---------------------------------------------------------------
> A) We don't really say what we mean by "adding" a guard to an ordered
> list. In particular:
>>  We add new members to {CONFIRMED_GUARDS} when we mark a circuit
>>  built through a guard as "for user traffic."
> When we say "add", do we mean "append" here?
>>  To compute primary guards, take the ordered intersection of
>>  {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first
>>  {N_PRIMARY_GUARDS} elements.  If there are fewer than
>>  {N_PRIMARY_GUARDS} elements, add additional elements to
>>  PRIMARY_GUARDS chosen _uniformly_ at random from
> Similarly, does "add" mean "append"?
>>  Once an element has been added to {PRIMARY_GUARDS}, we do not remove it
>>  until it is replaced by some element from {CONFIRMED_GUARDS}. Confirmed
>>  elements always proceed unconfirmed ones in the {PRIMARY_GUARDS} list.
> This one looks like a bug as currently stated: by "proceed", do we mean
> "precede"?

All the suggestions above seem correct.
We should open a ticket and make a patch.

> ---------------------------------------------------------------
> B) In Sec 4.10, whenever we get a new consensus, we:
>>  For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and
> In the old design, we also believed the Running flag in the new
> consensus, that is, we marked the guards as reachable again. It looks
> from sampled_guards_update_from_consensus() like we no longer do that.
> Is that old behavior considered a bug and we intentionally stopped,
> or did we not consider it?
> I am ok with "that was stupid behavior, so we stopped", but if we didn't
> know that we used to do it, maybe we should decide whether it is stupid
> behavior. :)

Hmm, I also don't see us currently doing this, but it might be worth doing.

I imagine it's not going to do much in most cases since primary guards
have a pretty short retry period, but still it might be a good idea
since it will keep our guard list more up to date.

Perhaps we should consider adding this behavior to the spec and code.

> ---------------------------------------------------------------
> C) In Sec A.4, in the state file we have
>>         "bridge_addr" -- If the guard is a bridge, its configured
>>         address and OR port. Optional.
> How does this play with bridges that have pluggable transports? In my
> "bridge obfs2" line, the ORPort of the bridge is
> not listed.
> It looks from
> https://trac.torproject.org/projects/tor/ticket/21027
> like we did some fixing here, but the spec didn't get an update?

Hmm, seems like that part of the spec is not really accurate indeed!

Maybe we can just update the spec and say that `bridge_addr` contains
the configured address and port of the bridge which can be either the
ORPort or PT port?

> ---------------------------------------------------------------
> D) In Sec 4.8, when a circuit succeeds:
>>      * If this circuit was <usable_if_no_better_guard>, it is now
>>        <waiting_for retry>.  You may not yet attach streams to it.
>>        Then check whether the {last_time_on_internet} is more than
>>        {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago:
>>           * If it is, then mark all {PRIMARY_GUARDS} as "maybe"
>>             reachable.
>>           * If it is not, update the list of waiting circuits. (See
>>             [UPDATE_WAITING] below)
> [Where INTERNET_LIKELY_DOWN_INTERVAL has been picked as 10 minutes.]
> To make sure I understand this one, consider the following scenario:
> * We go offline, and our current circuits fail.
> * We try to make new circuits through each of our primary guards,
>   failing for each (because we're offline) and marking them down.
> * Then we move to the remaining confirmed+usable ones, and mark those
>   down too.
> * Then we work through the rest of USABLE_FILTERED_GUARDS, marking them
>   down too. As we mark them down, USABLE_FILTERED_GUARDS shrinks,
>   causing us to add new elements to SAMPLED_GUARDS to replenish
> * After we've brought SAMPLED GUARDS to 60, we stop adding new ones.
>   At this point, we are out of tries:

Seems like a reasonable description. I haven't looked at this part of
the code in a while to be 100% sure of what you described being true.

>>  * Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted
>>      all the sampled guards.  In this case we proceed by marking all guards
>>      as <maybe> reachable so that we can keep on sampling.
> Does this mean that we mark our *already-sampled* guards as reachable,
> thus making USABLE_FILTERED_GUARDS big again, and we loop through the
> above steps again, and we keep looping until we come back online?
> The above "so that we can keep on sampling" phrase confuses me, because it
> seems like there's no way that we would arrive at "USABLE_FILTERED_GUARDS
> is empty" without also having SAMPLED_GUARDS at its maximum size:

Hmm, I think "we can keep on sampling" actually means keep on sampling
new guards for circuits, and not "keep on sampling new
SAMPLED_GUARDS". I think we overloaded the term "sample" there; we
should fix it.

(FWIW, that functionality was added in #21052.)

>>     Whenever we are going to sample from {USABLE_FILTERED_GUARDS},
>>     and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we
>>     add new elements to {SAMPLED_GUARDS} until one of the following
>>     is true:
>>       * {USABLE_FILTERED_GUARDS} is large enough,
>>     OR
>>       * {SAMPLED_GUARDS} is at its maximum size.
> So, perhaps we mean "so that we can keep on trying circuits"?


> Or, oh! Do we mean "sampling from USABLE_FILTERED_GUARDS", not "sampling
> from GUARDS"?
> Ok, proceeding with my scenario: let's further say that we're offline
> in such a way that we get quick failures for the connect attempts,
> so we burn through each connect attempt at one per second, meaning we
> blow through our CONFIRMED set within the first 10 minutes of going
> offline. Let's also say we come back online within those 10 minutes. A
> circuit (to some guard in SAMPLED_GUARDS) succeeds, and the circuit is
> now of type <waiting_for_better_guard>.
> Since last_time_on_internet isn't very long ago, and there aren't any
> other circuits open, the new circuit becomes <complete>, and we mark
> this new guard as CONFIRMED, and stick it on the end of the CONFIRMED set.
> We keep using the newly CONFIRMED guard (since it's the only one
> we think is reachable) for about a half hour, at which point the
> PRIMARY_GUARDS_RETRY_SCHED schedule makes us forgive our primary guards,
> and we switch back to our favorite primary guard.
> Do I have it right?

I think so yes.

More information about the tor-dev mailing list