[tor-dev] Clarifications on guard-spec.txt

George Kadianakis desnacked at riseup.net
Fri May 19 11:30:59 UTC 2017


Roger Dingledine <arma at mit.edu> writes:

> Hi folks!
>
> I'm catching up on my proposals, and I really like guard-spec.txt.
> Nicely done!
>
> I made some fixes that I hope are straightforward:
> https://lists.torproject.org/pipermail/tor-commits/2017-May/122942.html
> https://lists.torproject.org/pipermail/tor-commits/2017-May/122948.html
> https://lists.torproject.org/pipermail/tor-commits/2017-May/122986.html
>

Hello,

all three commits above LGTM. Thanks for cleaning up the proposal!

> And I have four remaining questions/topics that could use some feedback
> from the spec authors/implementors rather than just having me bust in
> and change things.
>
> ---------------------------------------------------------------
>
> A) We don't really say what we mean by "adding" a guard to an ordered
> list. In particular:
>
>>  We add new members to {CONFIRMED_GUARDS} when we mark a circuit
>>  built through a guard as "for user traffic."
>
> When we say "add", do we mean "append" here?
>
>>  To compute primary guards, take the ordered intersection of
>>  {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first
>>  {N_PRIMARY_GUARDS} elements.  If there are fewer than
>>  {N_PRIMARY_GUARDS} elements, add additional elements to
>>  PRIMARY_GUARDS chosen _uniformly_ at random from
>>  ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}).
>
> Similarly, does "add" mean "append"?
>
>>  Once an element has been added to {PRIMARY_GUARDS}, we do not remove it
>>  until it is replaced by some element from {CONFIRMED_GUARDS}. Confirmed
>>  elements always proceed unconfirmed ones in the {PRIMARY_GUARDS} list.
>
> This one looks like a bug as currently stated: by "proceed", do we mean
> "precede"?
>

All the suggestions above seem correct.
We should open a ticket and make a patch.

> ---------------------------------------------------------------
>
> B) In Sec 4.10, whenever we get a new consensus, we:
>
>>  For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and
>>  {FIRST_UNLISTED_AT}.
>
> In the old design, we also believed the Running flag in the new
> consensus, that is, we marked the guards as reachable again. It looks
> from sampled_guards_update_from_consensus() like we no longer do that.
>
> Is that old behavior considered a bug and we intentionally stopped,
> or did we not consider it?
>
> I am ok with "that was stupid behavior, so we stopped", but if we didn't
> know that we used to do it, maybe we should decide whether it is stupid
> behavior. :)
>

Hmm, I also don't see us currently doing this, but it might be worth doing.

I imagine it's not going to do much in most cases since primary guards
have a pretty short retry period, but still it might be a good idea
since it will keep our guard list more up to date.

Perhaps we should consider adding this behavior to the spec and code.

> ---------------------------------------------------------------
>
> C) In Sec A.4, in the state file we have
>
>>         "bridge_addr" -- If the guard is a bridge, its configured
>>         address and OR port. Optional.
>
> How does this play with bridges that have pluggable transports? In my
> "bridge obfs2 128.31.0.34:51715" line, the ORPort of the bridge is
> not listed.
>
> It looks from
> https://trac.torproject.org/projects/tor/ticket/21027
> like we did some fixing here, but the spec didn't get an update?
>

Hmm, seems like that part of the spec is not really accurate indeed!

Maybe we can just update the spec and say that `bridge_addr` contains
the configured address and port of the bridge which can be either the
ORPort or PT port?

> ---------------------------------------------------------------
>
> D) In Sec 4.8, when a circuit succeeds:
>
>>      * If this circuit was <usable_if_no_better_guard>, it is now
>>        <waiting_for retry>.  You may not yet attach streams to it.
>>        Then check whether the {last_time_on_internet} is more than
>>        {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago:
>>
>>           * If it is, then mark all {PRIMARY_GUARDS} as "maybe"
>>             reachable.
>>
>>           * If it is not, update the list of waiting circuits. (See
>>             [UPDATE_WAITING] below)
>
> [Where INTERNET_LIKELY_DOWN_INTERVAL has been picked as 10 minutes.]
>
> To make sure I understand this one, consider the following scenario:
>
> * We go offline, and our current circuits fail.
> * We try to make new circuits through each of our primary guards,
>   failing for each (because we're offline) and marking them down.
> * Then we move to the remaining confirmed+usable ones, and mark those
>   down too.
> * Then we work through the rest of USABLE_FILTERED_GUARDS, marking them
>   down too. As we mark them down, USABLE_FILTERED_GUARDS shrinks,
>   causing us to add new elements to SAMPLED_GUARDS to replenish
>   USABLE_FILTERED_GUARDS.
> * After we've brought SAMPLED GUARDS to 60, we stop adding new ones.
>   At this point, we are out of tries:
>

Seems like a reasonable description. I haven't looked at this part of
the code in a while to be 100% sure of what you described being true.

>>  * Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted
>>      all the sampled guards.  In this case we proceed by marking all guards
>>      as <maybe> reachable so that we can keep on sampling.
>
> Does this mean that we mark our *already-sampled* guards as reachable,
> thus making USABLE_FILTERED_GUARDS big again, and we loop through the
> above steps again, and we keep looping until we come back online?
>
> The above "so that we can keep on sampling" phrase confuses me, because it
> seems like there's no way that we would arrive at "USABLE_FILTERED_GUARDS
> is empty" without also having SAMPLED_GUARDS at its maximum size:
>

Hmm, I think "we can keep on sampling" actually means keep on sampling
new guards for circuits, and not "keep on sampling new
SAMPLED_GUARDS". I think we overloaded the term "sample" there; we
should fix it.

(FWIW, that functionality was added in #21052.)

>>     Whenever we are going to sample from {USABLE_FILTERED_GUARDS},
>>     and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we
>>     add new elements to {SAMPLED_GUARDS} until one of the following
>>     is true:
>>
>>       * {USABLE_FILTERED_GUARDS} is large enough,
>>     OR
>>       * {SAMPLED_GUARDS} is at its maximum size.
>
> So, perhaps we mean "so that we can keep on trying circuits"?
>

Yep.

> Or, oh! Do we mean "sampling from USABLE_FILTERED_GUARDS", not "sampling
> from GUARDS"?
>
> Ok, proceeding with my scenario: let's further say that we're offline
> in such a way that we get quick failures for the connect attempts,
> so we burn through each connect attempt at one per second, meaning we
> blow through our CONFIRMED set within the first 10 minutes of going
> offline. Let's also say we come back online within those 10 minutes. A
> circuit (to some guard in SAMPLED_GUARDS) succeeds, and the circuit is
> now of type <waiting_for_better_guard>.
>
> Since last_time_on_internet isn't very long ago, and there aren't any
> other circuits open, the new circuit becomes <complete>, and we mark
> this new guard as CONFIRMED, and stick it on the end of the CONFIRMED set.
>
> We keep using the newly CONFIRMED guard (since it's the only one
> we think is reachable) for about a half hour, at which point the
> PRIMARY_GUARDS_RETRY_SCHED schedule makes us forgive our primary guards,
> and we switch back to our favorite primary guard.
>
> Do I have it right?
>

I think so yes.


More information about the tor-dev mailing list