Clarifications on guard-spec.txt

19 May 2017

      Hi folks!

I'm catching up on my proposals, and I really like guard-spec.txt.
Nicely done!

I made some fixes that I hope are straightforward:
https://lists.torproject.org/pipermail/tor-commits/2017-May/122942.html
https://lists.torproject.org/pipermail/tor-commits/2017-May/122948.html
https://lists.torproject.org/pipermail/tor-commits/2017-May/122986.html

And I have four remaining questions/topics that could use some feedback
from the spec authors/implementors rather than just having me bust in
and change things.

---------------------------------------------------------------

A) We don't really say what we mean by "adding" a guard to an ordered
list. In particular:
...
We add new members to {CONFIRMED_GUARDS} when we mark a circuit
 built through a guard as "for user traffic."
When we say "add", do we mean "append" here?
...
To compute primary guards, take the ordered intersection of
 {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first
 {N_PRIMARY_GUARDS} elements.  If there are fewer than
 {N_PRIMARY_GUARDS} elements, add additional elements to
 PRIMARY_GUARDS chosen _uniformly_ at random from
 ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}).
Similarly, does "add" mean "append"?
...
Once an element has been added to {PRIMARY_GUARDS}, we do not remove it
 until it is replaced by some element from {CONFIRMED_GUARDS}. Confirmed
 elements always proceed unconfirmed ones in the {PRIMARY_GUARDS} list.
This one looks like a bug as currently stated: by "proceed", do we mean
"precede"?

---------------------------------------------------------------

B) In Sec 4.10, whenever we get a new consensus, we:
...
For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and
 {FIRST_UNLISTED_AT}.
In the old design, we also believed the Running flag in the new
consensus, that is, we marked the guards as reachable again. It looks
from sampled_guards_update_from_consensus() like we no longer do that.

Is that old behavior considered a bug and we intentionally stopped,
or did we not consider it?

I am ok with "that was stupid behavior, so we stopped", but if we didn't
know that we used to do it, maybe we should decide whether it is stupid
behavior. :)

---------------------------------------------------------------

C) In Sec A.4, in the state file we have
...
"bridge_addr" -- If the guard is a bridge, its configured
        address and OR port. Optional.
How does this play with bridges that have pluggable transports? In my
"bridge obfs2 128.31.0.34:51715" line, the ORPort of the bridge is
not listed.

It looks from
https://trac.torproject.org/projects/tor/ticket/21027
like we did some fixing here, but the spec didn't get an update?

---------------------------------------------------------------

D) In Sec 4.8, when a circuit succeeds:
...
* If this circuit was <usable_if_no_better_guard>, it is now
       <waiting_for retry>.  You may not yet attach streams to it.
       Then check whether the {last_time_on_internet} is more than
       {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago:
* If it is, then mark all {PRIMARY_GUARDS} as "maybe"
            reachable.
* If it is not, update the list of waiting circuits. (See
            [UPDATE_WAITING] below)
[Where INTERNET_LIKELY_DOWN_INTERVAL has been picked as 10 minutes.]

To make sure I understand this one, consider the following scenario:

* We go offline, and our current circuits fail.
* We try to make new circuits through each of our primary guards,
  failing for each (because we're offline) and marking them down.
* Then we move to the remaining confirmed+usable ones, and mark those
  down too.
* Then we work through the rest of USABLE_FILTERED_GUARDS, marking them
  down too. As we mark them down, USABLE_FILTERED_GUARDS shrinks,
  causing us to add new elements to SAMPLED_GUARDS to replenish
  USABLE_FILTERED_GUARDS.
* After we've brought SAMPLED GUARDS to 60, we stop adding new ones.
  At this point, we are out of tries:
...
* Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted
     all the sampled guards.  In this case we proceed by marking all guards
     as <maybe> reachable so that we can keep on sampling.
Does this mean that we mark our *already-sampled* guards as reachable,
thus making USABLE_FILTERED_GUARDS big again, and we loop through the
above steps again, and we keep looping until we come back online?

The above "so that we can keep on sampling" phrase confuses me, because it
seems like there's no way that we would arrive at "USABLE_FILTERED_GUARDS
is empty" without also having SAMPLED_GUARDS at its maximum size:
...
Whenever we are going to sample from {USABLE_FILTERED_GUARDS},
    and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we
    add new elements to {SAMPLED_GUARDS} until one of the following
    is true:
* {USABLE_FILTERED_GUARDS} is large enough,
    OR
      * {SAMPLED_GUARDS} is at its maximum size.
So, perhaps we mean "so that we can keep on trying circuits"?

Or, oh! Do we mean "sampling from USABLE_FILTERED_GUARDS", not "sampling
from GUARDS"?

Ok, proceeding with my scenario: let's further say that we're offline
in such a way that we get quick failures for the connect attempts,
so we burn through each connect attempt at one per second, meaning we
blow through our CONFIRMED set within the first 10 minutes of going
offline. Let's also say we come back online within those 10 minutes. A
circuit (to some guard in SAMPLED_GUARDS) succeeds, and the circuit is
now of type <waiting_for_better_guard>.

Since last_time_on_internet isn't very long ago, and there aren't any
other circuits open, the new circuit becomes <complete>, and we mark
this new guard as CONFIRMED, and stick it on the end of the CONFIRMED set.

We keep using the newly CONFIRMED guard (since it's the only one
we think is reachable) for about a half hour, at which point the
PRIMARY_GUARDS_RETRY_SCHED schedule makes us forgive our primary guards,
and we switch back to our favorite primary guard.

Do I have it right?

Thanks!
--Roger

Roger Dingledine

George Kadianakis

George Kadianakis

Nick Mathewson

tags

participants (3)