Clarifications on guard-spec.txt

Hi folks! I'm catching up on my proposals, and I really like guard-spec.txt. Nicely done! I made some fixes that I hope are straightforward: https://lists.torproject.org/pipermail/tor-commits/2017-May/122942.html https://lists.torproject.org/pipermail/tor-commits/2017-May/122948.html https://lists.torproject.org/pipermail/tor-commits/2017-May/122986.html And I have four remaining questions/topics that could use some feedback from the spec authors/implementors rather than just having me bust in and change things. --------------------------------------------------------------- A) We don't really say what we mean by "adding" a guard to an ordered list. In particular:
We add new members to {CONFIRMED_GUARDS} when we mark a circuit built through a guard as "for user traffic."
When we say "add", do we mean "append" here?
Similarly, does "add" mean "append"?
This one looks like a bug as currently stated: by "proceed", do we mean "precede"? --------------------------------------------------------------- B) In Sec 4.10, whenever we get a new consensus, we:
For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and {FIRST_UNLISTED_AT}.
In the old design, we also believed the Running flag in the new consensus, that is, we marked the guards as reachable again. It looks from sampled_guards_update_from_consensus() like we no longer do that. Is that old behavior considered a bug and we intentionally stopped, or did we not consider it? I am ok with "that was stupid behavior, so we stopped", but if we didn't know that we used to do it, maybe we should decide whether it is stupid behavior. :) --------------------------------------------------------------- C) In Sec A.4, in the state file we have
"bridge_addr" -- If the guard is a bridge, its configured address and OR port. Optional.
How does this play with bridges that have pluggable transports? In my "bridge obfs2 128.31.0.34:51715" line, the ORPort of the bridge is not listed. It looks from https://trac.torproject.org/projects/tor/ticket/21027 like we did some fixing here, but the spec didn't get an update? --------------------------------------------------------------- D) In Sec 4.8, when a circuit succeeds:
[Where INTERNET_LIKELY_DOWN_INTERVAL has been picked as 10 minutes.] To make sure I understand this one, consider the following scenario: * We go offline, and our current circuits fail. * We try to make new circuits through each of our primary guards, failing for each (because we're offline) and marking them down. * Then we move to the remaining confirmed+usable ones, and mark those down too. * Then we work through the rest of USABLE_FILTERED_GUARDS, marking them down too. As we mark them down, USABLE_FILTERED_GUARDS shrinks, causing us to add new elements to SAMPLED_GUARDS to replenish USABLE_FILTERED_GUARDS. * After we've brought SAMPLED GUARDS to 60, we stop adding new ones. At this point, we are out of tries:
Does this mean that we mark our *already-sampled* guards as reachable, thus making USABLE_FILTERED_GUARDS big again, and we loop through the above steps again, and we keep looping until we come back online? The above "so that we can keep on sampling" phrase confuses me, because it seems like there's no way that we would arrive at "USABLE_FILTERED_GUARDS is empty" without also having SAMPLED_GUARDS at its maximum size:
So, perhaps we mean "so that we can keep on trying circuits"? Or, oh! Do we mean "sampling from USABLE_FILTERED_GUARDS", not "sampling from GUARDS"? Ok, proceeding with my scenario: let's further say that we're offline in such a way that we get quick failures for the connect attempts, so we burn through each connect attempt at one per second, meaning we blow through our CONFIRMED set within the first 10 minutes of going offline. Let's also say we come back online within those 10 minutes. A circuit (to some guard in SAMPLED_GUARDS) succeeds, and the circuit is now of type <waiting_for_better_guard>. Since last_time_on_internet isn't very long ago, and there aren't any other circuits open, the new circuit becomes <complete>, and we mark this new guard as CONFIRMED, and stick it on the end of the CONFIRMED set. We keep using the newly CONFIRMED guard (since it's the only one we think is reachable) for about a half hour, at which point the PRIMARY_GUARDS_RETRY_SCHED schedule makes us forgive our primary guards, and we switch back to our favorite primary guard. Do I have it right? Thanks! --Roger

Roger Dingledine <arma@mit.edu> writes:
Hello, all three commits above LGTM. Thanks for cleaning up the proposal!
All the suggestions above seem correct. We should open a ticket and make a patch.
Hmm, I also don't see us currently doing this, but it might be worth doing. I imagine it's not going to do much in most cases since primary guards have a pretty short retry period, but still it might be a good idea since it will keep our guard list more up to date. Perhaps we should consider adding this behavior to the spec and code.
Hmm, seems like that part of the spec is not really accurate indeed! Maybe we can just update the spec and say that `bridge_addr` contains the configured address and port of the bridge which can be either the ORPort or PT port?
Seems like a reasonable description. I haven't looked at this part of the code in a while to be 100% sure of what you described being true.
Hmm, I think "we can keep on sampling" actually means keep on sampling new guards for circuits, and not "keep on sampling new SAMPLED_GUARDS". I think we overloaded the term "sample" there; we should fix it. (FWIW, that functionality was added in #21052.)
Yep.
I think so yes.

Roger Dingledine <arma@mit.edu> writes:
Second email: Please check branch `guard_spec_roger` in my torspec repo for fixes to all the issues below (I think), apart from issue (B) which probably deserves its own ticket. Feel free to merge, and let me know if I missed something.
participants (3)
-
George Kadianakis
-
Nick Mathewson
-
Roger Dingledine