Hi folks!
I'm catching up on my proposals, and I really like guard-spec.txt. Nicely done!
I made some fixes that I hope are straightforward: https://lists.torproject.org/pipermail/tor-commits/2017-May/122942.html https://lists.torproject.org/pipermail/tor-commits/2017-May/122948.html https://lists.torproject.org/pipermail/tor-commits/2017-May/122986.html
And I have four remaining questions/topics that could use some feedback from the spec authors/implementors rather than just having me bust in and change things.
---------------------------------------------------------------
A) We don't really say what we mean by "adding" a guard to an ordered list. In particular:
We add new members to {CONFIRMED_GUARDS} when we mark a circuit built through a guard as "for user traffic."
When we say "add", do we mean "append" here?
To compute primary guards, take the ordered intersection of {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first {N_PRIMARY_GUARDS} elements. If there are fewer than {N_PRIMARY_GUARDS} elements, add additional elements to PRIMARY_GUARDS chosen _uniformly_ at random from ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}).
Similarly, does "add" mean "append"?
Once an element has been added to {PRIMARY_GUARDS}, we do not remove it until it is replaced by some element from {CONFIRMED_GUARDS}. Confirmed elements always proceed unconfirmed ones in the {PRIMARY_GUARDS} list.
This one looks like a bug as currently stated: by "proceed", do we mean "precede"?
---------------------------------------------------------------
B) In Sec 4.10, whenever we get a new consensus, we:
For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and {FIRST_UNLISTED_AT}.
In the old design, we also believed the Running flag in the new consensus, that is, we marked the guards as reachable again. It looks from sampled_guards_update_from_consensus() like we no longer do that.
Is that old behavior considered a bug and we intentionally stopped, or did we not consider it?
I am ok with "that was stupid behavior, so we stopped", but if we didn't know that we used to do it, maybe we should decide whether it is stupid behavior. :)
---------------------------------------------------------------
C) In Sec A.4, in the state file we have
"bridge_addr" -- If the guard is a bridge, its configured address and OR port. Optional.
How does this play with bridges that have pluggable transports? In my "bridge obfs2 128.31.0.34:51715" line, the ORPort of the bridge is not listed.
It looks from https://trac.torproject.org/projects/tor/ticket/21027 like we did some fixing here, but the spec didn't get an update?
---------------------------------------------------------------
D) In Sec 4.8, when a circuit succeeds:
* If this circuit was <usable_if_no_better_guard>, it is now <waiting_for retry>. You may not yet attach streams to it. Then check whether the {last_time_on_internet} is more than {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago: * If it is, then mark all {PRIMARY_GUARDS} as "maybe" reachable. * If it is not, update the list of waiting circuits. (See [UPDATE_WAITING] below)
[Where INTERNET_LIKELY_DOWN_INTERVAL has been picked as 10 minutes.]
To make sure I understand this one, consider the following scenario:
* We go offline, and our current circuits fail. * We try to make new circuits through each of our primary guards, failing for each (because we're offline) and marking them down. * Then we move to the remaining confirmed+usable ones, and mark those down too. * Then we work through the rest of USABLE_FILTERED_GUARDS, marking them down too. As we mark them down, USABLE_FILTERED_GUARDS shrinks, causing us to add new elements to SAMPLED_GUARDS to replenish USABLE_FILTERED_GUARDS. * After we've brought SAMPLED GUARDS to 60, we stop adding new ones. At this point, we are out of tries:
- Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted all the sampled guards. In this case we proceed by marking all guards as <maybe> reachable so that we can keep on sampling.
Does this mean that we mark our *already-sampled* guards as reachable, thus making USABLE_FILTERED_GUARDS big again, and we loop through the above steps again, and we keep looping until we come back online?
The above "so that we can keep on sampling" phrase confuses me, because it seems like there's no way that we would arrive at "USABLE_FILTERED_GUARDS is empty" without also having SAMPLED_GUARDS at its maximum size:
Whenever we are going to sample from {USABLE_FILTERED_GUARDS}, and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we add new elements to {SAMPLED_GUARDS} until one of the following is true: * {USABLE_FILTERED_GUARDS} is large enough, OR * {SAMPLED_GUARDS} is at its maximum size.
So, perhaps we mean "so that we can keep on trying circuits"?
Or, oh! Do we mean "sampling from USABLE_FILTERED_GUARDS", not "sampling from GUARDS"?
Ok, proceeding with my scenario: let's further say that we're offline in such a way that we get quick failures for the connect attempts, so we burn through each connect attempt at one per second, meaning we blow through our CONFIRMED set within the first 10 minutes of going offline. Let's also say we come back online within those 10 minutes. A circuit (to some guard in SAMPLED_GUARDS) succeeds, and the circuit is now of type <waiting_for_better_guard>.
Since last_time_on_internet isn't very long ago, and there aren't any other circuits open, the new circuit becomes <complete>, and we mark this new guard as CONFIRMED, and stick it on the end of the CONFIRMED set.
We keep using the newly CONFIRMED guard (since it's the only one we think is reachable) for about a half hour, at which point the PRIMARY_GUARDS_RETRY_SCHED schedule makes us forgive our primary guards, and we switch back to our favorite primary guard.
Do I have it right?
Thanks! --Roger
Roger Dingledine arma@mit.edu writes:
Hi folks!
I'm catching up on my proposals, and I really like guard-spec.txt. Nicely done!
I made some fixes that I hope are straightforward: https://lists.torproject.org/pipermail/tor-commits/2017-May/122942.html https://lists.torproject.org/pipermail/tor-commits/2017-May/122948.html https://lists.torproject.org/pipermail/tor-commits/2017-May/122986.html
Hello,
all three commits above LGTM. Thanks for cleaning up the proposal!
And I have four remaining questions/topics that could use some feedback from the spec authors/implementors rather than just having me bust in and change things.
A) We don't really say what we mean by "adding" a guard to an ordered list. In particular:
We add new members to {CONFIRMED_GUARDS} when we mark a circuit built through a guard as "for user traffic."
When we say "add", do we mean "append" here?
To compute primary guards, take the ordered intersection of {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first {N_PRIMARY_GUARDS} elements. If there are fewer than {N_PRIMARY_GUARDS} elements, add additional elements to PRIMARY_GUARDS chosen _uniformly_ at random from ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}).
Similarly, does "add" mean "append"?
Once an element has been added to {PRIMARY_GUARDS}, we do not remove it until it is replaced by some element from {CONFIRMED_GUARDS}. Confirmed elements always proceed unconfirmed ones in the {PRIMARY_GUARDS} list.
This one looks like a bug as currently stated: by "proceed", do we mean "precede"?
All the suggestions above seem correct. We should open a ticket and make a patch.
B) In Sec 4.10, whenever we get a new consensus, we:
For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and {FIRST_UNLISTED_AT}.
In the old design, we also believed the Running flag in the new consensus, that is, we marked the guards as reachable again. It looks from sampled_guards_update_from_consensus() like we no longer do that.
Is that old behavior considered a bug and we intentionally stopped, or did we not consider it?
I am ok with "that was stupid behavior, so we stopped", but if we didn't know that we used to do it, maybe we should decide whether it is stupid behavior. :)
Hmm, I also don't see us currently doing this, but it might be worth doing.
I imagine it's not going to do much in most cases since primary guards have a pretty short retry period, but still it might be a good idea since it will keep our guard list more up to date.
Perhaps we should consider adding this behavior to the spec and code.
C) In Sec A.4, in the state file we have
"bridge_addr" -- If the guard is a bridge, its configured address and OR port. Optional.
How does this play with bridges that have pluggable transports? In my "bridge obfs2 128.31.0.34:51715" line, the ORPort of the bridge is not listed.
It looks from https://trac.torproject.org/projects/tor/ticket/21027 like we did some fixing here, but the spec didn't get an update?
Hmm, seems like that part of the spec is not really accurate indeed!
Maybe we can just update the spec and say that `bridge_addr` contains the configured address and port of the bridge which can be either the ORPort or PT port?
D) In Sec 4.8, when a circuit succeeds:
* If this circuit was <usable_if_no_better_guard>, it is now <waiting_for retry>. You may not yet attach streams to it. Then check whether the {last_time_on_internet} is more than {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago: * If it is, then mark all {PRIMARY_GUARDS} as "maybe" reachable. * If it is not, update the list of waiting circuits. (See [UPDATE_WAITING] below)
[Where INTERNET_LIKELY_DOWN_INTERVAL has been picked as 10 minutes.]
To make sure I understand this one, consider the following scenario:
- We go offline, and our current circuits fail.
- We try to make new circuits through each of our primary guards, failing for each (because we're offline) and marking them down.
- Then we move to the remaining confirmed+usable ones, and mark those down too.
- Then we work through the rest of USABLE_FILTERED_GUARDS, marking them down too. As we mark them down, USABLE_FILTERED_GUARDS shrinks, causing us to add new elements to SAMPLED_GUARDS to replenish USABLE_FILTERED_GUARDS.
- After we've brought SAMPLED GUARDS to 60, we stop adding new ones. At this point, we are out of tries:
Seems like a reasonable description. I haven't looked at this part of the code in a while to be 100% sure of what you described being true.
- Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted all the sampled guards. In this case we proceed by marking all guards as <maybe> reachable so that we can keep on sampling.
Does this mean that we mark our *already-sampled* guards as reachable, thus making USABLE_FILTERED_GUARDS big again, and we loop through the above steps again, and we keep looping until we come back online?
The above "so that we can keep on sampling" phrase confuses me, because it seems like there's no way that we would arrive at "USABLE_FILTERED_GUARDS is empty" without also having SAMPLED_GUARDS at its maximum size:
Hmm, I think "we can keep on sampling" actually means keep on sampling new guards for circuits, and not "keep on sampling new SAMPLED_GUARDS". I think we overloaded the term "sample" there; we should fix it.
(FWIW, that functionality was added in #21052.)
Whenever we are going to sample from {USABLE_FILTERED_GUARDS}, and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we add new elements to {SAMPLED_GUARDS} until one of the following is true: * {USABLE_FILTERED_GUARDS} is large enough, OR * {SAMPLED_GUARDS} is at its maximum size.
So, perhaps we mean "so that we can keep on trying circuits"?
Yep.
Or, oh! Do we mean "sampling from USABLE_FILTERED_GUARDS", not "sampling from GUARDS"?
Ok, proceeding with my scenario: let's further say that we're offline in such a way that we get quick failures for the connect attempts, so we burn through each connect attempt at one per second, meaning we blow through our CONFIRMED set within the first 10 minutes of going offline. Let's also say we come back online within those 10 minutes. A circuit (to some guard in SAMPLED_GUARDS) succeeds, and the circuit is now of type <waiting_for_better_guard>.
Since last_time_on_internet isn't very long ago, and there aren't any other circuits open, the new circuit becomes <complete>, and we mark this new guard as CONFIRMED, and stick it on the end of the CONFIRMED set.
We keep using the newly CONFIRMED guard (since it's the only one we think is reachable) for about a half hour, at which point the PRIMARY_GUARDS_RETRY_SCHED schedule makes us forgive our primary guards, and we switch back to our favorite primary guard.
Do I have it right?
I think so yes.
Roger Dingledine arma@mit.edu writes:
Hi folks!
I'm catching up on my proposals, and I really like guard-spec.txt. Nicely done!
I made some fixes that I hope are straightforward: https://lists.torproject.org/pipermail/tor-commits/2017-May/122942.html https://lists.torproject.org/pipermail/tor-commits/2017-May/122948.html https://lists.torproject.org/pipermail/tor-commits/2017-May/122986.html
And I have four remaining questions/topics that could use some feedback from the spec authors/implementors rather than just having me bust in and change things.
Second email:
Please check branch `guard_spec_roger` in my torspec repo for fixes to all the issues below (I think), apart from issue (B) which probably deserves its own ticket.
Feel free to merge, and let me know if I missed something.
A) We don't really say what we mean by "adding" a guard to an ordered list. In particular:
We add new members to {CONFIRMED_GUARDS} when we mark a circuit built through a guard as "for user traffic."
When we say "add", do we mean "append" here?
To compute primary guards, take the ordered intersection of {CONFIRMED_GUARDS} and {FILTERED_GUARDS}, and take the first {N_PRIMARY_GUARDS} elements. If there are fewer than {N_PRIMARY_GUARDS} elements, add additional elements to PRIMARY_GUARDS chosen _uniformly_ at random from ({FILTERED_GUARDS} - {CONFIRMED_GUARDS}).
Similarly, does "add" mean "append"?
Once an element has been added to {PRIMARY_GUARDS}, we do not remove it until it is replaced by some element from {CONFIRMED_GUARDS}. Confirmed elements always proceed unconfirmed ones in the {PRIMARY_GUARDS} list.
This one looks like a bug as currently stated: by "proceed", do we mean "precede"?
B) In Sec 4.10, whenever we get a new consensus, we:
For every guard in {SAMPLED_GUARDS}, we update {IS_LISTED} and {FIRST_UNLISTED_AT}.
In the old design, we also believed the Running flag in the new consensus, that is, we marked the guards as reachable again. It looks from sampled_guards_update_from_consensus() like we no longer do that.
Is that old behavior considered a bug and we intentionally stopped, or did we not consider it?
I am ok with "that was stupid behavior, so we stopped", but if we didn't know that we used to do it, maybe we should decide whether it is stupid behavior. :)
C) In Sec A.4, in the state file we have
"bridge_addr" -- If the guard is a bridge, its configured address and OR port. Optional.
How does this play with bridges that have pluggable transports? In my "bridge obfs2 128.31.0.34:51715" line, the ORPort of the bridge is not listed.
It looks from https://trac.torproject.org/projects/tor/ticket/21027 like we did some fixing here, but the spec didn't get an update?
D) In Sec 4.8, when a circuit succeeds:
* If this circuit was <usable_if_no_better_guard>, it is now <waiting_for retry>. You may not yet attach streams to it. Then check whether the {last_time_on_internet} is more than {INTERNET_LIKELY_DOWN_INTERVAL} seconds ago: * If it is, then mark all {PRIMARY_GUARDS} as "maybe" reachable. * If it is not, update the list of waiting circuits. (See [UPDATE_WAITING] below)
[Where INTERNET_LIKELY_DOWN_INTERVAL has been picked as 10 minutes.]
To make sure I understand this one, consider the following scenario:
- We go offline, and our current circuits fail.
- We try to make new circuits through each of our primary guards, failing for each (because we're offline) and marking them down.
- Then we move to the remaining confirmed+usable ones, and mark those down too.
- Then we work through the rest of USABLE_FILTERED_GUARDS, marking them down too. As we mark them down, USABLE_FILTERED_GUARDS shrinks, causing us to add new elements to SAMPLED_GUARDS to replenish USABLE_FILTERED_GUARDS.
- After we've brought SAMPLED GUARDS to 60, we stop adding new ones. At this point, we are out of tries:
- Otherwise, if USABLE_FILTERED_GUARDS is empty, we have exhausted all the sampled guards. In this case we proceed by marking all guards as <maybe> reachable so that we can keep on sampling.
Does this mean that we mark our *already-sampled* guards as reachable, thus making USABLE_FILTERED_GUARDS big again, and we loop through the above steps again, and we keep looping until we come back online?
The above "so that we can keep on sampling" phrase confuses me, because it seems like there's no way that we would arrive at "USABLE_FILTERED_GUARDS is empty" without also having SAMPLED_GUARDS at its maximum size:
Whenever we are going to sample from {USABLE_FILTERED_GUARDS}, and it contains fewer than {MIN_FILTERED_SAMPLE} elements, we add new elements to {SAMPLED_GUARDS} until one of the following is true: * {USABLE_FILTERED_GUARDS} is large enough, OR * {SAMPLED_GUARDS} is at its maximum size.
So, perhaps we mean "so that we can keep on trying circuits"?
Or, oh! Do we mean "sampling from USABLE_FILTERED_GUARDS", not "sampling from GUARDS"?
Ok, proceeding with my scenario: let's further say that we're offline in such a way that we get quick failures for the connect attempts, so we burn through each connect attempt at one per second, meaning we blow through our CONFIRMED set within the first 10 minutes of going offline. Let's also say we come back online within those 10 minutes. A circuit (to some guard in SAMPLED_GUARDS) succeeds, and the circuit is now of type <waiting_for_better_guard>.
Since last_time_on_internet isn't very long ago, and there aren't any other circuits open, the new circuit becomes <complete>, and we mark this new guard as CONFIRMED, and stick it on the end of the CONFIRMED set.
We keep using the newly CONFIRMED guard (since it's the only one we think is reachable) for about a half hour, at which point the PRIMARY_GUARDS_RETRY_SCHED schedule makes us forgive our primary guards, and we switch back to our favorite primary guard.
Do I have it right?
Thanks! --Roger
tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
On Fri, May 19, 2017 at 8:18 AM, George Kadianakis desnacked@riseup.net wrote:
Roger Dingledine arma@mit.edu writes:
Hi folks!
I'm catching up on my proposals, and I really like guard-spec.txt. Nicely done!
I made some fixes that I hope are straightforward: https://lists.torproject.org/pipermail/tor-commits/2017-May/122942.html https://lists.torproject.org/pipermail/tor-commits/2017-May/122948.html https://lists.torproject.org/pipermail/tor-commits/2017-May/122986.html
And I have four remaining questions/topics that could use some feedback from the spec authors/implementors rather than just having me bust in and change things.
Second email:
Please check branch `guard_spec_roger` in my torspec repo for fixes to all the issues below (I think), apart from issue (B) which probably deserves its own ticket.
Feel free to merge, and let me know if I missed something.
Looks good -- merged that to torspec.git. Roger, please let us know is there's something more to do.
cheers,