[tor-dev] Proposal 259: New Guard Selection Behaviour

Fri Mar 25 11:26:35 UTC 2016

Tim Wilson-Brown - teor <teor2345 at gmail.com> writes:

> [ text/plain ]
>
>> On 25 Mar 2016, at 00:31, George Kadianakis <desnacked at riseup.net> wrote:
>> 
>> Tim Wilson-Brown - teor <teor2345 at gmail.com <mailto:teor2345 at gmail.com>> writes:
>> 
>>> [ text/plain ]
>>> 
>>>> On 24 Mar 2016, at 22:55, George Kadianakis <desnacked at riseup.net <mailto:desnacked at riseup.net>> wrote:
>>>> 
>>>>
>>>> <snip>
>>>>
>> I think Reinaldo et al. were also thinking of incorporating the
>> ReachableAddresses logic in there, so that DYSTOPIC_GUARDS changes based on the
>> reachability settings of the client. I'm not sure exactly how that would work,
>> especially when the user can change ReachableAddresses at any moment. I think
>> we should go for the simplest thing possible here, and improve our heuristics
>> in the future based on testing.
>
> I suggest that we compose the set of UTOPIC guards based on addresses that are reachable and preferred (or, if there are no guards with preferred addresses, those guards that are reachable). I suggest that we use the same mechanism with DYSTOPIC guards, but add a port restriction to 80 & 443 to all the other restrictions. (This may result in the empty set.)
>

Alright, this seems like a good process here. We should do it like that.

What happens if a utopic guard suddenly is not included in the
ReachableAddresses anymore? Maybe we mark it as 'bad' (the same way we mark
relays that leave the consensus).

>>
>> <snip>
>>
>> I think the current proposal tries to balance this, by enabling this heuristic
>> only after Alice exhausts her utopic guardlist. Also, keep in mind that the
>> utopic guardlist might contain 80/443 guards as well. So if Alice is lucky, she
>> got an 80/443 guard in her utopic guard list, and she will still bootstrap
>> before the dystopic heuristic triggers.
>> 
>> There are various ways to make this heuristic more "intelligent", but I would
>> like to maintain simplicity in our design (both simple to understand and to
>> implement). For example, we could imagine that we always put some 80/443 guards
>> as our primary guards, or in the utopic guardlist. Or, that we reduce the 2%
>> requirement so that we go trigger the dystopic heuristic faster.
>
> Or that tor can get a hint about which ports it can access based on which ports it used to bootstrap.
> (See below for details.)
>

Yes, could be.

How would that work though?
And what happens if the network changes? How does the hint work then though?

>> Currently, I'm hoping that we will understand the value of this heuristic
>> better when we implement it, and test it on real networks...
>> 
>> Any suggestions?
>
> There's a whole lot of my thoughts below.
>
> Why such a large list of guards?
>
> Apart from the fingerprinting issue (which I think gets worse with a larger list, at least if it's tried in order), I wonder why we bother trying such a large UTOPIC guardlist.
> Surely after you've tried 10 guards, the chances that the 11th is going to connect is vanishingly small.
> (Unless it's on a different port or netback, I guess.)
> And if our packets are reaching the guard, and being dropped on the way back, we have to consider the load this places on the network.
>

Indeed, I also feel that 80 guards is a lot of guards to try before switching to dystopic mode.

I would be up for reducing it. I wonder what's the right number here.

My fear with having a small number of sampled guards in a guardlist is that if
all of them go down at the same time, then that guardlist is useless.

Also, this reminds me that the proposal does not precisely specify what happens
when guards in SAMPLED_UTOPIC_GUARDS become bad (they drop out of the
consensus).  Do we keep them on the list but marked as bad? What happens if
lots of them become bad? When do we add new guards? Currently the proposal only
says:

      It will be filled in by the algorithm if it's empty, or if it contains
      less than SAMPLE_SET_THRESHOLD guards after winnowing out older
      guards. It should be filled by using NEXT_BY_BANDWIDTH with UTOPIC_GUARDS
      as an argument.

I think we should be more specific here.

> Client Bootstrap
>
> The proposal ignores client bootstrap.
>
> There are a limited number of hard-coded authorities and fallback directories available during client bootstrap.
> The client doesn't select guards until it has bootstrapped from one of the 9 authorities or 20-200 fallback directories.
>

What do you think should be mentioned here?

> Bootstrap / Launch Time
>
> The proposal calculates bootstrap and launch time incorrectly.
>
> The proposal assumes that Tor attempts to connect to each guard, waits for failure before trying another. But this isn't how Tor actually works - it sometimes tries multiple connections simultaneously. So summing the times for individual connection attempts to each guard doesn't provide an accurate picture of the actual connection time.
>
> When bootstrapping in 0.2.7 and earlier, tor will try an authority, wait up to 10 seconds for it to fail, then try another.
> Then there's a 60 second wait before the third authority, but at that point the user has likely lost interest.
>
> In 0.2.8, tor connects to authorities and fallbacks concurrently. It will try 3 fallbacks and 1 authority in the first 10 seconds, and download from whichever one connects first So 0.2.8 is far more likely to connect within a few seconds.
>
> In all current versions, tor then downloads the consensus (~1.5MB, could take 10 seconds or more), and chooses directory guards.
> Then it simultaneously connects to 3 directory guards to download certificates and descriptors.
> The time it takes tor to work out if a connection to a directory guard has succeeded happens simultaneously with other directory guard timeouts.
>
> So under this proposal, it would really take tor:
> 10 seconds for initial bootstrap
> 20 seconds (or more) to download the consensus
> 600 seconds / 3 directory guards = 200 seconds to exhaust its UTOPIC guardlist

Where does the "600 seconds" figure come from here?

> (tor skip the first two phases if it has a live consensus)
>
> Can we revise the proposal to take this into account?
>

Are you talking about section 4? Yes, that could be rewritten a bit.

However, I think that section does not specifically talk about bootstrap as you
seem to be doing.

So, if you have Tor running and you move your laptop to a network with
FascistFirewall, you will not be bootstrapping again with 3 directory
guards. Instead, you are going to be walking over the guard list with a single
guard. So in that case section 4 will be more accurate.

Or am I wrong?