[tor-dev] Proposal 259: New Guard Selection Behaviour

Sat Mar 26 18:42:29 UTC 2016

Hello,

teor, asn, see comments inline.

On 3/24/2016 5:00 PM, Tim Wilson-Brown - teor wrote:
[snip]
>>> The number of directory guards will increase when 0.2.8-stable is
>>> released and relays and clients upgrade.
>>> In 0.2.8, relays accept tunnelled directory connections even if they
>>> do not have an open DirPort.
>>>
>>
>> Indeed, soon enough all guards will be directory guards.
> 
> Almost all guards will be directory guards. AccountingMax can disable
> tunnelled directory fetches, as can DirCache 0.

I guess the guards that won't be accepting tunneled BEGIN_DIR
connections because of AccountingMax or DirCache 0 will also advertise
this in their descriptors, so these relays will not get a `V2Dir` flag.
Can you confirm if this is actually true? I assume the code has to do
this, otherwise how can a client know if he can initiate a tunneled
BEGIN_DIR connection with a relay or not.

Simplest thing is to make the guard also be the directory guard. As
Roger suggested the "Notes from the prop259 proposal reading group"
thread, we might make the authorities assign the 'Guard' flag only to
relays that also have the 'V2Dir' flag, among the other existing
requirements. Until then, we require a DirPort set to be Guard, but this
will add extra complexity (what if ORPort is in 443 and DirPort on 9030
- is this guard utopic or dystopic?).

>>> [snip]
>>> Feedback on specific sections:
>>>
>>>> Under dystopic conditions (when a firewall is in place that blocks
>>>> all ports except for potentially port 80 and 443), this algorithm
>>>> will try to connect to 2% of all guards before switching modes to try
>>>> dystopic guards. Currently, that means trying to connect to circa 40
>>>> guards before getting a successful connection. If we assume a
>>>> connection try will take maximum 10 seconds, that means it will take
>>>> up to 6 minutes to get a working connection.
>>>
>>> This seems far too long for most users. Usability studies have
>>> demonstrated that users give up after approximately 30 seconds.
>>>
>>> Can we design an algorithm that will automatically choose a dystopic
>>> guard and bootstrap within 30 seconds?
>>> What are the security tradeoffs if we do?
>>
>> OK, let's assume that a connection failed timeout might take up to 10
>> seconds.
>>
>> If Alice is behind a FascistFirewall and we want her to bootstrap
>> within 30
>> seconds, this means that she always needs to have an 80/443 guard in
>> her top
>> three choices. This means, that we would heavily prioritize 80/443
>> guards over
>> the rest, and an adversary who sets up 80/443 guards will attract more
>> clients.
> 
> This isn't how Tor works - it tries multiple guards simultaneously.
> (See below for details.)
> Can we rework this calculation to take that into account?
> 
>> I think the current proposal tries to balance this, by enabling this
>> heuristic
>> only after Alice exhausts her utopic guardlist. Also, keep in mind
>> that the
>> utopic guardlist might contain 80/443 guards as well. So if Alice is
>> lucky, she
>> got an 80/443 guard in her utopic guard list, and she will still bootstrap
>> before the dystopic heuristic triggers.
>>
>> There are various ways to make this heuristic more "intelligent", but
>> I would
>> like to maintain simplicity in our design (both simple to understand
>> and to
>> implement). For example, we could imagine that we always put some
>> 80/443 guards
>> as our primary guards, or in the utopic guardlist. Or, that we reduce
>> the 2%
>> requirement so that we go trigger the dystopic heuristic faster.
> 

I agree that the maximum total time at client side to get a working
connection is probably too much. However, I am thinking asn's arguments
about _ensuring_ we keep at least n dystopic guards in our
PRIMARY_GUARDS list:
a) overloading 80/443 (dystopic guards);
b) creating incentives for attackers to run 80/443 (dystopic guards)
that will give them unfair probabilities to be picked by clients;

are very important and could be worth the effort to make this tradeoff
and increase the maximum possible time to get a working connection at
client side.

As I understand, the utopic guard list _can_ also contain dystopic
guards, so a client behind a FascistFirewall might be lucky and don't
have to wait until utopic guard list is exhausted entirely. This is
better, but I still think it would be simpler if instead of 2 guard lists:
- SAMPLED_UTOPIC_GUARDS
- SAMPLED_DYSTOPIC_GUARDS

We create a single SAMPLED_GUARDS list, but we make the selection by
taking into account the ratio of utopic and dystopic guards based on
their weights from the last consensus. I have suggested a simple example
for this few months ago in this post:

https://lists.torproject.org/pipermail/tor-dev/2015-November/009871.html

If we compute the guard list like this, load balancing shouldn't be
affected in any way (we use the weights to build the list, not the
number of relays). I saw the algorithm has been improved so much and
covers so many aspects we didn't consider initially, but I still don't
understand why we need two separate lists of utopic guards and dystopic
guards when we can create a single list.

This will also allow us to safely decrease a little bit the total number
of guards we are willing to try, being sure that clients behind
FascistFirewalls get a chance while also taking into account teor's
concern not to make this list too big.

> [snip]
> Client Bootstrap
> 
> The proposal ignores client bootstrap.
> 
> There are a limited number of hard-coded authorities and fallback
> directories available during client bootstrap.
> The client doesn't select guards until it has bootstrapped from one of
> the 9 authorities or 20-200 fallback directories.
> 

I think this step is before prop#259 does its magic, since prop#259
first needs a consensus before it can work. Let's call this initial
(genesis) bootstrap Step 0 - only after a client has bootstrapped
(either from an authority or from a fallback directory) he will initiate
prop#259 to pick a guard.

> Bootstrap / Launch Time
> 
> The proposal calculates bootstrap and launch time incorrectly.
> 
> The proposal assumes that Tor attempts to connect to each guard, waits
> for failure before trying another. But this isn't how Tor actually works
> - it sometimes tries multiple connections simultaneously. So summing the
> times for individual connection attempts to each guard doesn't provide
> an accurate picture of the actual connection time.
> 
> When bootstrapping in 0.2.7 and earlier, tor will try an authority, wait
> up to 10 seconds for it to fail, then try another.
> Then there's a 60 second wait before the third authority, but at that
> point the user has likely lost interest. 
> 
> In 0.2.8, tor connects to authorities and fallbacks concurrently. It
> will try 3 fallbacks and 1 authority in the first 10 seconds, and
> download from whichever one connects first So 0.2.8 is far more likely
> to connect within a few seconds.
> 
> In all current versions, tor then downloads the consensus (~1.5MB, could
> take 10 seconds or more), and chooses directory guards.
> Then it simultaneously connects to 3 directory guards to download
> certificates and descriptors.
> The time it takes tor to work out if a connection to a directory guard
> has succeeded happens simultaneously with other directory guard timeouts.
> 

Hmm. This requires some thinking. So Tor connects to the directory
guards immediately after it gets a consensus, to get the certificates
and descriptors. Plausible. I assume it does this via HTTP fetch on the
DirPort, since it has _no_ certificates and descriptors for routers.
Doesn't Tor need these certificates and descriptors to initiate tunneled
BEGIN_DIR requests with certain relays?

How will this work once DirPort is deprecated entirely? Or removing
DirPort from relays is not part of the plan?

Under prop#259 we can for example, initiate 3 simultaneous tunneled
BEGIN_DIR connections with the top 3 guards in PRIMARY_GUARDS to fetch
the certificates and descriptors. Until all relays update to recent
enough Tor versions, we initiate 3 HTTP GET connections to DirPorts with
the top 3 guards in PRIMARY_GUARDS. This shouldn't affect client's
anonymity or expose him too much if it's just for the certificates and
descriptors on one side, and on another side after all a client is at
all the time exposed to guards in PRIMARY_GUARDS list, for usability,
performance, overloading of some guards, etc.

> 
> Other Considerations
> 
> We're considering increasing the 10 second stream attach timeout to
> support users on slow and unreliable network connections (#16844). We
> should think about the impact of that on this proposal - I'd hate to
> double the time it takes tor to exhaust its UTOPIC guardlist.
> 
This is correct.

Also, FascistFirewall torrc option: prop#259 sounds like it will take
care of users behind FascistFirewalls by default, should we eliminate it
entirely for simplicity? Or should we make it that FascistFirewall 1
will tell prop#259 to populate SAMPLED_GUARDS list only with dystopic
guards OR use only a SAMPLED_DYSTOPIC_GUARDS list if we choose to keep
the two lists disjoint?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://lists.torproject.org/pipermail/tor-dev/attachments/20160326/56ba2482/attachment-0001.sig>