Re: [tor-dev] Update on 259

6 Apr 2016

      Hey,

> > - OrPort vs DirPort
> > ORPort is used for regular circuits, while DirPort is used when getting directory information. We need to interpret reachable stuff
> > differently depending on the purpose.
> >
>
> I'm not actually sure what the comment means here.
This was more for our own benefit. The OrPort vs DirPort distinction
has been a bit complicated so far. The comment basically means, when
we are looking up directory information, we should use the DirPort to
decide reachability and so on instead, correct?

> Ensuring a min percentage of dirguards in our sampled set could work. Then,
> when we need a directory guard, we could filter the sampled set and only
> examine guards that can do directory requests.
Yeah, we talked about this yesterday and our current thinking is to
have a sampled set that contains every kind of thing, and then we
dynamically filter it based on config and so on during START.

> Hm, are you talking about the guardlists here? What's the question?
>
> BTW, if we have the ability to do "ensure a min percentage of X in our sampled
> set", couldn't we just ensure a min percentage of dystopic guards in our
> sampled set?  And forget about the two separate guardlists?
>
> In this case we can have the percentage value be the actual portion of the
> network that is dystopic guards. So if 20% of the total guard bandwidth is
> dystopic, we could ensure that at least 20% of our sampled set is
> dystopic".
Well, the problem is really that the idea of dystopic doesn't
necessarily make sense, since it's so dependent on the current network
position of the client. Our current thinking is to do away with that
concept as well. =)

> > - DYSTOPIC - is there value in trying 80 and 443?
> > Probably not.
> >
>
> What does "trying" mean in this case?
Falling back to guards with 80 and 443.

> Restart pending guard selection algorithms on a SIGHUP? Plausible, but I don't
> know how hard it would be to implement this.
Well, the alternative is to just finish the running guard selections
with the old settings, but use the new settings for new algorithm instances.

> That's not very nice because the USED_GUARDS set that was created when
> ClientsUseIPv6 or FascistFirewall were on will have reduced diversity. Then
> even if we switch off those options, we are still stuck with reduced diversity.
>
> I'm not sure what's the right way to do this here!
>
> We could imagine having multiple USED_GUARDS sets, where we make a new set for
> each possible filter. This might be worth considering, but I imagine there will
> be technical difficulties. e.g. when a guard goes down, you need to update its
> state in all the USED_GUARDS sets that it's in. Also, a person who toggles the
> FascistFirewall option frequently, will end up using two different sets of
> guards all the time which is suboptimal.
Well, one thing you could do is hash the settings (and maybe also
reachable ports) and use that as a key to differentiate the different
USED_GUARDS. That would solve the problem, but might lead to a single
client using lots of different guards in different locations. Might
that be OK?

> > - Can we make the lists smaller?
> > Probably. Maybe a sampled set of 30 guards? Or 1.5%?
> >
>
> Plausible. However, if we take the filtering approach but use a small sampled
> guards list, it could happen that the list is not able to satisfy some of our
> filtering restrictions.
>
> e.g. maybe in our 30 guards there are no IPv6 guards at all, and the user just
>      turned on ClientUseIPv6. What to do now?
>
> This is important to understand, because currently there is no mechanism to add
> stuff to the sampled guards list if a restriction cannot be satisfied. So what
> will Tor do, if a user enables ClientsUseIPv6 _and_ FascistFirewall but there
> are no IPv6+80/443 guards in our sampled guards list?
Yeah, we talked about that yesterday. Our suggestion is to do
something like this:
- if the filtered/reduced sample-set contains less than X (5?) guards,
expand SAMPLED guards using the regular process.
- If SAMPLE guards reach SAMPLED_MAX (50?) size, we fail closed with
an error saying something like "your current network settings make it
impossible for us to safely choose an entry guard. If you really need
to connect under these circumstances, consider explicitly setting the
EntryGuards configuration option"

> I think it asks "What happens when guards in our sampled set drop out of the
> consensus and get marked as bad?" (see bad_since in entrynodes.c) .
>
> This is also a great question. Especially when combined with the planned
> "ensure a min percentage of X in our sampled set" logic.
>
> Like, what happens if suddenly most of our sampled IPv6 guards drop out of the
> consensus when we have ClientsUseIPv6 on? Should we replace them? And if yes,
> don't we need to replace them with other IPv6 guards to maintain the minimum
> percentage?
Well, I think if we replace, we should just replace randomly just like
we always expand the sampled set. If most ipv6 guards drop out, we
will have fewer ipv6 guards in our sampled set, but that also reflects
the Tor network.

I suspect a plausible thing to do is to wait a few consensus rounds
with expanding the sampled set to replace "bad" guards - they might
come back, and under most circumstances we shouldn't need to use the
sampled set anyway.

> The current proposal says the following about SAMPLED_UTOPIC_GUARDS:
>
>       It will be filled in by the algorithm if it's empty, or if it contains
>       less than SAMPLE_SET_THRESHOLD guards after winnowing out older guards.
>
> which I think is a good suggestion. However, what should we do if we end up
> going with the "ensure minimum percentage" logic?
Yeah, my suggestion is not have minimum percentage of X type guards -
instead just fill it randomly, and use the expanding process if we
can't find enough guards for a specific purpose.

> > - EntryNodes
> > If this is set, never use the algorithm for regular circuits - we should still use it for directory server connections though.
>
> If this is set we should not use our algorithm, but we should instead pick one
> of the guards in the EntryNodes list. This is for people who want to hardcode
> their guard.  It's used a lot by people currently.
Yeah. Is the guard picked randomly from this list, or using something
more complicated?

> > - UseEntryGuardsAsDirGuards
> > I don't understand exactly what this settings does.
>
> I'm not sure either. I'd just let it keep the exact same semantics it currently has.
Yeah, except we don't exactly understand what does semantics are, and
if we need to change something in our code to match it. =)

Thanks for all the feedback! Hopefully we're getting close to the
final iteration of this spec. =)

Cheers
-- 
 Ola Bini (https://olabini.se)

 "Yields falsehood when quined" yields falsehood when quined.