[tor-dev] Next version of the algorithm
obini at thoughtworks.com
Fri Mar 4 08:47:26 UTC 2016
> > This algorithm keeps track of the unreachability status for guards
> > in state private to the algorithm - this is re-initialized every time
> > START is called.
> Hmm, didn't we decide to persist the unreachability status over runs, right?
> Or not?
Yeah, I think we did decide to persist it between runs, but not more
permanently. I've changed it now.
> > SAMPLED_UTOPIC_GUARDS
> > This is a set that contains all guards that should be considered
> > for connection under utopic conditions. This set should be
> > persisted between runs. It will be filled in by the algorithm if
> > it's empty, or if it contains less than SAMPLE_SET_THRESHOLD
> > guards after winnowing out older guards. It should be filled by
> > using NEXT_BY_BANDWIDTH with UTOPIC_GUARDS as an argument.
> Should we use UTOPIC_GUARDS or REMAINING_UTOPIC_GUARDS as the
It should be UTOPIC_GUARDS, since REMAINING_UTOPIC_GUARDS will always
be a subset of SAMPLED_UTOPIC_GUARDS.
> I guess you mean SAMPLED_DYSTOPIC_GUARDS.
Yep, thanks. Fixed.
> > REMAINING_UTOPIC_GUARDS
> > This is a running set of the utopic guards we have not yet tried
> > to connect to. It should be initialized to be SAMPLED_UTOPIC_GUARDS
> > without USED_GUARDS.
> Maybe here we should also mention that we will reinsert guards that we have not
> tried in a long time (GUARDS_RETRY_TIME) as specified by 2.2.2?
Yep, good clarification. I've added that.
> > [XXX defining "was not possible to connect" as "entry is not live" according
> > to current definition of "live entry guard" in tor source code, seems
> > to improve success rate on the flaky network scenario.
> > See: https://github.com/twstrike/tor_guardsim/issues/1#issuecomment-187374942]
> Hmm, I'm not sure what this XXX means exactly. I believe we should actually try
> to _connect_ to those primary guards and not just check if we think
> they are live.
Yeah, I don't know where it comes from either - @rjunior, care to
expand on it?
> > §2.2.2. The STATE_TRY_UTOPIC state
> > In order to give guards that have been marked as unreachable a
> > chance to come back, add all entries in TRIED_GUARDS that were
> > marked as unreachable more than GUARDS_RETRY_TIME minutes ago back
> > to REMAINING_UTOPIC_GUARDS.
> I'm a bit puzzled by this mechanism. Maybe it's benefits can be explained a bit
> more clearly?
> When we add guards back to REMAINING_UTOPIC_GUARDS, do we also remove them from
Well, TRIED_GUARDS doesn't really do much at the moment. In fact, it
might be easier to just remove it. I've done that and it simplifies
things as well.
> Now that we have persistent SAMPLED_UTOPIC_GUARDS is this still useful? Won't
> we have fully populated our SAMPLED_*_GUARDS structures by the point this rule
Agree, I've removed it. Much nicer and neater now! =D
> > §2.2.5. ON_NEW_CONSENSUS
> > First, ensure that all guard profiles are updated with information
> > about whether they were in the newest consensus or not. If not, the
> > guard is considered bad.
> Maybe instead of "If not" we could say "If a guard is not included in the newest
> consensus" to make it a bit clearer.
Good clarification, done.
> > [XXX Does "add it back in the place it should have been in PRIMARY_GUARDS
> > if it had been non-bad" implies keeping original order?]
> If I understand correctly, I think the answer to this XXX is
> "Ideally, yes.".
Yes, that is definitely the answer.
> I'm curious to see how this mechanism will be implemented because it's important
> and it would be nice if it's done cleanly.
I can see a few different ways to do it easily. One of them would be
to just rerun the original primary guard selection algorithm until we
find the guard we want to insert.
> Also, we should be careful about when we count 'bad' guards. After a few weeks
> of operation, the USED_GUARDS list can accumulate multiple bad guards, and we
> should make sure we don't count them when we do our threshold
> Just a reminder that we also discussed adding the "Retry primary guards if we
> have looped over the whole guardlist" heuristic somewhere here. Because in many
> cases the network can go down and then back up in less than a minute.
Actually, that retry heuristic is there. Or maybe I misunderstand the point.
> IIUC, if the guard is not in USED_GUARDS it should be added *last* (that is,
> with lowest priority).
Yep, added that.
> We should decide if we want to actually use a dynamic percentage here, or just
> set the threshold to a constant value.
> A dynamic percentage might give us better security and reachability as the
> network evolves, but might also cause unpredictable behaviors if we suddently
> get too many guards or too many of them disappear.
> I don't have a strong opinion here.
Me neither. I think a percentage is a good starting point - it feels
easier to tweak in different ways.
> It seems to me that the value 20 here could get reduced to something like 5 or
> even less. Of course 5 is also an arbitrary value and to actually find out the
> "best" number here we should test the algorithm ourselves in various network
Arbitrarily changed to 5. =)
Ola Bini (https://olabini.se)
"Yields falsehood when quined" yields falsehood when quined.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 931 bytes
Desc: not available
More information about the tor-dev