Re: [tor-dev] Latest state of the guard algorithm proposal (prop259) (April 2016)

18 Apr 2016


      Hi,
the last mail got rejected by tor-dev, to keep track of this thread,
sending again, sorry for disturbing,
...
On Fri, Apr 15, 2016 at 5:37 AM, George Kadianakis <desnacked@riseup.net>
wrote:
...
Fan Jiang <fanjiang@thoughtworks.com> writes:
...
[ text/plain ]
Thanks for the insights.
...
...
...
It seems like the latest version of prop259 was posted a few weeks
ago:
https://lists.torproject.org/pipermail/tor-dev/2016-March/010625.html
...
<snip>
...
A few things:
a) Are there still proposal design decisions that need to be taken
and
we
are
   unclear on? I admit I'm a bit lost in the latest [tor-dev]
thread, so
maybe
   I can be of help somehow here?
There are still some issues, like for_directory may leads to
maintain
two
sets of
sampled_set independently, which is not yet defined clearly in
proposal.
Hm, how come this is happening?
I would think that for_directory would now be just another filter
(like the
ipv6 one etc.) that can be applied on top of the sampled list.
The problem here is for_directory is a parameter of function call, that
means we can't do filter
before the call happens. Now the filter action only happens at START
stage.
Maybe do a check before we return the selected guard (if not valid, then
continue picking new one)
can be a solution.
Hello,
hm, that's an interesting problem...
So we learn whether a circuit is for_directory when
choose_random_entry_prop259() is called, but at that point the prop259
algorithm has already STARTed and its filtering parameters have been
determined?
So if some part of tor calls choose_random_entry_prop259() quickly twice,
first
with for_directory set, and the second time with for_directory unset, the
guard
algorithm will proceed in both cases with for_directory being set?
Because the
context has been set in the first call to choose_random_entry_prop259()?
This seems like a problem to me, and I'm not sure how to solve it well...
One way could be to use the 'cpath_build_state_t *state' in
choose_random_entry_prop259() to be able to understand whether each call
is
about a different circuit or not. Then you could start a separate
algorithm
invocation (so new START) for each new circuit you see.
But then I'm not sure how to do this without each separate algorithm call
wrecking up the sampled_guards and used_guards lists... In the dev
meeting in
Valencia, we discussed with Ola and Reinaldo about using locking and
blocking
for this, but I'm not sure how much that would impact the performance...
Do you guys have any plans here?
We can have two pending_guard one for directory, one for any usage(which
will be picked by the NEXT algo, so they can be same node),
and they can be checked with directory flag before return.
Locking may not work because this algo should at least return sth before
it continues to another one, saying Dir and non-Dir are now in main loop,
Or if you have any ideas please let me know.
...
---
BTW, I also noticed that you have various global structures in prop259.c,
like
entry_guard_selection and used_guards and sampled_guards. I see you've
been
working hard on un-globalizing the entry_guards list, but I'm not sure if
there
is value to that if all those other structures are global. Do you have
plans
for making those structures local to each specific algorithm instance?
I've been working on moving these other structures into a
guard_selection_t,
and hope it can help to make this algo instanceable.
...
Cheers!