Hi, 
the last mail got rejected by tor-dev, to keep track of this thread, sending again, sorry for disturbing,


On Fri, Apr 15, 2016 at 5:37 AM, George Kadianakis <desnacked@riseup.net> wrote:
Fan Jiang <fanjiang@thoughtworks.com> writes:

> [ text/plain ]
> Thanks for the insights.
>
>
>> >> It seems like the latest version of prop259 was posted a few weeks ago:
>> >>
>> https://lists.torproject.org/pipermail/tor-dev/2016-March/010625.html
>> >>
>> > <snip>
>> >
>> >> A few things:
>> >>
>> >> a) Are there still proposal design decisions that need to be taken and
>> we
>> >> are
>> >>    unclear on? I admit I'm a bit lost in the latest [tor-dev] thread, so
>> >> maybe
>> >>    I can be of help somehow here?
>> >>
>> >> There are still some issues, like for_directory may leads to maintain
>> two
>> > sets of
>> > sampled_set independently, which is not yet defined clearly in proposal.
>> >
>>
>> Hm, how come this is happening?
>>
>> I would think that for_directory would now be just another filter (like the
>> ipv6 one etc.) that can be applied on top of the sampled list.
>>
> The problem here is for_directory is a parameter of function call, that
> means we can't do filter
> before the call happens. Now the filter action only happens at START stage.
> Maybe do a check before we return the selected guard (if not valid, then
> continue picking new one)
> can be a solution.
>

Hello,

hm, that's an interesting problem...

So we learn whether a circuit is for_directory when
choose_random_entry_prop259() is called, but at that point the prop259
algorithm has already STARTed and its filtering parameters have been
determined?

So if some part of tor calls choose_random_entry_prop259() quickly twice, first
with for_directory set, and the second time with for_directory unset, the guard
algorithm will proceed in both cases with for_directory being set? Because the
context has been set in the first call to choose_random_entry_prop259()?

This seems like a problem to me, and I'm not sure how to solve it well...

One way could be to use the 'cpath_build_state_t *state' in
choose_random_entry_prop259() to be able to understand whether each call is
about a different circuit or not. Then you could start a separate algorithm
invocation (so new START) for each new circuit you see.

But then I'm not sure how to do this without each separate algorithm call
wrecking up the sampled_guards and used_guards lists... In the dev meeting in
Valencia, we discussed with Ola and Reinaldo about using locking and blocking
for this, but I'm not sure how much that would impact the performance...

Do you guys have any plans here?

We can have two pending_guard one for directory, one for any usage(which will be picked by the NEXT algo, so they can be same node), 
and they can be checked with directory flag before return.
Locking may not work because this algo should at least return sth before it continues to another one, saying Dir and non-Dir are now in main loop,
Or if you have any ideas please let me know.
---



BTW, I also noticed that you have various global structures in prop259.c, like
entry_guard_selection and used_guards and sampled_guards. I see you've been
working hard on un-globalizing the entry_guards list, but I'm not sure if there
is value to that if all those other structures are global. Do you have plans
for making those structures local to each specific algorithm instance?

I've been working on moving these other structures into a guard_selection_t,
and hope it can help to make this algo instanceable.
 
Cheers!