[tor-dev] Hashring understanding

George Kadianakis desnacked at riseup.net
Thu Feb 4 21:55:51 UTC 2016

Ola Bini <obini at thoughtworks.com> writes:

> Hi,
> Sorry for the string of emails!
> Hopefully a simple question:
> The current proposal contains logic for keeping track of network
> up/down and setting timeouts for exponential backoff to test the
> network again. But if I understand correctly, this proposal is
> basically about replacing the algorithm used for
> choose_good_entry_server() - correct? So it seems like keeping track
> of network status doesn't really belong inside this algorithm at
> all. Wouldn't it make sense to return a specific failure to the caller
> and let the caller be in charge of when to retry?


It's quite hard (if not impossible) for a user-land application like Tor to
actually keep track of whether the network is up or down in a multiplatform,
secure and scalable manner (e.g. without using the dirauths as an oracle). So
instead all we do is connect to Tor nodes and check whether we could reach them
or not.

I think currently if a relay fails to answer a CREATE cell, we treat the relay
as unreachable and we mark it as such in our guardlist (see
entry_guard_register_connect_status()). This logic is considered suboptimal and
it sometimes ends up marking good relays as unreachable. The retry logic in
prop259 tries to compensate for this, by eventually retrying nodes that might
have been marked as offline by a shaky network.

I agree it would be nice to decouple the above behavior from the guard picking
behavior. However, the two behaviors are certainly linked with each other,
since the only nodes that a Tor client connects to are its guard nodes. I'm not
sure how separating these two behaviors would look like, but if you guys think
it would simplify things I'd definitely be interested in hearing about it :)

WRT current code, if you see choose_random_entry_impl() (which is called by
choose_good_entry_server()) you will notice that the status of relays is very
central to this function since populate_live_entry_guards() will remove any
guards that are inactive or previously found unreachable from the guardlist.

choose_random_entry_impl() does not contain logic about the network itself
being up and down, because currently Tor does not have such logic. Instead, Tor
will endlessly keep on cycling through guards till the network comes up.

Hopefully I answered the right question here.

BTW, if you guys want we can have another meeting next week Tuesday same time
iff that will be helpful to you.


More information about the tor-dev mailing list