[tor-dev] Hashring understanding

George Kadianakis desnacked at riseup.net
Thu Feb 4 15:06:04 UTC 2016


Ola Bini <obini at thoughtworks.com> writes:

> Hi,
>
> Sorry for this - had some questions about the hashring component of
> #259 that I haven't been able to figure out myself. I'm sure it's just
> me being unused to the Tor code base and how you write your proposals,
> but it would be super helpful if I can get a quick answer to a few
> questions.
>
> First, my simplified understanding of the data structures is this:
> GUARDS are all the guards in the consensus
>
> UTOPIC_GUARDS is all utopic guards from GUARDS
> DYSTOPIC_GUARDS is all dystopic guards from GUARDS
>
> UTOPIC_GUARDLIST is a subset of UTOPIC_GUARDS
> DYSTOPIC_GUARDLIST is a subset of DYSTOPIC_GUARDS
>
> We will only ever choose guards to use from UTOPIC/DYSTOPIC_GUARDLIST
>
> The idea of the hashring is to be a structure to make it possible for us to
> go from UTOPIC/DYSTOPIC_GUARDS to UTOPIC/DYSTOPIC_GUARDLIST.
> The reason we want to avoid shifting is because we want the hashring to be a
> long lived data structure that can be cheaply updated over
p> several consensuses.
>
> Here are my questions:
> - Is the above understanding correct?

Yes, that's also my understanding.

I think what prop259 tries to do with that hashring construction is to provide
a
deterministic way to sample guards, with minimum shifting as time passes. The
aim here is to make it harder to track and fingerprint clients based on their
guard lists. The idea is that when you are in WiFi "home" you always have one
set of guards, and when you move to the WiFi "cafe" you have a different set of
guards. See #10969 for some background and check
http://meetbot.debian.net/tor-dev/2016/tor-dev.2016-01-19-16.03.log.html
starting from 16:21:58 for some recent discussion.

Personally, I think that the hashring is kind of a red herring in the sense
that
it makes prop259 more complex than it needs to be, and it also tries to tackle
a
problem that we don't really know how to solve. I mean even if we have the
hashring construction, how do we use it? Do we just blend in the MAC of the
network gateway when we sample guards? What happens if an evil gateway changes
its MAC constantly, till we sample the guards it wants?

I think solving the guard fingerprinting issue should be the subject of a
separate proposal. Isis suggested this during the meeting but I was too dump at
that point.

Until we have a proper solution here I'd suggest we stick to the KISS thing,
which probably is to sample guards from a simple list weighted by their
bandwidth like Tor is currently doing (see node_sl_choose_by_bandwidth()).

Of course, if you guys are interested in this and want to simulate the hashring
idea (or any othe similar idea) that's fine as well. That's the good part of a
simulation; that we don't need to commit to any particular construction.

> - At what point do we add/update/change guards in UTOPIC/DYSTOPIC_GUARDLIST?

I think prop259 suggests that we populate the guardlists the first time Tor
attempts to connect to a network. This is different to how Tor currently does
it, where it starts with a minimal guardlist (one guard), and adds new guards
to
it on demand (when the first guard fails).

Both approaches have their merits. I'm not sure which one is better.

We add more guards to guardlists if all the previous guards don't work
anymore. This can be seen in step 4 of §2.

We also need to mark (update) guards as unreachable when they get removed from
the consensus, or when we fail to connect to them.

Indeed the proposal should be updated to better specify when these actions
happen and how they work.

> - Since each key in the hashring structure depends acutely on bw_weight_total
> and the BW of the guard, how will it be possible to update it
>   efficiently? My assumption is that the bw_weight_total will change on every
> consensus, and at that point all the old keys are invalid?


More information about the tor-dev mailing list