Ola Bini obini@thoughtworks.com writes:
Hi,
Sorry for this - had some questions about the hashring component of #259 that I haven't been able to figure out myself. I'm sure it's just me being unused to the Tor code base and how you write your proposals, but it would be super helpful if I can get a quick answer to a few questions.
First, my simplified understanding of the data structures is this: GUARDS are all the guards in the consensus
UTOPIC_GUARDS is all utopic guards from GUARDS DYSTOPIC_GUARDS is all dystopic guards from GUARDS
UTOPIC_GUARDLIST is a subset of UTOPIC_GUARDS DYSTOPIC_GUARDLIST is a subset of DYSTOPIC_GUARDS
We will only ever choose guards to use from UTOPIC/DYSTOPIC_GUARDLIST
The idea of the hashring is to be a structure to make it possible for us to go from UTOPIC/DYSTOPIC_GUARDS to UTOPIC/DYSTOPIC_GUARDLIST. The reason we want to avoid shifting is because we want the hashring to be a long lived data structure that can be cheaply updated over
p> several consensuses.
Here are my questions:
- Is the above understanding correct?
Yes, that's also my understanding.
I think what prop259 tries to do with that hashring construction is to provide a deterministic way to sample guards, with minimum shifting as time passes. The aim here is to make it harder to track and fingerprint clients based on their guard lists. The idea is that when you are in WiFi "home" you always have one set of guards, and when you move to the WiFi "cafe" you have a different set of guards. See #10969 for some background and check http://meetbot.debian.net/tor-dev/2016/tor-dev.2016-01-19-16.03.log.html starting from 16:21:58 for some recent discussion.
Personally, I think that the hashring is kind of a red herring in the sense that it makes prop259 more complex than it needs to be, and it also tries to tackle a problem that we don't really know how to solve. I mean even if we have the hashring construction, how do we use it? Do we just blend in the MAC of the network gateway when we sample guards? What happens if an evil gateway changes its MAC constantly, till we sample the guards it wants?
I think solving the guard fingerprinting issue should be the subject of a separate proposal. Isis suggested this during the meeting but I was too dump at that point.
Until we have a proper solution here I'd suggest we stick to the KISS thing, which probably is to sample guards from a simple list weighted by their bandwidth like Tor is currently doing (see node_sl_choose_by_bandwidth()).
Of course, if you guys are interested in this and want to simulate the hashring idea (or any othe similar idea) that's fine as well. That's the good part of a simulation; that we don't need to commit to any particular construction.
- At what point do we add/update/change guards in UTOPIC/DYSTOPIC_GUARDLIST?
I think prop259 suggests that we populate the guardlists the first time Tor attempts to connect to a network. This is different to how Tor currently does it, where it starts with a minimal guardlist (one guard), and adds new guards to it on demand (when the first guard fails).
Both approaches have their merits. I'm not sure which one is better.
We add more guards to guardlists if all the previous guards don't work anymore. This can be seen in step 4 of ยง2.
We also need to mark (update) guards as unreachable when they get removed from the consensus, or when we fail to connect to them.
Indeed the proposal should be updated to better specify when these actions happen and how they work.
- Since each key in the hashring structure depends acutely on bw_weight_total
and the BW of the guard, how will it be possible to update it efficiently? My assumption is that the bw_weight_total will change on every consensus, and at that point all the old keys are invalid?