On Tue, Feb 02, 2016 at 02:29:22PM -0500, Ola Bini wrote:
Hi,
We have now started looking at the proposal and the existing code. Our current plan is to first of all code more of the simulations referred to in stuff-to-test.txt. We also noticed that the hashring implementation for choosing the [DYSTOPIC/UTOPIC]_GUARDLIST hasn't been implemented in the simulation code - in fact, in the code it seems the [DYSTOPIC/UTOPIC]_GUARDS is used as the guardlist. We are planning on implementing the two different varieties of the guardlist selection algorithm as well.
Sounds good.
Another thing to note is that during our last meeting, we decided to not have the utopic/dystopic guardlists be disjoint (mainly for load balancing reasons): https://lists.torproject.org/pipermail/tor-dev/2016-January/010265.html Check the meeting logs if you care for more info.
Unfortunately, I think prop259 has not been updated to specify this new behavior.
This might be too researchy for what you are trying to do, but finding a nice behavior here would be very helpful.
Here are is an example idea of doing the 80/443 fascist firewall detection heuristic without two disjoint guard pools:
Alice initializes a single guard list with all the guard nodes. Then she does steps 1 to 4 from §2 of prop259. Then in step 5, if Alice has tried more than GUARDLIST_FAILOVER_THRESHOLD guards from her guard list, she goes into "dystopic firewall" mode. During this mode, Alice only picks 80/443 nodes as guards (maybe from a separate dystopic guardlist). If those don't work either (she tries GUARDLIST_FAILOVER_THRESHOLD of them), then Alice "should make a note to herself that the network has potentially gone down" as suggested by step 5.b.
So in the above idea, there are two guardlists. Guardlist GUARDS has all the guards in the network, and then there is a DYSTOPIC_GUARDS guard list (which is a subset of GUARDS) which is only used during "dystopic firewall" mode. I think this has better load balancing and anonymity properties. But there might be even better behaviors. Feel free to come up with your own and test them!
After that, we are going to start looking at where it fits in the main Tor code base.
Sounds good. You might enjoy choose_random_entry_impl() as a starting point.
Feel free to ask us any questions you have about the Tor code base. Either here or on IRC.
The only thing I'm a bit unclear about from the specification is the idea of primary guards, and what the procedure is when no primary guards are possible - 259.§2.3 talks about "all available and fitting entry guards" - is this from the list of primary guards or the guardlist?
IIUC it's from the whole guardlist, but this should only happen if the primary guards are unreachable.
The idea with primary guards is that in an ideal world, a Tor client would always only connect to the top guard of its guardlist. To expose itself minimally to the network. Unfortunately, the network is fiddly so this is not possible because the top guard will eventually go down. The concept of primary guards tries to compensate for that, by going to extra lengths to ensure that at least you always connect to one of your N=3 top guards in your guard list. It does this by periodically checking the reachability of those top N=3 guards, and marking them online if they are (see step 2 of §2). So, even if you or your guards have reachability issues and you drift on your 12th guard or something, you will eventually come back to one of your primary guards when they are found online again.
Also, for the above heuristic to work, nodes that are not listed in the latest consensus should not be considered primary guards.
259.§2.4 says "adds a new entry guard" - is that adding it to the list of primary guards or something else?
Good question. I'm not sure what the proposal means there. Maybe isis can clarify this further?
Here is an attempt to help. Hope I don't confuse you further.
The currently implemented Tor guard algorithm keeps a list USED_GUARDS of the guards it has already connected to (it's also saved on disk and is the guard list you see in your Tor state file). Everytime Tor tries a new guard node, it adds it to USED_GUARDS. The top N guards of USED_GUARDS are the primary guards. If you have exhausted all of the guards in USED_GUARDS and you still can't connect, then you need to add a new node to USED_GUARDS and attempt to connect to it. I imagine that the "list of all available and fitting entry guards" referenced in step 3 is something like USED_GUARDS.
(It's actually not called USED_GUARDS in the code. I just named it like this for this email.)