Reinaldo de Souza Jr rjunior@thoughtworks.com writes:
[ text/plain ] Thank you.
Another thing I'm interested in is how the proposed algorithm structure fits into current tor code. The proposed algorithm is:
OPEN_CIRCUIT: context = ALGO_CHOOSE_ENTRY_GUARD_START(...) while True: entryGuard = ALGO_CHOOSE_ENTRY_GUARD_NEXT(context) circuit = composeCircuitAndConnect(entryGuard) if not SHOULD_CONTINUE(isSuccessful(circuit)): ALGO_CHOOSE_ENTRY_GUARD_END(context, entryGuard) return circuit
I'd like to have ideas of current tor functions with similar purposes. This is the correlation I was able to find by reading the source code:
Hmm, yes finding the right interface here is very important! We might find that we need to change the structure of the proposed algorithm slightly to fit into Tor's networking logic.
Here is a quick reply:
a) OPEN_CIRCUIT() Seems to be equivalent to circuit_establish_circuit()
Seems to be the case.
b) while True: Seems to be equivalent to onion_populate_cpath(). It even has a "timeout" after 32 tries!
I'm not actually sure if this is the loop you are looking for. I don't think any networking happens in onion_populate_cpath() at all. I think that loop is there just to make sure that the final yet-to-be-created circuit will have at least one node that supports ntor (for crypto/security reasons). However no networking has taken place yet; it's just doing checks on the hypothetical future circuit.
Because of the asynchronous networking logic of Tor, I'm not sure if you will find a while loop that does precisely what you want here.
When a circuit fails in Tor, there is some retry logic to make a new one to carry out its job. This retry logic might be the loop you are looking for, but I'm not sure if the logic is somewhere centralized, or if it's special for each different type of cell/circuit.
Sorry for not being more helpful here, but I have to move now for the weekend. I'd suggest you do some runtime analysis of Tor with plenty of logs added, to find the right place. I will try to have more feedback for you on Wednesday.
c) ALGO_CHOOSE_ENTRY_GUARD_NEXT(context) Seems to be equivalent to choose_good_entry_server() as used in onion_extend_cpath().
Seems to be the case, yes.
d) composeCircuitAndConnect(entryGuard) This is the most uncertain to me. It seems to be circuit_handle_first_hop().
Indeed circuit_handle_first_hop() seems to be the function opening the initial connection to the guard, after the circuit has been constructed. The function channel_connect_for_circuit() seems to be the one actually doing the dirty networking work here; setting up the channel and calling connection_or_connect().
The issue is: we rely on `unreachable_since` being updated in case we fail to connect to the guard before the next call to (c). In current tor code, this is done by entry_guard_register_connect_status() but I got lost tracking when it happens.
Hmm, this is the part where the asynchronous networking logic of tor takes over. A good way to comprehend this IMO is the good ol' "put log statements everywhere, run tor under various scenarios and check the code flow".
In any case, here is an attempt at untangling the code:
If something goes bad connecting to the guard, I think circuit_build_failed() will be called eventually, which is one of the places that sets unreachable_since via entry_guard_register_connect_status().
Then, since that circuit failed, the retry logic of Tor (the exact details here depend on the type of circuit, etc.) will try to create a new circuit to complete the job that the previous circuit was supposed to do. During this second circuit creation, the first entry guard will already have been marked by circuit_build_failed() so unreachable_since will have been set and the first entry guard will be skipped.
I think this is approximately how it works, but I'd suggest you add log statements in all the functions calling entry_guard_register_connect_status() and run tor, to see it yourself because I might be wrong.
circuit_mark_for_close_() seems to be called when circuit_handle_first_hop() fails but it's unclear to me if
[ 12 more citation lines. Click/Enter to show. ]
entry_guard_register_connect_status() will ever be called as part of circuit_mark_for_close_() - and if there is any guarantee it will be called before the next invocation of onion_extend_cpath().
e) SHOULD_CONTINUE(isSuccessful(circuit)) This is also tricky. If onion_extend_cpath() is our loop, this is supposed to break it in some case.
It is very similar to how entry_guard_register_connect_status() is used in channel_do_open_actions(). But similarly to entry_guard_register_connect_status(), I'm also unsure if channel_do_open_actions() is called as part of onion_extend_cpath().
f) ALGO_CHOOSE_ENTRY_GUARD_END() It could be entry_guard_register_connect_status().
Plausible yes.
Sorry for such a dense email. I'm trying to make an informed decision before following my gut seeing how it breaks.
If there is any technique or call graph tool you find useful to get such information, it would be much appreciated.
printf or log_warn debugging is the technique I would suggest here. Make sure you run tor on various types of networks to see what happens.
Unfortunately, I didn't have time to answer all your questions. Will try to have some smart thoughts on this for Wednesday.
[Email doubly sent, because I forgot to CC tor-dev originally]