George Kadianakis desnacked@riseup.net writes:
[ text/plain ] Roger Dingledine arma@mit.edu writes:
[ text/plain ] On Mon, Jun 13, 2016 at 03:48:39PM +0300, George Kadianakis wrote:
The main issue for me right now is that I can't recall how this helps with clock skewed clients, even though that was a big part of our discussion in Montreal.
Specifically, I think that clients (and HSes) should determine the set of responsible HSDirs (i.e. the current time period) based on the "valid-after" of their latest consensus, instead of using their local clock. This way, as long as the client's skewed clock is good enough to verify the latest consensus, the client will have a consistent view of the network and SRV (assuming an honest/updated dirguard). I tried to clarify this a bit in commit 465156d, so please let me know if it's not a good idea.
Interesting idea! I think I like it. You're right that in Montreal we were thinking in terms of client clocks, and we might be able to reduce the problem (both in frequency and in magnitude) by considering the time in the last consensus we have.
Another argument in favor of using the last consensus is that we will be picking the "relays that are closest to the right location in the hash ring" out of our last consensus already. (That is not a strong argument in favor though, I think, since in theory there won't be so much churn in a day that all of the relays in our last consensus will become wrong.)
All of this said, it seems like you are basing your arguments on some expectations about how clients handle consensuses that have surprising dates in them (surprising either because the client's clock is skewed, or because their directory guard gave them the wrong consensus). How *do* clients handle these situations? If we could get the intended / expected behavior written down, then we would have a better chance of identifying bugs in it that we can then fix.
I agree that we should get the intended/expected behavior written down!
A few days have passed and I still feel that using the latest consensus valid_after time is a more robust way for taking decisions on how to perform the HS protocol, than using the local client clock. After all, the whole Tor protocol relies on having a good consensus (for HSDirs, SRV, etc.), so you can't go very far with a bad consensus anyway.
On the subject of clock skewed clients, I opened ticket #19460 with a few suggestions for improving the handling of consensuses with surprising dates. In general, I feel that with the #19460 suggestion implemented, the system can be accomodating towards slightly clock skewed clients both in the forward and backwards directions.
Also, we have logs in place to warn people that their clocks are ultra skewed (based on the received consensus date). We also have mechanisms in place to ensure that we refetch a consensus when the current consensus date is too far off (see update_consensus_networkstatus_downloads()). Now whether all these mechanisms and logs work properly in all cases is something we need to test extensively.
Of course, using the consensus valid_after time is not bulletproof either: there are various edge cases where this can have bad results. For example, imagine a world where the real time is 07:00UTC, but Alice is a 10 hours backwards-skewed client whose local time is 21:00UTC. Imagine that Alice starts up Tor with an old consensus with valid_after 20:00UTC (because her dirguard lied, or because Alice had that consensus cached). In this case, Alice will not realize that the consensus is hella old, and will try to use it. She will then compute the wrong set of HSDirs, and fail the HS protocol. This case is plausible in theory but also quite hard to protect against, since both Alice and her consensus had wrong but convenient times.
All in all, I feel that using the consensus valid_after time for time period related calculations seems reasonable at this point, but we should do more testing (ideally automated) as we implement the relevant parts.
Here is an initial attempt at figuring out the current Tor behavior when handling consensuses with surprising dates. More work is required here.
For example, do I as a client just ignore and discard a consensus from 6 hours in the future? I don't remember the answer, so I can't do a good job at analyzing your proposed change.
In general, the relevant time checks seem to happen at networkstatus_get_reasonably_live_consensus() and not during consensus parsing. That function is then called by router_have_minimum_dir_info() during bootstrapping. If that function returns NULL, then Tor will get stuck at "Boostrapping 25%: Loading networkstatus consensus".
Here is the basic logic of networkstatus_get_reasonably_live_consensus():
#define REASONABLY_LIVE_TIME (24*60*60) if (consensus && consensus->valid_after <= now && now <= consensus->valid_until+REASONABLY_LIVE_TIME) return consensus; else return NULL;
And here are the scenarios:
Case #1: Handling consensuses with old dates
If a client receives a consensus with an old date (i.e. the client's clock is skewed forward), the consensus will get verified just fine and Tor won't even log about the skew (XXX maybe we should fix this?) However when networkstatus_get_reasonably_live_consensus() gets reached, Tor will refuse to handle any consensuses whose valid_until date has expired by more than 24 hours.
Case #2: Handling consensuses with future dates
If a client receives a consensus with a valid_after in the future (i.e. the client's clock is skewed backwards), the consensus will get verified fine and a log will appear about the skew ("Our clock is N hours behind the time published in the consensus yada yada...") However, when networkstatus_get_reasonably_live_consensus() gets reached, Tor will refuse to handle any consensuses whose valid_after date is in the future.
We see that while Tor consensus handling is quite flexible towards forward skewed clocks (case #1), it's actually quite strict towards backward skewed clocks (case #2). We might want to rethink how this should work, if we are serious about supporting clock skewed clients. After all, handling consensuses with future dates is safer than handling consensuses with older dates (which are replayable).
I also wonder if we can consider the above problem orthogonal wrt prop224. After all the problem here is on the consensus handling layer, and affects all current clients and not just HS clients. We should first figure out exactly how well the current Tor behavior works with the suggested prop224 changes.
BTW, the analysis above does not consider situations where the dirguard gives us the wrong consensus (by caching accident or malice), or when the clock gets skewed in the middle of Tor's runtime. Or any other weird scenarios I didn't think about.
I will try to think more about this RSN. Till then, feedback is welcome :) _______________________________________________ tor-dev mailing list tor-dev@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev