On Wed, May 30, 2018 at 05:19:26PM -0700, Nick Mathewson wrote:
In proposal 275, we give reasons for dropping the published-on field from consensus documents, to improve the performance of consensus diffs. We've already changed Tor (as of 0.2.9.11) to allow us to set those fields far in the future -- but unfortunately, there is still one use case that requires them: relays use the published-on field to tell if they are about to fall out of the consensus and need to make new descriptors.
Makes sense. I agree this is worth fixing.
Mechanism One: The StaleDesc flag
Authorities should begin voting aon a new StaleDesc flag.
When authorities vote, if the most recent published_on date for a descriptor has over DESC_IS_STALE_INTERVAL in the past, the authorities should vote to give the StaleDesc flag to that relay.
If any relay sees that it has the StaleDesc flag, it should upload some time in the first half of the voting interval. (Implementors should take care not to re-upload over and over, though: Relays won't lose the flag until the next voting interval is reached.)
(Define DESC_IS_STALE_INTERVAL as equal to FORCE_REGENERATE_DESCRIPTOR_INTERVAL.)
I think this is the mechanism we should pick.
But I think it needs a name that is more than just a tiny typo away from Stable. :) Maybe "OldDesc"?
Also, I think you don't mean "the first half of the voting interval". Maybe you mean the first half hour after that consensus document appears (not sure what we call that time period), but actually the relay won't necessarily even *get* the new consensus until partway through that period, and maybe not even until later depending on what sort of update schedule it's on. But I agree with the notion that the relay should try to upload a new descriptor sufficiently before the next vote starts, and also it should limit the number of times it tries in reaction to any given consensus it sees.
Mechanism Two: Uploading more frequently when rejected.
Tor relays should remember the last time at which they uploaded a descriptor that was accepted by a majority of dirauths. If this time is more than FAST_RETRY_DESCRIPTOR_INTERVAL in the past, we mark our descriptor as dirty from mark_my_descriptor_dirty_if_too_old().
The reason I prefer the OldDesc mechanism is because trying to keep state on the relay is going to be hard, given that directory authorities share descriptors with each other. That is, if you upload your descriptor to nine dir auths, and eight of them reject it but one of them accepts it, then it might or might not be the case that all nine of them happily have the new descriptor by the next voting period -- and if one of them does get your descriptor from another and decide to like it that way, then it's not straightforward for the relay to know that that's happened.
(An example of possible pathology: a relay uploads a descriptor D1 to an authority and it's rejected, and then the authority learns about D1 from another authority's vote and fetches it and likes it, and then the relay makes a descriptor D2 because it thinks D1 wasn't liked, but then D2 isn't sufficiently new compared to D1 so the authority rejects D2.)
Implications for proposal 275
Once most relays are running verions that support the features above, and once authorities are generating consensuses with the StaleDesc flag, there will no longer be a need to keep the published time in consensus documents accurate -- we can start setting it to some time in the distant future, per proposal 275.
Sounds like a good plan.
--Roger