Hi all,
Here's a summary of the current state of fallback directory mirrors.
Overall Design
This repository contains a list of potential fallback directory mirrors (a fallback "offer list"), and a script that checks each mirror for speed and reliability: https://gitweb.torproject.org/fallback-scripts.git/
There is a "Fallback Scripts" component for tickets: https://trac.torproject.org/projects/tor/query?status=!closed&component=...
The fallback system is designed to gracefully degrade as fallback directory mirrors fail. Failures shift load to directory authorities, and cause brief delays during client bootstrap.
We expect the system to operate well, even if all the fallbacks have failed. But we try to keep the fallback failure rate below 20-30%. When the failure rate gets too high, we rebuild the fallback list.
Regular Tasks
This ticket is the parent ticket for the next fallback rebuild: https://trac.torproject.org/projects/tor/ticket/30971
This ticket contains the "offer list" changes that relay operators have requested. I usually commit them all at once, but you should feel free to do them incrementally: https://trac.torproject.org/projects/tor/ticket/30972
Sometimes, we don't have enough relays on the offer list, and we have to ask relay operators to opt-in to the list. Ideally, we want at least 100 fallbacks, we usually have between 120-160.
Future Work
It's hard to verify changes to the offer list. Changes are usually sent by email or through trac tickets. There's no reliable trust path from the relay key to the email or ticket.
The opt-in process is also a manual process. It can be time-consuming.
To resolve these issues, I had planned to add a signed fallback offer line to relay descriptors: https://trac.torproject.org/projects/tor/ticket/24839
Instead of checking the list in the fallback-scripts repository, the script can check relay descriptors instead. (Or check both, during the transition period.)
Unresolved Issues
Fallbacks eventually see the entire set of clients. Clients that are active all the time may only ever contact one fallback. (Clients re-use the same fallback for authority keys, and then switch to the consensus as soon as possible.) But clients whose consensuses have expired will choose new fallbacks at random.
Ideally, clients should select fallback (and maybe authority) guards. That is, they should retry previously-selected fallbacks. There are some tradeoffs here: a bad fallback guard can continue to manipulate its client's view of the network. We can avoid this issue by selecting multiple fallback guards.
Clients will need persistent state to remember their guards, so transient systems like TAILS won't benefit from this change.
T
On Wed, Apr 22, 2020 at 11:56:54AM +1000, teor wrote:
a bad fallback guard can continue to manipulate its client's view of the network
This is only true to the extent that the fallback guard can choose which of three still-valid consensuses to give to the client, right?
On 22 Apr 2020, at 12:27, Ian Goldberg iang@uwaterloo.ca wrote:
On Wed, Apr 22, 2020 at 11:56:54AM +1000, teor wrote:
a bad fallback guard can continue to manipulate its client's view of the network
This is only true to the extent that the fallback guard can choose which of three still-valid consensuses to give to the client, right?
Not quite.
Clients tolerate recently-expired consensuses for some operations, up to 72 hours in some cases.
When I last checked, TAILS set its system clock off the date in the consensus it receives.
Clients also download authority certificates from fallback directory mirrors. I think that's the whole trust path from the hard-coded authority fingerprints, to the certificates, and then a valid consensus.
Since clients use an ORPort connection to download consensuses, a malicious fallback directory mirror can also provide them with: * the wrong date (triggering a clock skew warning) * the wrong external IP address (not used for much) * malicious directory documents * note that decompression and some parsing happens before the signature checks * slow transfer speeds (like slowloris)
Using multiple fallbacks mitigates most of these issues.
T