Fallback Directory Handover

22 Apr 2020

      Hi all,

Here's a summary of the current state of fallback directory mirrors.

Overall Design

This repository contains a list of potential fallback directory mirrors
(a fallback "offer list"), and a script that checks each mirror for speed
and reliability:
https://gitweb.torproject.org/fallback-scripts.git/

There is a "Fallback Scripts" component for tickets:
https://trac.torproject.org/projects/tor/query?status=!closed&component=Core...

The fallback system is designed to gracefully degrade as fallback
directory mirrors fail. Failures shift load to directory authorities,
and cause brief delays during client bootstrap.

We expect the system to operate well, even if all the fallbacks have
failed. But we try to keep the fallback failure rate below 20-30%.
When the failure rate gets too high, we rebuild the fallback list.

Regular Tasks

This ticket is the parent ticket for the next fallback rebuild:
https://trac.torproject.org/projects/tor/ticket/30971

This ticket contains the "offer list" changes that relay operators have
requested. I usually commit them all at once, but you should feel free
to do them incrementally:
https://trac.torproject.org/projects/tor/ticket/30972

Sometimes, we don't have enough relays on the offer list, and we have
to ask relay operators to opt-in to the list. Ideally, we want at least 100
fallbacks, we usually have between 120-160.

Future Work

It's hard to verify changes to the offer list. Changes are usually sent by
email or through trac tickets. There's no reliable trust path from the
relay key to the email or ticket.

The opt-in process is also a manual process. It can be time-consuming.

To resolve these issues, I had planned to add a signed fallback offer line
to relay descriptors:
https://trac.torproject.org/projects/tor/ticket/24839

Instead of checking the list in the fallback-scripts repository, the script
can check relay descriptors instead. (Or check both, during the transition
period.)

Unresolved Issues

Fallbacks eventually see the entire set of clients. Clients that are active 
all the time may only ever contact one fallback. (Clients re-use the same
fallback for authority keys, and then switch to the consensus as soon as
possible.) But clients whose consensuses have expired will choose new
fallbacks at random.

Ideally, clients should select fallback (and maybe authority) guards.
That is, they should retry previously-selected fallbacks. There are some
tradeoffs here: a bad fallback guard can continue to manipulate its
client's view of the network. We can avoid this issue by selecting multiple
fallback guards.

Clients will need persistent state to remember their guards, so transient
systems like TAILS won't benefit from this change.

T

-- 
teor
----------------------------------------------------------------------

teor

Ian Goldberg

teor

tags

participants (2)