[tor-bugs] #33018 [Core Tor/Tor]: Dir auths using an unsustainable 400+ mbit/s, need to diagnose and fix

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Jan 22 22:44:12 UTC 2020


#33018: Dir auths using an unsustainable 400+ mbit/s, need to diagnose and fix
----------------------------+------------------------
 Reporter:  arma            |          Owner:  (none)
     Type:  defect          |         Status:  new
 Priority:  Medium          |      Milestone:
Component:  Core Tor/Tor    |        Version:
 Severity:  Normal          |     Resolution:
 Keywords:  network-health  |  Actual Points:
Parent ID:                  |         Points:
 Reviewer:                  |        Sponsor:
----------------------------+------------------------

Comment (by teor):

 Replying to [comment:5 arma]:
 > Possible next steps beyond the above branch which I think would be worth
 taking:
 >
 > 1. Whitelist (i.e. never send 503's) IP addresses of relays in the
 consensus too. Or maybe it's better to consider relays in our descriptor
 list (i.e. if we vote about it, whitelist it). I have a commented-out
 function conn_addr_is_relay() in the above branch which somebody would
 need to write, and it will need to be fast fast fast or the lookup won't
 be worth it. ahf sketched out that function as "if we extend routerlist_t
 to have a map from addr to a routerinfo_t and from the v6 address, then I
 think you can do it fast."

 It's better to consider the descriptor list, otherwise new relays have
 trouble joining the consensus. It's important to whitelist IPv4 and IPv6
 addresses, because Sponsor 55 will add IPv6 directory requests.

 > 2. Whitelist the IP address for the consensus health checker (I think
 that might be carinatum.tpo) so it stops yelling and thinking we're down.
 :)
 >
 > 3. Consider giving higher priority to microdesc-consensus and microdesc
 replies. That is, I would rather have relays successfully cache and mirror
 the microdesc flavored stuff, if I have to choose.
 >
 > 4. Make a change to the Tor code so relays remain on the client fetch
 schedule (i.e. fetch from relays and fallback dirs) until they publish
 their descriptor. That way we remove one variable from the mystery, i.e.
 "maybe these Tors that are mobbing me are all configured as relays but
 haven't found themselves reachable so that's why I don't know about them."
 (I recognize we'll need to wait some years until everybody has upgraded.
 No time like the present to get started then.)

 Relays which don't know their address will fetch from authorities, so they
 get their address from a trusted source. But address discovery only needs
 one successful fetch (or two, once relays are trying to guess their IPv6
 address). After that, relays can use the client fetch targets.

 The client schedule is a slightly different thing, it only affects fetch
 timing. It's ok for relays to use the client fetch timing, until they
 publish their descriptor. Relays on bad links might bootstrap a bit more
 slowly or unreliably, but those relays were never going to be good relays
 anyway.

 And then there's the client and relay fetch method. Should relays use
 ORPorts for fetches until they publish their descriptor? It probably
 wouldn't hurt, and it would make address detection more secure.

 > 5. Look for patterns in the non-relay IP addresses that are bombing us
 with consensus fetch attempts. How often do they come back asking for
 another one? Does that timing pattern make us think they are a well
 behaving Tor that somehow thinks the dir auths' dirports are the best
 places to ask?

 We should also check the HTTP headers sent as part of the requests. They
 will tell us a lot about the tor version (or other program) that's sending
 the requests.

 > 6. Consider a design for a more aggressive load shedding plan. Right now
 we send the 503 if we don't have the space left in our global write
 bucket, or we ran out of global write bucket the previous second. For
 vanilla-flavored dirport consensus responses to non-relay IP addresses, I
 could imagine something much more aggressive, like "could I serve ten of
 these? No? Then 503." with the goal of actually leaving some room to serve
 the more important ones rather than always being full or nearly full.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33018#comment:9>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list