[tor-bugs] #24667 [Core Tor/Tor]: OOM needs to consider the DESTROY queued cells

Fri Dec 22 04:10:12 UTC 2017

#24667: OOM needs to consider the DESTROY queued cells
----------------------------------------+----------------------------------
 Reporter:  dgoulet                     |          Owner:  (none)
     Type:  defect                      |         Status:  new
 Priority:  Medium                      |      Milestone:  Tor:
                                        |  0.3.3.x-final
Component:  Core Tor/Tor                |        Version:
 Severity:  Normal                      |     Resolution:
 Keywords:  tor-cell, tor-circuit, oom  |  Actual Points:
Parent ID:                              |         Points:
 Reviewer:                              |        Sponsor:
----------------------------------------+----------------------------------

Comment (by arma):

 Replying to [ticket:24667 dgoulet]:
 > But also not sending those will affects other relays hanging on dead
 circuits.

 Yeah, this is an ugly one. I was first thinking about the case where a
 relay doesn't send back a destroy cell towards the client, so the client
 ends up with an out-of-sync idea of what the circuit looks like. But in
 that case, eventually the client might still try to close the circuit, and
 things will take care of themselves.

 Where it gets really ugly is if the relay doesn't send a destroy *forward*
 on a circuit. Then the circuit essentially lives forever on the later
 relays. It will only be when the orconn that would have sent the destroy
 cell dies that the next relay will notice.

 (If some other orconn on the dangling circuit dies, it could still trigger
 splintered dangling circuits: the relay on the client side of the broken
 orconn will send a truncated data cell towards the client, which will just
 be ignored since there's no circuit that it corresponds to. And then the
 splintered dangling circuit will live forever because nobody will ever
 know to tell it to go away.)

 So, silently dropping destroy cells seems really bad and like we should
 really try to avoid it.

 One option is to queue them somewhere, using the more efficient queue that
 we put in with #24666, and then send them over the next "little while".
 That is, it's not critical to send them immediately, so long as they are
 sent sometime.

 Another option would be to rotate the long-term ORConn once an event has
 happened that caused us to drop destroy requests. That is, try to work
 towards closing the orconn, which will trigger destruction of the
 remaining circuits. But if even one long-lived circuit remains, that
 option is not so great, since it could remain for days or even weeks.

 What do we know about the pattern of destroys when we are reacting to an
 oom case? For example, do we end up making decisions like "close all the
 circuits to that relay"? In that case we could close the entire orconn,
 right there, rather than sending thousands of destroy cells. We'd probably
 want to mark it for flush for a little while so its current contents have
 a chance to go out, but that approach seems workable *if* that's the
 pattern of destroys that we want to make.

 Another option would be to make multidestroy cells that give you a huge
 pile of circids+reasons in a single cell -- basically extend the notion of
 the destroy queue into something that you can transport wholesale to a
 neighbor relay.

 Another option would be to make a destroy-except cell, where if you want
 to close a big pile of circids but leave a few open, you send over the
 ones *not* to destroy.

 While we're at it, we might want to get rid of the "send a truncate cell
 toward the client, and then let the client actually destroy the circuit"
 design. We built Tor that way so that clients could choose to have some
 smarter reaction in the future, like re-extending the circuit to some
 different next hop. But in practice we haven't figured out a smarter
 reaction that doesn't draw in a lot of complexity in terms of anonymity
 analysis, so maybe we should opt to simplify the design (and thus reduce
 network load).

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24667#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online