[tor-bugs] #32388 [Core Tor/Tor]: sched: When OR connection opens, always indicate the channel is ready for cells

Tor Bug Tracker & Wiki blackhole at torproject.org
Tue Nov 5 19:58:38 UTC 2019


#32388: sched: When OR connection opens, always indicate the channel is ready for
cells
------------------------------+--------------------------------
     Reporter:  dgoulet       |      Owner:  dgoulet
         Type:  defect        |     Status:  assigned
     Priority:  Medium        |  Milestone:  Tor: 0.4.3.x-final
    Component:  Core Tor/Tor  |    Version:
     Severity:  Normal        |   Keywords:  tor-sched
Actual Points:                |  Parent ID:
       Points:  0.1           |   Reviewer:
      Sponsor:                |
------------------------------+--------------------------------
 == Problem

 In `channel_tls_handle_state_change_on_orconn()`, when called when the OR
 connection becomes open, we have this snippet of code for when the new
 state is OPEN:

 {{{
     channel_change_state_open(base_chan);
     if (connection_or_num_cells_writeable(conn) > 0) {
       scheduler_channel_wants_writes(base_chan);
     }
 }}}

 So basically, the above means we only indicate the scheduler that the
 channel wants to write if we already have cells on the `outbuf`.

 It basically means that if the channel *scheduler* state is `IDLE`
 (initial opening state), it then ends up in state
 `SCHED_CHAN_WAITING_FOR_CELLS` which then means the scheduler will process
 the channel when a cell is queued on it. But ONLY if the channel had cells
 when it was opened.

 This on its own, as a design, is problematic because then what can make
 the channel transition into a state that allows the scheduler to recognize
 the channel as ready to be processed for cells?
 (`SCHED_CHAN_WAITING_FOR_CELLS`).

 Tor currently has 2 callsites that tells the scheduler that a channel
 "wants to write" data on the wire (`scheduler_channel_wants_writes()`,
 remember that function transition the scheduler state of the channel to
 "waiting for cells"):

 1. It is mentioned above that is when the channel becomes `OPEN`.

 2. The other one is when data is _flushed_ from the outbuf, in
 `connection_or_flushed_some()`.

 So once the channel is opened, we only become in the "wants to write"
 state if we previously had cells in the outbuf, which I can assure you is
 not always the case. And the other way is when we just wrote bytes on the
 network but then how can we do that in the first place?

 == What Is Saving Us

 One question is then: Maybe tor code flow makes it that we always have a
 cell in the outbuf when the channel opens?

 I conducted an experiment which made an Entry node of a client to only
 send 1 cell at a time per mainloop round. This made it that the
 `VERSIONs`, `CERTS` and `NETINFO` cell were sent in 3 different mainloop
 round and thus the client received them with a "delay".

 That was enough for the client's channel to have _nothing_ in the outbuf
 when the channel became `OPEN` that is when the `NETINFO` cell is received
 from the relay which skipped the scheduler state change and thus the
 client channel was stuck there in scheduler `IDLE` mode even though the
 channel was in `OPEN` state.

 What is saving us is because the very first thing we write on the wire,
 `VERSIONS` cell, calls the (2) callsite that tells the scheduler to go in
 "waiting for cells" state. And from there, the channel stays in that
 state.

 For now, this seems to be "fine" but any future changes like for instance
 where we wanted to have everything that writes on the `outbuf` to go
 through the scheduler so we can have proper KIST prioritization, will fail
 due to this design.

 == Short-Term Solution

 When the channel opens, we should simply put it in `wants to write` state
 all the time. So even bouncing from `MAINT` to `OPEN` state and vice versa
 will never make some cells stuck in the channel until something is
 explicitly written on the outbuf.

 Furthermore, it should be done in `channel_change_state_()` since this
 affects `channel_t`, it is NOT specific to `channeltls_t`. So in a world
 where we end up with multiple channel type, this whole situation explodes.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/32388>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list