[tor-bugs] #24700 [Core Tor/Tor]: sched: With KIST, a channel can be added more than once in the pending list
Tor Bug Tracker & Wiki
blackhole at torproject.org
Thu Dec 21 21:34:32 UTC 2017
#24700: sched: With KIST, a channel can be added more than once in the pending list
------------------------------+--------------------------------
Reporter: dgoulet | Owner: dgoulet
Type: defect | Status: assigned
Priority: Medium | Milestone: Tor: 0.3.2.x-final
Component: Core Tor/Tor | Version: Tor: 0.3.2.1-alpha
Severity: Normal | Keywords: tor-sched, kist
Actual Points: | Parent ID:
Points: | Reviewer: #23993
Sponsor: |
------------------------------+--------------------------------
Here is the flow for this to happen (and it has been observed):
The scheduler flushes 100 bytes onto the outbuf of a channel. Then the
channel it put in the re-add list (which is to add back the channel in the
pending list at the _end_ of the scheduler loop) because it can't write
anymore. Its state is set to `waiting_to_write`.
We then handle the next channel in the main loop but before that we'll try
to write to kernel the outbuf of the previous channel which is in the re-
add list.
If the outbuf is fully drained, `scheduler_channel_wants_writes()` is
called which will add the channel back in the pending list *IF* it is in
the `waiting_to_write` state which it is because it was set to that state
just before being added to the re-add list.
Scheduler loop ends and we end up with twice the channel in the pending
list. This can go on for a while resulting in the same channel being added
many more times.
There are two ways to fix that, one quick and one more logical.
1. The quick one is to let the state of the channel in PENDING mode (which
is what Vanilla does) before adding it to the re-add list. That way, it
won't get added back in the pending list by any callbacks.
2. Originally, the assumption was that KIST takes care of the write on the
socket and only KIST. But the reality is different, KIST will trigger
writes to the kernel but anything after that, any errors or retry is
handled by libevent `write_event` (#24448, and #24449).
So, it doesn't make sense for KIST to reschedule the channel as pending
if it is waiting to write on the socket because from that point on, it
will be the job of libevent to actually flush it with its `poll(POLLOUT)`
feature. Thus the fix is to never add back a channel that is waiting to
write.
I personally would like to have (2) for two reasons. First, we would save
CPU time and useless syscalls in this fast path. Second, adding a channel
that is waiting to write back into the pending list is not really good in
terms of priority. It should not get considered for another round in the
loop, it should simply wait until the socket has been written since the
assumption in (2) is false in the first place.
--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24700>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online
More information about the tor-bugs
mailing list