commit 207dba02d6adcf794b19e00b330548348c950878 Author: Damian Johnson atagar@torproject.org Date: Tue Mar 18 09:48:50 2014 -0700
Lowering logging about undelivered messages to the info level
Our BaseController's test_msg_repeatedly integ test hammers the holy bejezus out of our controller socket in the hope of triggering exotic concurrency issues that might lead to issues like deadlock. This has been an impressively great test at ferreting out exotic issues and this is no different. Our Jenkins tests frequently log...
03/12/2014 15:37:57 [NOTICE] BUG: the msg() function failed to deliver a response: Authentication required.
After a couple hours of looking into this I gotta admit that I'm frustrated by the delicacy of our message handling. It works great in practice, but in theory issues such as socket errors or edge cases can cause undelivered messages.
In this case rapidly connecting and closing the socket while simultaneously communicating on it causes the socket's recv() to occasionally raise exceptions. These exceptions are reported to callers, leaving enqueued messages that end up being undelivered (logging this notice).
After much head scratching and digging into our socket class I've come to the realization that this is about as good as it's gonna get. If we use our recv queue to communicate *both* messages and exceptions then it is to be expected that exceptions can sometimes result in undelivered messages.
Lowering the severity of the log message appropriately. --- stem/control.py | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/stem/control.py b/stem/control.py index 4d7f5b3..0d91587 100644 --- a/stem/control.py +++ b/stem/control.py @@ -346,9 +346,20 @@ class BaseController(object): # # - This is a leftover response for a msg() call. We can't tell who an # exception was earmarked for, so we only know that this was the case - # if it's a ControlMessage. This should not be possible and indicates - # a stem bug. This deserves a NOTICE level log message since it - # indicates that one of our callers didn't get their reply. + # if it's a ControlMessage. + # + # This is the most concerning situation since it indicates that one of + # our callers didn't get their reply. However, this is still a + # perfectly viable use case. For instance... + # + # 1. We send a request. + # 2. The reader thread encounters an exception, for instance a socket + # error. We enqueue the exception. + # 3. The reader thread receives the reply. + # 4. We raise the socket error, and have an undelivered message. + # + # Thankfully this only seems to arise in edge cases around rapidly + # closing/reconnecting the socket.
while not self._reply_queue.empty(): try: @@ -361,7 +372,7 @@ class BaseController(object): elif isinstance(response, stem.ControllerError): log.info("Socket experienced a problem (%s)" % response) elif isinstance(response, stem.response.ControlMessage): - log.notice("BUG: the msg() function failed to deliver a response: %s" % response) + log.info("Failed to deliver a response: %s" % response) except Queue.Empty: # the empty() method is documented to not be fully reliable so this # isn't entirely surprising