From ivan at equalit.ie Thu Aug 31 09:04:02 2017 From: ivan at equalit.ie (Ivan Vilata-i-Balaguer) Date: Thu, 31 Aug 2017 11:04:02 +0200 Subject: [ooni-dev] Timeout in process+TCP exchange nettest Message-ID: <20170831090402.GO5857@sax.terramar.selidor.net> Hi list, I'm working on an experimental nettest to collect addresses of other peer probes participating in a given P2P-like experiment. We have a [helper][] which listens for probe connections on a TCP port and a [nettest][] which connects to it. The nettest starts an HTTP server process on a given port and if the process doesn't exit in a few seconds, it assumes that running the process was successful. In both cases, then it connects to the helper and reports the HTTP server port (or none), then the helper saves it into a file and replies with another entry from the file, which the nettest saves into a local file. [helper]: https://github.com/equalitie/ooni-backend/blob/eq-testbed/oonib/testhelpers/peer_locator_helpers.py [nettest]: https://github.com/equalitie/ooni-probe/blob/eq-testbed/ooni/nettests/experimental/peer_locator_test.py The nettest derives from ``TCPTest``, and its main test method returns a subprocess deferred. A timeout function is set up to cancel the deferred (assuming that the subprocess kept on running), in which case the subprocess errback catches the cancellation error and proceeds to normal callback. The normal callback (which also handles subprocess exit codes) either retries running the subprocess from the beginning, or if the subprocess was successful, proceeds to start the TCP exchange with the helper by sending it some payload and setting handlers for error and reply. The nettest has an odd behaviour: although the TCP exchange completes in less than a second (including closing both ends of the connection), the nettest's TCP response handler only runs after the test times out (``TCPTest.timeout``). In some occasions (esp. when running the subprocess is retried), regardless of TCP exchange success, the ``Measurement`` task itself times out, the TCP response handler doesn't run at all, and the test is cancelled and run again. I'd like to ask for some advice in how to handle this situation, ideally so that the test can finish as soon as the TCP exchange completes so that timeouts don't trigger. If the explanations are unclear I can try to provide simplified versions of the code. Please note that my experience with Twisted/OONI development is very limited so I try my best.`;)` Thank you very much for your help! -- Ivan Vilata i Balaguer