On Fri, Mar 07, 2014 at 02:39:18PM +0000, infinity0@torproject.org wrote:
commit 05b9c101ba9afe4653d1eff6f5414f90f22ef042 Author: Ximin Luo infinity0@torproject.org Date: Fri Mar 7 13:39:31 2014 +0000
remove failed connections from proxy_pairs as well - this is a pretty major fix, as the previous behaviour effectively disabled a proxy after 5 failed connections (= max_num_proxy_pairs / 2)
Can you say some more about the circumstances of this fix? What behavior changed for you after making it? I can't reproduce what's described in the commit message.
With 9dea0c6c90dedc868dfaa84add2bfa19a2039281, I set up a phony facilitator with ncat -v -lk 7000 --sh-exec "./facilitator.py" where facilitator.py has the contents
#!/usr/bin/env python import random import sys port = random.randint(10000,11000) print "HTTP/1.0 200 OK\r" print "Content-Type: application/x-www-form-urlencoded\r" print "Cache-Control: no-cache\r" print "Access-Control-Allow-Origin: *\r" print "\r" sys.stdout.write("check-back-in=10&client=127.0.0.1:"+str(port)+"&relay=173.255.221.44:9901")
I then browse to file:///home/david/flashproxy/proxy/embed.html?debug&facilitator=http://127.0.0.1:7000&initial_facilitator_poll_interval=1&unsafe_log... All the "client" connections are going to closed ports, but the "Client: error" lines still appear and clients don't accumulate in the request URL. The same happens if I use an open port with a Ncat listener that just closes the connection. The same happens if I change to a non-listening Internet address like 1.1.1.1.
2014-03-10T16:28:48.173Z | Starting; will contact facilitator in 1 seconds. 2014-03-10T16:28:49.179Z | Facilitator: connecting to http://127.0.0.1:7000?r=1&transport=websocket. 2014-03-10T16:28:49.203Z | Next check in 10 seconds. 2014-03-10T16:28:49.204Z | Facilitator: got client:{ host: "127.0.0.1", port: 10782 } relay:{ host: "173.255.221.44", port: 9901 }. 2014-03-10T16:28:49.205Z | 127.0.0.1:10782|173.255.221.44:9901 : Client: connecting. 2014-03-10T16:28:49.206Z | 127.0.0.1:10782|173.255.221.44:9901 : Client: connecting. 2014-03-10T16:28:49.207Z | 127.0.0.1:10782|173.255.221.44:9901 : Client: error. 2014-03-10T16:28:49.208Z | 127.0.0.1:10782|173.255.221.44:9901 : Client: closed. 2014-03-10T16:28:49.209Z | Complete. 2014-03-10T16:28:49.209Z | 127.0.0.1:10782|173.255.221.44:9901 : Client: error. 2014-03-10T16:28:49.209Z | 127.0.0.1:10782|173.255.221.44:9901 : Client: closed. 2014-03-10T16:28:49.210Z | Complete. 2014-03-10T16:28:59.205Z | Facilitator: connecting to http://127.0.0.1:7000?r=1&transport=websocket. 2014-03-10T16:28:59.226Z | Next check in 10 seconds. 2014-03-10T16:28:59.227Z | Facilitator: got client:{ host: "127.0.0.1", port: 10221 } relay:{ host: "173.255.221.44", port: 9901 }. 2014-03-10T16:28:59.227Z | 127.0.0.1:10221|173.255.221.44:9901 : Client: connecting. 2014-03-10T16:28:59.227Z | 127.0.0.1:10221|173.255.221.44:9901 : Client: connecting. 2014-03-10T16:28:59.229Z | 127.0.0.1:10221|173.255.221.44:9901 : Client: error. 2014-03-10T16:28:59.229Z | 127.0.0.1:10221|173.255.221.44:9901 : Client: closed. 2014-03-10T16:28:59.230Z | Complete. 2014-03-10T16:28:59.230Z | 127.0.0.1:10221|173.255.221.44:9901 : Client: error. 2014-03-10T16:28:59.230Z | 127.0.0.1:10221|173.255.221.44:9901 : Client: closed. 2014-03-10T16:28:59.231Z | Complete. 2014-03-10T16:29:09.227Z | Facilitator: connecting to http://127.0.0.1:7000?r=1&transport=websocket. 2014-03-10T16:29:09.249Z | Next check in 10 seconds. 2014-03-10T16:29:09.250Z | Facilitator: got client:{ host: "127.0.0.1", port: 10069 } relay:{ host: "173.255.221.44", port: 9901 }.
Now what *does* seem to be happening, and what had me confused for a while, is that Firefox seems to be artificially delaying repeated WebSocket connections to the same port with some kind of exponential delay. I originally had a hardcoded port in facilitator.py, and the delay made it appear that something like what the commit message describes was happening. Since WebSocket connections eventually start waiting a long time to fail, it looks like they are not being closed. https://people.torproject.org/~dcf/graphs/firefox-exponential-delay.png is a screenshot showing what looks like exponential growth in the Network panel. Chromium doesn't appear to have the same issue; the requests are linearly spaced in the network pane.
Are you able to confirm any of the above? When you discovered the bug, were you testing with Firefox or Chromium? How did you find the bug, and what did you do (differently from the above) to reproduce it? In any case, what's going on must be different than what's described in the commit message, because even before the fix the proxy continues running for me with max_clients=2.
David