[tor-dev] Tor-stem in python script - HTTP requests issue

Damian Johnson atagar at torproject.org
Fri Aug 2 17:23:19 UTC 2013


Hi Eduard. On first glance I'm not aware of any resource that would be
exhausted by this after 200 iterations. Does the issue repro if you
just do 400 GETs or 400 POSTs, or does the problem only arise when you
do a combination of 200 of each? Have you tried running netstat or
another connection resolver when it gets stuck to see if you have 400
open connections (that is to say, checking that this is terminating
the connections as expected)?

On a side note I'm a little concerned about you running multiple
instances of this script to run through multiple relays. 400 requests
x N instances would be quite a bit of traffic to dump on the network.
In addition, each instance is downloading the full tor consensus and
microdescriptors since it does not share a data directory. What
exactly is the goal of your script?

Cheers! -Damian

On Thu, Aug 1, 2013 at 10:00 AM, Eduard Natale <eduard.natale at gmail.com> wrote:
> Hello guys,
>
> I had a problem and currently I'm not able to solve it. So, here I am ;) I
> have a python script that uses python-stem to create and handle a tor
> instance (on a defined port). What it does is retrieving (using a  HTTP GET)
> a web page and submitting information (using HTTP POST messages).
> Basically i use tor because I need to test this server from different IP
> addresses with more requests in parallel. What I also do is keeping trace of
> Cookies. Here's a sample of the code I use, based on the example on stem
> website https://stem.torproject.org/tutorials/to_russia_with_love.html (to
> have more parallel requests, i launch the script many times with different
> socks_port value):
> ----------------------------
> import socket, socks, stem.process
> import mechanize, cookielib
>
> SOCKS_PORT = 9000
> DATA_DIRECTORY = "TOR_%s" % SOCKS_PORT
> socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, '127.0.0.1', SOCKS_PORT)
> socket.socket = socks.socksocket
>
> tor_process = stem.process.launch_tor_with_config(
>           config = {
>             'SocksPort': str(SOCKS_PORT),
>             'ControlPort': str(SOCKS_PORT+1),
>             'DataDirectory': DATA_DIRECTORY,
>             'ExitNodes': '{it}',
>           },
>         )
>
> # initialize python mechanize, with cookies (it works exactly like urllib2,
> urllib3, etc. already tried...)
> br = mechanize.Browser()
> cj = cookielib.LWPCookieJar()
> br.set_cookiejar(cj)
> ...
>
> for number in num_list:
>   req = br.open_novisit("http://example.com") #_1_
>   res = req.read()
>   print res
>   req.close()
>   req2 = br.open("http://example.com/post_to_me", data_to_post) #_2_
>   res2 = req2.read()
>   req2.close()
> --------------------------------
>
> And that's it. The problem occurs on the lines i marked as _1_ and _2_:
> basically when it reaches around 200 requests, it seems to block
> undefinitely, waiting for a response that never comes. Of course,
> wiresharking doesn't work because it's encrypted. The same stuff, without
> TOR, works perfectly. So, why does it stuck at about 200 requests!? I tried
> to:
>
> 1. Telnet on control port, forcing to renew circuits with SIGNAL NEWNYM
> 2. instantiating mechanize (urllib2, 3, whatever) in the loop
> 3. ...i don't remember what else
>
> I thought it could be a local socket connection limit: actually without TOR,
> i see in wireshark the source port changes every time a request is
> performed. But actually i don't know if the problem is in using the same
> source port every time (but i don't think so) and if so, should I close the
> current socket and open a new one? Should I kill the tor process? I can't
> exaplain myself why...
> What I only know is: *when the script stucks, if i kill the python process
> (ctrl+c) and then re-launch, it starts working again.*. I've seen that it's
> possible to set the value of TrackHostExitsExpire, is it useful in my case?
>
> Thanks in advance to whoever can help me!!
> Ed
>
>
> _______________________________________________
> tor-dev mailing list
> tor-dev at lists.torproject.org
> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
>


More information about the tor-dev mailing list