Hello everyone,
thanks for replying. I did some further checks as you suggested. In the meantime I removed MaxAdvertisedBandwidth since the default should do.
I pinned the ExitNode in my Client and downloaded an Ubuntu image. Download was between 800 kbyte/s and 1 Mbyte/s. Not great - but not bad. When I start a wget on the machine directly it will be downloaded with 30 Mbyte/s. So I do not assume we are limited by the provider.
I checked and changed the ulimit.
Unfortunately no better results so far. I did not yet set up an local resolver - does it make such a difference ? We use the provides DNS and 8.8.8.8.
Finally a set logfiles to debug and started greping for err:
Dec 15 22:55:50.000 [info] {EDGE} connection_edge_process_relay_cell(): 617: end cell (misc error) for stream 7289. Removing stream. Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [info] {EDGE} connection_edge_process_relay_cell(): 356: end cell (misc error) for stream 42558. Removing stream. Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [info] {EDGE} connection_edge_process_relay_cell(): 491: end cell (misc error) for stream 36878. Removing stream. Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:50.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [info] {EDGE} connection_edge_process_relay_cell(): 177: end cell (misc error) for stream 60286. Removing stream. Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [info] {EDGE} connection_edge_process_relay_cell(): 374: end cell (misc error) for stream 60283. Removing stream. Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [info] {EDGE} connection_edge_process_relay_cell(): end cell (misc error) dropped, unknown stream. Dec 15 22:55:51.000 [info] {EDGE} connection_edge_process_relay_cell(): 174: end cell (misc error) for stream 60289. Removing stream. Dec 15 22:55:51.000 [info] {EDGE} connection_edge_process_relay_cell(): 384: end cell (misc error) for stream 60288. Removing stream. Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [info] {EDGE} connection_edge_process_relay_cell(): 590: end cell (misc error) for stream 60285. Removing stream. Dec 15 22:55:51.000 [info] {EDGE} connection_edge_process_relay_cell(): 587: end cell (misc error) for stream 60284. Removing stream. Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [info] {EDGE} connection_edge_process_relay_cell(): 338: end cell (misc error) for stream 32126. Removing stream. Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [debug] {NET} tor_tls_read(): read returned r=-1, err=-2 Dec 15 22:55:51.000 [info] {EDGE} connection_edge_process_relay_cell(): 676: end cell (misc error) for stream 60290. Removing stream.
I get very much of these. It does not look healthy to me. Is this normal ?
Again many thanks for your help.
best regards
Dirk
On 14.12.2015 00:06, Tim Wilson-Brown - teor wrote:
On 14 Dec 2015, at 07:18, Dirk Eschbach <tor-relay.dirk@o.banes.ch mailto:tor-relay.dirk@o.banes.ch> wrote:
...
The big question now is: Why do the machines do not have more throughput ? Is the reason for this the way the distribution through the Tor network works. Moritz hinted it might have to do with the way the tor "bandwidth scanners" measure the ability of a server to handle traffic.
No, this is not the issue, your relay's own self-measured throughput is the issue. See below.
Can you explain me / point me to documentation where this process is described and how this can be optimized. What are the criteria for tor exit node server traffic distribution ?
By consensus weight, which is determined by the bandwidth authorities. https://blog.torproject.org/blog/lifecycle-of-a-new-relay https://stem.torproject.org/tutorials/examples/votes_by_bandwidth_authoritie...
How do the clients choose the exit ?
From those servers they believe exit to the address and port they want, randomly, weighted by the server's consensus weight.
The details of bandwidth weight selection are in section 3.4.2 of the Directory Specification: https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt
"The bandwidth in a "w" line should be taken as the best estimate of the router's actual capacity that the authority has. For now, this should be the lesser of the observed bandwidth and bandwidth rate limit from the server descriptor. It is given in kilobytes per second, and capped at some arbitrary value (currently 10 MB/s).
The Measured= keyword on a "w" line vote is currently computed by multiplying the previous published consensus bandwidth by the ratio of the measured average node stream capacity to the network average. If 3 or more authorities provide a Measured= keyword for a router, the authorities produce a consensus containing a "w" Bandwidth= keyword equal to the median of the Measured= votes."
When I look at the bandwidth authority votes for DigiGesTor1e1, they say: w Bandwidth=9586 Measured=24200 w Bandwidth=9586 Measured=15900 w Bandwidth=9586 Measured=43500 w Bandwidth=9586 Measured=19700 w Bandwidth=9586 Measured=19200
Those votes are in these large files: http://171.25.193.9:443/tor/status-vote/current/authority http://199.254.238.53/tor/status-vote/current/authority http://131.188.40.189/tor/status-vote/current/authority http://128.31.0.34:9131/tor/status-vote/current/authority http://154.35.175.225/tor/status-vote/current/authority
So the bandwidth authorities have having no trouble measuring your relay, they think it should be 2x - 4x as fast. Your relay itself has't observed itself sustaining that performance over a 10-second interval, so it won't allow the directory authorities to assign it more bandwidth.
Section 2.1.1 of the Directory Specification:
""bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed NL
[Exactly once] Estimated bandwidth for this router, in bytes per second. The "average" bandwidth is the volume per second that the OR is
willing to sustain over long periods; the "burst" bandwidth is the volume that the OR is willing to sustain in very short intervals. The "observed" value is an estimate of the capacity this relay can handle. The relay remembers the max bandwidth sustained output over any ten second period in the past day, and another sustained input. The "observed" value is the lesser of these two numbers."
Please improve the throughput your relay can sustain over a 10-second period. Try some performance tuning steps, testing your relay with Tor client after each one. (See below.)
Have you set a limit on MaxAdvertisedBandwidth in the torrc files?
Konsole output top - 09:57:49 up 16 days, 11:57, 1 user, load average: 1.14, 0.91, 0.81 Tasks:244 total, 3 running,241 sleeping, 0 stopped, 0 zombie %Cpu0 :12.5 us, 3.4 sy, 0.0 ni,79.7 id, 0.0 wa, 0.0 hi, 4.4 si, 0.0 st %Cpu1 :15.9 us, 5.4 sy, 0.0 ni,74.3 id, 0.3 wa, 0.0 hi, 4.1 si, 0.0 st %Cpu2 :14.3 us, 2.7 sy, 0.0 ni,77.0 id, 0.0 wa, 0.0 hi, 6.0 si, 0.0 st %Cpu3 : 9.5 us, 3.4 sy, 0.0 ni,80.3 id, 0.0 wa, 0.0 hi, 6.8 si, 0.0 st KiB Mem: 3877624 total, 2739880 used, 1137744 free, 1288 buffers KiB Swap: 4026364 total, 364264 used, 3662100 free. 10752 cached Mem
Looking at the network connection it is without any problem possible to start big downloads without reducing TOR throughput.
Have you tried doing a large download through your exit via a Tor client and seeing how fast that is? ExitNodes <fingerprint od your exit> StrictNodes 1
The servers are connected with 1 Gbit/s each.
It looks like your Tor processes are neither CPU nor network-bound.
Does your network connection drop packets or have large latency?
How many file descriptors are the tor processes allowed to open? How many connections does each tor process have open at once? (There should be thousands per process on a busy relay.)
https://gitweb.torproject.org/tor.git/tree/doc/TUNING
Are there any performance-related messages in the Tor logs?
Is your hardware / kernel / firewall / etc. capable of handling many connections?
Does your provider rate-limit any kinds of traffic? Does your provider limit the number of open connections?
Is your DNS resolver keeping up with the requests? Do you have a local caching DNS resolver on each machine?
Tim
Tim Wilson-Brown (teor)
teor2345 at gmail dot com PGP 968F094B
teor at blah dot im OTR CAD08081 9755866D 89E2A06F E3558B7F B5A9D14F
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays