Hello,
I have a problem with one[1] of my exit relays: about a week ago this relay lost its Running flag, but it was running all the time. I upgraded Tor on this relay to 0.2.9.7-rc from "tor-devel" FreeBSD port and restarted it. Relay was and still is accessible from outside via ports 80 and 443. FreeBSD version is 11.0-RELEASE-p2.
After about 24 hours running on new version I decided to check votes from bwauth on detailed page here[2]. Back then the relay had votes for Running flag from maatuuska, dizum and dannenberg. I've checked the relay's ability to open connections to bwauths and it couldn't connect to longclaw - connection timed out, all other bwauths are reachable from the relay. Test was performed by command "openssl s_client -connect <bwauth ip>:443". Ipfw and pf are disabled on the server, there are only tor, ssh and local_unbound running, unbound uses my recursive DNS resolver on another server. DNS queries are resolved fine.
Then I saved /var/db/tor directory and started Tor with fresh identity[3], but in about 6 hours it showed similiar symptoms: only maatuska, dizum and dannenberg were voting for it, so I switched back to old identity[1] since.
In December 23th I decided to downgrade Tor version back to 0.2.8 from FreeBSD port "tor" and I enabled info level logging. The relay was running since with version 0.2.8.11, at the moment still only 3 bwauth are voting "Running" flag for it: maatuska, dizum and dannenberg. All bwauths except longclaw are accessible from the server. All bwauths are accessible from my neighbour relays in the same datacenter[4][5].
Log info.log has size of 200 megabytes at the moment, first concerning message is: Dec 23 11:54:23.000 [warn] Cannot make an outgoing connection without a DirPort. messages during relay startup, but they disappear eventually: # grep -c 'outgoing connection without a DirPort' info.log 19
They don't look like a reason for the problem.
Another anomaly is huge amount of messages like Dec 23 11:54:24.000 [info] router_pick_published_address: Success: chose address '85.143.213.44'. ... Dec 26 13:58:31.000 [info] router_pick_published_address: Success: chose address '85.143.213.44'. # grep -c router_pick_published_address info.log 1653449
I upgraded Tor to version 0.2.9.8 from FreeBSD port "tor". The flood of router_pick_published_address doesn't reproduce with new version.
I can compress the log file to 2 mb file with xz -9 and publish it, though I don't want to do it right away because of the warning: Dec 23 11:54:20.000 [warn] Your log may contain sensitive information - you're logging more than "notice". Don't l og unless it serves an important reason. Overwrite the log afterwards.
as I don't really know what exact info I should scrub from it.
Currently relay is running with identity[1] and version 0.2.9.8 with info logging. Traceroute to longclaw: 17:05:15-root@bsd1 ~ # traceroute 199.254.238.53 traceroute to 199.254.238.53 (199.254.238.53), 64 hops max, 40 byte packets 1 192.168.240.102 (192.168.240.102) 0.084 ms 0.088 ms 0.059 ms 2 192.168.240.5 (192.168.240.5) 0.664 ms 0.180 ms 0.105 ms 3 85.143.208.1 (85.143.208.1) 1.035 ms 1.437 ms 3.960 ms 4 185-47-53-25.customer.comfortel.pro (185.47.53.25) 0.759 ms 0.615 ms 0.645 ms 5 81-27-254-184.rascom.as20764.net (81.27.254.184) 1.507 ms 1.213 ms 1.300 ms 6 80-64-96-225.rascom.as20764.net (80.64.96.225) 12.082 ms 11.974 ms 11.867 ms 7 xe-9-2-0.bar1.Stockholm1.Level3.net (213.242.110.113) 11.781 ms 11.575 ms lag-127-401.ear1.Stockholm2.Level3.net (213.242.110.41) 12.083 ms 8 * * * 9 63-235-41-149.dia.static.qwest.net (63.235.41.149) 112.242 ms 115.032 ms 117.349 ms 10 * * * 11 65.116.154.6 (65.116.154.6) 182.361 ms 182.780 ms 185.128 ms 12 208.99.210.41 (208.99.210.41) 181.991 ms 182.831 ms 182.524 ms 13 208.99.192.65 (208.99.192.65) 181.698 ms 182.007 ms 182.437 ms 14 204.8.32.86 (204.8.32.86) 185.968 ms 185.809 ms 185.945 ms 15 wren.riseup.net (198.252.153.1) 181.940 ms 181.907 ms 181.890 ms 16 * * * 17 * * *
Please tell what I can do to understand a reason of the problem.
P.S. During compress attempt of the log memory of the server was exhausted: xz and tor processes were killed. So log is containing information since Dec 23 11:54:20 UTC till Dec 26 13:58:39 UTC. I'll compress logs on another machine.
[1] 1836695E7DFE2F80B13A9884431759B5DD19F1DA [2] https://consensus-health.torproject.org/ [3] 9E2833F5F95421B52A5336F66B063E471725F011 [4] 2C2CD992B785F09752278260DCD8D6D6242BB87A [5] 7F350D9EAD6C74730FB13862308BE60FA9B19678
Hello again,
Bwauths started voting again for my exit relay[1], looks like the problem disappeared. Though longclaw is still inaccessible from relay[1]: 10:16:14-root@bsd1 ~ # traceroute 199.254.238.53 traceroute to 199.254.238.53 (199.254.238.53), 64 hops max, 40 byte packets 1 192.168.240.102 (192.168.240.102) 0.086 ms 0.075 ms 0.129 ms 2 192.168.240.5 (192.168.240.5) 0.123 ms 0.136 ms 0.105 ms 3 85.143.208.1 (85.143.208.1) 5.350 ms 7.816 ms 19.672 ms 4 185-47-53-25.customer.comfortel.pro (185.47.53.25) 0.673 ms 1.111 ms 2.669 ms 5 81-27-254-184.rascom.as20764.net (81.27.254.184) 1.638 ms 1.129 ms 1.484 ms 6 80-64-96-225.rascom.as20764.net (80.64.96.225) 11.876 ms 12.420 ms 11.693 ms 7 xe-9-2-0.bar1.Stockholm1.Level3.net (213.242.110.113) 11.507 ms 11.666 ms lag-127-401.ear1.Stockholm2.Level3.net (213.242.110.41) 12.122 ms 8 * * * 9 63-235-41-149.dia.static.qwest.net (63.235.41.149) 110.429 ms 109.880 ms 109.923 ms 10 * * * 11 65.116.154.6 (65.116.154.6) 183.700 ms 182.898 ms 185.499 ms 12 208.99.210.41 (208.99.210.41) 182.097 ms 182.302 ms 184.688 ms 13 208.99.192.65 (208.99.192.65) 182.331 ms 184.661 ms 184.562 ms 14 204.8.32.86 (204.8.32.86) 186.152 ms 190.550 ms 191.381 ms 15 wren.riseup.net (198.252.153.1) 182.196 ms 184.517 ms 184.863 ms 16 * * * 17 * * * 18 * * *
It is accessible from neighbour relay[2] in the same datacenter: 10:16:42-thirdexit ~ # traceroute 199.254.238.53 traceroute to 199.254.238.53 (199.254.238.53), 30 hops max, 60 byte packets 1 192.168.240.138 (192.168.240.138) 3.014 ms 3.002 ms 3.001 ms 2 192.168.240.5 (192.168.240.5) 3.012 ms 3.024 ms 3.034 ms 3 85.143.208.1 (85.143.208.1) 3.089 ms 8.311 ms 12.929 ms 4 185-47-53-25.customer.comfortel.pro (185.47.53.25) 3.024 ms 3.019 ms 3.019 ms 5 81-27-254-184.rascom.as20764.net (81.27.254.184) 3.036 ms 3.047 ms 3.060 ms 6 80-64-96-225.rascom.as20764.net (80.64.96.225) 12.107 ms 11.761 ms 11.879 ms 7 lag-127-401.ear1.Stockholm2.Level3.net (213.242.110.41) 12.574 ms xe-9-2-0.bar1.Stockholm1.Level3.net (213.242.110.113) 49.256 ms 11.338 ms 8 * * * 9 63-235-41-149.dia.static.qwest.net (63.235.41.149) 109.894 ms 109.909 ms 110.102 ms 10 * * * 11 65.116.154.6 (65.116.154.6) 181.851 ms 181.868 ms 182.970 ms 12 208.99.210.41 (208.99.210.41) 182.146 ms 182.042 ms 182.625 ms 13 208.99.192.65 (208.99.192.65) 183.820 ms 183.781 ms 183.789 ms 14 204.8.32.86 (204.8.32.86) 186.095 ms 183.775 ms 183.773 ms 15 wren.riseup.net (198.252.153.1) 186.080 ms 186.032 ms 186.016 ms 16 longclaw.riseup.net (199.254.238.53) 183.757 ms 183.678 ms 185.967 ms
I'll try to contact riseup.net outside of the list about this issue.
[1] 1836695E7DFE2F80B13A9884431759B5DD19F1DA [2] 2C2CD992B785F09752278260DCD8D6D6242BB87A
tor-relays@lists.torproject.org