Dear all,
we are operating two exit nodes with each two tor processes out of Switzerland. The nodes worked quite stable until about 2-3 weeks ago.
Since then we experience frequent disruptions (Up to several times a day). This is caused by a significant rise in memory consumption by the tor processes and ends with a tor process being killed by the Linux Kernel:
May 22 00:30:43 tor2 kernel: [2257156.134100] Killed process 40964 (tor) total-vm:448088kB, anon-rss:0kB, file-rss:0kB
The systems are physical machines with Ubuntu 14.04.5 LTS with 4 GB of memory. This was sufficient the last 2 years.
For more details a sample of this behavior is documented in a Ticket [1]. Tor project team is researching but until now no hints lead to an improvement.
Due to a tweet last week [2] and the follow up discussion with another operator I became suspicious that is not just just us having a problem rather it cloud be more common.
Therefore here the question to you all - do you experience some strange behavior like described in the ticket as well ?
Please let me know if so.
If not - maybe it is just an unfortunate coincidence and we will end up make a clean install of both server hoping the problem will go away.
best regards
Dirk
[1] https://trac.torproject.org/projects/tor/ticket/22255
[2]https://twitter.com/FrennVunDerEnn/status/864583496072876034
Hi
We do also experience high memory consumptions by Tor from time to time which result in a total halt of the system. I'm now testing several values in MaxMemInQueues...
Greetings
tor-relay.dirk@o.banes.ch:
Dear all,
we are operating two exit nodes with each two tor processes out of Switzerland. The nodes worked quite stable until about 2-3 weeks ago.
Since then we experience frequent disruptions (Up to several times a day). This is caused by a significant rise in memory consumption by the tor processes and ends with a tor process being killed by the Linux Kernel:
May 22 00:30:43 tor2 kernel: [2257156.134100] Killed process 40964 (tor) total-vm:448088kB, anon-rss:0kB, file-rss:0kB
The systems are physical machines with Ubuntu 14.04.5 LTS with 4 GB of memory. This was sufficient the last 2 years.
For more details a sample of this behavior is documented in a Ticket [1]. Tor project team is researching but until now no hints lead to an improvement.
Due to a tweet last week [2] and the follow up discussion with another operator I became suspicious that is not just just us having a problem rather it cloud be more common.
Therefore here the question to you all - do you experience some strange behavior like described in the ticket as well ?
Please let me know if so.
If not - maybe it is just an unfortunate coincidence and we will end up make a clean install of both server hoping the problem will go away.
best regards
Dirk
[1] https://trac.torproject.org/projects/tor/ticket/22255
[2]https://twitter.com/FrennVunDerEnn/status/864583496072876034
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Le 22/05/17 à 20:29, tor-relay.dirk@o.banes.ch a écrit :
Dear all,
we are operating two exit nodes with each two tor processes out of Switzerland. The nodes worked quite stable until about 2-3 weeks ago.
Since then we experience frequent disruptions (Up to several times a day). This is caused by a significant rise in memory consumption by the tor processes and ends with a tor process being killed by the Linux Kernel:
May 22 00:30:43 tor2 kernel: [2257156.134100] Killed process 40964 (tor) total-vm:448088kB, anon-rss:0kB, file-rss:0kB
The systems are physical machines with Ubuntu 14.04.5 LTS with 4 GB of memory. This was sufficient the last 2 years.
For more details a sample of this behavior is documented in a Ticket [1]. Tor project team is researching but until now no hints lead to an improvement.
Due to a tweet last week [2] and the follow up discussion with another operator I became suspicious that is not just just us having a problem rather it cloud be more common.
Therefore here the question to you all - do you experience some strange behavior like described in the ticket as well ?
Please let me know if so.
Hello!
I have the same issue with tor-0.2.9.10 (last stable on jessie). The process got killed 6 times since May 14 .. and a 7th just right now.
If not - maybe it is just an unfortunate coincidence and we will end up make a clean install of both server hoping the problem will go away.
best regards
Dirk
[1] https://trac.torproject.org/projects/tor/ticket/22255
[2]https://twitter.com/FrennVunDerEnn/status/864583496072876034
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Same with 0.3.0.5. Upgrading to 0.3.0.7 helped on most relays.
niftybunny abuse@to-surf-and-protect.net
Where ignorance is bliss, 'Tis folly to be wise.
Thomas Gray
On 22. May 2017, at 20:29, tor-relay.dirk@o.banes.ch wrote:
Dear all,
we are operating two exit nodes with each two tor processes out of Switzerland. The nodes worked quite stable until about 2-3 weeks ago.
Since then we experience frequent disruptions (Up to several times a day). This is caused by a significant rise in memory consumption by the tor processes and ends with a tor process being killed by the Linux Kernel:
May 22 00:30:43 tor2 kernel: [2257156.134100] Killed process 40964 (tor) total-vm:448088kB, anon-rss:0kB, file-rss:0kB
The systems are physical machines with Ubuntu 14.04.5 LTS with 4 GB of memory. This was sufficient the last 2 years.
For more details a sample of this behavior is documented in a Ticket [1]. Tor project team is researching but until now no hints lead to an improvement.
Due to a tweet last week [2] and the follow up discussion with another operator I became suspicious that is not just just us having a problem rather it cloud be more common.
Therefore here the question to you all - do you experience some strange behavior like described in the ticket as well ?
Please let me know if so.
If not - maybe it is just an unfortunate coincidence and we will end up make a clean install of both server hoping the problem will go away.
best regards
Dirk
[1] https://trac.torproject.org/projects/tor/ticket/22255
[2]https://twitter.com/FrennVunDerEnn/status/864583496072876034
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
On Mon, May 22, 2017 at 10:48:31PM +0200, niftybunny wrote:
Same with 0.3.0.5. Upgrading to 0.3.0.7 helped on most relays.
We didn't change anything between 0.3.0.5 and 0.3.0.7 that would have helped.
If somebody with a really really fast CPU wants to run their relay under valgrind --leak-check for a while, that would be grand.
https://gitweb.torproject.org/tor.git/tree/doc/HACKING/HelpfulTools.md
The question to try to get some answers on is what sort of bloat problem we have:
A) Maybe it's a memory leak?
B) Maybe it's an inefficiency of the memory allocator, e.g. fragmentation where nothing is technically leaked yet there's a whole lot of wasted space?
C) Maybe it is some behavior by clients that causes the relay to use a lot of memory, e.g. by having a bunch of stuff in buffers? The behavior could be accidental / normal, or it could be intentional / malicious.
My first guess is "C, accidental". But it could easily be 'B', and it would be great if it's 'A' because then we can just fix it.
--Roger
Roger,
I have whatever resources you need for testing. Let me know if you would like them.
John
On May 22, 2017, at 21:54, Roger Dingledine arma@mit.edu wrote:
On Mon, May 22, 2017 at 10:48:31PM +0200, niftybunny wrote: Same with 0.3.0.5. Upgrading to 0.3.0.7 helped on most relays.
We didn't change anything between 0.3.0.5 and 0.3.0.7 that would have helped.
If somebody with a really really fast CPU wants to run their relay under valgrind --leak-check for a while, that would be grand.
https://gitweb.torproject.org/tor.git/tree/doc/HACKING/HelpfulTools.md
The question to try to get some answers on is what sort of bloat problem we have:
A) Maybe it's a memory leak?
B) Maybe it's an inefficiency of the memory allocator, e.g. fragmentation where nothing is technically leaked yet there's a whole lot of wasted space?
C) Maybe it is some behavior by clients that causes the relay to use a lot of memory, e.g. by having a bunch of stuff in buffers? The behavior could be accidental / normal, or it could be intentional / malicious.
My first guess is "C, accidental". But it could easily be 'B', and it would be great if it's 'A' because then we can just fix it.
--Roger
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
On Tue, May 23, 2017 at 03:01:10AM +0000, John Ricketts wrote:
Roger,
I have whatever resources you need for testing. Let me know if you would like them.
1) git clone https://git.torproject.org/git/tor cd tor.git ./autogen.sh && ./configure && make
2) edit /etc/security/limits.conf to have "tord hard nofile 65536" and "tord soft nofile 65536" lines, where tord is your user who will run it.
3) valgrind --leak-check=yes --error-limit=no --undef-value-errors=no src/or/tor orport 9001 dirport 9030 nickname sorryitsslow geoipfile src/config/geoip
(You might also want to set datadirectory, etc if you prefer, or point it to a torrc file if you prefer. Once you've got it working, you might run it with a datadirectory that has keys for an established relay.)
Then watch the output for interesting valgrind complaints (I don't expect any, but if you find some, great!), and when you've let it run for a while, ^C it, let it close down, and see if Valgrind tells you about any "definite" memory leaks.
Be aware that unless the CPU is super amazing, it will be totally cpu saturated and constantly failing to keep up with the requests it receives. So it is a fine thing to do for active bug hunting, but somewhat rude to do on real relays. :)
--Roger
Hello Roger,
I updated the ticket. You will find the output of the valgrind there as well: https://trac.torproject.org/projects/tor/attachment/ticket/22255/valgrind.tx...
best regards
Dirk
On 23.05.2017 06:00, Roger Dingledine wrote:
On Tue, May 23, 2017 at 03:01:10AM +0000, John Ricketts wrote:
Roger,
I have whatever resources you need for testing. Let me know if you would like them.
git clone https://git.torproject.org/git/tor cd tor.git ./autogen.sh && ./configure && make
- edit /etc/security/limits.conf to have "tord hard nofile 65536" and
"tord soft nofile 65536" lines, where tord is your user who will run it.
valgrind --leak-check=yes --error-limit=no --undef-value-errors=no src/or/tor orport 9001 dirport 9030 nickname sorryitsslow geoipfile src/config/geoip
(You might also want to set datadirectory, etc if you prefer, or point it to a torrc file if you prefer. Once you've got it working, you might run it with a datadirectory that has keys for an established relay.)
Then watch the output for interesting valgrind complaints (I don't expect any, but if you find some, great!), and when you've let it run for a while, ^C it, let it close down, and see if Valgrind tells you about any "definite" memory leaks.
Be aware that unless the CPU is super amazing, it will be totally cpu saturated and constantly failing to keep up with the requests it receives. So it is a fine thing to do for active bug hunting, but somewhat rude to do on real relays. :)
--Roger
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
On Wed, May 24, 2017 at 08:49:38PM +0200, tor-relay.dirk@o.banes.ch wrote:
Hello Roger,
I updated the ticket. You will find the output of the valgrind there as well: https://trac.torproject.org/projects/tor/attachment/ticket/22255/valgrind.tx...
Well, you are a winner, in that you found a new Tor bug (in 0.3.1.1-alpha): https://bugs.torproject.org/22368
Once we resolve that one, I'll ask for another valgrind run. :)
--Roger
On Wed, May 24, 2017 at 07:51:50PM -0400, Roger Dingledine wrote:
Well, you are a winner, in that you found a new Tor bug (in 0.3.1.1-alpha): https://bugs.torproject.org/22368
Once we resolve that one, I'll ask for another valgrind run. :)
Ok, we merged the fix for bug 22368.
If anybody wants to resume running valgrind on git master to hunt for memory issues, we're eagerly awaiting more reports. :)
--Roger
Hello Roger,
thanks for the Feedback. We do not have super fast but at least Intel(R) Xeon(R) CPU E3-1220 v3 @ 3.10GHz
Since our processes keep falling over (started them 12 hours ago and now they are dead again) I think I can and give valgrind a try tonight.
best regards
Dirk
On 23.05.2017 04:58, Roger Dingledine wrote:
On Mon, May 22, 2017 at 10:48:31PM +0200, niftybunny wrote:
Same with 0.3.0.5. Upgrading to 0.3.0.7 helped on most relays.
We didn't change anything between 0.3.0.5 and 0.3.0.7 that would have helped.
If somebody with a really really fast CPU wants to run their relay under valgrind --leak-check for a while, that would be grand.
https://gitweb.torproject.org/tor.git/tree/doc/HACKING/HelpfulTools.md
The question to try to get some answers on is what sort of bloat problem we have:
A) Maybe it's a memory leak?
B) Maybe it's an inefficiency of the memory allocator, e.g. fragmentation where nothing is technically leaked yet there's a whole lot of wasted space?
C) Maybe it is some behavior by clients that causes the relay to use a lot of memory, e.g. by having a bunch of stuff in buffers? The behavior could be accidental / normal, or it could be intentional / malicious.
My first guess is "C, accidental". But it could easily be 'B', and it would be great if it's 'A' because then we can just fix it.
--Roger
tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Hi Dirk,
I noticed your comment [1] about your plans to write a script that restarts tor should it get killed.
I just wanted to let you know that if you have plans to upgrade to Ubuntu 16.04 you will get this out of the box due to systemd Restart= [2] service configuration.
[1] https://trac.torproject.org/projects/tor/ticket/22255#comment:23 [2] https://gitweb.torproject.org/debian/tor.git/tree/debian/systemd/tor@.servic... https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restar...
On Thu, 25 May 2017 08:54:00 +0000 nusenu nusenu-lists@riseup.net wrote:
I noticed your comment [1] about your plans to write a script that restarts tor should it get killed.
I just wanted to let you know that if you have plans to upgrade to Ubuntu 16.04 you will get this out of the box due to systemd Restart= [2] service configuration.
[1] https://trac.torproject.org/projects/tor/ticket/22255#comment:23 [2] https://gitweb.torproject.org/debian/tor.git/tree/debian/systemd/tor@.servic... https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restar...
Or you can just put
/etc/init.d/tor start | grep -v "already running"
into your crontab at every 5 minutes or similar. If Tor is already running, this will not do anything, i.e. won't launch a duplicate or anything of that sort. And in case it crashed, you will get an E-Mail automatically (assuming you set up your crontab MAILTO= and system's MTA properly) telling you that it has been started (again), letting you to keep assessing frequency of the issue.
Hi Nusenu and Roman,
thanks for you recommendations. I already implemented as small ~ 10 lines script which does it job and logs what's going on.
Mit Mai 24 21:45:01 CEST 2017 Process tor2 is not running -> starting Don Mai 25 08:50:02 CEST 2017 Process tor2 is not running -> starting Don Mai 25 10:10:22 CEST 2017 Process tor1 is not running -> starting Don Mai 25 11:25:02 CEST 2017 Process tor2 is not running -> starting Don Mai 25 15:40:02 CEST 2017 Process tor2 is not running -> starting
As you can see we had this OOM on one server yesterday 4 times. On the other everything went fine.
Upgrading 16.04 is on the roadmap - but we agreed in the team it would be a last resort. We want to have equal configuration on all nodes (and other servers we operate).
Since we hopefully soon may have an additional provider we would start with the new machine for the clean setup on the chosen OS and then would follow with the existing ones.
Long text short. If it helps with the cause of the current problem I would start changing OS today. If not we try to get a common platform for all our tor and other servers.
Even the scripts works and now the exits return to their throughput it feels like a dirty solution to do this auto restart.
best regards
Dirk
p.s. Nusenu - it seem onionoo responds better now - was there a solution to the problem
On 25.05.2017 10:54, nusenu wrote:
Hi Dirk,
I noticed your comment [1] about your plans to write a script that restarts tor should it get killed.
I just wanted to let you know that if you have plans to upgrade to Ubuntu 16.04 you will get this out of the box due to systemd Restart= [2] service configuration.
[1] https://trac.torproject.org/projects/tor/ticket/22255#comment:23 [2] https://gitweb.torproject.org/debian/tor.git/tree/debian/systemd/tor@.servic... https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restar...
On 23 May 2017, at 04:29, tor-relay.dirk@o.banes.ch wrote:
we are operating two exit nodes with each two tor processes out of Switzerland. The nodes worked quite stable until about 2-3 weeks ago.
Since then we experience frequent disruptions (Up to several times a day). This is caused by a significant rise in memory consumption by the tor processes and ends with a tor process being killed by the Linux Kernel:
May 22 00:30:43 tor2 kernel: [2257156.134100] Killed process 40964 (tor) total-vm:448088kB, anon-rss:0kB, file-rss:0kB
Your process is being killed when using 0.45 GB of RAM.
On my Exit, which handles about 150 Mbps, it is normal for tor to use 900 MB of RAM or more.
To reduce this, you could try setting: DirPort 0 DirCache 0
The systems are physical machines with Ubuntu 14.04.5 LTS with 4 GB of memory. This was sufficient the last 2 years.
If two tor process are using 0.5 GB, what is using the other 3 GB?
T
-- Tim Wilson-Brown (teor)
teor2345 at gmail dot com PGP C855 6CED 5D90 A0C5 29F6 4D43 450C BA7F 968F 094B ricochet:ekmygaiu4rzgsk6n xmpp: teor at torproject dot org ------------------------------------------------------------------------
Hello Teor
Your process is being killed when using 0.45 GB of RAM.
On my Exit, which handles about 150 Mbps, it is normal for tor to use 900 MB of RAM or more.
To reduce this, you could try setting: DirPort 0 DirCache 0
I will try this settings.
Yesterday I was logged in as it happend. There are no other services running (except ssh of course). I tried to investigate more. Only thing I got was the top (attached). Because the machine was so busy an unresponsive that I got kicked out of ssh and a remote console shell session was aborted.
There are no significant other process on the machine beside tor.
Last login: Fri May 26 22:29:17 2017 from 217.150.229.239 dirk@tor1:~$ ps -ef UID PID PPID C STIME TTY TIME CMD root 1 0 0 Mai23 ? 00:00:00 /sbin/init root 2 0 0 Mai23 ? 00:00:00 [kthreadd] root 3 2 0 Mai23 ? 00:05:19 [ksoftirqd/0] root 5 2 0 Mai23 ? 00:00:00 [kworker/0:0H] root 7 2 0 Mai23 ? 00:10:30 [rcu_sched] root 8 2 0 Mai23 ? 00:04:11 [rcuos/0] root 9 2 0 Mai23 ? 00:04:03 [rcuos/1] root 10 2 0 Mai23 ? 00:03:58 [rcuos/2] root 11 2 0 Mai23 ? 00:04:37 [rcuos/3] root 12 2 0 Mai23 ? 00:00:00 [rcuos/4] root 13 2 0 Mai23 ? 00:00:00 [rcuos/5] root 14 2 0 Mai23 ? 00:00:00 [rcuos/6] root 15 2 0 Mai23 ? 00:00:00 [rcuos/7] root 16 2 0 Mai23 ? 00:00:00 [rcuos/8] root 17 2 0 Mai23 ? 00:00:00 [rcuos/9] root 18 2 0 Mai23 ? 00:00:00 [rcuos/10] root 19 2 0 Mai23 ? 00:00:00 [rcuos/11] root 20 2 0 Mai23 ? 00:00:00 [rcuos/12] root 21 2 0 Mai23 ? 00:00:00 [rcuos/13] root 22 2 0 Mai23 ? 00:00:00 [rcuos/14] root 23 2 0 Mai23 ? 00:00:00 [rcuos/15] root 24 2 0 Mai23 ? 00:00:00 [rcuos/16] root 25 2 0 Mai23 ? 00:00:00 [rcuos/17] root 26 2 0 Mai23 ? 00:00:00 [rcuos/18] root 27 2 0 Mai23 ? 00:00:00 [rcuos/19] root 28 2 0 Mai23 ? 00:00:00 [rcuos/20] root 29 2 0 Mai23 ? 00:00:00 [rcuos/21] root 30 2 0 Mai23 ? 00:00:00 [rcuos/22] root 31 2 0 Mai23 ? 00:00:00 [rcuos/23] root 32 2 0 Mai23 ? 00:00:00 [rcuos/24] root 33 2 0 Mai23 ? 00:00:00 [rcuos/25] root 34 2 0 Mai23 ? 00:00:00 [rcuos/26] root 35 2 0 Mai23 ? 00:00:00 [rcuos/27] root 36 2 0 Mai23 ? 00:00:00 [rcuos/28] root 37 2 0 Mai23 ? 00:00:00 [rcuos/29] root 38 2 0 Mai23 ? 00:00:00 [rcuos/30] root 39 2 0 Mai23 ? 00:00:00 [rcuos/31] root 40 2 0 Mai23 ? 00:00:00 [rcuos/32] root 41 2 0 Mai23 ? 00:00:00 [rcuos/33] root 42 2 0 Mai23 ? 00:00:00 [rcuos/34] root 43 2 0 Mai23 ? 00:00:00 [rcuos/35] root 44 2 0 Mai23 ? 00:00:00 [rcuos/36] root 45 2 0 Mai23 ? 00:00:00 [rcuos/37] root 46 2 0 Mai23 ? 00:00:00 [rcuos/38] root 47 2 0 Mai23 ? 00:00:00 [rcuos/39] root 48 2 0 Mai23 ? 00:00:00 [rcuos/40] root 49 2 0 Mai23 ? 00:00:00 [rcuos/41] root 50 2 0 Mai23 ? 00:00:00 [rcuos/42] root 51 2 0 Mai23 ? 00:00:00 [rcuos/43] root 52 2 0 Mai23 ? 00:00:00 [rcuos/44] root 53 2 0 Mai23 ? 00:00:00 [rcuos/45] root 54 2 0 Mai23 ? 00:00:00 [rcuos/46] root 55 2 0 Mai23 ? 00:00:00 [rcuos/47] root 56 2 0 Mai23 ? 00:00:00 [rcuos/48] root 57 2 0 Mai23 ? 00:00:00 [rcuos/49] root 58 2 0 Mai23 ? 00:00:00 [rcuos/50] root 59 2 0 Mai23 ? 00:00:00 [rcuos/51] root 60 2 0 Mai23 ? 00:00:00 [rcuos/52] root 61 2 0 Mai23 ? 00:00:00 [rcuos/53] root 62 2 0 Mai23 ? 00:00:00 [rcuos/54] root 63 2 0 Mai23 ? 00:00:00 [rcuos/55] root 64 2 0 Mai23 ? 00:00:00 [rcuos/56] root 65 2 0 Mai23 ? 00:00:00 [rcuos/57] root 66 2 0 Mai23 ? 00:00:00 [rcuos/58] root 67 2 0 Mai23 ? 00:00:00 [rcuos/59] root 68 2 0 Mai23 ? 00:00:00 [rcuos/60] root 69 2 0 Mai23 ? 00:00:00 [rcuos/61] root 70 2 0 Mai23 ? 00:00:00 [rcuos/62] root 71 2 0 Mai23 ? 00:00:00 [rcuos/63] root 72 2 0 Mai23 ? 00:00:00 [rcu_bh] root 73 2 0 Mai23 ? 00:00:00 [rcuob/0] root 74 2 0 Mai23 ? 00:00:00 [rcuob/1] root 75 2 0 Mai23 ? 00:00:00 [rcuob/2] root 76 2 0 Mai23 ? 00:00:00 [rcuob/3] root 77 2 0 Mai23 ? 00:00:00 [rcuob/4] root 78 2 0 Mai23 ? 00:00:00 [rcuob/5] root 79 2 0 Mai23 ? 00:00:00 [rcuob/6] root 80 2 0 Mai23 ? 00:00:00 [rcuob/7] root 81 2 0 Mai23 ? 00:00:00 [rcuob/8] root 82 2 0 Mai23 ? 00:00:00 [rcuob/9] root 83 2 0 Mai23 ? 00:00:00 [rcuob/10] root 84 2 0 Mai23 ? 00:00:00 [rcuob/11] root 85 2 0 Mai23 ? 00:00:00 [rcuob/12] root 86 2 0 Mai23 ? 00:00:00 [rcuob/13] root 87 2 0 Mai23 ? 00:00:00 [rcuob/14] root 88 2 0 Mai23 ? 00:00:00 [rcuob/15] root 89 2 0 Mai23 ? 00:00:00 [rcuob/16] root 90 2 0 Mai23 ? 00:00:00 [rcuob/17] root 91 2 0 Mai23 ? 00:00:00 [rcuob/18] root 92 2 0 Mai23 ? 00:00:00 [rcuob/19] root 93 2 0 Mai23 ? 00:00:00 [rcuob/20] root 94 2 0 Mai23 ? 00:00:00 [rcuob/21] root 95 2 0 Mai23 ? 00:00:00 [rcuob/22] root 96 2 0 Mai23 ? 00:00:00 [rcuob/23] root 97 2 0 Mai23 ? 00:00:00 [rcuob/24] root 98 2 0 Mai23 ? 00:00:00 [rcuob/25] root 99 2 0 Mai23 ? 00:00:00 [rcuob/26] root 100 2 0 Mai23 ? 00:00:00 [rcuob/27] root 101 2 0 Mai23 ? 00:00:00 [rcuob/28] root 102 2 0 Mai23 ? 00:00:00 [rcuob/29] root 103 2 0 Mai23 ? 00:00:00 [rcuob/30] root 104 2 0 Mai23 ? 00:00:00 [rcuob/31] root 105 2 0 Mai23 ? 00:00:00 [rcuob/32] root 106 2 0 Mai23 ? 00:00:00 [rcuob/33] root 107 2 0 Mai23 ? 00:00:00 [rcuob/34] root 108 2 0 Mai23 ? 00:00:00 [rcuob/35] root 109 2 0 Mai23 ? 00:00:00 [rcuob/36] root 110 2 0 Mai23 ? 00:00:00 [rcuob/37] root 111 2 0 Mai23 ? 00:00:00 [rcuob/38] root 112 2 0 Mai23 ? 00:00:00 [rcuob/39] root 113 2 0 Mai23 ? 00:00:00 [rcuob/40] root 114 2 0 Mai23 ? 00:00:00 [rcuob/41] root 115 2 0 Mai23 ? 00:00:00 [rcuob/42] root 116 2 0 Mai23 ? 00:00:00 [rcuob/43] root 117 2 0 Mai23 ? 00:00:00 [rcuob/44] root 118 2 0 Mai23 ? 00:00:00 [rcuob/45] root 119 2 0 Mai23 ? 00:00:00 [rcuob/46] root 120 2 0 Mai23 ? 00:00:00 [rcuob/47] root 121 2 0 Mai23 ? 00:00:00 [rcuob/48] root 122 2 0 Mai23 ? 00:00:00 [rcuob/49] root 123 2 0 Mai23 ? 00:00:00 [rcuob/50] root 124 2 0 Mai23 ? 00:00:00 [rcuob/51] root 125 2 0 Mai23 ? 00:00:00 [rcuob/52] root 126 2 0 Mai23 ? 00:00:00 [rcuob/53] root 127 2 0 Mai23 ? 00:00:00 [rcuob/54] root 128 2 0 Mai23 ? 00:00:00 [rcuob/55] root 129 2 0 Mai23 ? 00:00:00 [rcuob/56] root 130 2 0 Mai23 ? 00:00:00 [rcuob/57] root 131 2 0 Mai23 ? 00:00:00 [rcuob/58] root 132 2 0 Mai23 ? 00:00:00 [rcuob/59] root 133 2 0 Mai23 ? 00:00:00 [rcuob/60] root 134 2 0 Mai23 ? 00:00:00 [rcuob/61] root 135 2 0 Mai23 ? 00:00:00 [rcuob/62] root 136 2 0 Mai23 ? 00:00:00 [rcuob/63] root 137 2 0 Mai23 ? 00:00:00 [migration/0] root 138 2 0 Mai23 ? 00:00:00 [watchdog/0] root 139 2 0 Mai23 ? 00:00:00 [watchdog/1] root 140 2 0 Mai23 ? 00:00:00 [migration/1] root 141 2 0 Mai23 ? 00:02:21 [ksoftirqd/1] root 143 2 0 Mai23 ? 00:00:00 [kworker/1:0H] root 144 2 0 Mai23 ? 00:00:00 [watchdog/2] root 145 2 0 Mai23 ? 00:00:00 [migration/2] root 146 2 0 Mai23 ? 00:04:40 [ksoftirqd/2] root 148 2 0 Mai23 ? 00:00:00 [kworker/2:0H] root 149 2 0 Mai23 ? 00:00:00 [watchdog/3] root 150 2 0 Mai23 ? 00:00:00 [migration/3] root 151 2 0 Mai23 ? 00:02:10 [ksoftirqd/3] root 153 2 0 Mai23 ? 00:00:00 [kworker/3:0H] root 154 2 0 Mai23 ? 00:00:00 [khelper] root 155 2 0 Mai23 ? 00:00:00 [kdevtmpfs] root 156 2 0 Mai23 ? 00:00:00 [netns] root 157 2 0 Mai23 ? 00:00:00 [writeback] root 158 2 0 Mai23 ? 00:00:00 [kintegrityd] root 159 2 0 Mai23 ? 00:00:00 [bioset] root 160 2 0 Mai23 ? 00:00:00 [kworker/u129:0] root 161 2 0 Mai23 ? 00:00:00 [kblockd] root 162 2 0 Mai23 ? 00:00:00 [ata_sff] root 163 2 0 Mai23 ? 00:00:00 [khubd] root 164 2 0 Mai23 ? 00:00:00 [md] root 165 2 0 Mai23 ? 00:00:00 [devfreq_wq] root 168 2 0 Mai23 ? 00:00:00 [khungtaskd] root 169 2 1 Mai23 ? 01:27:45 [kswapd0] root 170 2 0 Mai23 ? 00:00:00 [vmstat] root 171 2 0 Mai23 ? 00:00:00 [ksmd] root 172 2 0 Mai23 ? 00:00:02 [khugepaged] root 173 2 0 Mai23 ? 00:00:00 [fsnotify_mark] root 174 2 0 Mai23 ? 00:00:00 [ecryptfs-kthrea] root 175 2 0 Mai23 ? 00:00:00 [crypto] root 187 2 0 Mai23 ? 00:00:00 [kthrotld] root 208 2 0 Mai23 ? 00:00:00 [deferwq] root 209 2 0 Mai23 ? 00:00:00 [charger_manager] root 236 2 0 Mai23 ? 00:08:13 [kworker/2:1] root 260 2 0 Mai23 ? 00:00:00 [kpsmoused] root 267 2 0 Mai23 ? 00:00:00 [scsi_eh_0] root 268 2 0 Mai23 ? 00:00:00 [hpsa] root 272 2 0 Mai23 ? 00:00:00 [scsi_eh_1] root 273 2 0 Mai23 ? 00:00:00 [scsi_eh_2] root 274 2 0 Mai23 ? 00:00:00 [scsi_eh_3] root 275 2 0 Mai23 ? 00:00:00 [scsi_eh_4] root 276 2 0 Mai23 ? 00:00:00 [scsi_eh_5] root 277 2 0 Mai23 ? 00:00:00 [scsi_eh_6] root 288 2 0 Mai23 ? 00:12:54 [kworker/0:3] root 289 2 0 Mai23 ? 00:11:52 [kworker/3:1] root 291 2 0 Mai23 ? 00:00:00 [kdmflush] root 292 2 0 Mai23 ? 00:00:00 [bioset] root 293 2 0 Mai23 ? 00:00:00 [kdmflush] root 295 2 0 Mai23 ? 00:00:00 [bioset] root 312 2 0 Mai23 ? 00:00:01 [jbd2/dm-0-8] root 313 2 0 Mai23 ? 00:00:00 [ext4-rsv-conver] root 32890 1 0 Mai23 tty1 00:00:00 /bin/login -- root 32898 1 0 Mai23 ? 00:00:00 upstart-udev-bridge --daemon root 32899 2 0 Mai23 ? 00:00:00 [ext4-rsv-conver] root 32904 1 0 Mai23 ? 00:00:00 /lib/systemd/systemd-udevd --daemon root 32938 2 0 Mai23 ? 00:00:00 [kauditd] root 32993 1 0 Mai23 ? 00:00:00 upstart-file-bridge --daemon root 33020 2 0 Mai23 ? 00:00:17 [kipmi0] root 33052 2 0 Mai23 ? 00:00:00 [kvm-irqfd-clean] root 33077 1 0 Mai23 tty4 00:00:00 /sbin/getty -8 38400 tty4 root 33080 1 0 Mai23 tty5 00:00:00 /sbin/getty -8 38400 tty5 root 33085 1 0 Mai23 tty2 00:00:00 /sbin/getty -8 38400 tty2 root 33086 1 0 Mai23 tty3 00:00:00 /sbin/getty -8 38400 tty3 root 33088 1 0 Mai23 tty6 00:00:00 /sbin/getty -8 38400 tty6 syslog 33093 1 0 Mai23 ? 00:00:00 rsyslogd root 33109 1 0 Mai23 ? 00:00:00 acpid -c /etc/acpi/events -s /var/run/acpid.socket message+ 33117 1 0 Mai23 ? 00:00:00 dbus-daemon --system --fork root 33121 1 0 Mai23 ? 00:00:00 /usr/sbin/sshd -D root 33139 1 0 Mai23 ? 00:00:22 /usr/sbin/irqbalance root 33141 1 0 Mai23 ? 00:00:01 cron daemon 33142 1 0 Mai23 ? 00:00:00 atd root 33147 1 0 Mai23 ? 00:00:00 upstart-socket-bridge --daemon root 33156 1 0 Mai23 ? 00:00:00 /lib/systemd/systemd-logind root 33245 1 0 Mai23 ? 00:06:51 /usr/bin/perl -w /usr/bin/collectl -D vnstat 33397 1 0 Mai23 ? 00:00:12 /usr/sbin/vnstatd -d --pidfile /run/vnstat/vnstat.pid debian-+ 37159 1 26 Mai24 ? 15:30:01 /usr/bin/tor --defaults-torrc /usr/share/tor/tor-service-defaults-torrc -f /etc/tor/torrc2 --hush root 37478 2 0 Mai24 ? 00:00:00 [kworker/u129:1] root 41443 2 0 Mai25 ? 00:00:00 [kworker/1:1] debian-+ 44973 1 12 Mai26 ? 02:47:52 /usr/bin/tor --defaults-torrc /usr/share/tor/tor-service-defaults-torrc -f /etc/tor/torrc1 --hush root 45193 2 0 Mai26 ? 00:03:08 [kworker/2:2] root 45258 2 0 Mai26 ? 00:00:00 [kworker/0:1] dirk 45265 32890 0 Mai26 tty1 00:00:00 -bash root 46095 2 0 Mai26 ? 00:04:34 [kworker/3:0] root 47141 2 0 Mai26 ? 00:01:14 [kworker/1:0] root 47839 2 0 04:00 ? 00:00:00 [kworker/u128:1] root 48547 2 0 07:35 ? 00:00:00 [kworker/u128:2] root 48579 33121 0 07:52 ? 00:00:00 sshd: dirk [priv] dirk 48696 48579 0 07:52 ? 00:00:00 sshd: dirk@pts/1 dirk 48697 48696 2 07:52 pts/1 00:00:00 -bash dirk 48711 48697 0 07:52 pts/1 00:00:00 ps -ef dirk@tor1:~$
The unresponsive machine also described by Tyler in this thread on the 22.05.
The systems are physical machines with Ubuntu 14.04.5 LTS with 4 GB of memory. This was sufficient the last 2 years.
If two tor process are using 0.5 GB, what is using the other 3 GB?
Good question. But their loss seem to be related to tor or why does OOM chose it so frequent ?
this is server 2: Mit Mai 24 21:45:01 CEST 2017 Process tor2 is not running -> starting Don Mai 25 08:50:02 CEST 2017 Process tor2 is not running -> starting Don Mai 25 10:10:22 CEST 2017 Process tor1 is not running -> starting Don Mai 25 11:25:02 CEST 2017 Process tor2 is not running -> starting Don Mai 25 15:40:02 CEST 2017 Process tor2 is not running -> starting Fre Mai 26 10:50:22 CEST 2017 Process tor2 is not running -> starting
The above described incident on server1 brought down the server about 5h until it recovered.
thx and best regards
Dirk
tor-relays@lists.torproject.org