[tor-bugs] #10532 [Tor]: [Tor relay] Random hangs

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Jan 1 15:20:46 UTC 2014


#10532: [Tor relay] Random hangs
-----------------------+----------------------------------
 Reporter:  mrc0mmand  |          Owner:
     Type:  defect     |         Status:  new
 Priority:  normal     |      Milestone:
Component:  Tor        |        Version:  Tor: unspecified
 Keywords:             |  Actual Points:
Parent ID:             |         Points:
-----------------------+----------------------------------
 To relay runs smoothly for some time (1 - 10% CPU usage) when suddenly it
 starts consuming 100% CPU and its status in top/htop jumps between S and
 D. Process is uninterruptible/unresponsive and ignores every signal I send
 to it thus I can't attach strace/gdb to it nor send SIGSUSR2 for debug
 output. The only thing I can do is call manually OOM killer, which kills
 mysqld (which does at these times literally nothing) and the tor usage
 returns back to 1 - 10%. I happens at random intevals from few hours to
 days.

 I'm experiencing this issue with three versions of tor: 0.2.3.25,
 0.2.4.19-rc and now with 0.2.4.20. I've tried many different
 configurations and tried to limit RelayBandwidthRate and
 RelayBandwidthRateBurst to 100KB/200KB which reduced CPU usage to 0.5 - 2%
 but eventually it got stuck too.

 Due to previous reasons the only log I was able to get is stack trace dump
 before call of OOM killer. I'm not sure if it's useful, but it's the only
 thing I got.

 System info: Fedora 19 3.11.6-201.fc19.x86_64

 {{{
 [2949490.851017] tor             R  running task        0  9426      1
 0x00000084
 [2949490.851017]  0000000000000000 ffff88001fc03c30 ffffffff81097b18
 ffff88001c3acf40
 [2949490.851017]  0000000000000000 ffff88001fc03c60 ffffffff81097bf2
 ffff88001c3ad0c0
 [2949490.851017]  ffffffff81c9cae0 0000000000000074 0000000000000002
 ffff88001fc03c70
 [2949490.851017] Call Trace:
 [2949490.851017]  <IRQ>  [<ffffffff81097b18>] sched_show_task+0xa8/0x110
 [2949490.851017]  [<ffffffff81097bf2>] show_state_filter+0x72/0xb0
 [2949490.851017]  [<ffffffff813bb3d0>] sysrq_handle_showstate+0x10/0x20
 [2949490.851017]  [<ffffffff813bba62>] __handle_sysrq+0xa2/0x170
 [2949490.851017]  [<ffffffff813bbef2>] sysrq_filter+0x392/0x3d0
 [2949490.851017]  [<ffffffff814ad849>] input_to_handler+0x59/0xf0
 [2949490.851017]  [<ffffffff814aeb59>]
 input_pass_values.part.4+0x159/0x160
 [2949490.851017]  [<ffffffff814b0ba5>] input_handle_event+0x125/0x530
 [2949490.851017]  [<ffffffff814b1006>] input_event+0x56/0x70
 [2949490.851017]  [<ffffffff814b798e>] atkbd_interrupt+0x5be/0x6b0
 [2949490.851017]  [<ffffffff814aaa93>] serio_interrupt+0x43/0x90
 [2949490.851017]  [<ffffffff814abafa>] i8042_interrupt+0x18a/0x370
 [2949490.851017]  [<ffffffff810f5ffe>] handle_irq_event_percpu+0x3e/0x1e0
 [2949490.851017]  [<ffffffff810f61d6>] handle_irq_event+0x36/0x60
 [2949490.851017]  [<ffffffff810f8adf>] handle_edge_irq+0x6f/0x120
 [2949490.851017]  [<ffffffff8101459f>] handle_irq+0xbf/0x150
 [2949490.851017]  [<ffffffff8106c60f>] ? irq_enter+0x4f/0x90
 [2949490.851017]  [<ffffffff81658acd>] do_IRQ+0x4d/0xc0
 [2949490.851017]  [<ffffffff8164e46d>] common_interrupt+0x6d/0x6d
 [2949490.851017]  <EOI>  [<ffffffff8114ed70>] ?
 shrink_page_list+0x460/0xb00
 [2949490.851017]  [<ffffffff8114fa9a>] shrink_inactive_list+0x18a/0x4e0
 [2949490.851017]  [<ffffffff81150475>] shrink_lruvec+0x345/0x670
 [2949490.851017]  [<ffffffff8107f262>] ? insert_work+0x62/0xa0
 [2949490.851017]  [<ffffffff8107f262>] ? insert_work+0x62/0xa0
 [2949490.851017]  [<ffffffff81150806>] shrink_zone+0x66/0x1a0
 [2949490.851017]  [<ffffffff81150cf0>] do_try_to_free_pages+0xf0/0x590
 [2949490.851017]  [<ffffffff8114cfb4>] ?
 throttle_direct_reclaim.isra.40+0x84/0x270
 [2949490.851017]  [<ffffffff81151261>] try_to_free_pages+0xd1/0x170
 [2949490.851017]  [<ffffffff811459ea>] __alloc_pages_nodemask+0x69a/0xa30
 [2949490.851017]  [<ffffffff811830e9>] alloc_pages_current+0xa9/0x170
 [2949490.851017]  [<ffffffff81537600>] sk_page_frag_refill+0x70/0x160
 [2949490.851017]  [<ffffffff81591720>] tcp_sendmsg+0x2f0/0xdc0
 [2949490.851017]  [<ffffffff815baac4>] inet_sendmsg+0x64/0xb0
 [2949490.851017]  [<ffffffff8129a973>] ? selinux_socket_sendmsg+0x23/0x30
 [2949490.851017]  [<ffffffff81532e37>] sock_aio_write+0x137/0x150
 [2949490.851017]  [<ffffffff811a7a30>] do_sync_write+0x80/0xb0
 [2949490.851017]  [<ffffffff811a8235>] vfs_write+0x1b5/0x1e0
 [2949490.851017]  [<ffffffff8164bf0a>] ? __schedule+0x2ba/0x750
 [2949490.851017]  [<ffffffff811a8b79>] SyS_write+0x49/0xa0
 [2949490.851017]  [<ffffffff81656919>] system_call_fastpath+0x16/0x1b
 [2949490.851017] tor             S ffff88001fc14180     0 26681      1
 0x00000080
 [2949490.851017]  ffff88000f443b30 0000000000000082 ffff88000f443fd8
 0000000000014180
 [2949490.851017]  ffff88000f443fd8 0000000000014180 ffff88001cc8de80
 7fffffffffffffff
 [2949490.851017]  0000000000000000 ffff88001cc8de80 ffff88001eee4380
 ffff88001eee4678
 [2949490.851017] Call Trace:
 [2949490.851017]  [<ffffffff8164c3c9>] schedule+0x29/0x70
 [2949490.851017]  [<ffffffff8164a451>] schedule_timeout+0x201/0x2c0
 [2949490.851017]  [<ffffffff811ecc83>] ? ep_poll_callback+0xf3/0x160
 [2949490.851017]  [<ffffffff815ecd0e>] unix_stream_recvmsg+0x30e/0x850
 [2949490.851017]  [<ffffffff81089420>] ? wake_up_atomic_t+0x30/0x30
 [2949490.851017]  [<ffffffff81534328>] sock_recvmsg+0xa8/0xe0
 [2949490.851017]  [<ffffffff81533c69>] ? sock_sendmsg+0x99/0xd0
 [2949490.851017]  [<ffffffff811655e3>] ? handle_pte_fault+0x93/0xa70
 [2949490.851017]  [<ffffffff8153448f>] SYSC_recvfrom+0xdf/0x160
 [2949490.851017]  [<ffffffff8164bf0a>] ? __schedule+0x2ba/0x750
 [2949490.851017]  [<ffffffff81534c0e>] SyS_recvfrom+0xe/0x10
 [2949490.851017]  [<ffffffff81656919>] system_call_fastpath+0x16/0x1b
 }}}

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/10532>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list