[tor-bugs] #30912 [Internal Services/Tor Sysadmin Team]: Investigate stunnel outage on crm-ext-01

Tor Bug Tracker & Wiki blackhole at torproject.org
Tue Jul 9 16:28:55 UTC 2019


#30912: Investigate stunnel outage on crm-ext-01
-------------------------------------------------+-------------------------
 Reporter:  peterh                               |          Owner:  tpa
     Type:  defect                               |         Status:
                                                 |  needs_information
 Priority:  Medium                               |      Milestone:
Component:  Internal Services/Tor Sysadmin Team  |        Version:
 Severity:  Normal                               |     Resolution:
 Keywords:                                       |  Actual Points:
Parent ID:                                       |         Points:
 Reviewer:                                       |        Sponsor:
-------------------------------------------------+-------------------------

Comment (by anarcat):

 details from #31119 debugging...

 daemon.log on crm-ext-01:

 {{{
 Jul  9 16:21:58 crm-ext-01/crm-ext-01 stunnel[25094]: LOG5[17173]: Service
 [civicrm-redis-server] accepted connection from 127.0.0.1:55102
 Jul  9 16:21:58 crm-ext-01/crm-ext-01 stunnel[25094]: LOG6[17173]:
 failover: round-robin, starting at entry #1
 Jul  9 16:21:58 crm-ext-01/crm-ext-01 stunnel[25094]: LOG6[17173]:
 s_connect: connecting 138.201.212.235:16379
 Jul  9 16:21:58 crm-ext-01/crm-ext-01 stunnel[25094]: LOG5[17173]:
 s_connect: connected 138.201.212.235:16379
 Jul  9 16:21:58 crm-ext-01/crm-ext-01 stunnel[25094]: LOG5[17173]: Service
 [civicrm-redis-server] connected remote server from 138.201.212.236:54124
 Jul  9 16:21:58 crm-ext-01/crm-ext-01 stunnel[25094]: LOG6[17173]: SNI:
 sending servername: crm-int-01.torproject.org
 Jul  9 16:21:58 crm-ext-01/crm-ext-01 stunnel[25094]: LOG6[17173]: Peer
 certificate required
 Jul  9 16:21:59 crm-ext-01/crm-ext-01 stunnel[25094]: LOG5[17174]: Service
 [civicrm-redis-server] accepted connection from 127.0.0.1:55110
 Jul  9 16:21:59 crm-ext-01/crm-ext-01 stunnel[25094]: LOG6[17174]:
 failover: round-robin, starting at entry #0
 Jul  9 16:21:59 crm-ext-01/crm-ext-01 stunnel[25094]: LOG6[17174]:
 s_connect: connecting 2a01:4f8:172:39ca:0:dad3:11:1:16379
 Jul  9 16:21:59 crm-ext-01/crm-ext-01 stunnel[25094]: LOG5[17174]:
 s_connect: connected 2a01:4f8:172:39ca:0:dad3:11:1:16379
 Jul  9 16:21:59 crm-ext-01/crm-ext-01 stunnel[25094]: LOG5[17174]: Service
 [civicrm-redis-server] connected remote server from
 2a01:4f8:172:39ca:0:dad3:12:1:33678
 Jul  9 16:21:59 crm-ext-01/crm-ext-01 stunnel[25094]: LOG6[17174]: SNI:
 sending servername: crm-int-01.torproject.org
 Jul  9 16:21:59 crm-ext-01/crm-ext-01 stunnel[25094]: LOG6[17174]: Peer
 certificate required
 Jul  9 16:22:13 crm-ext-01/crm-ext-01 stunnel[25094]: LOG6[17165]:
 ssl_start: s_poll_wait: TIMEOUTbusy exceeded: sending reset
 Jul  9 16:22:13 crm-ext-01/crm-ext-01 stunnel[25094]: LOG5[17165]:
 Connection reset: 0 byte(s) sent to TLS, 0 byte(s) sent to socket
 Jul  9 16:22:41 crm-ext-01/crm-ext-01 stunnel[25094]: LOG6[17166]:
 ssl_start: s_poll_wait: TIMEOUTbusy exceeded: sending reset
 Jul  9 16:22:41 crm-ext-01/crm-ext-01 stunnel[25094]: LOG5[17166]:
 Connection reset: 0 byte(s) sent to TLS, 0 byte(s) sent to socket
 Jul  9 16:22:58 crm-ext-01/crm-ext-01 stunnel[25094]: LOG6[17167]:
 ssl_start: s_poll_wait: TIMEOUTbusy exceeded: sending reset
 Jul  9 16:22:58 crm-ext-01/crm-ext-01 stunnel[25094]: LOG5[17167]:
 Connection reset: 0 byte(s) sent to TLS, 0 byte(s) sent to socket
 Jul  9 16:23:11 crm-ext-01/crm-ext-01 stunnel[25094]: LOG6[17168]:
 ssl_start: s_poll_wait: TIMEOUTbusy exceeded: sending reset
 Jul  9 16:23:11 crm-ext-01/crm-ext-01 stunnel[25094]: LOG5[17168]:
 Connection reset: 0 byte(s) sent to TLS, 0 byte(s) sent to socket
 }}}

 the crm-int-01 perspective's:

 {{{
 Jul  9 16:21:58 crm-int-01/crm-int-01 stunnel[25139]: LOG5[1138]: Service
 [civicrm-redis-server] accepted connection from
 ::ffff:138.201.212.236:54124
 Jul  9 16:21:59 crm-int-01/crm-int-01 stunnel[25139]: LOG5[1139]: Service
 [civicrm-redis-server] accepted connection from
 2a01:4f8:172:39ca:0:dad3:12:1:33678
 }}}

 AKA "all is well here".

 redis is up on int-01:

 {{{
 root at crm-int-01:~# echo PING | nc localhost 6379
 +PONG
 }}}

 but hangs on ext-01:

 {{{
 root at crm-ext-01:~# echo PING | nc localhost 6379
 [nothing]
 }}}

 restarting stunnel on ext-01 does nothing. restarting stunnel on int-01
 fixes it. therefore, there might be something wrong on the int-01 side,
 some fd leak maybe?

 next debugging step is to restart int-01 only.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/30912#comment:7>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list