[tor-bugs] #33732 [Internal Services/Tor Sysadmin Team]: bacula sometimes fails with a gdb error

Tor Bug Tracker & Wiki blackhole at torproject.org
Thu Mar 26 01:36:47 UTC 2020


#33732: bacula sometimes fails with a gdb error
-------------------------------------------------+-------------------------
 Reporter:  anarcat                              |          Owner:  anarcat
     Type:  defect                               |         Status:  closed
 Priority:  High                                 |      Milestone:
Component:  Internal Services/Tor Sysadmin Team  |        Version:
 Severity:  Minor                                |     Resolution:  wontfix
 Keywords:                                       |  Actual Points:
Parent ID:                                       |         Points:
 Reviewer:                                       |        Sponsor:
-------------------------------------------------+-------------------------
Changes (by anarcat):

 * status:  assigned => closed
 * resolution:   => wontfix


Comment:

 so i tried to debug this problem on forrestii. i found the following:

  1. bacula is unhappy because it can't bind on the IPv4 port. i fix this
 by fixing the /etc/hosts so it points to the correct IP. good.
  2. bacula, then, starts, but is *still* unhappy because it can't bind to
 the IPv6 port. i fixed this by adding ipv6 to /etc/hosts, but it's not
 clear to me why this is a problem on this specific host and not others

 regarding bug 2, the "fun" thing with that is that it doesn't immediately
 crash. at first, it gives us this rather cryptic warning:

 {{{
 Mar 26 01:07:06 forrestii bacula-fd[2813]: forrestii.torproject.org-fd:
 Warning: Cannot bind port 9102: ERR=Cannot assign requested address:
 Retrying ...
 }}}

 and then *eventually* it will *crash* (with a SIGSEGV, no less), like so:

 {{{
 Mar 26 01:08:06 forrestii bacula-fd[2813]: forrestii.torproject.org-fd:
 ABORTING due to ERROR in bnet_server.c:132
 Mar 26 01:08:06 forrestii bacula-fd[2813]: Cannot bind port 9102:
 ERR=Cannot assign requested address.
 Mar 26 01:08:06 forrestii bacula-fd[2813]: 26-Mar 01:08
 forrestii.torproject.org-fd: ABORTING due to ERROR in bnet_server.c:132
 Mar 26 01:08:06 forrestii bacula-fd[2813]: Cannot bind port 9102:
 ERR=Cannot assign requested address.
 Mar 26 01:08:06 forrestii bacula-fd[2813]: forrestii.torproject.org-fd:
 ABORTING due to ERROR in bnet_server.c:132
                                            Cannot bind port 9102:
 ERR=Cannot assign requested address.
 }}}

 this error comes from:

 https://sources.debian.org/src/bacula/9.6.3-1/src/lib/bnet_server.c/#L131

 which is basically:

 {{{
          if (--tmax <= 0) {
             Emsg2(M_ABORT, 0, _("Cannot bind port %d: ERR=%s.\n"),
 ntohs(fd_ptr->port),
                   be.bstrerror());
             Pmsg2(000, "Aborting cannot bind port %d: ERR=%s.\n",
 ntohs(fd_ptr->port),
                   be.bstrerror());
          }
 }}}

 from the error message, the seemingly intuitive thing to do is to install
 gdb. but that doesn't give us anything. instead of the reported error, by
 email, we get:

 {{{
 root at forrestii.torproject.org (4 mins. ago) (rapports root tor)
 Subject: Bacula GDB traceback of bacula-fd on forrestii
 To: root at forrestii.torproject.org
 Date: Thu, 26 Mar 2020 01:14:42 +0000

 ptrace: Operation not permitted.
 /var/lib/bacula/2813: No such file or directory.
 /etc/bacula/scripts/btraceback.gdb:1: Error in sourced command file:
 No symbol table is loaded.  Use the "file" command.
 }}}

 ... because of course, bacula can't gdb itself because it dropped
 privileges. (this, by the way, is a regression of [https://bugs.debian.org
 /cgi-bin/bugreport.cgi?bug=556207 bug 556207] in debian.)

 and besides, it segfaulted, so the process is possibly gone already. but
 even if it *could* have ran, this is what it gets:

 {{{
 Thread 1 "bacula-fd" received signal SIGSEGV, Segmentation fault.
 0x00007f7857199185 in e_msg(char const*, int, int, int, char const*, ...)
 () from /usr/lib/bacula/libbac-9.4.2.so
 (gdb) bt
 #0  0x00007f7857199185 in e_msg(char const*, int, int, int, char const*,
 ...) () from /usr/lib/bacula/libbac-9.4.2.so
 #1  0x00007f785717d5c6 in bnet_thread_server(dlist*, int, workq_tag*,
 void* (*)(void*)) () from /usr/lib/bacula/libbac-9.4.2.so
 #2  0x0000557966d9f62c in ?? ()
 #3  0x00007f7856bfc09b in __libc_start_main () from /lib/x86_64-linux-
 gnu/libc.so.6
 #4  0x0000557966d9fb6a in ?? ()
 (gdb) up
 #1  0x00007f785717d5c6 in bnet_thread_server(dlist*, int, workq_tag*,
 void* (*)(void*)) () from /usr/lib/bacula/libbac-9.4.2.so
 (gdb) list
 No symbol table is loaded.  Use the "file" command.
 }}}

 because of course there's no symbols table, because of course we're in
 production. why we would fire up a debugger, *by default*, in production
 code is just beyond me. i looked at the source code:

 https://sources.debian.org/src/bacula/9.6.3-1/src/lib/signal.c/#L165-L181

 and it looks like this behavior is hardcoded in the the upstream source
 code, at compile time:

 https://sources.debian.org/src/bacula/9.6.3-1/src/version.h/?hl=46#L46

 there are no flags to disable it, even at configure/build time.

 this is just a mess.

 here are the problems with the above:

  1. bacula should not crash when it can't bind to a port, especially if it
 *can* bind to *another* port (e.g. in dual stack). this has already been
 reported in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=949366
  2. bacula shouldn't fail to debug itself when it tries. this is
 regression https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=556207
  3. bacula shouldn't unconditionnally try to send a backtrace on crashes.
 i just reported this as https://bugs.debian.org/cgi-
 bin/bugreport.cgi?bug=954971

 I'm not sure what else to do with this. I mostly opened this bug report to
 document this odyssey, since I remember dealing with this before, but not
 what the outcome was.

 TL;DR: look into `systemctl status bacula-fd`, the error message should be
 there, and is more informative than the backtrace could ever be. Consider
 trashing `/usr/sbin/btraceback` completely to stop it from sending us such
 warnings ever again.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33732#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list