[tor-bugs] #33785 [Internal Services/Tor Sysadmin Team]: cannot create new machines in ganeti cluster

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Apr 1 20:46:52 UTC 2020


#33785: cannot create new machines in ganeti cluster
-------------------------------------------------+-------------------------
 Reporter:  anarcat                              |          Owner:  anarcat
     Type:  defect                               |         Status:
                                                 |  assigned
 Priority:  High                                 |      Milestone:
Component:  Internal Services/Tor Sysadmin Team  |        Version:
 Severity:  Major                                |     Resolution:
 Keywords:                                       |  Actual Points:
Parent ID:                                       |         Points:
 Reviewer:                                       |        Sponsor:
-------------------------------------------------+-------------------------
Changes (by anarcat):

 * owner:  tpa => anarcat
 * status:  new => assigned


Comment:

 it looks like `gnt-cluster verify` agrees with the allocator in that some
 nodes are not quite setup properly:

 {{{
 root at fsn-node-01:~# gnt-cluster verify
 Submitted jobs 70114, 70115
 Waiting for job 70114 ...
 Wed Apr  1 20:43:00 2020 * Verifying cluster config
 Wed Apr  1 20:43:00 2020 * Verifying cluster certificate files
 Wed Apr  1 20:43:00 2020 * Verifying hypervisor parameters
 Wed Apr  1 20:43:00 2020 * Verifying all nodes belong to an existing group
 Waiting for job 70115 ...
 Wed Apr  1 20:43:00 2020 * Verifying group 'default'
 Wed Apr  1 20:43:00 2020 * Gathering data (5 nodes)
 Wed Apr  1 20:43:00 2020 * Gathering information about nodes (5 nodes)
 Wed Apr  1 20:43:03 2020 * Gathering disk information (5 nodes)
 Wed Apr  1 20:43:03 2020 * Verifying configuration file consistency
 Wed Apr  1 20:43:03 2020 * Verifying node status
 Wed Apr  1 20:43:03 2020 * Verifying instance status
 Wed Apr  1 20:43:03 2020 * Verifying orphan volumes
 Wed Apr  1 20:43:03 2020   - WARNING: node fsn-node-05.torproject.org:
 volume vg_ganeti/troodi.torproject.org-root is unknown
 Wed Apr  1 20:43:03 2020   - WARNING: node fsn-node-05.torproject.org:
 volume vg_ganeti/srv-tmp is unknown
 Wed Apr  1 20:43:03 2020   - WARNING: node fsn-node-05.torproject.org:
 volume vg_ganeti/troodi.torproject.org-swap is unknown
 Wed Apr  1 20:43:03 2020   - WARNING: node fsn-node-05.torproject.org:
 volume vg_ganeti/troodi.torproject.org-lvm is unknown
 Wed Apr  1 20:43:03 2020   - WARNING: node fsn-node-03.torproject.org:
 volume vg_ganeti/srv-tmp is unknown
 Wed Apr  1 20:43:03 2020 * Verifying N+1 Memory redundancy
 Wed Apr  1 20:43:03 2020 * Other Notes
 Wed Apr  1 20:43:03 2020   - NOTICE: 3 non-redundant instance(s) found.
 Wed Apr  1 20:43:04 2020 * Hooks Results
 root at fsn-node-01:~#
 }}}

 the key part here being:

 {{{
 Wed Apr  1 20:43:03 2020   - NOTICE: 3 non-redundant instance(s) found.
 }}}

 It doesn't say which instances those are, but i suspect it is the three
 nodes `hspace -L` has identified.

 The solution here might simply be to rebalance the cluster. I don't want
 to do this right now because it takes time and would throw a lot of
 machines on fsn-node-05, which i'm going to fill with boxes from macrum
 first.

 But that might be solvable that way. That, and documenting this entire
 process for the next time I stumble upon it.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33785#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list