[tor-bugs] #33785 [Internal Services/Tor Sysadmin Team]: cannot create new machines in ganeti cluster

Tor Bug Tracker & Wiki blackhole at torproject.org
Wed Apr 1 20:05:42 UTC 2020


#33785: cannot create new machines in ganeti cluster
-----------------------------------------------------+-----------------
     Reporter:  anarcat                              |      Owner:  tpa
         Type:  defect                               |     Status:  new
     Priority:  High                                 |  Milestone:
    Component:  Internal Services/Tor Sysadmin Team  |    Version:
     Severity:  Major                                |   Keywords:
Actual Points:                                       |  Parent ID:
       Points:                                       |   Reviewer:
      Sponsor:                                       |
-----------------------------------------------------+-----------------
 for some reason, I can't create new instances in the ganeti cluster:

 {{{
 root at fsn-node-01:~# gnt-instance add   -o debootstrap+buster   -t drbd
 --no-wait-for-sync   --disk 0:size=10G   --disk 1:size=2G,name=swap
 --backend-parameters memory=2g,vcpus=2   --net 0:ip=pool,network=gnt-fsn
 --no-name-check   --no-ip-check test-01.torproject.org

 Failure: prerequisites not met for this operation:
 error type: insufficient_resources, error details:
 Can't compute nodes using iallocator 'hail': Request failed: Group default
 (preferred): No valid allocation solutions, failure reasons: FailMem: 8,
 FailN1: 12
 }}}

 The `gnt-fsn` network is getting full, but it had one spare IP when that
 command was run. I see the same behavior with `gnt-fsn13-02`, the new
 network created to cover the new IP allocation from hetzner which has
 plenty of room as well.

 The nodes do have plenty of disk and memory space to respond to the
 demand:

 {{{
 root at fsn-node-01:~# gnt-node list
 Node                       DTotal  DFree MTotal MNode MFree Pinst Sinst
 fsn-node-01.torproject.org 893.1G 451.9G  62.8G 38.5G 23.7G     7    14
 fsn-node-02.torproject.org 893.1G 561.9G  62.8G 22.8G 39.6G     6    15
 fsn-node-03.torproject.org 893.6G 151.4G  62.8G 18.2G 43.6G     5    22
 fsn-node-04.torproject.org 893.6G 450.2G  62.8G 24.0G 38.4G     6    12
 fsn-node-05.torproject.org 893.6G 232.1G  62.8G  832M 60.8G     3     6
 }}}

 It's not clear to me why the allocator is failing.

 Note that I've been *adopting* new instances without problems for the past
 few weeks, so this could be specifically about *creating* new disks.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/33785>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list