[tor-bugs] #30020 [Internal Services/Tor Sysadmin Team]: switch from our custom YAML implementation to Hiera

Tor Bug Tracker & Wiki blackhole at torproject.org
Mon Sep 9 18:54:20 UTC 2019


#30020: switch from our custom YAML implementation to Hiera
-------------------------------------------------+-------------------------
 Reporter:  anarcat                              |          Owner:  anarcat
     Type:  project                              |         Status:
                                                 |  accepted
 Priority:  Medium                               |      Milestone:
Component:  Internal Services/Tor Sysadmin Team  |        Version:
 Severity:  Normal                               |     Resolution:
 Keywords:                                       |  Actual Points:
Parent ID:  #29387                               |         Points:
 Reviewer:                                       |        Sponsor:
-------------------------------------------------+-------------------------

Comment (by anarcat):

 grand milestone today: `local.yaml` was removed from the repository, along
 with `get_role` and `yamlinfo`, which are all now useless.

 WHOOHOO!

 == Next step: hoster.yaml

 the next chunk we need to convert would be, i think,
 `./modules/torproject_org/misc/hoster.yaml`, which specifies those things:

 * `netrange`: used to create the `TPO_NET` macro in ferm (unused?) and
 determine in which `hoster` a given host is (through `whohosts.rb`, which
 does IP range calculations from the host's IP as seen from LDAP)
 * `mirror-debian`: used in `torproject_org` class to define the APT mirror
 for this host
 * `mirror-debian-security`: unused?
 * `nameservers`: used to configured upstream forwarders in unbound on each
 host
 * `nameservers_break_dnssec` : used to disable unbound forwarding in case
 of broken upstream DNS, unused
 * `allow_dns_query`: used to tell unbound to allow other network ranges
 (ie. generally on this site) to use *this* node as recursive DNS server
 (if `misc.resolver-recursive` is true, which is the case when the LDAP ip
 of the host is listed in the hoster's `nameservers` list)

 Debian.org has hooked hosters.yaml into hiera, and the way they did it is
 to have one .yaml file per hoster, for example:

 https://salsa.debian.org/dsa-team/mirror/dsa-
 puppet/blob/master/hieradata/bytemark.yaml

 Unfortunately, the `hoster.yaml` is still present on d.o:

 https://salsa.debian.org/dsa-team/mirror/dsa-
 puppet/blob/master/modules/puppetmaster/lib/puppet/parser/functions/whohosts.rb

 there we have the same code as on tpo:

 {{{
     yamlfile =
 Puppet::Parser::Files.find_file('debian_org/misc/hoster.yaml',
 compiler.environment)
 }}}

 Here the file, which contains more than just ip ranges:

 https://salsa.debian.org/dsa-team/mirror/dsa-
 puppet/blob/master/modules/debian_org/files/misc/hoster.yaml

 So their transition isn't complete, but it matches some of the ideas I had
 (namely to have one YAML file per hoster).

 So what I would suggest we do to get rid of the hoster.yaml file is this:

  1. convert all the aforementioned variables used in `hoster.yaml` into
 class variables, defaulting to the values loaded from `hoster.yaml` (ie.
 `$nodeinfo`)
  2. test the variables by overriding them in (e.g.)
 `hiera/nodes/foo.example.com.yaml`
  3. break up the `hoster.yaml` file into multiple smaller files in
 `hiera/hoster/%{hoster}.yaml`
  4. add that path to `hiera.yaml`
  5. test that host can load its variables from the `hoster` search path by
 hardcoding a value by hand in facter
  6. create a new YAML variable that gives us a IP range -> hoster mapping
  7. create a function that looks through those to guess the hoster for a
 given IP address
  8. use that function to create a fact (through a template, but with a
 variable defined in the base class) that defines the `$hoster` variable
 that hiera will use to load the right YAML
  9. remove `hoster.yaml`

 That's a first step. At that stage, `hoster.yaml` is gone, but `$nodeinfo`
 remains and still might contain host-specific configuration. Those should
 be extract *out* of the `$nodeinfo` construct and into manifest business
 logic. And *then* the `nodeinfo.rb` code can be ripped out.

 There might be a better way to define a hoster per node than guess it with
 its IP address and drop it as a fact, but I can't think of anything right
 now.

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/30020#comment:10>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list