Technical Ramblings // Gurping Zones

Gurping Zones

25 July 2025

With Gurp now able to build most of my zones, I thought about whether it should build the zones themselves. Like most illumos users, I already wrote my own zone manager which builds zones from YAML definitions, but hey, YAML.

After a bit of experimentation I thought this seemed like a pretty good way to define a zone.

(zone/ensure "serv-media"
             :brand "lipkg"
             (zone-fs "/home" :special "/export/home")
             (zone-fs "/storage/mp3" :special "/export/mp3")
             (zone-fs "/storage/flac" :special "/export/flac")
             (zone-network "media_net0"
                :allowed-address "192.168.1.24/24"
                :defrouter "192.168.1.1")
             :dns {:domain "lan.id264.net"
                   :nameservers ["192.168.1.53" "192.168.1.1"]})

I know there’s a slightly ugly mix of functions and structs there, but the alternative is defining filesystems and nets (both of which there can by many of) as one big struct, and I didn’t like that at all. There’s no point making a (brand "brand") function which would map to :brand "brand", and there’s no obvious name for the :dns block to have, so that would feel forced. The :dns struct expands to a couple of attr blocks in the real zone config, and, as (zone-attr) is a thing, you could do it that way if you preferred, but I like my config with with sugar.

The --dump-config flag will make Gurp dump the zonecfg script it creates to stdout if you’re curious, or need to check it.

If you’ve used zones you will notice some essentials like :zonepath and :autoboot are missing. These have defaults filled in by the Janet front-end.

A state machine brings zones up and down. I’m just a dumb-ass ops guy, and I don’t often get the chance to break out even the simplest proper programming techniques, so wrapping a Rust enum with a few methods felt nice. So far my state transitions are all via zoneadm and zonecfg, as they should be, but I’ve seen bhyve zones get pretty wedged when they come down, so it’s not impossible I’ll end up having to do something nastier in the future. The state transitions can all time out, so at least if a zone does get wedged Gurp will give up and move on.

To know when a zone is “ready”, I use svcs to check if the multiuser milestone is online. This is more trustworthy than grepping zoneadm output.

I generally create a gold zone, then clone it. I added :clone-from to the (zone/ensure) definition, as well as :final-state to shut the zone down so it can be cloned.

I also like to exercise my infrastructure creation as much as possible, so I gave the zone doer a :recreate property. This takes an integer value n, and gives the zone a 1-in-_n_ chance of being destroyed and recreated when Gurp ensures the definition. So, a value of 1 recreates it every time, 0 leaves it alone, and 10 means it will be rebuild roughly once in every ten runs.

There isn’t much point recreating a zone if it doesn’t also recreate what’s inside it. So you can :bootstrap-from <file>. When this is present, Gurp copies itself into the zone’s /var/tmp and, once the zone has booted into multiuser mode, uses zlogin to run that copy, applying the given file.

This works fine for me, because I have my home directory loopback- or NFS-mounted everywhere and I can put by Gurp config there. But I’ve had this vague notion for some time that Gurp could run as an HTTP server and client, which would allow the zone being bootstrapped to pull config and files from a central server. I think that might also be a decent way to handle secrets, which I’ve avoided doing so far.

The zone doer can only create and destroy zones. It can’t modify in place. I started looking at doing this, but it clearly wasn’t a five-minute job, and I don’t need to be able to do it. I think it’s almost always going to be better practice to recreate a zone, and I have a proven mechanism for that. I might do it at some point in the future though, as it might be interesting to do it with a proper parser like pest.

LX

Broadcom turned off my Wavefront account, which means I need some other way of collecting and graphing metrics. OmniOS provides a VictoriaMetrics package, but no Grafana. I had a quick look at building a native Grafana package, but it looked somewhere between “loads of work” and “impossible”. It should, however, run pretty easily in an LX Zone.

For those who don’t know, an LX zone is an illumos container which converts system calls from the Linux to the SunOS kernel. This means that you can run the vast majority of Linux software with zero effort. Once you have the zone, that is.

Whereas native zones install by adding packages from a repo, LX zones install from a pre-baked image file. The OmniOS project provides images for various Linux distributions, but I didn’t want to have to get my wget on every time I wanted to build a zone. So I gave Gurp a helper which pulls images down and caches them in /var/tmp. It looks at the latest release in the lx-images Github which matches the pattern you give it. Say these are the latest releases:

lx-alma-8-2025-04-10_13-01-56.tar.xz
lx-alma-9-2025-04-10_13-01-56.tar.xz
lx-alpine-3-2025-04-10_13-01-56.tar.xz
lx-centos-stream-9-2025-04-10_13-01-56.tar.xz
lx-rocky-8-2025-04-10_13-01-56.tar.xz

If you specify :lx-image "lx-alma", you’ll get the latest alma, in this case lx-alma-9-2025-04-10_13-01-56.tar.xz. If you specify alma-8 you get lx-alma-8-2025-04-10_13-01-56.tar.xz. Images are cached in /var/tmp.

Building a fully working LX zone was then as simple as:

(zone/ensure "serv-grafana"
             :brand "lx"
             :lx-image "alpine"
             (zone-attr "kernel-version" :value "4.4")
             (zone-fs "/home" :special "/export/home")
             (zone-network "lx_net0"
                :allowed-address "192.168.1.44/24"
                :defrouter "192.168.1.1")
             :dns {:domain "lan.id264.net"
                   :nameservers ["192,168.1.53" "192.168.1.1"]})

Note the kernel-version attr. Without that, things won’t work properly.

Remember I said LX zones run Linux binaries? Well, they also run native binaries, and even give you access to illumos tools through /native/usr. So, Gurp will run in the LX just like every other zone. Why not use it to configure LX zones?

One of my guiding principles when writing Gurp was “It is illumos-specific. It will not support any other OS”. To build a Grafana zone, I only need to install the grafana package, drop a couple of config files, and start a service. And remember, one of my other principles was to not allow execution of arbitrary shell commands.

First, I cheated. I used the zone doer’s copy-in and ;run-in to do what needed to be done, but I didn’t like it. It was cheating, and it wasn’t config management, it was setting something up then leaving it alone, like Jumpstart in 1996.

Like so many people, the first time my principles were challenged, I threw them out of the window, in this case, by writing a super-basic apk doer. It can install and remove packages. Nothing else. The file and file-line doers were capable enough to install my Grafana config and modify the couple of things that needede changing.

Here’s the full config, with annotations. I share a ZFS dataset from the global zone which holds persistent data. (In this case, just the VictoriaMetrics plugin: everything else is in a MySQL DB.)

(import ../globals)
(import ../secrets)

(def data-dir "/var/lib/grafana")
(def grafana-init "/etc/init.d/grafana")
(def zfs-mounter "/etc/init.d/zfs-mount")

(role grafana
      # No mountpoint property means it is not mounted at all
      (zfs/ensure (zfscat globals/fast-pool "zone" "grafana"))

      (zfs/ensure (zfscat globals/fast-pool "zone" "grafana" "data")
                  :properties {:mountpoint data-dir})

      (section mount-zfs-filesystems
               # An LX Zone can see delegated datasets, but it won't mount them
               # unless it's told to. This script does that. We could also use
               # :from and keep it a separate file, but I wanted to show what's
               # happening.
               (indoc zfs-mount-script ```
                    #!/sbin/openrc-run
                    description="Mount delegated ZFS dataset"

                    depend()
                    {
                        need localmount
                        before *
                    }

                    start()
                    {
                        ebegin "Mounting ZFS dataset"
                        /native/usr/sbin/zfs mount -a
                        eend $?
                    }```)

               (file/ensure zfs-mounter
                            :content zfs-mount-script
                            :mode "0755")

               # This is all `rc-update` does. The script above will run in the
               # boot runlevel, which is the first one. So our ZFS data will be
               # visible when the service starts later.
               (symlink/ensure "/etc/runlevels/boot/zfs-mount"
                               :source zfs-mounter))

      (section configure-grafana
               (apk/ensure "grafana")

               # The 'net' dependency is never resolved, because an LX zone
               # doesn't run a proper init, But there is networking, so we
               # can just remove the dependency.
               (file-line/remove grafana-init
                                 :match "contains"
                                 :pattern "need net")

               # Make the grafana service part of the default runlevel 
               (symlink/ensure "/etc/runlevels/default/grafana"
                               :source grafana-init)

               # Accept external connections.
               (file-line/ensure "/etc/conf.d/grafana"
                                 :replace "127.0.0.1" :with "0.0.0.0")

               # Configure Grafana. Gurp will turn this struct into an ini file
               (def grafana-config
                 {:paths {:data data-dir
                          :logs "/var/log/grafana"
                          :plugins (pathcat data-dir "plugins")
                          :provisioning "conf/provisioning"}
                  :server {:protocol "http"
                           :http_port 3000}
                  :database {:type "mysql"
                             :host "mysql"
                             :name "grafana"
                             :user "grafana"
                             :password secrets/grafana-mysql-password}
                  :log {:mode "file"
                        :level "info"}
                  :news {:news_feed_enabled false}
                  :metrics {:enabled true}})

               (file/ensure "/etc/grafana.ini"
                  :from-struct grafana-config
                  :to-format "ini")))

This config doesn’t actually start the grafana service. I drew the line at manaaging rc services, so I added a :final-state property to the zone doer that reboots the zone. The service starts on boot, and it’s never a bad idea to reboot-test.

Note that though Grafana is configured by an INI file, we do so here with a Janet struct. This was suggested by a friend, who I think got it from Nix. The file doer can take a :from-struct parameter, in which you define configuration as, obviously, a struct. You then supply a :to-format value which, at time of writing, can be json, yaml, toml, ini, or kvp. The first three can hold any struct, so you’re guaranteed to get what you ask for. INI files are more limited, so you must write a struct that will fit into one. kvp is key-value-pair, so you just need a flat struct for that. This, I believe, is far more civilised than templating YAML files from other YAML files template from a YAML file.

Back to zone management though and, as I was doing a lot of testing, I found it convenient to make a (recreate) helper function to recreate zones on demand.

(defn recreate?
  "If GURP_RECREATE_ZONE is a zone name, recreate that zone. If it's all, do them all"
  [zone-name]
  (def env-val (os/getenv "GURP_RECREATE_ZONE"))
    (if (or (= env-val "ALL") (= env-val zone-name)) 1 200))

It sits in the same file as the zone definitions which call it thusly:

(zone/ensure "serv-ws"
             :brand "lipkg"
             :recreate (recreate? "serv-ws")
             ...   

If I want to just rebuild the package server zone:

# GURP_RECREATE_ZONE=serv-pkg gurp apply zones.janet

will do it, and if I want to recreate all my zones, perhaps because I’ve changed the gold zone config:

# GURP_RECREATE_ZONE=all gurp apply zones.janet

will do it. And there’s a 1-in-200 chance any zone will be recreated on any Gurp run, just to keep things spicy.