Cleaning up the lab (#2) - highly available DHCP (in theory)

Friday's shenanigans started innocently enough - "I'll just rip out the DHCP server and run my own, and completely restructure the subnets, and..." as it started, soon led to a rabbithole.

Where the last post left off, my "simple" DHCP solution was not working. This, as it would turn out, would become a problem. The simple bandaid fix was to re-enable DHCP on the NBN router in the apartment, restoring network access and ensuring there wouldn't be a chain of events where my partner found the entire home network broken and murdered me (not to mention the impacts on StorJ, Chia and Ethereum mining and farming).

I was completely unable to get DHCP working in Technitium - and reported my experience in a hopefully-relevant GitHub issue - but I hope to revisit that as a solution some day. Instead I opted to split out DHCP duties to something else... something robust and enterprise-grade and proven and... well, I'll just say it.

ISC Kea DHCP.

Kea DHCP
Modern, open source DHCPv4 & DHCPv6 server

I chose it for a number of reasons -

  • Talk of Kea being useful in my professional life
  • Personal interest in trying it out
  • The ability to run highly available, load-balanced DHCP services

If I had been worried before about running a load-bearing candy cane, highly available DHCP would be a killer feature.

Kea checked all the boxes - static DHCP leases, fully featured, stable and open source, and as a bonus would possibly let me set up Foreman some day at home and get automated machine builds.

I spun up two new machines running Debian 11: kea01 (10.1.2.3) and kea02 (10.1.2.4), leaving open 10.1.2.2 for an eventual dns02 machine.

Then it was time to get Kea going. I quickly discovered a few things about it that frustrated me:

  • The Debian-packaged version of Kea uses names like "kea-dhcp4-server", while the ISC-packaged version uses names like "isc-kea-dhcp4-server". I installed the Debian version originally, although I had intended to use the newer ISC version.
  • Much of the documentation makes you cross-reference other bits of the documentation repeatedly, with little in the way of a "quick start guide" or complete examples. This was mitigated a lot by there being plenty of documentation and plenty of partial examples.

I might write a proper deep dive into configuring Kea from scratch some time, maybe as a lab exercise, but for now, here is my complete Kea configuration:

kea-dhcp4 (main configuration):

wings@kea01:~$ cat /etc/kea/kea-dhcp4.conf
{
"Dhcp4": {
     "dhcp-ddns": {
        // Connectivity parameters
        "enable-updates": true,
         "server-ip": "127.0.0.1",
         "server-port":53001,
         "sender-ip":"",
         "sender-port":0,
         "max-queue-size":1024,
         "ncr-protocol":"UDP",
         "ncr-format":"JSON"
     },

     // Behavioral parameters (global)
     "ddns-send-updates": true,
     "ddns-override-no-update": false,
     "ddns-override-client-update": false,
     "ddns-replace-client-name": "never",
     "ddns-generated-prefix": "",
     "ddns-qualifying-suffix": "windowpa.in",
     "ddns-update-on-renew": true,
     "ddns-use-conflict-resolution": true,
     "hostname-char-set": "",
     "hostname-char-replacement": "",

    "interfaces-config": {
        // See section 8.2.4 for more details. You probably want to add just
        // interface name (e.g. "eth0" or specific IPv4 address on that
        // interface name (e.g. "eth0/192.0.2.1").
        "interfaces": [ "ens18/10.1.2.3" ]

        // "dhcp-socket-type": "udp"
    },

    "control-socket": {
        "socket-type": "unix",
        "socket-name": "/tmp/kea-dhcp4-ctrl.sock"
    },

    "lease-database": {
        "type": "memfile",
        "persist": true,
        "name": "/var/lib/kea/kea-leases4.csv",
        "lfc-interval": 3600
    },

    "expired-leases-processing": {
        "reclaim-timer-wait-time": 10,
        "flush-reclaimed-timer-wait-time": 25,
        "hold-reclaimed-time": 3600,
        "max-reclaim-leases": 100,
        "max-reclaim-time": 250,
        "unwarned-reclaim-cycles": 5
    },

    // Global timers specified here apply to all subnets, unless there are
    // subnet specific values defined in particular subnets.
    "renew-timer": 900,
    "rebind-timer": 1800,
    "min-valid-lifetime": 3600,
    "valid-lifetime": 86400,
    "max-valid-lifetime": 604800,

    "option-data": [
        {
            "name": "domain-name-servers",
            "data": "10.1.2.1, 10.1.2.2"
        },

        // Typically people prefer to refer to options by their names, so they
        // don't need to remember the code names. However, some people like
        // to use numerical values. For example, option "domain-name" uses
        // option code 15, so you can reference to it either by
        // "name": "domain-name" or "code": 15.
        {
            "code": 15,
            "data": "windowpa.in"
        },

        // Domain search is also a popular option. It tells the client to
        // attempt to resolve names within those specified domains. For
        // example, name "foo" would be attempted to be resolved as
        // foo.mydomain.example.com and if it fails, then as foo.example.com
        {
            "name": "domain-search",
            "data": "windowpa.in"
        }
    ],
    "hooks-libraries": [{
        "library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_stat_cmds.so"
    }, {
        "library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_lease_cmds.so",
        "parameters": { }
    }, {
        "library": "/usr/lib/x86_64-linux-gnu/kea/hooks/libdhcp_ha.so",
        "parameters": {
            "high-availability": [{
                "this-server-name": "kea01",
                "mode": "load-balancing",
                "heartbeat-delay": 10000,
                "max-response-delay": 60000,
                "max-ack-delay": 5000,
                "max-unacked-clients": 5,
                "delayed-updates-limit": 100,
                "peers": [{
                    "name": "kea01",
                    "url": "http://10.1.2.3:8000/",
                    "role": "primary",
                    "basic-auth-user": "redacted",
                    "basic-auth-password": "redacted",
                    "auto-failover": true
                }, {
                    "name": "kea02",
                    "url": "http://10.1.2.4:8000/",
                    "role": "secondary",
                    "basic-auth-user": "redacted",
                    "basic-auth-password": "redacted",
                    "auto-failover": true
                }]
            }]
        }
    }],
    "subnet4": [
        {
            "subnet": "10.1.0.0/16",
            "pools": [{
                "pool": "10.1.1.2 - 10.1.1.123",
                "client-class": "HA_kea01"
             }, {
                "pool": "10.1.1.124 - 10.1.1.254",
                "client-class": "HA_kea02"
             }],
            "option-data": [
                {
                    // For each IPv4 subnet you most likely need to specify at
                    // least one router.
                    "name": "routers",
                    "data": "10.1.1.1"
                }
            ],

            "reservations": [
                {
                    // Helios64 - pacman - MooseFS CS, Sia / StorJ / Chia
                    "hw-address": "redacted",
                    "ip-address": "10.1.1.200"
                },
                {
                    // HC4 - blinky - MooseFS Master, Samba
                    "hw-address": "redacted",
                    "ip-address": "10.1.1.201"
                },
                {
                    // HC4 - pinky - MooseFS CS, Samba
                    "hw-address": "redacted",
                    "ip-address": "10.1.1.202"
                },
                {
                    // plexy - Plex Media Server
                    "hw-address": "redacted",
                    "ip-address": "10.1.1.20"
                },
                {
                    // torrent - qBittorrent
                    "hw-address": "redacted",
                    "ip-address": "10.1.1.21"
                },
                {
                    // miner 1 - HiveOS
                    "hw-address": "redacted",
                    "ip-address": "10.1.1.101"
                },
                {
                    // miner 2 - HiveOS
                    "hw-address": "redacted",
                    "ip-address": "10.1.1.102"
                }
            ]
        }
    ],

    "loggers": [
    {
        "name": "kea-dhcp4",
        "output_options": [
            {
                "output": "/var/log/kea/kea-dhcp4.log",

                // Shorter log pattern suitable for use with systemd,
                // avoids redundant information
                "pattern": "%-5p %m\n",

                // This specifies the maximum size of the file before it is
                // rotated.
                "maxsize": 1048576,

                // This specifies the maximum number of rotated files to keep.
                "maxver": 8
            }
        ],
        // This specifies the severity of log messages to keep. Supported values
        // are: FATAL, ERROR, WARN, INFO, DEBUG
        "severity": "INFO",

        // If DEBUG level is specified, this value is used. 0 is least verbose,
        // 99 is most verbose. Be cautious, Kea can generate lots and lots
        // of logs if told to do so.
        "debuglevel": 0
    }
  ]
}
}

kea-ctrl-agent (API server)

wings@kea01:~$ cat /etc/kea/kea-ctrl-agent.conf
{
"Control-agent": {
    "http-host": "10.1.2.3",
    "http-port": 8000,

    "authentication": {
        "type": "basic",
        "realm": "kea-control-agent",
        "clients": [
        {
            "user": "redacted",
            "password": "redacted"
        } ]
    },

    "control-sockets": {
        "dhcp4": {
            "socket-type": "unix",
            "socket-name": "/tmp/kea-dhcp4-ctrl.sock"
        },
        "dhcp6": {
            "socket-type": "unix",
            "socket-name": "/tmp/kea-dhcp6-ctrl.sock"
        },
        "d2": {
            "socket-type": "unix",
            "socket-name": "/tmp/kea-dhcp-ddns-ctrl.sock"
        }
    },

    "hooks-libraries": [
    ],

    "loggers": [
    {
        "name": "kea-ctrl-agent",
        "output_options": [
            {
                "output": "/var/log/kea/kea-ctrl-agent.log",

                "pattern": "%-5p %m\n",

                "maxsize": 1048576,

                "maxver": 8
            }
        ],
        "severity": "INFO",

        "debuglevel": 0
    }
  ]
}
}

This configuration provides static DHCP leases for a number of core devices in my network, as well as general DHCP services, across a highly available pair of Kea daemons. A semi-urgent TODO is to add certificate-based authentication - a sufficiently determined attacker could likely sniff the actual credentials the Kea servers use to talk to each other, and gain control over the DHCP servers - however, this is not a substantially higher risk than the plaintext HTTP my NBN router uses for its admin panel 😅 which is yet another reason to switch to this setup.

I have added a pretty major dependency with this setup - my Proxmox cluster now must be online for DHCP to work in the apartment. Luckily, Proxmox is pretty stable and with 4 low-power nodes running with few moving parts and ample airflow, chances are there will be at least 3 nodes up and running at any given time (and the cluster is configured to survive on 2). The VMs themselves run on the Ceph cluster I created inside Proxmox - I will likely add a third non-Ceph VM as a standby DHCP server at some point, just in case of a catastrophic Ceph issue.

Once that was squared away, I went to work on setting up split-horizon DNS using Technitium, created a second Technitium machine (dns02), moved the Technitium service to port 80 since it's running on dedicated machines, and cleared all public DNS records for windowpa.in ahead of re-launching some public services later using Cloudflare Tunnel. I also created DNS records for every important machine in the home lab, and reconfigured all of the MooseFS clients, chunkservers and metaloggers to point at "mfsmaster", and finished spinning up the Tailscale "proxy" node. Whew! Details on all of those changes will come in a third blog post in this series at some later point.

The state of this place!

More to come.