UDM-Pro docker container as a Proxmox quorum device
I recently switched the majority of my self-hosted services over to Proxmox running on a custom built 1U supermicro server. It’s been working so well I decided to build a second one, and setup a two node HA cluster to take advantage of live migration among other features. Proxmox uses corosync to track cluster status, but the trick is it requires a third node (qdevice) to provide quorum and act as a tie breaker. Fortunately you don’t need to use a full machine for this as that could get expensive. Most rational homelab folks will opt for a Raspberry PI or existing NAS in this case but I wanted to try something different.
Having also recently switched from a Ubiquiti Security Gateway Pro over to the newer Unifi Dream Machine Pro (UDM-Pro), I thought I would try running an extra Docker container on this little ARM64 “unified” router/server. There have been many words spilled over the usefulness (or lack thereof) of the current UnifiOS, and early attempts at running custom services didn’t go so well. However as anyone into the homelab scene knows, the community is a persistent bunch and has prevailed at providing a more robust framework for running these kinds of services.
So building on the great work of others and being somewhat persistent myself, I’ve worked out what you need in order to run a qdevice on the UDM-Pro. And now for the unfortunately necessary disclaimer:
The instructions that follow can render your UDM-Pro an unusable brick and possibly void your warranty. Follow these steps at your own risk, and if you decide to try them make sure it’s not in a production environment!
First ssh as root into your UDM-Pro. From there, execute the following:
# unifi-os shell [email protected]:/# curl -L https://raw.githubusercontent.com/boostchicken/udm-utilities/master/on-boot-script/packages/udm-boot_1.0.1-1_all.deb -o udm-boot_1.0.1-1_all.deb [email protected]:/# dpkg -i udm-boot_1.0.1-1_all.deb [email protected]:/# exit
Note that the above is modifying the underlying OS on the UDM-Pro. After exiting you’ll be back in the restricted shell. Proceed with running these commands:
# cd /mnt/data/on_boot.d # wget https://github.com/boostchicken/udm-utilities/blob/master/dns-common/on_boot.d/10-dns.sh -O 10-qnetd.sh (edit this file to match your vlan, change "dns" to "qnetd" everywhere) # /mnt/data/on_boot.d/10-qnetd.sh # cd /mnt/data/podman/cni # wget https://github.com/boostchicken/udm-utilities/blob/master/cni-plugins/20-dns.conflist -O 20-qnetd.conflist (edit this file to match your network, change "dns" to "qnetd") # docker network ls (you should see qnetd) # mkdir -p /mnt/data_ext/corosync && chmod 777 /mnt/data_ext/corosync # docker run -d --network=qnetd --name=qnetd --cap-drop=ALL -p5403:5403 -v "/mnt/data_ext/corosync:/etc/corosync" --restart=always modelrockettier/corosync-qnetd:v3-arm64v8
chmod 777 here poses a security risk, but I was too impatient to look
for a more suitable workaround. Note that you can ignore errors about
healthchecks after running the
docker run line. From this point, you
should proceed with the prerequisites and instruction sections of
Dockerized Corosync QNet Daemon starting at step #3. Assuming everything has gone
well, you should be able to login to your qnetd container on the UDM-Pro and see
your Proxmox cluster nodes connected:
# docker exec -it qnetd /bin/bash [email protected]:/$ corosync-qnetd-tool -s QNetd address: *:5403 TLS: Supported (client certificate required) Connected clients: 2 Connected clusters: 1
You can also verify from one of your Proxmox cluster nodes:
[email protected]:~# pvecm status Cluster information ------------------- Name: mycluster Config Version: 2 Transport: knet Secure auth: on Quorum information ------------------ Date: Tue Jul 14 21:17:12 2020 Quorum provider: corosync_votequorum Nodes: 2 Node ID: 0x00000001 Ring ID: 1.17 Quorate: Yes Votequorum information ---------------------- Expected votes: 3 Highest expected: 3 Total votes: 3 Quorum: 2 Flags: Quorate Qdevice Membership information ---------------------- Nodeid Votes Qdevice Name 0x00000001 1 A,V,NMW 192.168.50.5 (local) 0x00000002 1 A,V,NMW 192.168.50.6 0x00000000 1 Qdevice
If you’ve made it this far, congratulations on your new two node HA cluster! Feedback and improvements to this process are welcome. I’ll followup in a few months with a status update on how it’s working.