UDM-Pro docker container as a Proxmox quorum device
I recently switched the majority of my self-hosted services over to Proxmox running on a custom built 1U supermicro server. It’s been working so well I decided to build a second one, and setup a two node HA cluster to take advantage of live migration among other features. Proxmox uses corosync to track cluster status, but the trick is it requires a third node (qdevice) to provide quorum and act as a tie breaker. Fortunately you don’t need to use a full machine for this as that could get expensive. Most rational homelab folks will opt for a Raspberry PI or existing NAS in this case but I wanted to try something different.
Having also recently switched from a Ubiquiti Security Gateway Pro over to the newer Unifi Dream Machine Pro (UDM-Pro), I thought I would try running an extra Docker container on this little ARM64 “unified” router/server. There have been many words spilled over the usefulness (or lack thereof) of the current UnifiOS, and early attempts at running custom services didn’t go so well. However as anyone into the homelab scene knows, the community is a persistent bunch and has prevailed at providing a more robust framework for running these kinds of services.
So building on the great work of others and being somewhat persistent myself, I’ve worked out what you need in order to run a qdevice on the UDM-Pro. And now for the unfortunately necessary disclaimer:
The instructions that follow can render your UDM-Pro an unusable brick and possibly void your warranty. Follow these steps at your own risk, and if you decide to try them make sure it’s not in a production environment!
The Magic
First ssh as root into your UDM-Pro. From there, execute the following:
# unifi-os shell
root@udmpro:/# curl -L https://raw.githubusercontent.com/boostchicken/udm-utilities/master/on-boot-script/packages/udm-boot_1.0.1-1_all.deb -o udm-boot_1.0.1-1_all.deb
root@udmpro:/# dpkg -i udm-boot_1.0.1-1_all.deb
root@udmpro:/# exit
Note that the above is modifying the underlying OS on the UDM-Pro. After exiting you’ll be back in the restricted shell. Proceed with running these commands:
# cd /mnt/data/on_boot.d
# wget https://github.com/boostchicken/udm-utilities/blob/master/dns-common/on_boot.d/10-dns.sh -O 10-qnetd.sh
(edit this file to match your vlan, change "dns" to "qnetd" everywhere)
# /mnt/data/on_boot.d/10-qnetd.sh
# cd /mnt/data/podman/cni
# wget https://github.com/boostchicken/udm-utilities/blob/master/cni-plugins/20-dns.conflist -O 20-qnetd.conflist
(edit this file to match your network, change "dns" to "qnetd")
# docker network ls
(you should see qnetd)
# mkdir -p /mnt/data_ext/corosync && chmod 777 /mnt/data_ext/corosync
# docker run -d --network=qnetd --name=qnetd --cap-drop=ALL -p5403:5403 -v "/mnt/data_ext/corosync:/etc/corosync" --restart=always modelrockettier/corosync-qnetd:v3-arm64v8
The chmod 777
here poses a security risk, but I was too impatient to look
for a more suitable workaround. Note that you can ignore errors about
healthchecks after running the docker run
line. From this point, you
should proceed with the prerequisites and instruction sections of
Dockerized Corosync QNet Daemon starting at step #3. Assuming everything has gone
well, you should be able to login to your qnetd container on the UDM-Pro and see
your Proxmox cluster nodes connected:
# docker exec -it qnetd /bin/bash
coroqnetd@6587531c9d73:/$ corosync-qnetd-tool -s
QNetd address: *:5403
TLS: Supported (client certificate required)
Connected clients: 2
Connected clusters: 1
You can also verify from one of your Proxmox cluster nodes:
root@node1:~# pvecm status
Cluster information
-------------------
Name: mycluster
Config Version: 2
Transport: knet
Secure auth: on
Quorum information
------------------
Date: Tue Jul 14 21:17:12 2020
Quorum provider: corosync_votequorum
Nodes: 2
Node ID: 0x00000001
Ring ID: 1.17
Quorate: Yes
Votequorum information
----------------------
Expected votes: 3
Highest expected: 3
Total votes: 3
Quorum: 2
Flags: Quorate Qdevice
Membership information
----------------------
Nodeid Votes Qdevice Name
0x00000001 1 A,V,NMW 192.168.50.5 (local)
0x00000002 1 A,V,NMW 192.168.50.6
0x00000000 1 Qdevice
If you’ve made it this far, congratulations on your new two node HA cluster! Feedback and improvements to this process are welcome. I’ll followup in a few months with a status update on how it’s working.