I recently switched the majority of my self-hosted services over to Proxmox running on a custom built 1U supermicro server. It’s been working so well I decided to build a second one, and setup a two node HA cluster to take advantage of live migration among other features. Proxmox uses corosync to track cluster status, but the trick is it requires a third node (qdevice) to provide quorum and act as a tie breaker. Fortunately you don’t need to use a full machine for this as that could get expensive. Most rational homelab folks will opt for a Raspberry PI or existing NAS in this case but I wanted to try something different.

Having also recently switched from a Ubiquiti Security Gateway Pro over to the newer Unifi Dream Machine Pro (UDM-Pro), I thought I would try running an extra Docker container on this little ARM64 “unified” router/server. There have been many words spilled over the usefulness (or lack thereof) of the current UnifiOS, and early attempts at running custom services didn’t go so well. However as anyone into the homelab scene knows, the community is a persistent bunch and has prevailed at providing a more robust framework for running these kinds of services.

So building on the great work of others and being somewhat persistent myself, I’ve worked out what you need in order to run a qdevice on the UDM-Pro. And now for the unfortunately necessary disclaimer:

The instructions that follow can render your UDM-Pro an unusable brick and possibly void your warranty. Follow these steps at your own risk, and if you decide to try them make sure it’s not in a production environment!

The Magic

First ssh as root into your UDM-Pro. From there, execute the following:

# unifi-os shell
root@udmpro:/# curl -L https://raw.githubusercontent.com/boostchicken/udm-utilities/master/on-boot-script/packages/udm-boot_1.0.1-1_all.deb -o udm-boot_1.0.1-1_all.deb
root@udmpro:/# dpkg -i udm-boot_1.0.1-1_all.deb
root@udmpro:/# exit

Note that the above is modifying the underlying OS on the UDM-Pro. After exiting you’ll be back in the restricted shell. Proceed with running these commands:

# cd /mnt/data/on_boot.d
# wget https://github.com/boostchicken/udm-utilities/blob/master/dns-common/on_boot.d/10-dns.sh -O 10-qnetd.sh
(edit this file to match your vlan, change "dns" to "qnetd" everywhere)
# /mnt/data/on_boot.d/10-qnetd.sh
# cd /mnt/data/podman/cni
# wget https://github.com/boostchicken/udm-utilities/blob/master/cni-plugins/20-dns.conflist -O 20-qnetd.conflist
(edit this file to match your network, change "dns" to "qnetd")
# docker network ls
(you should see qnetd)
# mkdir -p /mnt/data_ext/corosync && chmod 777 /mnt/data_ext/corosync
# docker run -d --network=qnetd --name=qnetd --cap-drop=ALL -p5403:5403 -v "/mnt/data_ext/corosync:/etc/corosync" --restart=always modelrockettier/corosync-qnetd:v3-arm64v8

The chmod 777 here poses a security risk, but I was too impatient to look for a more suitable workaround. Note that you can ignore errors about healthchecks after running the docker run line. From this point, you should proceed with the prerequisites and instruction sections of Dockerized Corosync QNet Daemon starting at step #3. Assuming everything has gone well, you should be able to login to your qnetd container on the UDM-Pro and see your Proxmox cluster nodes connected:

# docker exec -it qnetd /bin/bash
coroqnetd@6587531c9d73:/$ corosync-qnetd-tool -s
QNetd address:			*:5403
TLS:				Supported (client certificate required)
Connected clients:		2
Connected clusters:		1

You can also verify from one of your Proxmox cluster nodes:

root@node1:~# pvecm status
Cluster information
Name:             mycluster
Config Version:   2
Transport:        knet
Secure auth:      on

Quorum information
Date:             Tue Jul 14 21:17:12 2020
Quorum provider:  corosync_votequorum
Nodes:            2
Node ID:          0x00000001
Ring ID:          1.17
Quorate:          Yes

Votequorum information
Expected votes:   3
Highest expected: 3
Total votes:      3
Quorum:           2
Flags:            Quorate Qdevice

Membership information
    Nodeid      Votes    Qdevice Name
0x00000001          1    A,V,NMW (local)
0x00000002          1    A,V,NMW
0x00000000          1            Qdevice

If you’ve made it this far, congratulations on your new two node HA cluster! Feedback and improvements to this process are welcome. I’ll followup in a few months with a status update on how it’s working.