PVE Cluster Quorum Recovery: A Field Manual
When a Proxmox VE cluster loses quorum, /etc/pve flips to read-only. You can't start, stop, or migrate VMs. You can't edit configs. The cluster filesystem is a quorum gate, and without majority you don't get to write to it.
Most quorum-loss incidents look the same: one or more nodes are unreachable, corosync is unhappy, and the survivor sits there with pvecm status reporting Quorate: No. The recovery path depends on whether the missing nodes are coming back, whether you can still form a majority, and whether the underlying problem is the network or the nodes themselves.
This is the workflow I run when it happens.
Confirm quorum is actually lost
pvecm status
Key fields:
Quorate: Yes/NoTotal votesvsExpected votesHighest expected(the cluster-configured baseline)
If Total votes < Expected votes / 2 + 1, you're not quorate.
Cross-check membership:
corosync-quorumtool -s
corosync-cfgtool -s
Each ring should show Active. If it shows FAULTY, that's a network problem, not a node problem — fix the link before anything else.
Logs:
journalctl -u corosync -u pve-cluster -n 200 --no-pager
Look for link: 0 is down, Sync members, or New configuration with N nodes.
Scenario 1: One node down in a healthy 3+ node cluster
The cluster is still quorate. No recovery action needed on the survivors — they keep running. When the dead node comes back, corosync re-syncs automatically.
If pmxcfs is out of sync after the node reboots:
systemctl restart corosync pve-cluster
If that doesn't catch up, force a clean restart of just pmxcfs:
systemctl stop pve-cluster
systemctl start pve-cluster
⚠️ Never run pmxcfs -l (local mode) on more than one node simultaneously while reconnecting. They'll diverge and you'll spend longer reconciling than you saved. Local mode is for inspection only, never for sustained operation.
Scenario 2: Lost majority, single survivor
Two of three nodes are gone, you're on the last one, and you need critical VMs running now. Force the survivor to consider itself quorate:
pvecm expected 1
/etc/pve becomes writable, VMs can start, you can edit configs.
⚠️ This is a temporary survival measure, not a fix. The moment the dead nodes come back online with the original expected_votes, you have a split-brain risk. As soon as the cluster is healthy again:
pvecm expected 3 # or whatever your real cluster size is
⚠️ Never run pvecm expected 1 simultaneously on two separated nodes during a network partition. Both will consider themselves authoritative, both will accept writes, and you'll have two divergent versions of /etc/pve to merge by hand. Pick one side to be authoritative and shut corosync down on the other until the network is fixed.
Scenario 3: Network partition (split brain)
Two halves of the cluster can each see themselves but not each other. Whichever side has majority retains quorum. The minority side goes read-only.
Recovery:
- Identify the partition by running
corosync-cfgtool -son each side. Compare the ring addresses each side is bound to. - Fix the underlying network issue — usually a switch, a firewall, or an MTU mismatch on the corosync ring.
- After re-merge, run
pvecm statuson every node and confirm they all agree onQuorate: Yesand the same node list.
Watch corosync re-form the membership:
journalctl -u corosync -f
⚠️ Use a dedicated network for corosync if you can. A separate VLAN over a 1Gbit cross-connect is enough — corosync is latency-sensitive and a noisy backup job on the management network is enough to cause spurious partitions. The two-ring knet setup PVE 7+ supports is even better.
Scenario 4: Permanently dead node
If a node is gone for good (hardware loss, decommissioning), remove it from the cluster cleanly. From a surviving, quorate node:
pvecm delnode <nodename>
Clean up its leftover state on the other nodes:
rm -rf /etc/pve/nodes/<nodename>
Update expected_votes in /etc/pve/corosync.conf if needed:
quorum {
provider: corosync_votequorum
expected_votes: 2 # was 3
}
⚠️ Bump the config_version field at the top of corosync.conf whenever you edit it, or corosync won't pick up the change:
totem {
...
config_version: 7 # was 6
}
Then reload:
systemctl reload corosync
⚠️ If the dead node ever comes back online with its old config, it will think it's still part of the cluster and corosync will reject it noisily. Reinstall it from scratch before reusing the hardware.
Scenario 5: Rebuild from corrupted corosync state
When /etc/corosync/corosync.conf is out of sync between nodes, or authkey differences cause Authentication failed errors:
Restart:
systemctl start corosync pve-cluster
Copy the correct config and authkey from the authoritative node:
scp authoritative:/etc/corosync/corosync.conf /etc/corosync/
scp authoritative:/etc/corosync/authkey /etc/corosync/
On a misbehaving node, stop services:
systemctl stop pve-cluster corosync
On the authoritative node (latest config, most recent state), confirm the source of truth:
cat /etc/pve/corosync.conf
/etc/pve/corosync.conf is the cluster-wide source of truth; /etc/corosync/corosync.conf is the local cache.
⚠️ /etc/pve/priv/authkey.key and /etc/corosync/authkey are different files with different purposes. Wiping /etc/pve/priv/authkey.key breaks the web UI and node-to-node API operations. Wiping /etc/corosync/authkey breaks corosync membership. Don't touch either casually — and never assume they're interchangeable.
The dist-upgrade trap
Upgrading PVE major versions (7 → 8, 8 → 9) triggers conffile prompts:
Configuration file '/etc/network/interfaces'
==> Modified (by you or by a script) since installation.
==> Package distributor has shipped an updated version.
What would you like to do about it?
⚠️ Always answer "keep the local version" for /etc/network/interfaces, /etc/corosync/corosync.conf, and /etc/hosts. Replacing any of these with the package default on a clustered node mid-upgrade is the fastest way to lose quorum and lock yourself out of a remote host simultaneously.
To avoid the question entirely:
DEBIAN_FRONTEND=noninteractive \
apt-get -o Dpkg::Options::="--force-confold" \
-o Dpkg::Options::="--force-confdef" \
dist-upgrade -y
The pvecm add trap
Re-adding a previously-removed node to a cluster overwrites several local files on the joining node, including /etc/pve/storage.cfg. If the joining node had local-only storage definitions (a directory storage that doesn't exist on other nodes, an NFS mount unique to that host), they vanish.
Before re-adding:
cp /etc/pve/storage.cfg /root/storage.cfg.bak
After pvecm add:
diff /etc/pve/storage.cfg /root/storage.cfg.bak
Manually merge missing entries via the web UI — storage edits propagate to all nodes via pmxcfs.
Quick reference
| Symptom | Action |
|---|---|
Quorate: No, majority gone |
pvecm expected N where N = surviving votes |
| Single node, need VMs running NOW | pvecm expected 1 (temporary, then fix) |
| Node permanently lost | pvecm delnode <name> from a quorate node |
| Network partition | Fix the link, then watch journalctl -u corosync |
Authentication failed in corosync |
Sync /etc/corosync/authkey from authoritative |
/etc/pve is read-only |
You're not quorate. pvecm status to diagnose |
| Web UI broken after authkey work | You wiped /etc/pve/priv/authkey.key. Restore it. |
Quorum loss is rarely the actual problem — it's a symptom of a network failure, a node failure, or a botched upgrade. Fix the underlying cause first, then restore quorum. Forcing pvecm expected 1 to silence the alarm without understanding why it fired is how you turn a recoverable incident into a split-brain mess that takes a day to untangle.