Cluster Restore for Disaster

Restoring a Proxmox VE cluster is different from restoring a standalone node because cluster state is split between:

  • the cluster database (/var/lib/pve-cluster/config.db and related files)
  • the pmxcfs mount (/etc/pve, which is not a regular filesystem)
  • corosync configuration (/etc/corosync/**)

ProxSave includes dedicated safety handling for these parts to reduce the risk of overwriting critical cluster state on a running cluster.

How to run the cluster restore workflow

TUI (interactive screens):

proxsave --restore

CLI (text prompts):

proxsave --restore --cli

For deeper troubleshooting output:

proxsave --restore --cli --log-level debug

Critical safety rule: /etc/pve is never overwritten

/etc/pve is a pmxcfs FUSE mount backed by the cluster database. ProxSave intentionally never writes to /etc/pve during a restore to /.

If the backup contains /etc/pve data, ProxSave treats it as export-only and extracts it to an export directory (for example pve-config-export-YYYYMMDD-HHMMSS under your ProxSave base directory).

The “base directory” is BASE_DIR (when not explicitly set it typically defaults to /opt/proxsave).

SAFE vs RECOVERY (cluster database restore)

If the backup is marked as a cluster backup and your restore selection includes the pve_cluster category, ProxSave asks you to choose how to handle the cluster database:

SAFE (recommended for most situations)

SAFE does not overwrite /var/lib/pve-cluster/config.db.

Instead, ProxSave exports cluster-related data (and any export-only /etc/pve data you selected) to the export directory, and can optionally help you apply parts of it via pvesh (for example VM/CT configs for the current node, storage.cfg, datacenter.cfg).

Note: the optional pvesh apply step targets the current node name (derived from the hostname) when importing VM/CT configs.

Use SAFE when:

  • the cluster is still healthy (or you are not sure)
  • you only need to recover configuration, not force a database replacement
  • you want to review exported config before applying anything

RECOVERY (disaster recovery only)

RECOVERY restores the full cluster database under /var/lib/pve-cluster/**.

Use RECOVERY only when:

  • the cluster is offline/isolated and you understand split-brain risks
  • there is no healthy node left to act as the source of truth

What ProxSave does in RECOVERY:

  • stops key PVE services (pve-cluster, pvedaemon, pveproxy, pvestatd)
  • attempts to unmount /etc/pve
  • restores the selected categories (including pve_cluster) back to /
  • restarts services after restore

What ProxSave restores (relevant categories)

ProxSave restore is category-based. For cluster disaster recovery, the most relevant categories are:

  • pve_cluster: cluster database and state under /var/lib/pve-cluster/** (SAFE or RECOVERY depending on your choice)
  • pve_config_export: export-only /etc/pve/** (never written to system paths)
  • corosync: /etc/corosync/**
  • network, ssh, services: OS-level reachability and service configuration (optional; depends on the restore mode you choose)
  • zfs: ZFS-related configuration under /etc/zfs/** (optional)

Multi-node guidance (operational notes)

ProxSave is not a cluster orchestrator. If you use RECOVERY on a node while other nodes still run with a different cluster database, you can create split-brain.

Recommended approach:

  • If other nodes are still healthy, prefer restoring a node by rejoining it to the cluster rather than replacing the DB.
  • If you must use RECOVERY, ensure other nodes are powered off or fully isolated from the corosync network before proceeding.

Post-restore checks

After a cluster-related restore, verify:

pvecm status
pvecm nodes
pvesm status
journalctl -u pve-cluster -n 100

Rollback and confirmation

  • ProxSave asks you to confirm by typing RESTORE before applying changes.
  • Before overwriting system-path categories, ProxSave creates a safety backup tarball under /tmp/proxsave/restore_backup_YYYYMMDD_HHMMSS.tar.gz (rollback hint is printed in the restore output).