Skip to content

Releases: doutsis/vmbackup

vmbackup 0.6.0 — Unification release

05 Jun 05:40

Choose a tag to compare

Unification release. vmbackup and vmrestore now ship from one source tree as one Debian package containing two binaries. Both binaries always carry the same version, share a single lib/ of cross-tool helpers, and read from a single SQLite catalogue. Existing flags, invocations, systemd units, and operator scripts continue to work unchanged. The old standalone vmrestore package is replaced cleanly on upgrade.

Added

  • Unified packagevmbackup and vmrestore ship from a single source tree, build via one Makefile, and install from one .deb. The package declares Provides: vmrestore, Replaces: vmrestore (<< 0.6.0), and Conflicts: vmrestore (<< 0.6.0), so apt removes the old standalone vmrestore package automatically on upgrade. The vmrestore binary continues to live at /usr/local/bin/vmrestore (symlink to /opt/vmbackup/vmrestore.sh).
  • Shared lib/ consumed by both binaries — 16 libraries now provide one canonical implementation of behaviour that was previously duplicated or divergent across the two tools: logging, exit codes, versioning, per-VM locking, signal traps, config-instance resolution, period handling, the backup-tree walker, the read-only SQLite reader, path and VM-name helpers, TPM artefact reading, and the virtnbdbackup / virtnbdrestore and virsh wrappers. Where a behaviour exists in both binaries it now comes from exactly one place, so they can no longer disagree by accident.
  • vmrestore is catalogue-awarevmrestore --list reads the same SQLite catalogue that drives vmbackup --status --chains and appends a per-VM Chains: <N> active, <N> broken, last backup <ISO> line. Falls back to walker-only output when the catalogue is unavailable, preserving the standalone DR contract.
  • vmrestore writes restore-session rows to the catalogue — schema bumped to v2.2 with a new restore_sessions table. Every invocation records start, end, VM, restore type, and final outcome. A new vmbackup --status --restores subcommand reads the table, so backup and restore history live in one place. Catalogue failures degrade to a single WARN and never block the restore; --dry-run writes no row.
  • vmrestore gains per-VM locking and signal handlers — restoring a VM now takes the same lock a backup of that VM takes, so backup and restore can no longer race each other. SIGINT and SIGTERM clean up staging directories and partial disk files.
  • vmrestore --restore-path overlap guard — refuses any path that equals or sits inside a configured vmbackup BACKUP_PATH (checked across all config instances), preventing accidental restores into the live backup tree.
  • Broken-chain detector for vmrestore — incomplete chains (truncated by an interrupted backup, or partially archived) are no longer offered as the default latest restore target. The reason for skipping is logged so the operator can override with --include-incomplete (forensic use only).
  • In-session re-entry guard for chain archivalvmbackup refuses to archive the same VM twice within one invocation, eliminating collision-suffixed .archives/chain-<date>.1 directories.
  • Misplaced-database guardvmbackup refuses to create the SQLite catalogue inside .archives/ or under a period directory, closing a class of bugs where a misconfigured backup path could spawn a second catalogue that silently diverged from the canonical one.
  • vmbackup --cleanup-stale-manifests — one-shot subcommand that removes leftover per-VM chain-manifest.json files from the backup tree. Invoked automatically by debian/postinst on package upgrade and safe to re-run manually.

Changed

  • chain_health.archive_size_bytes populated at archive transition — the retention-path archive caller now writes the archive size immediately, matching the active-path caller. Previously the column stayed at 0 until manual reconciliation.
  • TPM-restore reporting is now truthful — when disks restore successfully but TPM/BitLocker unlock fails, vmrestore no longer reports overall success. The summary line carries a TPM ✓ / TPM ✗ (manual unlock required) token (omitted on VMs without TPM).
  • SharePoint replication verify logs actionable diagnostics on mismatch — when the post-upload rclone check reports a difference, the cloud transport now logs the rclone check exit code, elapsed time, and the specific differing/missing files, replacing the previous opaque Found differences message. Transient SharePoint verify warnings are now diagnosable instead of mysterious.

Fixed

  • False-positive backup failures from substring ERROR matches in the virtnbdbackup log — the post-run guard used a case-insensitive substring match for ERROR, which mis-flagged successful runs whenever the log mentioned internal error, ERROR — trim not supported, or carried ANSI colour codes. False positives recorded the chain as failed and promoted the next monthly backup from incremental to full, inflating destination write volume. Now anchored to virtnbdbackup's own log-line format and ANSI-stripped. Originally reported and proposed by @hostarts with co-author @houssamchergui.
  • Email notifier "intentionally skipped" return value logged as a delivery failure — all four send_backup_report call sites collapsed the notifier's three return values (delivered / transport failure / intentionally skipped) into pass-or-fail, so operators who set EMAIL_ON_SUCCESS=no saw a misleading "Failed to send email report" WARN on every successful run. A new _handle_notifier_rc dispatcher distinguishes the three cases and is wired into all four call sites. Originally proposed by @hostarts (email-only scope adopted).
  • get_last_backup_timestamp() blind to archived chains — the probe's find -maxdepth was too shallow to see archived data after the chain-archive layout change, so offline-unchanged VMs were treated as having no prior backup and re-ran a full backup nightly. Probe depth corrected; the offline-skip path now fires as intended.
  • False "incomplete backup" WARN on clean shutdowncleanup_on_exit emitted a misleading WARN on every clean exit because its duplicate-call gate was an in-memory flag the success path could not clear before the trap fired. The gate now uses the sqlite_session_end() return code itself. Independently reported by @hostarts in PR #4.
  • TPM artefact validation accepted empty bundlesvalidate_tpm_backup() was -s-testing the tpm2/ directory instead of its files, so an empty TPM bundle passed validation. Replaced with a per-file size check and a minimum-size floor on tpm2-00.permall.
  • xmllint listed as required but never invoked — phantom dependency removed.
  • Dead restore_vm_tpm() body removed — had different semantics from vmrestore's restore_tpm() and would have corrupted a recovery if ever called. Already marked # DEAD CODE; now physically gone.
  • Undefined VM-name sanitisation helper in prune pathsvmbackup's prune code paths called a helper that had never been defined, so the call was a silent no-op. Replaced by the canonical helper in lib/vm_name_utils.sh.
  • NVRAM/disk coherency on restore (BdsDxe: No mapping boot failure)vmrestore paired restored disks with the live host NVRAM instead of the NVRAM captured at the backed-up checkpoint, so restoring an older period over a VM that had since rebooted left UEFI variables (SecureBoot keys, BootOrder, MOK) out of step with the disks and the guest failed to boot. It now pairs each restore with the matching checkpoint's NVRAM — clones and in-place alike — backing up the live NVRAM to <path>.before-restore.<timestamp> first.
  • chain_check_complete false-positive on chains containing CD-ROM devices — the completeness check treated every <disk> in the libvirt checkpoint XML as a data disk, but that XML carries no device= attribute, so CD-ROMs (which virtnbdbackup correctly skips) were indistinguishable from genuinely missing disks — flagging healthy chains ⚠ INCOMPLETE in --list-restore-points and refusing them without --include-incomplete. The check now consults the per-checkpoint domain XML snapshot, which preserves device='cdrom', and skips those phantom targets. Chains without that snapshot keep prior behaviour.
  • Stale chain_id recorded on SIGTERM / SIGINT — interrupted backups wrote a chain_health row whose chain_id was derived from an in-memory index that had never been committed to disk, so the interrupted-chain entry could not be correlated with anything retention or restore could see. The id is now derived from the on-disk chain layout, so the row matches the chain that actually exists.
  • vmrestore skipped valid restore points on large backup trees (SIGPIPE under pipefail) — the chain-presence probe used find … | grep -q .; with set -o pipefail now globally enabled, grep -q closed the pipe on the first match and the resulting SIGPIPE made find exit non-zero, so has_backup_data() wrongly returned false. Rewritten to find … -print -quit. The same pipefail-vulnerable idiom was audited and fixed everywhere it occurred (across vmbackup.sh, lib/chain_validation.sh, and an integration test).
  • _state/logs/ rotation never ran; central logs grew unbounded — the rotation routine was gated behind a directory that no code path ever created, so it had been dead since the early-2026 modular refactor: per-VM, replication, and email logs accumulated indefinitely, and vmbackup.log / vmprune.log grew append-only forever. Rotation now runs at most once per calendar day from the pre-backup hook, and the central logs are size-capped by a new LOG_MAX_BYTES knob (default 50 MiB) — an oversized file is rolled to <name>.<epoch> and aged out under the existing LOG_KEEP_DAYS rule. Deployed installs inherit the default with no config change; the first post-upgrade session clears...
Read more

v0.5.6

26 Apr 08:20

Choose a tag to compare

Changed

  • Structured exit codes — categorised non-zero exits (config / lock / storage / VM / tool / CLI / dependency) let monitoring distinguish why a run failed without parsing logs. Symmetric with vmrestore.

Fixed

  • Retention not enforced for skipped or excluded VMs — Retention was wired only to the post-backup success path, so any VM that was skipped (SKIP_OFFLINE_UNCHANGED_BACKUPS=true) or excluded (policy=never) accumulated period directories indefinitely with no rotation. The same code path also created the period directory via mkdir -p before deciding whether the backup would run, leaving an empty stub on disk every time a VM was skipped, excluded, or failed before first write. Combined effect on production: VMs at RETENTION_WEEKS=4 carrying 8+ weekly directories, including pure stubs that no later session would ever clean up. Retention is now invoked from the skip and exclude paths via the new run_retention_for_unbacked_vm wrapper, which orders stub cleanup before retention so the period count is correct before the limit check runs. Excluded VMs (policy=never) have stubs removed but their populated periods are preserved by the policy short-circuit. Failed-path retention remains intentionally suppressed; failed-path stubs are reaped on the next non-failed session.

Added

  • Stub-aware retention pipeline for unbacked VMs — A new run_retention_for_unbacked_vm wrapper in modules/retention_module.sh runs stub cleanup → retention → orphan retention in that order whenever a VM is skipped or excluded, so the on-disk period count is correct before the limit check fires. Stub cleanup is performed by the new _remove_empty_period_dirs helper, which removes pure stub directories (zero *.data, no .archives/) and is anchored to BACKUP_PATH with a path-shape regex guard, deliberately bypassing _remove_period's keep-last, replication, and protected-period guards (all inappropriate for empty directories). Stub deletions in SQLite go through a new UPDATE-only library function sqlite_mark_chain_deleted_if_exists (in lib/sqlite_module.sh) to avoid injecting phantom active-then-deleted chain_health rows when a pure stub never had a row to begin with.

Changed

  • retention_events audit attribution — Field 12 (triggered_by) no longer carries hardcoded function-name literals; it now records the high-level event that drove the prune, with new enum values skipped, excluded, and orphan_dir joining the existing post_backup, prune, and orphan_retention. The action column also gains remove_stub for the new stub-cleanup path. Internally, _remove_period, _remove_orphan_period, _remove_archive_chain, _remove_archives_in_period, _remove_vm_all, run_retention_for_vm, and run_orphan_retention_for_vm all gain a new trigger parameter so the originating event propagates through the call chain into the audit row — making it possible to attribute retention activity to skipped-VM and excluded-VM sessions for the first time.

v0.5.5

25 Apr 09:02

Choose a tag to compare

Added

  • Configurable backup-destination space thresholds — Four new optional vmbackup.conf settings (DISK_ABORT_PCT, DISK_WARN_PCT, DISK_ABORT_GB, DISK_WARN_GB) let check_disk_space() be tuned per instance. Percent and absolute thresholds are evaluated together so either can fire independently; setting any threshold to 0 disables it. Defaults (20%/30% and 10 GB/50 GB) preserve previous behaviour.
  • Disk-space snapshot per session (schema v2.1)sessions table gains disk_free_bytes and disk_total_bytes columns, populated by sqlite_session_end() from a df capture against BACKUP_PATH. Migration from v2.0 is automatic, idempotent and additive.
  • --status reporting command — Seven report modes: sessions, VM history, failures, replication, chains, storage, policies. Terminal tables by default, --csv for export, --days N for time window, --all-instances to span every config instance. Sessions output is job-type-aware (backup / prune / replicate-only / mixed) and scoped to the active CONFIG_INSTANCE by default. The storage report includes per-VM size trends and a destination-growth projection that names the configured DISK_ABORT_PCT threshold.
  • Post-upgrade config advisory in postinst — On dpkg upgrade, lists .dpkg-dist files awaiting merge with per-file diff -u commands and points custom config instances at config/template/vmbackup.conf for new variables. Visible only on upgrade.
  • vmbackup --config-prune-removed — Cleanup helper that comments out configuration variables removed in the running release. Idempotent; supports --dry-run. Operates on default/ and all custom instances; skips template/. Per-name allowlist keyed to release version, designed to be extended by future config-pruning ENHs.

Fixed

  • Pre-flight aborts failed silently with no emailcheck_backup_destination(), check_scratch_path() and check_disk_space() exit before main() reaches its normal email send, so destination/scratch/space failures left no notification (only a journal entry). cleanup_on_exit() now sends a failure report on any non-zero exit once a SQLite session has been registered, gated by _EMAIL_SENT so the existing success/failure path remains the single source of truth on normal runs.
  • SKIP_OFFLINE_UNCHANGED_BACKUPS is now honoured — Previously the variable was defined and validated but never read; offline-unchanged VMs were always skipped regardless of the setting. The change-detection call in backup_vm() is now gated by this flag.

Removed

  • OFFLINE_CHANGE_DETECTION_THRESHOLD — Was never read by the change-detection code (which uses strict mtime > last_backup). The variable inverted the safe default and would have introduced false negatives if implemented. Existing values in operator configs are inert; run vmbackup --config-prune-removed to clean them up.
  • EMAIL_INCLUDE_REPLICATION — Was never read. Hiding replication results from the email is operator-hostile (silent on success, dangerous on failure). The empty-section logic already handles the no-replication case.
  • EMAIL_INCLUDE_DISK_SPACE — Was never read; gated a section that was never built. A real disk-usage email section is tracked as ENH-16.

v0.5.4

12 Apr 08:09

Choose a tag to compare

Fixed

  • SQLite session not finalised on normal exit — Sessions could be left permanently "in progress" in the database. Now finalised unconditionally in cleanup_on_exit() with idempotency guard.
  • Silent permission failures on backup pathchown/chmod failures now log warnings instead of being silently suppressed with || true.
  • Stale lock cleanup could delete active locks — Now validates PID liveness before deletion and uses correct vmbackup-*.lock glob.
  • Session PID lock race condition — Replaced non-atomic check-then-write with noclobber pattern.
  • Double email on SIGTERM — Added _EMAIL_SENT guard flag.
  • virtnbdbackup not confirmed dead before retry — Added pgrep/pkill cleanup and virsh domjobabort before retry.
  • Reorder config-instance validation before session lock--config-instance nonexistent now fails immediately at startup.

Install

wget https://github.com/doutsis/vmbackup/releases/download/v0.5.4/vmbackup_0.5.4_all.deb
sudo dpkg -i vmbackup_0.5.4_all.deb

Full changelog: CHANGELOG.md

v0.5.3

10 Apr 07:22

Choose a tag to compare

Added

  • --run flag required to start backups — explicit mode for all operations
  • --vm targeted backup mode — back up specific VMs on demand
  • Unknown flag detection and --cancel-replication conflict guards
  • Root privilege check with clear error message
  • Global session lock to prevent concurrent invocations
  • session_type column in SQLite sessions table (schema v2.0)

Changed

  • SKIP_OFFLINE_UNCHANGED_BACKUPS default changed to true
  • --help output restructured
  • Systemd service updated with --run flag
  • Documentation rewritten — condensed from ~4,500 to ~2,900 lines

Removed

  • Host Configuration Backup feature

Fixed

  • RETENTION_ORPHAN_DRY_RUN config setting was being ignored

See CHANGELOG.md for full details.

v0.5.2

22 Mar 05:48

Choose a tag to compare

See CHANGELOG.md for details.

v0.5.1

18 Mar 02:51

Choose a tag to compare

Fixed

  • chain_health off-by-onetotal_checkpoints and restorable_count were 0 after first backup instead of 1
  • restore_points counted per-disk instead of per-backup — multi-disk VMs reported 3× the correct count
  • csv_ variable name remnants — 25 stale variable names and dead CSV cleanup code removed
  • Archived chains missing vmconfig XML and TPM marker — chain archives were incomplete; now self-contained

See CHANGELOG.md for details.

v0.5.0

14 Mar 10:05

Choose a tag to compare

Initial public release.

Install:

sudo dpkg -i vmbackup_0.5.0_all.deb

See README for configuration and usage.