Skip to content

docs(home-lab): DR runbooks page (needs RTO/RPO + paging path decisions) #21

@JacobPEvans

Description

@JacobPEvans

Goal

Create operational DR runbooks for the home lab.

Why this needs user input

DR pages are step-by-step failover/failback procedures. They need:

  1. RTO/RPO targets per service tier — what's the recovery time objective for Splunk vs Cribl vs the homelab dashboard? What's the recovery point objective (how much data loss is acceptable)?
  2. Paging path — who/what gets notified when DR triggers? Personal pager? PushNotification? Email? None?
  3. Failback validation steps — what proves DR has succeeded? Specific synthetic queries? Health checks? Smoke tests?

The current docs (about/homelab.mdx) describe DR conceptually but don't operationalize it.

Proposed page

  • Path: infrastructure/dr-runbooks.mdx
  • Sidebar group: Infrastructure
  • Tier: 2
  • Key sources: terraform-aws (AWS DR infra), observability/tf-splunk-aws.mdx (existing AWS Splunk doc), ansible-proxmox (host-level recovery)

Done definition

  • One runbook per critical service (Splunk failover, Proxmox restore from snapshot, network failover).
  • Each runbook has: trigger conditions, prerequisites, step-by-step actions, verification commands, rollback plan.
  • Linked from infrastructure/overview.mdx.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions