2.8 KiB
Raw Blame History

Point 2 Compute Plane (OptiPlex Proxmox Cluster) FINAL

Role

  • Cluster that runs all Docker Swarm workloads
  • Separate from out-of-band control (Watchtower)
  • Designed to tolerate loss of one physical node without losing quorum

Physical hosts

  • 3× Dell OptiPlex Micro 7010: pve01-pve03
  • Local NVMe only; no shared storage dependency
  • Hosts sized with headroom; no aggressive CPU/RAM overcommit by default

Proxmox cluster

  • 3-node Proxmox VE cluster with Corosync over LAN
  • Static IPs on all hosts
  • vmbr0 = primary LAN bridge; VLAN-capable but unused initially
  • Proxmox HA: off by default (may be added later via separate design)

VM layout per host

  • Each OptiPlex runs exactly 2× Ubuntu Server LTS VMs:
    • 1× Swarm Manager VM
    • 1× Swarm Worker VM
  • No additional "misc" VMs on these hosts without an explicit architecture update

Swarm roles and placement

  • Total: 3 managers, 3 workers (one of each per host)
  • Managers hold Swarm Raft state and scheduling decisions
  • Workers run application workloads
  • Managers are schedulable only for light/infra tasks; no heavy or noisy apps
  • Node labels and placement constraints enforce "apps → workers" by default

Resource allocation (initial)

  • Manager VM
    • 2 vCPU
    • 46 GB RAM
    • ~40 GB disk
  • Worker VM
    • 46 vCPU
    • 1624 GB RAM
    • ≥100 GB disk

Storage model

  • VM disks: local Proxmox storage (ZFS or LVM-thin), no shared VM disks
  • Container data: bind-mounts inside VMs
  • Swarm control plane and core workloads do not depend on shared storage
  • Production data path:
    • Primary: TerraMaster
    • Backup: TerraMaster → Synology via rsync
    • Offsite: Synology → cloud

Networking assumptions

  • All Proxmox hosts and VMs attach to primary LAN via vmbr0
  • Compute plane runs on a flat LAN at baseline
  • Detailed VLAN and IP design will live in a separate networking architecture document that this spec can reference

Operational constraints ("never do this")

  • Do not run Docker workloads or Swarm nodes directly on Proxmox hosts
  • Do not run heavy or stateful application stacks on manager VMs
  • Do not introduce shared storage as a hard dependency for Swarm or cluster boot
  • Do not use storage appliances (TerraMaster, Synology, etc.) as Swarm managers or workers

Expansion and change model

  • To add compute capacity:
    • Add a new OptiPlex node to the Proxmox cluster
    • Create at least one new Swarm Worker VM on that host
    • Join the VM to Swarm with standard labels and constraints
    • Gradually rebalance workloads; no redesign of existing nodes required
  • Any change that alters manager count, enables Proxmox HA, or significantly changes storage/networking models requires an explicit architecture review and doc update