2.8 KiB
2.8 KiB
✅ Point 2 – Compute Plane (OptiPlex Proxmox Cluster) – FINAL
Role
- Cluster that runs all Docker Swarm workloads
- Separate from out-of-band control (Watchtower)
- Designed to tolerate loss of one physical node without losing quorum
Physical hosts
- 3× Dell OptiPlex Micro 7010: pve01-pve03
- Local NVMe only; no shared storage dependency
- Hosts sized with headroom; no aggressive CPU/RAM overcommit by default
Proxmox cluster
- 3-node Proxmox VE cluster with Corosync over LAN
- Static IPs on all hosts
- vmbr0 = primary LAN bridge; VLAN-capable but unused initially
- Proxmox HA: off by default (may be added later via separate design)
VM layout per host
- Each OptiPlex runs exactly 2× Ubuntu Server LTS VMs:
- 1× Swarm Manager VM
- 1× Swarm Worker VM
- No additional "misc" VMs on these hosts without an explicit architecture update
Swarm roles and placement
- Total: 3 managers, 3 workers (one of each per host)
- Managers hold Swarm Raft state and scheduling decisions
- Workers run application workloads
- Managers are schedulable only for light/infra tasks; no heavy or noisy apps
- Node labels and placement constraints enforce "apps → workers" by default
Resource allocation (initial)
- Manager VM
- 2 vCPU
- 4–6 GB RAM
- ~40 GB disk
- Worker VM
- 4–6 vCPU
- 16–24 GB RAM
- ≥100 GB disk
Storage model
- VM disks: local Proxmox storage (ZFS or LVM-thin), no shared VM disks
- Container data: bind-mounts inside VMs
- Swarm control plane and core workloads do not depend on shared storage
- Production data path:
- Primary: TerraMaster
- Backup: TerraMaster → Synology via rsync
- Offsite: Synology → cloud
Networking assumptions
- All Proxmox hosts and VMs attach to primary LAN via vmbr0
- Compute plane runs on a flat LAN at baseline
- Detailed VLAN and IP design will live in a separate networking architecture document that this spec can reference
Operational constraints ("never do this")
- Do not run Docker workloads or Swarm nodes directly on Proxmox hosts
- Do not run heavy or stateful application stacks on manager VMs
- Do not introduce shared storage as a hard dependency for Swarm or cluster boot
- Do not use storage appliances (TerraMaster, Synology, etc.) as Swarm managers or workers
Expansion and change model
- To add compute capacity:
- Add a new OptiPlex node to the Proxmox cluster
- Create at least one new Swarm Worker VM on that host
- Join the VM to Swarm with standard labels and constraints
- Gradually rebalance workloads; no redesign of existing nodes required
- Any change that alters manager count, enables Proxmox HA, or significantly changes storage/networking models requires an explicit architecture review and doc update