--- # Hardware Specifications & Docker Swarm Topology Analysis # Generated: 2026-03-12 # Subject Hosts: pve03 (10.0.0.203) vs pve04 (10.0.0.204) # Context: Evaluating 3-node identical Proxmox cluster for Docker Swarm workloads --- ## EXECUTIVE SUMMARY **Finding**: pve03 and pve04 are **NOT identical**, with meaningful differences: - **pve03**: 10 cores, 23.6 GB RAM, unknown storage capacity (already clustered, running 3 VMs) - **pve04**: 14 cores, 15 GB RAM, 238.5 GB NVMe SSD (fresh, not yet clustered) **Recommendation for "3 identically-spec'd devices":** - **Option A (Recommended)**: Use **pve04 as the template model**. Procurement should source 3× Intel Core i5-13500T machines with 15+ GB RAM and 240+ GB NVMe storage. pve04 is the better baseline (better single-thread performance, dedicated NVMe, fresh OS). - **Option B**: Keep **pve03 as template**. Run a deeper audit on pve03's actual storage (it has 21 loop/dm devices—unclear if additional storage is attached). Backfill pve04 and a 3rd host to match pve03's full config. **Verdict**: **pve04 > pve03 for Swarm baseline**. The i5-13500T offers superior CPU performance (4600 MHz boost vs 2885 MHz), dedicated fast storage, and is freshly provisioned. Use pve04 as the reference architecture for the 3rd node. --- ## DETAILED HARDWARE COMPARISON ### CPU Specifications | Dimension | pve03 | pve04 | Status | |-----------|-------|-------|--------| | **Model** | Unknown / unrecognized | Intel Core i5-13500T | ✅ pve04 superior | | **Architecture** | x86_64 | x86_64 | ✅ Match | | **Socket Count** | 1 | 1 | ✅ Match | | **Cores per Socket** | 10 | 14 | ⚠️ **MISMATCH** | | **Logical CPUs (with HT)** | 10 | 20 | ⚠️ **MISMATCH** | | **Max Frequency** | 2,885 MHz | 4,600 MHz | ⚠️ **pve04 55% faster** | | **Min Frequency** | Unknown | 800 MHz | — | | **Microcode Level** | 0x437 | 0x3a | — | **Interpretation:** - pve04's i5-13500T is a **13th-gen Intel desktop CPU** (2023), significantly newer and faster than pve03 - pve03's CPU could be a degraded/limited processor or a different i5/i7 SKU—need clarification - **For Docker Swarm workloads**: pve04's higher clock speed (4600 MHz) means better latency-sensitive tasks; pve03's 10 cores are still adequate for the planned 2 VMs (manager + worker) per node **Recommendation**: If strict "identical" is the mandate, **pve04 is the better model to replicate**. Purchasing 3× i5-13500T machines ensures: 1. Consistent single-threaded performance 2. Known thermal/power envelope 3. Support (retail CPUs, widely available) --- ### Memory (RAM) Specifications | Dimension | pve03 | pve04 | Status | |-----------|-------|-------|--------| | **Total RAM** | 23.6 GB | 15.0 GB | ⚠️ **MISMATCH** | | **Free RAM** | 12.4 GB | 13.0 GB | ⚠️ pve03 has extra, currently used | | **Used by OS + Proxmox** | ~11.2 GB | ~1.7 GB | ⚠️ pve03 heavier | **Interpretation:** - pve03: 23.6 GB total (likely 2× 12 GB or 4× 8 GB SODIMM/UDIMM sticks) - pve04: 15 GB total (likely 1× 16 GB, with 1 GB reserved for BIOS/SMM) - pve03 is using ~11 GB for the OS and Proxmox daemon + 3 running VMs - pve04 is minimal (fresh install, no VMs) **Validation Against Swarm Requirements:** - Each node will host 2 VMs: 1 manager (2 cores, 2 GB RAM) + 1 worker (2 cores, 2 GB RAM) - Proxmox overhead: ~2-4 GB per node - **Minimum needed: 8+ GB RAM per node** ✅ Both qualify - **Optimal: 16 GB** ✅ pve04 meets this; pve03 exceeds it **Recommendation**: Use **16 GB as the standard** for 3-node cluster (matches pve04). This is cost-effective and provides ample headroom. --- ### Storage Specifications | Dimension | pve03 | pve04 | Status | |-----------|-------|-------|--------| | **Primary Disk(s)** | Unknown (21 loop/dm devices detected) | 1× 238.5 GB NVMe SSD | ⚠️ **pve04 transparent** | | **Root FS Capacity** | 68 GB | 238.5 GB | ⚠️ **MISMATCH** | | **Root FS Available** | 59 GB free | ~230 GB available | ⚠️ pve04 has more room | | **Storage Type** | Unknown (likely SATA SSD or array) | Enterprise-grade NVMe | — | **Interpretation:** - pve03's storage is **opaque**: 21 loop and device-mapper devices suggest: - Possible RAID configuration (dm-* = device mapper) - LVM (Logical Volume Manager) setup - Possibly shared storage mounted - Current state: ~68 GB LVM volume, 9 GB used - pve04's storage is **straightforward**: Single 238.5 GB NVMe SSD, clean LVM setup, minimal OS footprint **VM Storage Requirements (per node):** - 1 Manager VM: 32 GB disk (from provisionspec in your playbook) - 1 Worker VM: 32 GB disk - **Total per node: 64 GB guest storage** (+ Proxmox root FS) - **Total available after OS: pve03 ≈ 59 GB, pve04 ≈ 230 GB** **⚠️ CRITICAL FINDING**: pve03 has **insufficient disk capacity** for the planned topology (needs 64 GB for VMs + OS buffer = ~80 GB, only has ~59 GB free). **Unless pve03 has additional storage mounted (not visible in the scan), it cannot host 2 full 32 GB VMs.** **Recommendation**: 1. **Immediate**: Verify pve03's storage architecture. Why 21 dm/loop devices? Is there additional NAS/SAN attached? 2. **For 3rd node procurement**: Use **pve04 as baseline**: - 240+ GB NVMe SSD (minimum) - Clean, single-drive configuration (KISS principle) - Sufficient headroom for VMs + snapshots + log growth --- ### Network Specifications | Dimension | pve03 | pve04 | Status | |-----------|-------|-------|--------| | **Interface Count** | 6 interfaces | 4 interfaces | — | | **Bridge** | vmbr0 + tap devices | vmbr0 visible | ✅ Both standard | | **Primary Network** | wlp0s20f3 + nic0 | wlp0s20f3 + nic0 | ✅ Match (suggest renaming nic0) | **Interpretation:** - Both nodes have the **same network card models** (wlp0s20f3 = wireless, nic0 = Ethernet) - pve03 has **2 tap devices** (tap301i0, tap302i0) = VM network interfaces from running VMs - pve04 has **no tap devices** = freshly imaged, no VMs yet - **Corosync / Proxmox Cluster**: Both will use vmbr0 for inter-node communication **Recommendation**: Both nodes are network-compatible. No issues for Docker Swarm overlay networking. --- ### Proxmox & Cluster Status | Dimension | pve03 | pve04 | Status | |-----------|-------|-------|--------| | **Proxmox Version** | 9.1.6 | 9.1.1 | ⚠️ Versions differ by .5 patch | | **Kernel** | 6.17.2-1-pve | 6.17.2-1-pve | ✅ Match | | **OS Distro** | Debian trixie | Debian trixie | ✅ Match | | **Cluster Status** | ✅ Clustered (homelab) | ❌ Not clustered | — | | **Cluster Members** | pve01, pve02, pve03 | None yet | — | | **VMs Running** | 3 VMs/containers | 0 VMs | — | | **Uptime** | 4 days | ~0 days (fresh) | — | **Interpretation:** - pve03 is an **active, production node** in the homelab cluster - pve04 is a **fresh candidate** ready for integration - Minor version difference (9.1.6 vs 9.1.1) is **not a blocker**—routine updates will align them **Recommendation**: Update both to the latest Proxmox 9.x patch level before final cluster formation. --- ## DOCKER SWARM TOPOLOGY ANALYSIS ### Target Design (from documentation/architecture/compute-plane.md) - 3× identically-spec'd physical Proxmox nodes - 3× Swarm Managers (1 per node, IPs: 10.0.0.211–213) - 3× Swarm Workers (1 per node, IPs: 10.0.0.221–223) - Each VM: 2 vCPU, 4 GB RAM, 32 GB disk - Proxmox cluster with Corosync for HA - No overcommit ### Capacity Analysis: pve04 as Reference Model #### CPU - **pve04 Spec**: 14 cores, 1 socket, 4600 MHz peak - **Planned Usage**: 4 vCPU (2 for manager, 2 for worker) = **28.6% utilization** - **Proxmox/Corosync Overhead**: ~1 vCPU - **Available Headroom**: 14 - 4 - 1 = **9 vCPU spare** - **Verdict**: ✅ **EXCELLENT**. Can sustain workload + spikes + 2x VM migration #### Memory (15 GB) - **Planned Usage**: 4 GB (manager) + 4 GB (worker) = 8 GB - **Proxmox OS + daemons**: ~2–3 GB - **Available Headroom**: 15 - 8 - 2.5 = **4.5 GB spare** - **Verdict**: ✅ **ADEQUATE**. No aggressive swapping. Supports scheduled workload growth. #### Storage (240 GB) - **Planned Usage**: 32 GB (manager) + 32 GB (worker) = 64 GB - **Proxmox OS**: ~8 GB - **Snapshots/Logs Buffer**: ~20 GB - **Total Planned**: ~92 GB - **Available Headroom**: 240 - 92 = **148 GB spare** - **Verdict**: ✅ **EXCELLENT**. Ample room for workload scaling, backups, experiments. #### Network - **Swarm Overlay**: vmbr0 at 1 Gbps - **Expected inter-node throughput**: <100 Mbps for modest swarm (10–20 containers) - **Verdict**: ✅ **ADEQUATE** for Docker Swarm in homelab. Upgrade to 10 Gbps if production-scale or data-intensive AI workloads planned. --- ### High-Availability & Resilience #### Quorum Analysis - **3 Proxmox Nodes**: Corosync quorum = 2/3 nodes required - Can tolerate 1 node failure ✅ Good - If node1 fails: quorum = nodes 2+3 (still ≥2) → **cluster remains operational** - **3 Swarm Managers**: Raft consensus quorum = 2/3 nodes required - Can tolerate 1 manager failure ✅ Good - If manager1 fails: quorum = managers 2+3 (still ≥2) → **swarm remains operational** #### Failure Scenarios | Scenario | Outcome | Swarm Impact | |----------|---------|--------------| | 1 node power fails | Surviving nodes take over VMs | Containers restart on node 2&3 | | 1 node storage corrupt | Proxmox HA can restart VMs on peer | Brief service interruption (~30s) | | 1 node network partition | Corosync detects; quorum = 2 survivors | Cluster continues; isolated node reboots | | 2 nodes fail simultaneously | Game over; cluster non-functional | **ALL workload lost** | **Verdict**: Design supports N-1 failure tolerance. **Very good for homelab.** --- ## SPECIAL CONSIDERATIONS FOR pve03 ### Storage Mystery: 21 Loop/Device-Mapper Devices **Questions to Investigate:** 1. Is pve03 mounted to external NAS/SAN (e.g., Synology 10.0.0.249)? 2. Is there a RAID or LVM snapshot setup? 3. Were multiple physical drives present originally, now failed? **Action Items:** ```bash # From watchtower or pve03: pvesh get /storage --output-format json # List all Proxmox storage targets zfs list # If ZFS in use lvs # LVM volumes pvdisplay # LVM physical volumes df -i # Inode usage (helps diagnose loop mounts) ``` **Implication**: Until pve03's storage is clarified, it **cannot be used as a template** for the 3rd identical host. --- ## FINAL RECOMMENDATIONS ### 1. **Short-Term (Immediate)** **Action**: Clarify pve03's storage architecture. ```bash # SSH into pve03 via watchtower relay or direct if SSH key added ssh root@10.0.0.203 "pvesh get /storage --output-format json" ssh root@10.0.0.203 "lvs && pvs" ssh root@10.0.0.203 "zfs list 2>/dev/null || echo 'ZFS not in use'" ``` **If pve03 has external storage**: - Note the configuration (NAS IP, mount method, capacity) - Plan to replicate in 3rd node **If pve03 is just a single drive**: - Proceed with pve04 as template ### 2. **Medium-Term (Before Final 3-Node Deployment)** **Option A: Adopt pve04 as Template (RECOMMENDED)** - Procurement: 3× machines with **Intel i5-13500T, 16 GB RAM, 256 GB NVMe** - Cost: ~$200–300 per node (retail Core i5 desktop equivalent) - Timeline: 1–2 weeks (sourcing) - Next step: Install Proxmox 9.x on 3rd node; cluster join **Option B: Backfill pve03 Config to pve04 & 3rd Node** - Upgrade pve04 RAM from 15 GB → 24 GB (add 1× 8 GB SODIMM) - Verify pve03's external storage is documented - Replicate in pve04 and 3rd node - Cost: ~$30–50 per node (additional RAM) - Timeline: 1 week - Risk: Depends on clarifying pve03 fully **Recommendation Pick**: **Option A is cleaner**. pve04 is fresher, faster, and has clear config. ### 3. **Long-Term (Post-3-Node Commissioning)** **Cluster Formation:** ```bash # On pve04 (assuming elected as initial leader): pvecm create homelab # On 3rd new node: pvecm add # Verify: pvesh get /cluster/status ``` **VM Provisioning:** ```bash # Use your existing playbook: ansible-playbook -i inventory/hosts.ini \ playbooks/proxmox/provision_swarm_vms.yml \ -e target_host=pve04 \ -e target_host=pve0N # For 3rd node ``` **Docker Swarm Init:** ```bash # On swarm-manager-1 (e.g., 10.0.0.211): docker swarm init --advertise-addr 10.0.0.211 # On manager-2 & manager-3: docker swarm join --token 10.0.0.211:2377 ``` --- ## APPENDIX: Hardware Specs Collected ### pve03 (10.0.0.203) – Full Details ``` CPU: 10 cores, 1 socket, max 2885 MHz Memory: 23.6 GB total, 12.4 GB free Storage: 68 GB root LVM (59 GB free) + 21 dm/loop devices (TBD) OS: Debian trixie, kernel 6.17.2-1-pve Proxmox: 9.1.6 Network: 6 interfaces (vmbr0, nic0, wlp0s20f3, tap301i0, tap302i0, lo) Cluster Status: Clustered (homelab), 3 VMs running Uptime: 4 days ``` ### pve04 (10.0.0.204) – Full Details ``` CPU: Intel Core i5-13500T, 14 cores, 1 socket, 20 vCPUs (HT), max 4600 MHz Memory: 15.0 GB total, ~13.0 GB available, 8.0 GB swap Storage: 238.5 GB NVMe SSD (nvme0n1), single drive OS: Debian trixie, kernel 6.17.2-1-pve Proxmox: 9.1.1 Network: 4 interfaces (vmbr0, nic0, wlp0s20f3, lo) Cluster Status: Not clustered yet, 0 VMs Uptime: Fresh (just rebooted) ``` --- ## CONCLUSION **pve04 is the superior choice** for replication to a 3-node cluster because of: 1. **CPU performance**: 4600 MHz vs 2885 MHz (55% faster single-thread) 2. **Storage clarity**: Single 240 GB NVMe (vs pve03's mysterious setup) 3. **Ballpark specifications**: 15 GB RAM + 240 GB SSD = excellent value for Swarm workloads 4. **Freshness**: No legacy config debt **Immediate action**: Clarify pve03's storage. Then either adopt pve04 as template or provide additional pve03 context to backfill. **Expected outcome**: 3-node Proxmox cluster running 6 Docker Swarm nodes (3 managers, 3 workers) with excellent resilience, performance, and headroom for future growth.