Castaldi Family Homelab
A GitOps-managed, Ansible-automated infrastructure running media services, container orchestration, and hypervisor management across distributed ARM and x86 nodes.
🚀 Why This Homelab?
- Zero-Touch Deployments: Push to Git → Auto-deploy via webhooks → Containers update automatically
- Ansible Automation: All nodes managed by Ansible from watchtower control plane
- Infrastructure as Code: Services defined in
compose.yaml+ infrastructure managed with Ansible playbooks - GPU Transcoding: Hardware-accelerated media streaming with NVIDIA GTX 1060 Mobile
- Distributed Architecture: Services across Proxmox hypervisor, VMs, physical servers, and Raspberry Pi
- Self-Hosted Git: No external dependencies—Gitea runs on-premise with automated backups
- Production-Grade Networking: Traefik reverse proxy with automatic SSL (Cloudflare DNS challenge)
- Hypervisor Management: Proxmox VE for VM orchestration with automated post-install configuration
🏗️ Architecture
graph TB
subgraph Internet
CF[Cloudflare DNS]
end
subgraph "PVE01 - Proxmox VE Hypervisor (10.0.0.201)"
subgraph "Heimdall VM (10.0.0.151)"
Traefik[Traefik Reverse Proxy<br/>:80, :443]
Komodo[Komodo Core<br/>Container Orchestrator]
Gitea[Gitea<br/>Self-Hosted Git]
Redis[Redis Cache]
Trek[Trek]
Vault[Vaultwarden]
end
end
subgraph "Waldorf - Physical Server (10.0.0.251)"
Plex[Plex Media Server<br/>GPU Transcoding]
Tunarr[Tunarr<br/>IPTV Channels]
GPU[NVIDIA GTX 1060 Mobile<br/>6GB VRAM]
KomodoW[Komodo Periphery]
end
subgraph "Watchtower - Raspberry Pi 5 (10.0.0.200)"
Ansible[Ansible Control Node<br/>Infrastructure Automation]
KomodoP[Komodo Periphery]
VSCode[VS Code Server]
end
subgraph "TerraMaster NAS (10.0.0.250)"
NFS[NFS Storage<br/>Volume1: /appdata<br/>Volume2: /media]
end
CF -->|HTTPS| Traefik
Traefik --> Gitea
Traefik --> Komodo
Traefik --> Plex
Traefik --> Tunarr
Komodo <-->|WebSocket| KomodoW
Komodo <-->|WebSocket| KomodoP
Gitea -->|Webhook| Komodo
Ansible -.->|SSH| PVE01
Ansible -.->|SSH| Heimdall
Ansible -.->|SSH| Waldorf
Plex --> GPU
Tunarr --> GPU
Heimdall -.->|NFS v4| NFS
Waldorf -.->|NFS v4| NFS
Watchtower -.->|NFS v3| NFS
style NFS fill:#f9a825,color:#000
style PVE01 fill:#e57000,color:#fff
📦 Infrastructure Inventory
| Node | IP | Hardware | Platform/OS | Role | Services |
|---|---|---|---|---|---|
| PVE01 | 10.0.0.201 |
Physical Server Intel i5-13500T (14c), 15GB RAM |
Proxmox VE 9.1.7 | Hypervisor | Hosts Heimdall VM |
| Heimdall | 10.0.0.151 |
Proxmox VM on PVE01 Intel N100 (4c), 15GB RAM |
Ubuntu 24.04 | Core Services | Komodo Core, Gitea, Traefik, Redis, Trek, Vaultwarden |
| Waldorf | 10.0.0.251 |
Physical Server i7-7820HQ (8c), GTX 1060, 16GB |
Ubuntu 24.04 | Media Processing | Plex, Tunarr (GPU transcoding), Komodo Periphery |
| Watchtower | 10.0.0.200 |
Raspberry Pi 5 ARM Cortex-A76 (4c), 16GB |
Debian Trixie | Control Plane | Ansible, Komodo Periphery, VS Code Server |
| TerraMaster | 10.0.0.250 |
NAS | TOS | Shared Storage | NFS (Volume1: /appdata, Volume2: /media) |
⚡ Quick Start
Prerequisites
- SSH access to nodes
- Git configured with credentials:
git config --global credential.helper wincred # Windows git config --global core.autocrlf true
Clone & Deploy
# Clone from self-hosted Gitea
git clone https://git.castaldifamily.com/nathan/homelab.git
cd homelab
# Deploy a service (via Komodo UI or SSH)
ssh chester@10.0.0.251
cd /etc/komodo/stacks/tunarr
docker compose up -d
Automated GitOps Workflow
- Edit
nodes/{node}/{service}/compose.yamllocally - Commit and push to Gitea:
git add . && git commit -m "feat: update service" && git push - Webhook triggers Komodo Core (heimdall)
- Auto-deploy pulls latest code and restarts containers
- Monitor via Komodo UI at
http://10.0.0.151:9000
⚙️ Automation
Ansible Control Plane
Watchtower (10.0.0.200) manages all infrastructure via Ansible:
# SSH into control node
ssh chester@10.0.0.200
cd ~/homelab/ansible
# Test connectivity to all nodes
ansible all -m ping
# Gather live system facts
ansible-playbook playbooks/gather-node-facts.yml
# Deploy Proxmox post-install config
ansible-playbook playbooks/onboard-proxmox.yml --limit pve01
# Run commands across node groups
ansible docker_nodes -m command -a "docker ps"
ansible proxmox_cluster -m command -a "pveversion"
Managed Node Groups
control_plane: watchtower
docker_nodes: heimdall, waldorf
proxmox_cluster: pve01
nfs_clients: heimdall, waldorf
core_services: heimdall
media_services: waldorf
🎯 Active Missions
Traffic Light System: 🟢 Complete | 🟡 In Progress | 🔴 Blocked
| Status | Mission | Details |
|---|---|---|
| 🟢 | Komodo GitOps | All stacks migrated to Git sources with webhook automation |
| 🟢 | GPU Transcoding | GTX 1060 Mobile accessible in Plex/Tunarr containers |
| 🟢 | Documentation Structure | KBAs and SOPs organized in documentation/ |
| 🟢 | Ansible Automation | All 4 nodes onboarded and managed by Ansible from Watchtower |
| 🟢 | Proxmox Post-Install | PVE01 configured: subscription nag removed, repos optimized |
| 🟡 | Hardware Transcoding Validation | Monitor Plex for (hw) indicator during active streams |
| 🟢 | NFS Mount Stability | NFSv3 on Pi, NFSv4 on x86 nodes |
📂 Repository Structure
homelab/
├── ansible/ # Ansible automation (active)
│ ├── inventory/ # Managed hosts and groups
│ │ ├── hosts.ini # 4-node inventory
│ │ └── host_vars/ # Per-node configuration
│ ├── playbooks/ # Automation workflows
│ │ ├── onboard-nodes.yml # Node SSH key deployment
│ │ ├── onboard-proxmox.yml # Proxmox post-install
│ │ └── gather-node-facts.yml # System discovery
│ ├── roles/ # Reusable automation
│ │ └── proxmox_post_install/ # Nag removal, repo config
│ └── group_vars/ # Global variables
├── nodes/ # Service definitions per node
│ ├── heimdall/ # Core infrastructure (VM on PVE01)
│ │ ├── core/ # Komodo, Traefik, Redis
│ │ ├── trek/ # Trek service
│ │ ├── vaultwarden/ # Password manager
│ │ └── (gitea via Komodo) # Self-hosted Git
│ ├── waldorf/ # Media services (Physical)
│ │ ├── plex/ # Media server + GPU
│ │ └── tunarr/ # IPTV channels + GPU
│ └── watchtower/ # Control plane (Pi 5)
│ └── vscode/ # Remote development
├── documentation/ # Technical knowledge base
│ ├── KBAs/ # Troubleshooting guides
│ ├── SOPs/ # Operational procedures
│ ├── plans/ # Implementation roadmaps
│ └── TECHNICAL_RUNBOOK.md # Emergency reference
└── scripts/ # Utility scripts
├── bootstrap.sh # Day-0 node initialization
└── lib/ # Shared function libraries
🔧 Common Operations
Deploy a New Stack
# 1. Create directory structure
mkdir -p nodes/waldorf/sonarr
# 2. Create compose.yaml
cat > nodes/waldorf/sonarr/compose.yaml <<EOF
services:
sonarr:
image: lscr.io/linuxserver/sonarr:latest
restart: unless-stopped
ports:
- 8989:8989
volumes:
- /mnt/appdata/sonarr:/config
EOF
# 3. Commit and push
git add nodes/waldorf/sonarr/
git commit -m "feat(stacks): add Sonarr to Waldorf"
git push
# 4. Configure in Komodo UI
# - Source Type: Git Repo
# - Run Directory: nodes/waldorf/sonarr
# - Deploy!
Check Service Status
# Via Komodo API
curl http://10.0.0.151:9000/api/stacks
# Direct SSH to node
ssh chester@10.0.0.251
docker ps | grep tunarr
docker logs tunarr --tail 50
Emergency Rollback
# In Komodo UI: Click "Rollback" on stack
# Or via Git:
git revert HEAD
git push # Triggers auto-rollback
📚 Documentation
| Document | Purpose |
|---|---|
| TECHNICAL_RUNBOOK.md | Infrastructure overview, emergency procedures, maintenance schedule |
| KBA-001 | Troubleshooting Git-linked stack failures |
| SOP-001 | Step-by-step guide to migrate stacks to GitOps |
| Node READMEs | Hardware specs and service details per node |
🛡️ Security & Best Practices
Secrets Management
- ❌ NEVER commit passwords, API keys, or tokens to Git
- ✅ DO use Komodo Environment Variables for secrets
- ✅ DO use Gitea App Tokens for authentication (avoids SSH key exchange issues)
Example:
# In Git (compose.yaml)
environment:
- PUID=1000
- PGID=1000
- API_KEY=${PLEX_API_KEY} # Injected by Komodo
# In Komodo UI: Set PLEX_API_KEY in Environment Variables
NFS Mount Configuration
Critical: Raspberry Pi requires NFSv3 (not v4) due to ID-domain mismatches:
# /etc/fstab on Watchtower (Pi 5)
10.0.0.250:/Volume1/appdata /mnt/appdata nfs nfsvers=3,rw,sync 0 0
# /etc/fstab on Heimdall/Waldorf (x86 Ubuntu)
10.0.0.250:/Volume1/appdata /mnt/appdata nfs4 rw,sync 0 0
Backup Strategy
- Git Repository: Daily backups via Gitea's built-in backup feature
- Docker Volumes: Weekly snapshots to
/mnt/appdata/backups/ - Proxmox VMs: Daily snapshots with 7-day retention
- Configuration Files: Tracked in Git under
nodes/{hostname}/
📊 Stats
- Total Nodes: 5 (1 hypervisor + 3 compute + 1 storage)
- Automation: Ansible managing 4 active nodes from Watchtower
- Container Orchestration: Komodo v2.1.2
- Active Services: 12+ (Traefik, Plex, Tunarr, Gitea, Trek, Vaultwarden, etc.)
- Total RAM: 62GB (15GB PVE01 + 15GB Heimdall + 16GB Waldorf + 16GB Watchtower)
- Total CPU Cores: 30 physical (14c i5-13500T + 8c i7-7820HQ + 4c N100 + 4c ARM)
- Virtualization: Proxmox VE 9.1.7 hosting 1 VM (expandable)
- GPU Acceleration: NVIDIA GTX 1060 Mobile (6GB VRAM)
- Storage: TerraMaster NAS (NFSv3/v4)
🔥 Emergency Procedures
NFS Mount Failure
# Check connectivity
ping 10.0.0.250
# Remount
sudo umount /mnt/appdata
sudo mount -a
df -h | grep appdata
Komodo Periphery Offline
# Check WebSocket connectivity
curl -v ws://10.0.0.151:9120
# Restart agent
docker restart komodo-periphery
docker logs -f komodo-periphery
Traefik SSL Certificate Issues
# Check Cloudflare API token
docker exec traefik cat /etc/traefik/traefik.yml
# Force certificate renewal
docker restart traefik
docker logs traefik | grep -i "cloudflare\|certificate"
🤝 Contributing
This is a personal homelab, but documentation improvements and issue reports are welcome!
- Fork via Gitea:
https://git.castaldifamily.com/nathan/homelab - Create feature branch:
git checkout -b feat/my-improvement - Commit using Conventional Commits
- Push and create Pull Request
📜 License
Personal infrastructure configuration. Documentation licensed under CC BY-SA 4.0.
Maintained by: Nathan Castaldi
Last Updated: April 13, 2026
Status: 🟢 Operational
Automation Status: 🟢 Ansible Fully Deployed
Description
A GitOps-managed, Ansible-automated infrastructure running media services, container orchestration, and hypervisor management across distributed ARM and x86 nodes.
Languages
Shell
100%