Castaldi Family Homelab

A GitOps-managed, Ansible-automated infrastructure running media services, container orchestration, and hypervisor management across distributed ARM and x86 nodes.

GitOps Automation Infrastructure Documentation


🚀 Why This Homelab?

  • Zero-Touch Deployments: Push to Git → Auto-deploy via webhooks → Containers update automatically
  • Ansible Automation: All nodes managed by Ansible from watchtower control plane
  • Infrastructure as Code: Services defined in compose.yaml + infrastructure managed with Ansible playbooks
  • GPU Transcoding: Hardware-accelerated media streaming with NVIDIA GTX 1060 Mobile
  • Distributed Architecture: Services across physical servers with Proxmox hypervisor ready for VM deployment
  • Self-Hosted Git: No external dependencies—Gitea runs on-premise with automated backups
  • Production-Grade Networking: Traefik reverse proxy with automatic SSL (Cloudflare DNS challenge)
  • Hypervisor Management: Proxmox VE ready for VM orchestration with automated post-install configuration

📦 Infrastructure Inventory

Node IP Hardware Platform/OS Role Services
PVE01 10.0.0.201 Physical Server
Intel i5-13500T (14c), 15GB RAM
Proxmox VE 9.1.7 Hypervisor VM orchestration platform
Heimdall 10.0.0.151 Physical Server
Intel N100 (4c), 15GB RAM
Ubuntu 24.04 Core Services Komodo, Gitea, Traefik
Waldorf 10.0.0.251 Physical Server
i7-7820HQ (8c), GTX 1060, 16GB
Ubuntu 24.04 Media Processing Plex and Related Media Services
Watchtower 10.0.0.200 Physical Server
ARM Cortex-A76 (4c), 16GB
Debian Trixie Control Plane Ansible, VS Code, Monitoring Tools
TerraMaster 10.0.0.250 NAS TOS Shared Storage NFS (Volume1: /appdata, Volume2: /media)

Quick Start

Prerequisites

  • SSH access to nodes
  • Git configured with credentials:
    git config --global credential.helper wincred  # Windows
    git config --global core.autocrlf true
    

Clone & Deploy

# Clone from self-hosted Gitea
git clone https://git.castaldifamily.com/nathan/homelab.git
cd homelab

# Deploy a service (via Komodo UI or SSH)
ssh chester@10.0.0.251
cd /etc/komodo/stacks/tunarr
docker compose up -d

Automated GitOps Workflow

  1. Edit nodes/{node}/{service}/compose.yaml locally
  2. Commit and push to Gitea: git add . && git commit -m "feat: update service" && git push
  3. Webhook triggers Komodo Core (heimdall)
  4. Auto-deploy pulls latest code and restarts containers
  5. Monitor via Komodo UI at http://10.0.0.151:9000

⚙️ Automation

Ansible Control Plane

Watchtower (10.0.0.200) manages all infrastructure via Ansible:

Status: 🟢 PRODUCTION READY (4 nodes, all responding)

# SSH into control node
ssh chester@10.0.0.200
cd ~/homelab/ansible

# Quick health check
./validate-environment.sh

# Test connectivity to all nodes
ansible all -m ping

# Gather live system facts
ansible-playbook playbooks/gather-node-facts.yml

# Deploy Proxmox post-install config
ansible-playbook playbooks/onboard-proxmox.yml --limit pve01

# Run commands across node groups
ansible docker_nodes -m command -a "docker ps"
ansible proxmox_cluster -m command -a "pveversion"

Quick Reference: See ansible/QUICK-REFERENCE.md for comprehensive command guide.
Setup Documentation: documentation/plans/plan-ansibleSetup.md

Managed Node Groups

control_plane:     watchtower
docker_nodes:      heimdall, waldorf
proxmox_cluster:   pve01
nfs_clients:       heimdall, waldorf
core_services:     heimdall
media_services:    waldorf

🎯 Active Missions

Traffic Light System: 🟢 Complete | 🟡 In Progress | 🔴 Blocked

Status Mission Details
🟢 Komodo GitOps All stacks migrated to Git sources with webhook automation
🟢 GPU Transcoding GTX 1060 Mobile accessible in Plex/Tunarr containers
🟢 Documentation Structure KBAs and SOPs organized in documentation/
🟢 Ansible Automation All 4 nodes onboarded and managed by Ansible from Watchtower
🟢 Proxmox Post-Install PVE01 configured: subscription nag removed, repos optimized
🟡 Hardware Transcoding Validation Monitor Plex for (hw) indicator during active streams
🟢 NFS Mount Stability NFSv3 on Pi, NFSv4 on x86 nodes

📂 Repository Structure

homelab/
├── ansible/                    # Ansible automation (active)
│   ├── inventory/              # Managed hosts and groups
│   │   ├── hosts.ini           # 4-node inventory
│   │   └── host_vars/          # Per-node configuration
│   ├── playbooks/              # Automation workflows
│   │   ├── onboard-nodes.yml   # Node SSH key deployment
│   │   ├── onboard-proxmox.yml # Proxmox post-install
│   │   └── gather-node-facts.yml # System discovery
│   ├── roles/                  # Reusable automation
│   │   └── proxmox_post_install/ # Nag removal, repo config
│   └── group_vars/             # Global variables
├── nodes/                      # Service definitions per node
│   ├── heimdall/               # Core infrastructure (Physical)
│   │   ├── core/               # Komodo, Traefik, Redis
│   │   ├── trek/               # Trek service
│   │   ├── vaultwarden/        # Password manager
│   │   └── (gitea via Komodo)  # Self-hosted Git
│   ├── waldorf/                # Media services (Physical)
│   │   ├── plex/               # Media server + GPU
│   │   └── tunarr/             # IPTV channels + GPU
│   └── watchtower/             # Control plane (Pi 5)
│       └── vscode/             # Remote development
├── documentation/              # Technical knowledge base
│   ├── KBAs/                   # Troubleshooting guides
│   ├── SOPs/                   # Operational procedures
│   ├── plans/                  # Implementation roadmaps
│   └── TECHNICAL_RUNBOOK.md    # Emergency reference
└── scripts/                    # Utility scripts
    ├── bootstrap.sh            # Day-0 node initialization
    └── lib/                    # Shared function libraries

🔧 Common Operations

Deploy a New Stack

# 1. Create directory structure
mkdir -p nodes/waldorf/sonarr

# 2. Create compose.yaml
cat > nodes/waldorf/sonarr/compose.yaml <<EOF
services:
  sonarr:
    image: lscr.io/linuxserver/sonarr:latest
    restart: unless-stopped
    ports:
      - 8989:8989
    volumes:
      - /mnt/appdata/sonarr:/config
EOF

# 3. Commit and push
git add nodes/waldorf/sonarr/
git commit -m "feat(stacks): add Sonarr to Waldorf"
git push

# 4. Configure in Komodo UI
# - Source Type: Git Repo
# - Run Directory: nodes/waldorf/sonarr
# - Deploy!

Check Service Status

# Via Komodo API
curl http://10.0.0.151:9000/api/stacks

# Direct SSH to node
ssh chester@10.0.0.251
docker ps | grep tunarr
docker logs tunarr --tail 50

Emergency Rollback

# In Komodo UI: Click "Rollback" on stack
# Or via Git:
git revert HEAD
git push  # Triggers auto-rollback

📚 Documentation

Document Purpose
TECHNICAL_RUNBOOK.md Infrastructure overview, emergency procedures, maintenance schedule
KBA-001 Troubleshooting Git-linked stack failures
SOP-001 Step-by-step guide to migrate stacks to GitOps
Node READMEs Hardware specs and service details per node

🛡️ Security & Best Practices

Secrets Management

  • NEVER commit passwords, API keys, or tokens to Git
  • DO use Komodo Environment Variables for secrets
  • DO use Gitea App Tokens for authentication (avoids SSH key exchange issues)

Example:

# In Git (compose.yaml)
environment:
  - PUID=1000
  - PGID=1000
  - API_KEY=${PLEX_API_KEY}  # Injected by Komodo

# In Komodo UI: Set PLEX_API_KEY in Environment Variables

NFS Mount Configuration

Critical: Raspberry Pi requires NFSv3 (not v4) due to ID-domain mismatches:

# /etc/fstab on Watchtower (Pi 5)
10.0.0.250:/Volume1/appdata /mnt/appdata nfs nfsvers=3,rw,sync 0 0

# /etc/fstab on Heimdall/Waldorf (x86 Ubuntu)
10.0.0.250:/Volume1/appdata /mnt/appdata nfs4 rw,sync 0 0

Backup Strategy

  • Git Repository: Daily backups via Gitea's built-in backup feature
  • Docker Volumes: Weekly snapshots to /mnt/appdata/backups/
  • Proxmox VMs: Daily snapshots with 7-day retention (when VMs are deployed)
  • Configuration Files: Tracked in Git under nodes/{hostname}/

📊 Stats

  • Total Nodes: 5 (1 hypervisor + 3 compute + 1 storage)
  • Automation: Ansible managing 4 active nodes from Watchtower
  • Container Orchestration: Komodo v2.1.2
  • Active Services: 12+ (Traefik, Plex, Tunarr, Gitea, Trek, Vaultwarden, etc.)
  • Total RAM: 62GB (15GB PVE01 + 15GB Heimdall + 16GB Waldorf + 16GB Watchtower)
  • Total CPU Cores: 30 physical (14c i5-13500T + 8c i7-7820HQ + 4c N100 + 4c ARM)
  • Virtualization: Proxmox VE 9.1.7 available (no VMs currently deployed)
  • GPU Acceleration: NVIDIA GTX 1060 Mobile (6GB VRAM)
  • Storage: TerraMaster NAS (NFSv3/v4)

🔥 Emergency Procedures

NFS Mount Failure

# Check connectivity
ping 10.0.0.250

# Remount
sudo umount /mnt/appdata
sudo mount -a
df -h | grep appdata

Komodo Periphery Offline

# Check WebSocket connectivity
curl -v ws://10.0.0.151:9120

# Restart agent
docker restart komodo-periphery
docker logs -f komodo-periphery

Traefik SSL Certificate Issues

# Check Cloudflare API token
docker exec traefik cat /etc/traefik/traefik.yml

# Force certificate renewal
docker restart traefik
docker logs traefik | grep -i "cloudflare\|certificate"

🤝 Contributing

This is a personal homelab, but documentation improvements and issue reports are welcome!

  1. Fork via Gitea: https://git.castaldifamily.com/nathan/homelab
  2. Create feature branch: git checkout -b feat/my-improvement
  3. Commit using Conventional Commits
  4. Push and create Pull Request

📜 License

Personal infrastructure configuration. Documentation licensed under CC BY-SA 4.0.


Maintained by: Nathan Castaldi
Last Updated: April 13, 2026
Status: 🟢 Operational
Automation Status: 🟢 Ansible Fully Deployed

Description
A GitOps-managed, Ansible-automated infrastructure running media services, container orchestration, and hypervisor management across distributed ARM and x86 nodes.
Readme 1.2 MiB
Languages
Shell 100%