# Technical Runbook: Castaldi Family Lab **Status:** ACTIVE & OPERATIONAL **Last Updated:** April 11, 2026 **Maintainer:** Nathan Castaldi --- ## Table of Contents 1. [Infrastructure Overview](#infrastructure-overview) 2. [Critical Fixes](#critical-fixes) 3. [Lessons Learned](#lessons-learned) 4. [Network Map](#network-map) 5. [Active Tasks](#active-tasks) 6. [Emergency Procedures](#emergency-procedures) --- ## Infrastructure Overview ### Node Inventory | Node | IP Address | Hardware | Services | |------|------------|----------|----------| | **Heimdall** | 10.0.0.151 | Proxmox VM | Komodo Core, Gitea, Traefik | | **Waldorf** | 10.0.0.XXX | NVIDIA GTX 1060 | Plex, Tunarr | | **Watchtower** | 10.0.0.200 | Raspberry Pi | Komodo Periphery | | **TerraMaster** | 10.0.0.250 | NAS | NFS Storage (`/Volume1/appdata`) | ### Repository Structure ```text /nodes /heimdall /core # Komodo Core /gitea # Git Repository Server /waldorf /plex # Media Server (NVIDIA Optimized) /tunarr # Channel Management (GPU Passthrough) /watchtower # Komodo Periphery ``` --- ## Critical Fixes > ⚠️ **DO NOT REVERT THESE CONFIGURATIONS** ### 1. NFS Mount: Watchtower (Raspberry Pi) **Problem:** Permission Denied on `/mnt/appdata` despite matching UIDs. **Root Cause:** NFSv4 ID-domain mismatch between Pi and TerraMaster NAS. **Solution:** ```bash # /etc/fstab entry (Force NFSv3) 10.0.0.250:/Volume1/appdata /mnt/appdata nfs rw,nfsvers=3,hard,intr,x-systemd.automount,nolock 0 0 ``` **Mount Point Ownership:** ```bash # Set ownership WHILE UNMOUNTED sudo chown chester:chester /mnt/appdata ``` --- ### 2. Komodo Periphery Connectivity **Problem:** Hairpin NAT prevents `*.castaldifamily.com` access from internal nodes. **Solution:** - **Core URL (Internal):** `ws://10.0.0.151:9120` - **Key Paths:** `/config/keys/periphery.pub` - **Environment Variable:** `file:/config/keys/periphery.pub` --- ### 3. Gitea & GitOps **Problem:** SSH Key Exchange (Kex) errors on Windows (`diffie-hellman-group1-sha1`). **Solution:** ```bash # Use HTTPS instead of SSH git clone https://git.castaldifamily.com/nathan/homelab.git # Windows Credential Storage git config --global credential.helper wincred # Cross-Platform Line Endings git config --global core.autocrlf true # Network Share Permissions git config --global safe.directory "*" ``` --- ### 4. GPU Passthrough (Plex/Waldorf) **Problem:** Plex sees GPU but doesn't use it for hardware transcoding. **Solution:** ```yaml # compose.yaml services: plex: runtime: nvidia deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu] ``` **Verification:** - Monitor Plex Dashboard for `(hw)` status during transcoding. --- ## Lessons Learned ### "NFSv4 is too smart" Modern NFS (v4) tries to sync user identities across a "Domain." If the Pi and NAS don't agree on the domain name, it defaults to `nobody`. **Fix:** Force NFSv3—it only checks UID numbers (1000). --- ### "Naked Mount Point" If the local folder (`/mnt/appdata`) is owned by `root`, you can't "pass through" to see NAS data once it mounts. **Fix:** `chown` the mount point to the user **while unmounted**. --- ### "Hairpin NAT" Many routers won't let internal traffic go out to a public IP and then back in (Hairpinning). **Fix:** Use **Internal IPs** (`10.0.0.X`) for node-to-node communication. --- ### "GPU Passthrough" Docker isolation is strict. Simply having drivers on the host isn't enough. **Fix:** Use `deploy: resources: reservations` block in Compose to "hand the keys" of the hardware to the container. --- ## Network Map | Service | Protocol | Internal Address | External URL | |---------|----------|------------------|--------------| | **Komodo Core** | HTTP | `10.0.0.151:9000` | `komodo.castaldifamily.com` | | **Gitea** | HTTPS | `10.0.0.151:3000` | `git.castaldifamily.com` | | **Plex** | Host Network | `10.0.0.XXX:32400` | `plex.castaldifamily.com` | | **Tunarr** | HTTP | `10.0.0.XXX:8000` | `tunarr.castaldifamily.com` | --- ## Active Tasks ### Current Focus 1. **Git-ify Stacks** - ✅ `plex` and `tunarr` pushed to Gitea - ⏳ Convert remaining "Manual" stacks to "Git" sources in Komodo 2. **Webhooks** - ⏳ Ensure Gitea Webhooks fire to Komodo Stack URLs for auto-deployment 3. **Hardware Transcoding** - ⏳ Monitor Waldorf for `(hw)` status in Plex --- ## Emergency Procedures ### 🔥 NFS Mount Failure (Watchtower) ```bash # Check NFS Server ping 10.0.0.250 # Remount NFS Share sudo umount /mnt/appdata sudo mount -a # Verify Mount df -h | grep appdata ``` --- ### 🔥 Komodo Periphery Offline ```bash # Check Core Connectivity curl -v ws://10.0.0.151:9120 # Restart Periphery Container docker restart komodo-periphery docker logs -f komodo-periphery ``` --- ### 🔥 Plex Not Using GPU ```bash # Verify NVIDIA Runtime docker info | grep -i nvidia # Check GPU Access in Container docker exec -it plex nvidia-smi ``` --- ### 🔥 Git Authentication Failure ```bash # Regenerate Gitea App Token # Settings > Applications > Generate New Token # Update Credential Helper git config --global credential.helper wincred # Test Clone git clone https://git.castaldifamily.com/nathan/homelab.git ``` --- ## Credential Management - ❌ **DO NOT** store passwords in `compose.yaml` in Git repo - ✅ **DO** use Komodo Stack "Environment Variables" to inject secrets - ✅ **DO** use Gitea **App Tokens** for Git authentication (iPad/Windows) --- ## Maintenance Schedule | Task | Frequency | Notes | |------|-----------|-------| | Update Docker Images | Weekly | Via Komodo or Watchtower | | Backup Gitea | Weekly | `/data/gitea` directory | | Backup Plex Metadata | Monthly | `/config/Library` directory | | Check NFS Mount Health | Monthly | `df -h`, verify permissions | --- **End of Runbook**