# SOP-002: Initial Infrastructure Deployment **Status:** Active **Created:** April 12, 2026 **Last Updated:** April 12, 2026 **Owner:** Nathan Castaldi **Applies To:** Fresh homelab deployments and disaster recovery scenarios --- ## Purpose Deploy the complete homelab infrastructure from a clean state using GitOps principles and automation. This SOP covers: - Secure repository setup with encrypted secrets - Ansible control node configuration - Core service deployment (Komodo, Traefik, Gitea, Redis) - Validation and health checks **Use Cases:** - New homelab initialization - Disaster recovery (full infrastructure rebuild) - Node replacement or migration --- ## Prerequisites ### Required Access - [ ] Physical or console access to all nodes (Heimdall, Waldorf, Watchtower) - [ ] GitHub account with access to `homelab` repository - [ ] Gitea credentials (if repository already hosted locally) - [ ] Root/sudo privileges on all nodes ### Required Infrastructure - [ ] Nodes have base OS installed (Debian/Ubuntu recommended) - [ ] Network connectivity between all nodes - [ ] NFS storage accessible at `10.0.0.250:/Volume1/appdata` - [ ] DNS/hosts file configured for node resolution - [ ] Internet access for package installation ### Security Requirements - [ ] Git-crypt symmetric key (if repository already encrypted) - [ ] Password manager for storing credentials - [ ] Secure workstation for handling keys and secrets --- ## Security & Pre-Deployment Setup ### Step 1: Prepare Your Workstation **Time:** 15-20 minutes 1. **Install Required Tools:** **Linux/MacOS:** ```bash # Install git-crypt brew install git-crypt # MacOS # OR sudo apt install git-crypt # Debian/Ubuntu # Verify installation git-crypt --version ``` **Windows (Git Bash/WSL):** ```bash # Download git-crypt binary curl -L https://github.com/AGWA/git-crypt/releases/download/0.7.0/git-crypt-0.7.0-x86_64.exe -o /usr/local/bin/git-crypt chmod +x /usr/local/bin/git-crypt ``` 2. **Configure Git Identity:** ```bash git config --global user.name "Your Name" git config --global user.email "your.email@domain.com" git config --global core.autocrlf true # Windows only ``` --- ### Step 2: Clone Repository & Initialize Secrets **Time:** 10-15 minutes 1. **Clone from Source:** **Option A: GitHub (Initial Clone):** ```bash cd ~/dev # Or your preferred code directory git clone https://github.com/your-username/homelab.git cd homelab ``` **Option B: Gitea (Production Environment):** ```bash cd ~/dev git clone https://git.castaldifamily.com/nathan/homelab.git cd homelab ``` 2. **Unlock Encrypted Secrets (If Repository Already Uses Git-crypt):** ```bash # Import the symmetric key (retrieve from password manager) git-crypt unlock /path/to/homelab-secrets.key # Verify decryption ls -lh nodes/heimdall/core/.env.secrets # File should be readable plaintext, not binary ``` **⚠️ Security Warning:** Store `homelab-secrets.key` in: - Password manager (1Password, Bitwarden, etc.) - Encrypted backup drive - **NEVER** commit it to the repository 3. **Initialize Git-crypt (First-Time Setup Only):** ```bash # If repository is NOT yet encrypted git-crypt init git-crypt export-key ~/homelab-secrets.key # Secure the key immediately chmod 600 ~/homelab-secrets.key ``` --- ## Ansible Control Node Setup ### Step 3: Configure Watchtower as Control Node **Time:** 25-35 minutes **Rationale:** Watchtower (Raspberry Pi 5) serves as the Ansible control node to manage all infrastructure, including itself. 1. **SSH to Watchtower:** ```bash ssh chester@10.0.0.200 ``` 2. **Install Ansible Toolchain:** ```bash # Update package index sudo apt update # Install Ansible and dependencies sudo apt install -y ansible ansible-lint sshpass python3-pip git # Install Python libraries pip3 install proxmoxer requests --break-system-packages # Verify installation ansible --version # Expected: ansible [core 2.x.x] ``` 3. **Generate SSH Keys for Automation:** ```bash # Generate ED25519 key (modern cryptography) ssh-keygen -t ed25519 -C "ansible@watchtower" -f ~/.ssh/id_ed25519 -N "" # Set proper permissions chmod 600 ~/.ssh/id_ed25519 chmod 644 ~/.ssh/id_ed25519.pub ``` 4. **Distribute Keys to All Nodes:** ```bash # Deploy to Heimdall ssh-copy-id -i ~/.ssh/id_ed25519.pub chester@10.0.0.151 # Deploy to Waldorf ssh-copy-id -i ~/.ssh/id_ed25519.pub chester@10.0.0.251 # Deploy to localhost (self-management) ssh-copy-id -i ~/.ssh/id_ed25519.pub chester@localhost ``` 5. **Validate Passwordless Authentication:** ```bash # Test each node ssh -i ~/.ssh/id_ed25519 chester@10.0.0.151 "hostname" # Expected: heimdall ssh -i ~/.ssh/id_ed25519 chester@10.0.0.251 "hostname" # Expected: waldorf ssh -i ~/.ssh/id_ed25519 chester@localhost "hostname" # Expected: watchtower ``` 6. **Clone Repository to Control Node:** ```bash cd ~ git clone https://git.castaldifamily.com/nathan/homelab.git cd homelab # Unlock secrets (if using git-crypt) # Transfer key securely via scp from workstation git-crypt unlock ~/homelab-secrets.key ``` --- ## Core Infrastructure Deployment ### Step 4: Deploy Core Stack on Heimdall **Time:** 20-30 minutes **Core Stack Components:** - Docker Socket Proxy (security boundary) - Traefik (reverse proxy with automatic SSL) - Redis (caching layer) - Komodo Core (container orchestration) **Deployment Method:** Manual Docker Compose (Ansible automation planned for future state) 1. **SSH to Heimdall:** ```bash ssh chester@10.0.0.151 ``` 2. **Install Docker & Docker Compose:** ```bash # Install Docker curl -fsSL https://get.docker.com -o get-docker.sh sudo sh get-docker.sh # Add user to docker group sudo usermod -aG docker $USER # Log out and back in for group to take effect exit ssh chester@10.0.0.151 # Verify Docker installation docker --version docker compose version ``` 3. **Create Komodo Directory Structure:** ```bash sudo mkdir -p /etc/komodo/{stacks,repos,volumes} sudo chown -R $USER:$USER /etc/komodo ``` 4. **Mount NFS Storage (If Required):** ```bash # Install NFS client sudo apt install -y nfs-common # Create mount point sudo mkdir -p /mnt/nas # Add to /etc/fstab (persistent mount) echo "10.0.0.250:/Volume1/appdata /mnt/nas nfs defaults,nfsvers=3 0 0" | sudo tee -a /etc/fstab # Mount immediately sudo mount -a # Verify mount df -h | grep nas ``` 5. **Clone Repository to Heimdall:** ```bash cd ~ git clone https://git.castaldifamily.com/nathan/homelab.git cd homelab # Unlock secrets if repository uses git-crypt git-crypt unlock ~/homelab-secrets.key ``` 6. **Deploy Core Stack:** ```bash cd ~/homelab/nodes/heimdall/core # Review configuration cat compose.yaml cat .env.secrets # Verify secrets are decrypted # Pull images docker compose pull # Start services in detached mode docker compose up -d # Monitor logs docker compose logs -f # Press Ctrl+C to exit log streaming ``` 7. **Verify Core Services:** ```bash # Check running containers docker ps # Expected containers: # - dockerproxy # - traefik # - redis # - komodo-core # Check health docker compose ps # All services should show "running" status ``` --- ## Validation & Health Checks ### Step 5: Service Verification **Time:** 15-20 minutes 1. **Test Internal Connectivity:** ```bash # From Heimdall # Test Komodo Core curl -I http://localhost:9000 # Expected: HTTP/1.1 200 OK # Test Redis docker exec -it redis redis-cli ping # Expected: PONG # Test Docker Socket Proxy curl http://localhost:2375/version # Expected: JSON response with Docker version ``` 2. **Test External Access (From Workstation):** ```bash # Test Traefik dashboard (if exposed) curl -I https://traefik.castaldifamily.com # Test Komodo Core UI curl -I https://komodo.castaldifamily.com # Expected: HTTP/2 200 ``` 3. **Verify Traefik SSL Certificates:** ```bash # SSH to Heimdall ssh chester@10.0.0.151 # Check Traefik logs for ACME certificate retrieval docker logs traefik 2>&1 | grep -i "certificate" # Verify cert storage ls -lh /etc/komodo/volumes/traefik/acme.json ``` 4. **Komodo Core Initial Configuration:** - Navigate to `https://komodo.castaldifamily.com` in browser - Complete first-time setup wizard - Create admin account - Add server nodes (Heimdall, Waldorf, Watchtower) --- ## Post-Deployment Configuration ### Step 6: Configure GitOps Integration **Time:** 20-25 minutes 1. **Install Komodo Periphery on Remote Nodes:** **On Waldorf (10.0.0.251):** ```bash ssh chester@10.0.0.251 # Install Docker curl -fsSL https://get.docker.com -o get-docker.sh sudo sh get-docker.sh sudo usermod -aG docker $USER # Create Komodo directory sudo mkdir -p /etc/komodo/{stacks,repos} sudo chown -R $USER:$USER /etc/komodo # Deploy Periphery (via Komodo UI or manually) # See Komodo documentation for Periphery setup ``` **On Watchtower (10.0.0.200):** ```bash # Repeat same process as Waldorf ``` 2. **Configure Repository Cloning in Komodo:** In Komodo UI: - Navigate to **Settings** → **Git Providers** - Add Gitea provider: - **URL:** `https://git.castaldifamily.com` - **Token:** Generate from Gitea Settings → Applications - Test connection 3. **Create Git-Linked Stacks:** For each service (Plex, Tunarr, etc.): - Navigate to **Stacks** → **New Stack** - Select **Git Repository** as source - Configure: - **Repo:** `nathan/homelab` - **Branch:** `main` - **Path:** `nodes/{node-name}/{service-name}` - **Compose File:** `compose.yaml` - Enable **Auto-Deploy on Push** 4. **Configure Gitea Webhooks:** In Gitea repository settings: - Navigate to **Settings** → **Webhooks** - Add webhook: - **URL:** `https://komodo.castaldifamily.com/api/webhook/pull-stack/{stack-id}` - **Secret:** From Komodo stack configuration - **Events:** Push events only - **Active:** Enabled --- ## Troubleshooting ### Common Issues **Issue:** `git-crypt unlock` fails with "File is not encrypted" **Resolution:** - Verify you're in the correct repository directory - Check if repository is actually using git-crypt: `git-crypt status` - Ensure `.gitattributes` file exists and defines encryption rules --- **Issue:** SSH key authentication fails to nodes **Resolution:** ```bash # Verify key permissions ls -lh ~/.ssh/id_ed25519 # Should be: -rw------- (600) # Test manual SSH with verbose logging ssh -vvv -i ~/.ssh/id_ed25519 chester@10.0.0.151 # Check authorized_keys on target node ssh chester@10.0.0.151 "cat ~/.ssh/authorized_keys" ``` --- **Issue:** Docker Compose fails with "network not found" **Resolution:** ```bash # Recreate default Docker networks docker network prune -f docker compose up -d --force-recreate ``` --- **Issue:** NFS mount fails with "Operation not permitted" **Resolution:** ```bash # Check NFS server exports showmount -e 10.0.0.250 # Force NFSv3 (avoid ID mapping issues) sudo mount -t nfs -o nfsvers=3 10.0.0.250:/Volume1/appdata /mnt/nas # Update fstab with explicit version # 10.0.0.250:/Volume1/appdata /mnt/nas nfs defaults,nfsvers=3 0 0 ``` --- ## Emergency Rollback ### Complete Stack Teardown If deployment fails and rollback is required: ```bash # On Heimdall cd ~/homelab/nodes/heimdall/core docker compose down -v # -v removes volumes (DESTRUCTIVE) # Preserve data (omit -v flag) docker compose down # Remove repository clone cd ~ rm -rf homelab ``` ### Restore Previous State ```bash # Re-clone repository at specific commit git clone https://git.castaldifamily.com/nathan/homelab.git cd homelab git checkout {commit-hash} # Hash before failed deployment # Unlock secrets and redeploy git-crypt unlock ~/homelab-secrets.key cd nodes/heimdall/core docker compose up -d ``` --- ## Success Criteria Deployment is **complete** when: - [ ] All core services running on Heimdall (Komodo, Traefik, Redis, Docker Proxy) - [ ] Komodo Periphery agents connected on Waldorf and Watchtower - [ ] Traefik SSL certificates issued and valid - [ ] Komodo UI accessible at `https://komodo.castaldifamily.com` - [ ] Git-linked stacks successfully pull from Gitea - [ ] Webhooks trigger automatic deployments on push - [ ] NFS mounts stable across all nodes - [ ] Ansible control node (Watchtower) can execute playbooks against all nodes --- ## Next Steps After successful deployment: 1. **Deploy Application Stacks:** - Use [SOP-001: Migrate Stack from UI to Git](SOP-001-Migrate-Stack-from-UI-to-Git.md) for each service - Prioritize critical services: Plex, Gitea, Tunarr 2. **Configure Backups:** - Implement automated Gitea repository backups - Schedule NFS snapshot retention policy - Export Komodo configuration regularly 3. **Security Hardening:** - Enable Traefik authentication for internal services - Configure fail2ban for SSH protection - Implement network segmentation (VLANs) 4. **Monitoring & Observability:** - Deploy Prometheus/Grafana stack - Configure health check endpoints - Set up uptime monitoring (Uptime Kuma) --- ## Related Documentation - [SOP-001: Migrate Stack from UI to Git](SOP-001-Migrate-Stack-from-UI-to-Git.md) - Convert existing services to GitOps - [KBA-001: Komodo GitOps Deployment Failures](../KBAs/KBA-001-Komodo-GitOps-Stack-Deployment-Failures.md) - Troubleshooting guide - [plan-ansibleSetup.md](../plans/plan-ansibleSetup.md) - Detailed Ansible control node configuration - [plan-gitcryptMigration.md](../plans/plan-gitcryptMigration.md) - Comprehensive git-crypt setup guide - [TECHNICAL_RUNBOOK.md](../TECHNICAL_RUNBOOK.md) - Emergency procedures and reference --- ## Revision History | Date | Version | Change Description | |------|---------|-------------------| | 2026-04-12 | 1.0 | Initial SOP creation |