- Update Git-crypt migration guide with detailed phase breakdown and time estimates - Expand prompt distribution plan with implementation options and timelines
19 KiB
Ansible Control Node Setup: Path to Production Readiness
Overview
Transform Watchtower (Raspberry Pi 5) into a production-ready Ansible control node capable of managing the entire homelab infrastructure. This guide builds the foundational runtime environment required to execute automation against Heimdall, Waldorf, and Watchtower itself.
Control Node: Watchtower (10.0.0.200) — Raspberry Pi 5, ARM Cortex-A76, 16GB RAM
Managed Nodes: Heimdall (10.0.0.151), Waldorf (10.0.0.251), Watchtower (localhost)
End State: Fully configured Ansible environment with validated connectivity, encrypted secrets, and role scaffolding.
Estimated Time to Complete: 2-3 hours (first-time setup) | 45-60 minutes (experienced operator)
Time Breakdown by Phase
| Phase | Description | Time Estimate |
|---|---|---|
| Phase 1 | Control Node Foundation | 20-30 minutes |
| Phase 2 | Ansible Project Configuration | 25-35 minutes |
| Phase 3 | Validation & First Automation | 15-25 minutes |
| Phase 4 | Role Scaffolding & Developer Experience | 20-30 minutes |
| Phase 5 | Final Verification & Documentation | 15-20 minutes |
| Total | End-to-End Setup | 2-3 hours |
Prerequisites
- SSH access to Watchtower as
chester(or your primary user) - Watchtower has network access to Heimdall, Waldorf, and TerraMaster NAS
- Git repository cloned to Watchtower at
/home/chester/homelab(or similar) - Sudo privileges on Watchtower
- Basic understanding of YAML syntax
- VSCode with Remote-SSH extension (optional, but recommended)
Phase 1: Control Node Foundation (Watchtower Setup)
Estimated Time: 20-30 minutes
Step 1: Install Ansible Toolchain
Time: 10-15 minutes (depends on network speed)
Connect to Watchtower via SSH and install the complete Ansible stack:
# SSH to Watchtower
ssh chester@10.0.0.200
# Update package index
sudo apt update
# Install Ansible core components
sudo apt install -y ansible ansible-lint sshpass python3-pip python3-venv git
# Verify Ansible installation
ansible --version
# Expected: ansible [core 2.x.x] or newer
# Install Python API libraries
pip3 install proxmoxer requests --break-system-packages
# Verify ansible-lint
ansible-lint --version
# Expected: ansible-lint 6.x.x or newer
Why these tools:
ansible: Execution engineansible-lint: Code quality enforcement (aligns with.ansible-lintconfiguration)sshpass: Enables password-based initial SSH key deploymentproxmoxer: Required for Proxmox API automation (future state)python3-pip: Package manager for Python libraries
Step 2: Generate SSH Keys (ED25519)
Time: 2-3 minutes
Create the SSH key pair that Ansible will use for node authentication:
# Generate ED25519 key (modern, secure, fast)
ssh-keygen -t ed25519 -C "ansible@watchtower" -f ~/.ssh/id_ed25519 -N ""
# Set proper permissions
chmod 600 ~/.ssh/id_ed25519
chmod 644 ~/.ssh/id_ed25519.pub
# Verify key creation
ls -lh ~/.ssh/id_ed25519*
# Expected: Two files (private key and .pub)
# Display public key (for manual distribution if needed)
cat ~/.ssh/id_ed25519.pub
Security Note: The private key (id_ed25519) never leaves Watchtower. Only the .pub file is distributed to managed nodes.
Step 3: Distribute SSH Key to Managed Nodes
Time: 3-5 minutes
Deploy the public key to all nodes (including localhost for self-management):
# Deploy to Heimdall
ssh-copy-id -i ~/.ssh/id_ed25519.pub chester@10.0.0.151
# Deploy to Waldorf
ssh-copy-id -i ~/.ssh/id_ed25519.pub chester@10.0.0.251
# Deploy to localhost (Watchtower managing itself)
ssh-copy-id -i ~/.ssh/id_ed25519.pub chester@localhost
# Test passwordless authentication
ssh -i ~/.ssh/id_ed25519 chester@10.0.0.151 "hostname && exit"
# Expected: heimdall
ssh -i ~/.ssh/id_ed25519 chester@10.0.0.251 "hostname && exit"
# Expected: waldorf
ssh -i ~/.ssh/id_ed25519 chester@localhost "hostname && exit"
# Expected: watchtower
Troubleshooting: If ssh-copy-id fails, ensure:
- You can SSH to the target with password authentication first
- The target user has a
~/.sshdirectory with proper permissions (700) - The firewall allows SSH (port 22)
Step 4: Configure Passwordless Sudo (If Required)
Time: 5-7 minutes (per node)
If Ansible tasks require privilege escalation without password prompts:
# On EACH node (Heimdall, Waldorf, Watchtower), run:
sudo visudo
# Add this line (replace 'chester' with your username):
chester ALL=(ALL) NOPASSWD: ALL
# Save and exit (:wq in vi)
Alternative (More Secure): Use Ansible Vault to encrypt the sudo password and configure ansible_become_pass in inventory. See Step 7 below.
Phase 2: Ansible Project Configuration
Estimated Time: 25-35 minutes
Step 5: Create ansible.cfg
Time: 5-7 minutes
Navigate to the homelab repository on Watchtower and create the main configuration file:
cd ~/homelab/ansible
cat > ansible.cfg <<'EOF'
[defaults]
# Inventory Configuration
inventory = ./inventory/hosts.yml
host_key_checking = False
# SSH Behavior
remote_user = chester
private_key_file = ~/.ssh/id_ed25519
timeout = 30
forks = 3
# Output & Logging
stdout_callback = yaml
display_skipped_hosts = False
display_ok_hosts = True
log_path = ./ansible.log
# Vault Configuration
vault_password_file = ./.vault_pass
# Role Path
roles_path = ./roles
# Retry Configuration
retry_files_enabled = False
[privilege_escalation]
become = True
become_method = sudo
become_user = root
become_ask_pass = False
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s
pipelining = True
EOF
# Verify syntax
ansible-config dump --only-changed
Key Decisions:
host_key_checking = False: Simplifies homelab automation (acceptable for trusted private network)vault_password_file: Points to.vault_pass(created in Step 7)forks = 3: Limits parallel execution (prevents overwhelming Pi resources)pipelining = True: Performance optimization
Step 6: Create Inventory Structure
Time: 8-10 minutes
Define the three-node infrastructure with hybrid grouping (hardware type + function):
# Create inventory directory
mkdir -p ~/homelab/ansible/inventory/group_vars/all
# Create main inventory file
cat > ~/homelab/ansible/inventory/hosts.yml <<'EOF'
---
# Homelab Infrastructure Inventory
# Control Node: Watchtower (10.0.0.200)
all:
vars:
ansible_user: chester
ansible_ssh_private_key_file: ~/.ssh/id_ed25519
children:
# --- Hardware Hierarchy ---
proxmox_vms:
hosts:
heimdall:
ansible_host: 10.0.0.151
physical_servers:
hosts:
waldorf:
ansible_host: 10.0.0.251
# GPU passthrough capability
gpu_enabled: true
gpu_type: nvidia
raspberry_pi:
hosts:
watchtower:
ansible_host: localhost
ansible_connection: local
# --- Functional Hierarchy ---
infrastructure:
hosts:
heimdall:
watchtower:
media_servers:
hosts:
waldorf:
docker_hosts:
hosts:
heimdall:
waldorf:
watchtower:
EOF
# Validate inventory
ansible-inventory --list
ansible-inventory --graph
Inventory Design:
- Hardware groups (
proxmox_vms,physical_servers,raspberry_pi): Target based on architecture - Functional groups (
infrastructure,media_servers,docker_hosts): Target based on role - Localhost optimization: Watchtower uses
ansible_connection: local(no SSH overhead)
Step 7: Initialize Ansible Vault
Time: 12-15 minutes
Create encrypted storage for sensitive variables (passwords, API keys, tokens):
cd ~/homelab/ansible
# Create vault password file
echo "YourSecureVaultPassword123!" > .vault_pass
chmod 600 .vault_pass
# CRITICAL: Add to .gitignore
echo ".vault_pass" >> ../.gitignore
# Create encrypted variable file
cat > inventory/group_vars/all/vault.yml <<'EOF'
---
# Encrypted Secrets (Ansible Vault)
# Edit with: ansible-vault edit inventory/group_vars/all/vault.yml
vault_sudo_password: "YourSudoPasswordHere"
vault_nfs_password: "" # If NFS requires auth
vault_proxmox_api_token: "" # For future Proxmox automation
vault_gitea_token: "" # For Git automation
EOF
# Encrypt the file
ansible-vault encrypt inventory/group_vars/all/vault.yml
# Verify encryption
cat inventory/group_vars/all/vault.yml
# Expected: $ANSIBLE_VAULT... (encrypted content)
# Test decryption
ansible-vault view inventory/group_vars/all/vault.yml
# Expected: Original YAML content
Usage Pattern:
- Store all secrets in
vault.ymlwithvault_prefix - Reference in playbooks/roles as:
become_pass: "{{ vault_sudo_password }}" - Edit encrypted file:
ansible-vault edit inventory/group_vars/all/vault.yml
Phase 3: Validation & First Automation
Estimated Time: 15-25 minutes
Step 8: Create Connectivity Validation Playbook
Time: 8-10 minutes
Build a simple playbook to prove the entire stack works:
mkdir -p ~/homelab/ansible/playbooks
cat > ~/homelab/ansible/playbooks/validate-connectivity.yml <<'EOF'
---
- name: Ansible Environment Validation
hosts: all
gather_facts: true
tasks:
- name: Test ping module
ansible.builtin.ping:
- name: Display node facts
ansible.builtin.debug:
msg: |
Hostname: {{ ansible_hostname }}
OS: {{ ansible_distribution }} {{ ansible_distribution_version }}
Architecture: {{ ansible_architecture }}
Python: {{ ansible_python_version }}
- name: Test privilege escalation
ansible.builtin.command:
cmd: whoami
become: true
register: sudo_test
changed_when: false
- name: Verify sudo worked
ansible.builtin.assert:
that:
- sudo_test.stdout == "root"
success_msg: "Privilege escalation: PASS"
fail_msg: "Privilege escalation: FAIL"
- name: Check NFS mount (infrastructure nodes only)
ansible.builtin.stat:
path: /mnt/appdata
register: nfs_mount
when: inventory_hostname in groups['infrastructure']
- name: Display NFS status
ansible.builtin.debug:
msg: "NFS mount exists: {{ nfs_mount.stat.exists | default(false) }}"
when: inventory_hostname in groups['infrastructure']
EOF
# Validate playbook syntax
ansible-playbook playbooks/validate-connectivity.yml --syntax-check
# Lint check (must pass with zero errors)
ansible-lint playbooks/validate-connectivity.yml
Step 9: Execute Validation Playbook
Time: 7-15 minutes (includes troubleshooting)
Run the playbook to confirm end-to-end functionality:
cd ~/homelab/ansible
# Dry-run first (check mode)
ansible-playbook playbooks/validate-connectivity.yml --check
# Full execution
ansible-playbook playbooks/validate-connectivity.yml
# Expected output summary:
# heimdall : ok=6 changed=0 unreachable=0 failed=0
# waldorf : ok=6 changed=0 unreachable=0 failed=0
# watchtower : ok=6 changed=0 unreachable=0 failed=0
Success Criteria:
- All hosts return
okstatus (nounreachableorfailed) - Sudo test shows "Privilege escalation: PASS"
- Facts display correct OS/architecture for each node
Troubleshooting:
unreachable: Check SSH keys, network connectivity, firewallfailedon sudo: Verify passwordless sudo or Vault configuration- Lint errors: Fix YAML indentation, task naming, FQCN usage
Phase 4: Role Scaffolding & Developer Experience
Estimated Time: 20-30 minutes
Step 10: Create Standard Role Directory Structure
Time: 5-7 minutes
Generate the skeleton for reusable Ansible roles:
cd ~/homelab/ansible
# Create roles directory
mkdir -p roles
# Generate a sample role (follows .ansible-standards.md)
cat > roles/setup-docker.yml <<'EOF'
---
# Placeholder: This will become a proper role with:
# - roles/setup-docker/tasks/main.yml
# - roles/setup-docker/defaults/main.yml
# - roles/setup-docker/handlers/main.yml
# - roles/setup-docker/meta/main.yml
#
# For now, just structural validation.
EOF
# Create external roles directory (for Ansible Galaxy)
mkdir -p roles/external
# Update .ansible-lint exclusion (already configured)
grep -q "roles/external" .ansible-lint && echo "✅ Lint exclusion exists"
Next Steps (Future):
- Install Galaxy roles:
ansible-galaxy install geerlingguy.docker -p roles/external/ - Create custom roles following the patterns in
.ansible-standards.md - Use
moleculefor role testing (installed in Step 1)
Step 11: Configure VSCode Remote Development
Time: 15-20 minutes (includes extension installation)
Connect VSCode from your Windows workstation to Watchtower for seamless editing:
On Windows:
- Install Remote - SSH extension in VSCode
- Open Command Palette (
Ctrl+Shift+P) → "Remote-SSH: Connect to Host" - Enter:
chester@10.0.0.200 - VSCode opens a new window connected to Watchtower
- Navigate to
/home/chester/homelab/ansible - Install extensions on the remote (VSCode will prompt):
- Ansible (by Red Hat)
- YAML (by Red Hat)
Verify:
- Open
ansible.cfg→ Syntax highlighting works - Open
playbooks/validate-connectivity.yml→ Ansible linting shows in Problems panel - Terminal in VSCode → Runs commands directly on Watchtower
Phase 5: Final Verification & Documentation
Estimated Time: 15-20 minutes
Step 12: Execute Full Environment Test
Time: 10-12 minutes
Run comprehensive checks to certify the environment:
cd ~/homelab/ansible
# 1. Lint all playbooks
ansible-lint playbooks/*.yml
# 2. Configuration dump
ansible-config dump --only-changed
# 3. Inventory validation
ansible-inventory --list --yaml
# 4. Ad-hoc ping test
ansible all -m ping
# 5. Fact gathering test
ansible all -m setup -a "filter=ansible_distribution*"
# 6. Vault operations test
ansible-vault view inventory/group_vars/all/vault.yml
ansible-vault edit inventory/group_vars/all/vault.yml # Add a test variable, save, exit
# 7. Privilege escalation test
ansible all -m command -a "whoami" --become
# 8. Full playbook run
ansible-playbook playbooks/validate-connectivity.yml
Success State:
- ✅ Zero lint errors
- ✅ All nodes respond to
ping - ✅ Facts gathered from all hosts
- ✅ Vault encrypt/decrypt cycle works
- ✅ Sudo escalation succeeds
- ✅ Validation playbook completes with no failures
Step 13: Update Repository Documentation
Time: 5-8 minutes
Document the new Ansible capabilities:
cd ~/homelab
# Update main README
cat >> README.md <<'EOF'
## Ansible Automation
**Control Node:** Watchtower (10.0.0.200)
**Managed Nodes:** Heimdall, Waldorf, Watchtower
**Quick Start:**
```bash
# SSH to control node
ssh chester@10.0.0.200
# Run validation
cd ~/homelab/ansible
ansible-playbook playbooks/validate-connectivity.yml
# Ad-hoc commands
ansible all -m ping
ansible docker_hosts -m command -a "docker --version"
Configuration:
- Inventory:
ansible/inventory/hosts.yml - Main Config:
ansible/ansible.cfg - Secrets:
ansible/inventory/group_vars/all/vault.yml(encrypted)
Standards: See ansible/.ansible-standards.md for architectural patterns.
EOF
Commit the changes
git add ansible/ git commit -m "feat(ansible): complete control node setup on Watchtower
- Install ansible-core, ansible-lint, proxmoxer
- Generate ED25519 SSH keys and distribute to nodes
- Create ansible.cfg with vault integration
- Build YAML inventory with hardware + functional grouping
- Initialize Ansible Vault for secret management
- Create validate-connectivity.yml playbook
- Verify end-to-end automation capability
Environment tested and production-ready."
git push origin main
---
## Maintenance & Next Steps
### Ongoing Operations
**Update Ansible:**
```bash
sudo apt update && sudo apt upgrade ansible ansible-lint
Rotate Vault Password:
ansible-vault rekey inventory/group_vars/all/vault.yml
# Update .vault_pass file with new password
Add New Managed Node:
- Deploy SSH key:
ssh-copy-id -i ~/.ssh/id_ed25519.pub user@new-host - Add to
inventory/hosts.yml - Test:
ansible new-host -m ping
Future Enhancements
-
Proxmox Automation:
- Create playbook to manage VM creation/deletion via Proxmox API
- Use
proxmoxerlibrary (already installed)
-
Docker Stack Management:
- Ansible role to deploy Compose stacks (replacing manual Git pulls)
- Integration with Komodo API for automated deployments
-
System Maintenance:
- Scheduled playbook for OS updates (
apt update && upgrade) - NFS mount validation and auto-remediation
- Log rotation and backup verification
- Scheduled playbook for OS updates (
-
CI/CD Integration:
- Gitea webhook triggers Ansible playbook runs
- Automated testing via Molecule in Docker containers
Troubleshooting
Issue: "Host key verification failed"
Cause: SSH strict host checking is enabled despite ansible.cfg setting.
Fix:
# Clear known_hosts
rm ~/.ssh/known_hosts
# Force disable in SSH config
cat >> ~/.ssh/config <<EOF
Host 10.0.0.*
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
EOF
Issue: "Permission denied (publickey)"
Cause: SSH key not properly deployed to target node.
Fix:
# Re-deploy key manually
ssh-copy-id -i ~/.ssh/id_ed25519.pub chester@TARGET_IP
# Verify key is in authorized_keys
ssh chester@TARGET_IP "cat ~/.ssh/authorized_keys | grep ansible@watchtower"
Issue: "Vault password file not found"
Cause: .vault_pass missing or wrong permissions.
Fix:
# Recreate vault password file
echo "YourVaultPassword" > ~/homelab/ansible/.vault_pass
chmod 600 ~/homelab/ansible/.vault_pass
# Verify ansible.cfg points to it
grep vault_password_file ~/homelab/ansible/ansible.cfg
Issue: Lint errors on playbook execution
Cause: Code violates .ansible-lint safety profile rules.
Fix:
# Run linter to see specific violations
ansible-lint playbooks/YOUR_PLAYBOOK.yml
# Common fixes:
# - Use FQCN: ansible.builtin.command instead of 'command'
# - Add 'name:' to all tasks
# - Use changed_when/failed_when for shell/command tasks
# - Add check_mode support for idempotency testing
Summary Checklist
- Ansible toolchain installed on Watchtower
- ED25519 SSH keys generated and distributed
- Passwordless sudo configured (or Vault password set)
ansible.cfgcreated and validated- Inventory file with all three nodes defined
- Ansible Vault initialized and
.vault_passsecured - Validation playbook created and linted
- First playbook run successful (all hosts green)
- VSCode Remote-SSH connected to Watchtower
- Repository documentation updated
- Git commit pushed to Gitea
Environment Status: 🟢 PRODUCTION READY
Document Version: 1.0
Last Updated: April 12, 2026
Author: FrankGPT (Ansible Architect Mode)
Review Cycle: Quarterly or after infrastructure changes