docs: add comprehensive README for Castaldi Family Homelab
This commit is contained in:
parent
a23a8581ee
commit
3d7eba7044
319
README.md
Normal file
319
README.md
Normal file
@ -0,0 +1,319 @@
|
||||
# Castaldi Family Homelab
|
||||
|
||||
> **A GitOps-managed, self-hosted infrastructure running media services, container orchestration, and automation across distributed ARM and x86 nodes.**
|
||||
|
||||
[](https://komo.do)
|
||||
[](#architecture)
|
||||
[](documentation/)
|
||||
|
||||
---
|
||||
|
||||
## 🚀 Why This Homelab?
|
||||
|
||||
- **Zero-Touch Deployments:** Push to Git → Auto-deploy via webhooks → Containers update automatically
|
||||
- **Infrastructure as Code:** All services defined in version-controlled `compose.yaml` files
|
||||
- **GPU Transcoding:** Hardware-accelerated media streaming with NVIDIA GTX 1060
|
||||
- **Distributed Architecture:** Services intelligently distributed across VM, physical server, and Raspberry Pi
|
||||
- **Self-Hosted Git:** No external dependencies—Gitea runs on-premise with automated backups
|
||||
- **Production-Grade Networking:** Traefik reverse proxy with automatic SSL (Cloudflare DNS challenge)
|
||||
|
||||
---
|
||||
|
||||
## 🏗️ Architecture
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph Internet
|
||||
CF[Cloudflare DNS]
|
||||
end
|
||||
|
||||
subgraph "Heimdall (Proxmox VM - 10.0.0.151)"
|
||||
Traefik[Traefik Reverse Proxy<br/>:80, :443]
|
||||
Komodo[Komodo Core<br/>Container Orchestrator]
|
||||
Gitea[Gitea<br/>Self-Hosted Git]
|
||||
Redis[Redis Cache]
|
||||
end
|
||||
|
||||
subgraph "Waldorf (Physical Server - 10.0.0.251)"
|
||||
Plex[Plex Media Server<br/>GPU Transcoding]
|
||||
Tunarr[Tunarr<br/>IPTV Channels]
|
||||
GPU[NVIDIA GTX 1060]
|
||||
end
|
||||
|
||||
subgraph "Watchtower (Raspberry Pi 5 - 10.0.0.200)"
|
||||
Periphery[Komodo Periphery<br/>Remote Agent]
|
||||
end
|
||||
|
||||
subgraph "TerraMaster NAS (10.0.0.250)"
|
||||
NFS[NFS Storage<br/>/Volume1/appdata]
|
||||
end
|
||||
|
||||
CF -->|HTTPS| Traefik
|
||||
Traefik --> Gitea
|
||||
Traefik --> Komodo
|
||||
Traefik --> Plex
|
||||
Traefik --> Tunarr
|
||||
|
||||
Komodo <-->|WebSocket| Periphery
|
||||
Gitea -->|Webhook| Komodo
|
||||
|
||||
Plex --> GPU
|
||||
Tunarr --> GPU
|
||||
|
||||
Heimdall -.->|NFSv3| NFS
|
||||
Waldorf -.->|NFSv3| NFS
|
||||
Watchtower -.->|NFSv3| NFS
|
||||
|
||||
style Traefik fill:#326ce5,color:#fff
|
||||
style Komodo fill:#ff6b6b,color:#fff
|
||||
style GPU fill:#76b900,color:#fff
|
||||
style NFS fill:#f9a825,color:#000
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📦 Infrastructure Inventory
|
||||
|
||||
| Node | IP | Hardware | Role | Services |
|
||||
|------|------|----------|------|----------|
|
||||
| **Heimdall** | `10.0.0.151` | Proxmox VM<br/>Intel N100, 16GB RAM | Core Services | Komodo, Gitea, Traefik, Redis |
|
||||
| **Waldorf** | `10.0.0.251` | Physical Server<br/>i7-7820HQ, GTX 1060, 16GB | Media Processing | Plex, Tunarr (GPU transcoding) |
|
||||
| **Watchtower** | `10.0.0.200` | Raspberry Pi 5<br/>ARM Cortex-A76, 16GB | Periphery Node | Komodo Agent |
|
||||
| **TerraMaster** | `10.0.0.250` | NAS | Shared Storage | NFSv3 (`/Volume1/appdata`) |
|
||||
|
||||
---
|
||||
|
||||
## ⚡ Quick Start
|
||||
|
||||
### Prerequisites
|
||||
|
||||
- SSH access to nodes
|
||||
- Git configured with credentials:
|
||||
```bash
|
||||
git config --global credential.helper wincred # Windows
|
||||
git config --global core.autocrlf true
|
||||
```
|
||||
|
||||
### Clone & Deploy
|
||||
|
||||
```bash
|
||||
# Clone from self-hosted Gitea
|
||||
git clone https://git.castaldifamily.com/nathan/homelab.git
|
||||
cd homelab
|
||||
|
||||
# Deploy a service (via Komodo UI or SSH)
|
||||
ssh chester@10.0.0.251
|
||||
cd /etc/komodo/stacks/tunarr
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
### Automated GitOps Workflow
|
||||
|
||||
1. **Edit** `nodes/{node}/{service}/compose.yaml`
|
||||
2. **Commit** and push to `main` branch
|
||||
3. **Webhook** triggers Komodo pull
|
||||
4. **Auto-deploy** updates running containers
|
||||
|
||||
---
|
||||
|
||||
## 🎯 Active Missions
|
||||
|
||||
> **Traffic Light System:** 🟢 Complete | 🟡 In Progress | 🔴 Blocked
|
||||
|
||||
| Status | Mission | Details |
|
||||
|--------|---------|---------|
|
||||
| 🟢 | **GitOps Migration** | All production stacks migrated to Git-based deployment |
|
||||
| 🟢 | **Webhook Automation** | Gitea webhooks trigger auto-deploy on push |
|
||||
| 🟢 | **GPU Passthrough** | NVIDIA GTX 1060 accessible in Plex/Tunarr containers |
|
||||
| 🟢 | **Documentation Structure** | KBAs and SOPs organized in `documentation/` |
|
||||
| 🟡 | **Hardware Transcoding Validation** | Monitor Plex for `(hw)` indicator during active streams |
|
||||
| 🟢 | **NFS Mount Stability** | NFSv3 forced on Raspberry Pi to prevent ID-domain errors |
|
||||
| 🟢 | **Credential Security** | Secrets managed via Komodo Environment Variables (not Git) |
|
||||
|
||||
---
|
||||
|
||||
## 📂 Repository Structure
|
||||
|
||||
```
|
||||
homelab/
|
||||
├── nodes/ # Service definitions per node
|
||||
│ ├── heimdall/ # Core infrastructure (VM)
|
||||
│ │ ├── core/ # Komodo, Traefik, Redis
|
||||
│ │ └── gitea/ # Self-hosted Git
|
||||
│ ├── waldorf/ # Media services (Physical)
|
||||
│ │ ├── plex/ # Media server + GPU
|
||||
│ │ └── tunarr/ # IPTV channels + GPU
|
||||
│ └── watchtower/ # Periphery agent (Pi 5)
|
||||
├── documentation/ # Technical knowledge base
|
||||
│ ├── KBAs/ # Troubleshooting guides
|
||||
│ ├── SOPs/ # Operational procedures
|
||||
│ └── TECHNICAL_RUNBOOK.md # Emergency reference
|
||||
├── ansible/ # (Future) Automated provisioning
|
||||
└── scripts/ # Utility scripts
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Common Operations
|
||||
|
||||
### Deploy a New Stack
|
||||
|
||||
```bash
|
||||
# 1. Create directory structure
|
||||
mkdir -p nodes/waldorf/sonarr
|
||||
|
||||
# 2. Create compose.yaml
|
||||
cat > nodes/waldorf/sonarr/compose.yaml <<EOF
|
||||
services:
|
||||
sonarr:
|
||||
image: lscr.io/linuxserver/sonarr:latest
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- 8989:8989
|
||||
volumes:
|
||||
- /mnt/appdata/sonarr:/config
|
||||
EOF
|
||||
|
||||
# 3. Commit and push
|
||||
git add nodes/waldorf/sonarr/
|
||||
git commit -m "feat(stacks): add Sonarr to Waldorf"
|
||||
git push
|
||||
|
||||
# 4. Configure in Komodo UI
|
||||
# - Source Type: Git Repo
|
||||
# - Run Directory: nodes/waldorf/sonarr
|
||||
# - Deploy!
|
||||
```
|
||||
|
||||
### Check Service Status
|
||||
|
||||
```bash
|
||||
# Via Komodo API
|
||||
curl http://10.0.0.151:9000/api/stacks
|
||||
|
||||
# Direct SSH to node
|
||||
ssh chester@10.0.0.251
|
||||
docker ps | grep tunarr
|
||||
docker logs tunarr --tail 50
|
||||
```
|
||||
|
||||
### Emergency Rollback
|
||||
|
||||
```bash
|
||||
# In Komodo UI: Click "Rollback" on stack
|
||||
# Or via Git:
|
||||
git revert HEAD
|
||||
git push # Triggers auto-rollback
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📚 Documentation
|
||||
|
||||
| Document | Purpose |
|
||||
|----------|---------|
|
||||
| [TECHNICAL_RUNBOOK.md](documentation/TECHNICAL_RUNBOOK.md) | Infrastructure overview, emergency procedures, maintenance schedule |
|
||||
| [KBA-001](documentation/KBAs/KBA-001-Komodo-GitOps-Stack-Deployment-Failures.md) | Troubleshooting Git-linked stack failures |
|
||||
| [SOP-001](documentation/SOPs/SOP-001-Migrate-Stack-from-UI-to-Git.md) | Step-by-step guide to migrate stacks to GitOps |
|
||||
| [Node READMEs](nodes/) | Hardware specs and service details per node |
|
||||
|
||||
---
|
||||
|
||||
## 🛡️ Security & Best Practices
|
||||
|
||||
### Secrets Management
|
||||
|
||||
- ❌ **NEVER** commit passwords, API keys, or tokens to Git
|
||||
- ✅ **DO** use Komodo Environment Variables for secrets
|
||||
- ✅ **DO** use Gitea App Tokens for authentication (avoids SSH key exchange issues)
|
||||
|
||||
Example:
|
||||
```yaml
|
||||
# In Git (compose.yaml)
|
||||
environment:
|
||||
- PLEX_CLAIM=${PLEX_CLAIM} # Placeholder
|
||||
|
||||
# In Komodo UI → Stack → Environment Variables
|
||||
PLEX_CLAIM=claim-xxxxxxxxx
|
||||
```
|
||||
|
||||
### NFS Mount Configuration
|
||||
|
||||
**Critical:** Raspberry Pi requires NFSv3 (not v4) due to ID-domain mismatches:
|
||||
|
||||
```bash
|
||||
# /etc/fstab on Watchtower
|
||||
10.0.0.250:/Volume1/appdata /mnt/appdata nfs rw,nfsvers=3,hard,intr,x-systemd.automount,nolock 0 0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔥 Emergency Procedures
|
||||
|
||||
### NFS Mount Failure
|
||||
|
||||
```bash
|
||||
# Check connectivity
|
||||
ping 10.0.0.250
|
||||
|
||||
# Remount
|
||||
sudo umount /mnt/appdata
|
||||
sudo mount -a
|
||||
df -h | grep appdata
|
||||
```
|
||||
|
||||
### Komodo Periphery Offline
|
||||
|
||||
```bash
|
||||
# Check WebSocket connectivity
|
||||
curl -v ws://10.0.0.151:9120
|
||||
|
||||
# Restart agent
|
||||
docker restart komodo-periphery
|
||||
docker logs -f komodo-periphery
|
||||
```
|
||||
|
||||
### Traefik SSL Certificate Issues
|
||||
|
||||
```bash
|
||||
# Check Cloudflare API token
|
||||
docker exec traefik cat /etc/traefik/traefik.yml
|
||||
|
||||
# Force certificate renewal
|
||||
docker restart traefik
|
||||
docker logs traefik | grep -i "cloudflare\|certificate"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
This is a personal homelab, but documentation improvements and issue reports are welcome!
|
||||
|
||||
1. Fork via Gitea: `https://git.castaldifamily.com/nathan/homelab`
|
||||
2. Create feature branch: `git checkout -b feat/my-improvement`
|
||||
3. Commit using [Conventional Commits](https://www.conventionalcommits.org/)
|
||||
4. Push and create Pull Request
|
||||
|
||||
---
|
||||
|
||||
## 📊 Stats
|
||||
|
||||
- **Total Nodes:** 4 (3 compute + 1 storage)
|
||||
- **Container Orchestration:** Komodo v2.1.2
|
||||
- **Active Services:** 8+ (Traefik, Plex, Tunarr, Gitea, etc.)
|
||||
- **Total RAM:** 48GB (across compute nodes)
|
||||
- **GPU Acceleration:** NVIDIA GTX 1060 Mobile (6GB)
|
||||
- **Storage:** TerraMaster NAS + local node storage
|
||||
|
||||
---
|
||||
|
||||
## 📜 License
|
||||
|
||||
Personal infrastructure configuration. Documentation licensed under [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/).
|
||||
|
||||
---
|
||||
|
||||
**Maintained by:** Nathan Castaldi
|
||||
**Last Updated:** April 12, 2026
|
||||
**Status:** 🟢 Operational
|
||||
Loading…
x
Reference in New Issue
Block a user