# Plan: Homelab MCP Gateway MVP with Traefik Shard ## TL;DR Build a modular MCP (Model Context Protocol) Gateway on Waldorf that routes tool requests to specialized shards. MVP includes the Traefik shard (for dynamic route management) plus a template for creating additional shards. Each shard can fetch its service's documentation from the internet on-demand. **Approach:** Python-based using mcp.server.fastmcp, deploy via single docker-compose on Waldorf, no authentication (trust internal network), web fetching for live documentation. --- ## Steps ### Phase 1: Infrastructure Setup 1. Create unified directory structure on Waldorf - `/nodes/waldorf/mcp-system/` with single compose.yaml - `/nodes/waldorf/mcp-system/gateway/` for Gateway code - `/nodes/waldorf/mcp-system/traefik-shard/` for Traefik Shard code 2. Create shared template directory (*parallel with step 1*) - `/mcp_root/template/` for shard template files - Documentation: `/mcp_root/template/README.md` ### Phase 2: Gateway Implementation 3. Build Gateway core functionality (*depends on step 1*) - Shard registry (discover and register shards) - Tool routing (forward requests to appropriate shard) - Health check aggregation - Startup logic to discover available shards 4. Create Gateway Dockerfile and requirements.txt (*parallel with step 3*) - Python 3.11 base image - Install mcp, httpx, pyyaml ### Phase 3: Traefik Shard Implementation 5. Implement Traefik Shard with 7 tools (*depends on step 1*) - `list_routes` - Query Traefik API for all routes - `create_route` - Write new YAML file to `/dynamic/mcp-managed/` - `delete_route` - Remove route YAML file - `validate_config` - YAML syntax check + Traefik API validation - `get_backend_status` - Health check backend services - `check_ssl_status` - Query Traefik API for cert info - `reload_config` - Trigger Traefik config reload (if needed) 6. Add documentation fetcher to Traefik Shard (*parallel with step 5*) - Tool: `get_traefik_docs(topic)` - Fetch from docs.traefik.io - Use httpx to fetch and cache temporarily - Parse HTML/Markdown for relevant sections 7. Implement shard registration with Gateway (*depends on step 5*) - Health endpoint for Gateway discovery - Tool manifest endpoint (list available tools) 8. Create Traefik Shard Dockerfile and requirements.txt (*depends on step 5*) - Python 3.11 base image - Install mcp, httpx, pyyaml, beautifulsoup4 9. Create unified docker-compose.yaml (*depends on steps 4, 8*) - Gateway service with appdata mount - Traefik Shard service with NFS mount to `/mnt/appdata/traefik/dynamic:rw` - Shared Docker network for inter-shard communication - Environment: `TRAEFIK_API_URL=http://10.0.0.151:8080/api` (reach Heimdall) ### Phase 4: Prepare Traefik Integration 10. Create `/mnt/appdata/traefik/dynamic/mcp-managed/` directory (*depends on step 9*) - Isolated folder for MCP-managed routes (safer, easier cleanup) - Traefik file watcher will auto-detect changes here 11. Verify Traefik allows write access (*parallel with step 10*) - Confirm NFS mount on Waldorf allows writes to `/mnt/appdata/traefik/dynamic/` - If needed, update Traefik mount from `:ro` to `:rw` in `nodes/heimdall/core/compose.yaml` ### Phase 5: Shard Template Creation 12. Create comprehensive shard template (*depends on steps 5-7*) - `template/shard_template.py` - Skeleton MCP server - `template/Dockerfile.template` - Standard container build - `template/compose.yaml.template` - Docker compose service boilerplate - `template/requirements.txt` - Common dependencies 13. Write template documentation (*parallel with step 12*) - `/mcp_root/template/README.md` - How to create a new shard - `/mcp_root/template/INTEGRATION.md` - How shards register with Gateway - `/mcp_root/ARCHITECTURE.md` - Overall system design ### Phase 6: Deployment & Validation 14. Deploy unified MCP system on Waldorf (*depends on steps 9, 10*) - `docker compose up` in `/nodes/waldorf/mcp-system/` - Verify Gateway logs show successful startup and shard discovery - Verify Traefik Shard registers successfully 15. Test tool execution (*depends on step 14*) - Gateway → list_routes → Traefik Shard → Traefik API (Heimdall) - Create test route for validation - Verify documentation fetcher works 16. Integration with Open WebUI (*depends on step 15*) - Update `/nodes/waldorf/openwebui/compose.yaml` to connect to MCP Gateway - Configure MCP Gateway connection in Open WebUI (localhost since same host) - Test end-to-end LLM → Gateway → Shard flow --- ## Relevant Files - `ansible/archive/scripts/ansible_mcp_server.py` - Reference implementation showing MCP server patterns, job tracking, configuration - `nodes/heimdall/core/compose.yaml` - Contains Traefik service definition (lines 10-50), needs mount permission update - `nodes/waldorf/openwebui/compose.yaml` - Open WebUI config with commented MCP Gateway integration (lines 15-17) - `ansible/archive/outputs/heimdall-baseline-20260312T214117/traefik_configs/traefik.yml` - Static Traefik config showing API endpoint, providers, file watch - `ansible/archive/outputs/heimdall-baseline-20260312T214117/traefik_configs/static-backends.yml` - Example dynamic route structure to replicate - `ansible/archive/outputs/heimdall-baseline-20260312T214117/traefik_configs/middleware.yml` - Existing middleware definitions to reference --- ## Verification 1. **Gateway Health Check**: `curl http://10.0.0.251:9100/health` returns shard registry 2. **Shard Registration**: Gateway logs show Traefik shard discovered and registered 3. **Tool Execution**: Call `list_routes` through Gateway, receive Traefik API response 4. **Route Creation**: Create test route `test.castaldifamily.com` → Appears in Traefik dashboard 5. **Documentation Fetcher**: Call `get_traefik_docs("middlewares")` → Returns relevant Traefik docs 6. **File Validation**: Check `/mnt/appdata/traefik/dynamic/mcp-managed/` contains created routes 7. **Traefik Reload**: Verify Traefik auto-detects new YAML files (file watch enabled) 8. **Open WebUI Integration**: Send message in Open WebUI that triggers MCP tool → See logs in Gateway 9. **Template Usability**: Follow template README to create a stub "Dozzle Shard" → Registers successfully --- ## Decisions - **Language**: Python (mcp.server.fastmcp) - matches existing Ansible MCP server pattern - **Deployment Location**: All components on Waldorf (10.0.0.251) - stable 24/7 node with 16GB RAM, runs Open WebUI - **Single Compose File**: Gateway + all shards in one docker-compose.yaml - simpler MVP, easier debugging - **Traefik Access**: Shard reaches Traefik API on Heimdall via `http://10.0.0.151:8080/api`, writes to shared NFS mount `/mnt/appdata/traefik/dynamic/` - **Authentication**: None for MVP - trust internal network isolation (add in future if needed) - **Documentation Fetching**: On-demand web fetching using httpx - fetch from official service docs when tool is called - **Route Management**: Create isolated `/mcp-managed/` subdirectory in Traefik dynamic config - safer than mixing with existing routes - **All 7 Traefik tools included**: list_routes, create_route, delete_route, validate_config, get_backend_status, check_ssl_status, reload_config --- ## Scope Boundaries **Included:** - MCP Gateway with shard discovery and routing - Complete Traefik shard with 7 tools + documentation fetcher - Comprehensive template for creating new shards - Integration with Open WebUI - Single docker-compose deployment on Waldorf **Excluded:** - Additional shards (Dozzle, Authentik) - future work, use template to create - Authentication/authorization - trust network for MVP - Monitoring/metrics collection - add later if needed - Web UI for Gateway management - CLI/API only for MVP - Advanced caching for documentation - simple in-memory cache only - Cross-node service mesh networking - direct HTTP between containers - Ansible playbook for automated deployment - manual docker compose for MVP --- ## Further Considerations None - all clarifications obtained. Ready for implementation.