169 lines
8.0 KiB
Markdown
169 lines
8.0 KiB
Markdown
# Plan: Homelab MCP Gateway MVP with Traefik Shard
|
|
|
|
## TL;DR
|
|
|
|
Build a modular MCP (Model Context Protocol) Gateway on Waldorf that routes tool requests to specialized shards. MVP includes the Traefik shard (for dynamic route management) plus a template for creating additional shards. Each shard can fetch its service's documentation from the internet on-demand.
|
|
|
|
**Approach:** Python-based using mcp.server.fastmcp, deploy via single docker-compose on Waldorf, no authentication (trust internal network), web fetching for live documentation.
|
|
|
|
---
|
|
|
|
## Steps
|
|
|
|
### Phase 1: Infrastructure Setup
|
|
|
|
1. Create unified directory structure on Waldorf
|
|
- `/nodes/waldorf/mcp-system/` with single compose.yaml
|
|
- `/nodes/waldorf/mcp-system/gateway/` for Gateway code
|
|
- `/nodes/waldorf/mcp-system/traefik-shard/` for Traefik Shard code
|
|
|
|
2. Create shared template directory (*parallel with step 1*)
|
|
- `/mcp_root/template/` for shard template files
|
|
- Documentation: `/mcp_root/template/README.md`
|
|
|
|
### Phase 2: Gateway Implementation
|
|
|
|
3. Build Gateway core functionality (*depends on step 1*)
|
|
- Shard registry (discover and register shards)
|
|
- Tool routing (forward requests to appropriate shard)
|
|
- Health check aggregation
|
|
- Startup logic to discover available shards
|
|
|
|
4. Create Gateway Dockerfile and requirements.txt (*parallel with step 3*)
|
|
- Python 3.11 base image
|
|
- Install mcp, httpx, pyyaml
|
|
|
|
### Phase 3: Traefik Shard Implementation
|
|
|
|
5. Implement Traefik Shard with 7 tools (*depends on step 1*)
|
|
- `list_routes` - Query Traefik API for all routes
|
|
- `create_route` - Write new YAML file to `/dynamic/mcp-managed/`
|
|
- `delete_route` - Remove route YAML file
|
|
- `validate_config` - YAML syntax check + Traefik API validation
|
|
- `get_backend_status` - Health check backend services
|
|
- `check_ssl_status` - Query Traefik API for cert info
|
|
- `reload_config` - Trigger Traefik config reload (if needed)
|
|
|
|
6. Add documentation fetcher to Traefik Shard (*parallel with step 5*)
|
|
- Tool: `get_traefik_docs(topic)` - Fetch from docs.traefik.io
|
|
- Use httpx to fetch and cache temporarily
|
|
- Parse HTML/Markdown for relevant sections
|
|
|
|
7. Implement shard registration with Gateway (*depends on step 5*)
|
|
- Health endpoint for Gateway discovery
|
|
- Tool manifest endpoint (list available tools)
|
|
|
|
8. Create Traefik Shard Dockerfile and requirements.txt (*depends on step 5*)
|
|
- Python 3.11 base image
|
|
- Install mcp, httpx, pyyaml, beautifulsoup4
|
|
|
|
9. Create unified docker-compose.yaml (*depends on steps 4, 8*)
|
|
- Gateway service with appdata mount
|
|
- Traefik Shard service with NFS mount to `/mnt/appdata/traefik/dynamic:rw`
|
|
- Shared Docker network for inter-shard communication
|
|
- Environment: `TRAEFIK_API_URL=http://10.0.0.151:8080/api` (reach Heimdall)
|
|
|
|
### Phase 4: Prepare Traefik Integration
|
|
|
|
10. Create `/mnt/appdata/traefik/dynamic/mcp-managed/` directory (*depends on step 9*)
|
|
- Isolated folder for MCP-managed routes (safer, easier cleanup)
|
|
- Traefik file watcher will auto-detect changes here
|
|
|
|
11. Verify Traefik allows write access (*parallel with step 10*)
|
|
- Confirm NFS mount on Waldorf allows writes to `/mnt/appdata/traefik/dynamic/`
|
|
- If needed, update Traefik mount from `:ro` to `:rw` in `nodes/heimdall/core/compose.yaml`
|
|
|
|
### Phase 5: Shard Template Creation
|
|
|
|
12. Create comprehensive shard template (*depends on steps 5-7*)
|
|
- `template/shard_template.py` - Skeleton MCP server
|
|
- `template/Dockerfile.template` - Standard container build
|
|
- `template/compose.yaml.template` - Docker compose service boilerplate
|
|
- `template/requirements.txt` - Common dependencies
|
|
|
|
13. Write template documentation (*parallel with step 12*)
|
|
- `/mcp_root/template/README.md` - How to create a new shard
|
|
- `/mcp_root/template/INTEGRATION.md` - How shards register with Gateway
|
|
- `/mcp_root/ARCHITECTURE.md` - Overall system design
|
|
|
|
### Phase 6: Deployment & Validation
|
|
|
|
14. Deploy unified MCP system on Waldorf (*depends on steps 9, 10*)
|
|
- `docker compose up` in `/nodes/waldorf/mcp-system/`
|
|
- Verify Gateway logs show successful startup and shard discovery
|
|
- Verify Traefik Shard registers successfully
|
|
|
|
15. Test tool execution (*depends on step 14*)
|
|
- Gateway → list_routes → Traefik Shard → Traefik API (Heimdall)
|
|
- Create test route for validation
|
|
- Verify documentation fetcher works
|
|
|
|
16. Integration with Open WebUI (*depends on step 15*)
|
|
- Update `/nodes/waldorf/openwebui/compose.yaml` to connect to MCP Gateway
|
|
- Configure MCP Gateway connection in Open WebUI (localhost since same host)
|
|
- Test end-to-end LLM → Gateway → Shard flow
|
|
|
|
---
|
|
|
|
## Relevant Files
|
|
|
|
- `ansible/archive/scripts/ansible_mcp_server.py` - Reference implementation showing MCP server patterns, job tracking, configuration
|
|
- `nodes/heimdall/core/compose.yaml` - Contains Traefik service definition (lines 10-50), needs mount permission update
|
|
- `nodes/waldorf/openwebui/compose.yaml` - Open WebUI config with commented MCP Gateway integration (lines 15-17)
|
|
- `ansible/archive/outputs/heimdall-baseline-20260312T214117/traefik_configs/traefik.yml` - Static Traefik config showing API endpoint, providers, file watch
|
|
- `ansible/archive/outputs/heimdall-baseline-20260312T214117/traefik_configs/static-backends.yml` - Example dynamic route structure to replicate
|
|
- `ansible/archive/outputs/heimdall-baseline-20260312T214117/traefik_configs/middleware.yml` - Existing middleware definitions to reference
|
|
|
|
---
|
|
|
|
## Verification
|
|
|
|
1. **Gateway Health Check**: `curl http://10.0.0.251:9100/health` returns shard registry
|
|
2. **Shard Registration**: Gateway logs show Traefik shard discovered and registered
|
|
3. **Tool Execution**: Call `list_routes` through Gateway, receive Traefik API response
|
|
4. **Route Creation**: Create test route `test.castaldifamily.com` → Appears in Traefik dashboard
|
|
5. **Documentation Fetcher**: Call `get_traefik_docs("middlewares")` → Returns relevant Traefik docs
|
|
6. **File Validation**: Check `/mnt/appdata/traefik/dynamic/mcp-managed/` contains created routes
|
|
7. **Traefik Reload**: Verify Traefik auto-detects new YAML files (file watch enabled)
|
|
8. **Open WebUI Integration**: Send message in Open WebUI that triggers MCP tool → See logs in Gateway
|
|
9. **Template Usability**: Follow template README to create a stub "Dozzle Shard" → Registers successfully
|
|
|
|
---
|
|
|
|
## Decisions
|
|
|
|
- **Language**: Python (mcp.server.fastmcp) - matches existing Ansible MCP server pattern
|
|
- **Deployment Location**: All components on Waldorf (10.0.0.251) - stable 24/7 node with 16GB RAM, runs Open WebUI
|
|
- **Single Compose File**: Gateway + all shards in one docker-compose.yaml - simpler MVP, easier debugging
|
|
- **Traefik Access**: Shard reaches Traefik API on Heimdall via `http://10.0.0.151:8080/api`, writes to shared NFS mount `/mnt/appdata/traefik/dynamic/`
|
|
- **Authentication**: None for MVP - trust internal network isolation (add in future if needed)
|
|
- **Documentation Fetching**: On-demand web fetching using httpx - fetch from official service docs when tool is called
|
|
- **Route Management**: Create isolated `/mcp-managed/` subdirectory in Traefik dynamic config - safer than mixing with existing routes
|
|
- **All 7 Traefik tools included**: list_routes, create_route, delete_route, validate_config, get_backend_status, check_ssl_status, reload_config
|
|
|
|
---
|
|
|
|
## Scope Boundaries
|
|
|
|
**Included:**
|
|
- MCP Gateway with shard discovery and routing
|
|
- Complete Traefik shard with 7 tools + documentation fetcher
|
|
- Comprehensive template for creating new shards
|
|
- Integration with Open WebUI
|
|
- Single docker-compose deployment on Waldorf
|
|
|
|
**Excluded:**
|
|
- Additional shards (Dozzle, Authentik) - future work, use template to create
|
|
- Authentication/authorization - trust network for MVP
|
|
- Monitoring/metrics collection - add later if needed
|
|
- Web UI for Gateway management - CLI/API only for MVP
|
|
- Advanced caching for documentation - simple in-memory cache only
|
|
- Cross-node service mesh networking - direct HTTP between containers
|
|
- Ansible playbook for automated deployment - manual docker compose for MVP
|
|
|
|
---
|
|
|
|
## Further Considerations
|
|
|
|
None - all clarifications obtained. Ready for implementation.
|