homelab/.github/prompts/plan-homelabMcpGatewayMvp.prompt.md

8.0 KiB

Plan: Homelab MCP Gateway MVP with Traefik Shard

TL;DR

Build a modular MCP (Model Context Protocol) Gateway on Waldorf that routes tool requests to specialized shards. MVP includes the Traefik shard (for dynamic route management) plus a template for creating additional shards. Each shard can fetch its service's documentation from the internet on-demand.

Approach: Python-based using mcp.server.fastmcp, deploy via single docker-compose on Waldorf, no authentication (trust internal network), web fetching for live documentation.


Steps

Phase 1: Infrastructure Setup

  1. Create unified directory structure on Waldorf

    • /nodes/waldorf/mcp-system/ with single compose.yaml
    • /nodes/waldorf/mcp-system/gateway/ for Gateway code
    • /nodes/waldorf/mcp-system/traefik-shard/ for Traefik Shard code
  2. Create shared template directory (parallel with step 1)

    • /mcp_root/template/ for shard template files
    • Documentation: /mcp_root/template/README.md

Phase 2: Gateway Implementation

  1. Build Gateway core functionality (depends on step 1)

    • Shard registry (discover and register shards)
    • Tool routing (forward requests to appropriate shard)
    • Health check aggregation
    • Startup logic to discover available shards
  2. Create Gateway Dockerfile and requirements.txt (parallel with step 3)

    • Python 3.11 base image
    • Install mcp, httpx, pyyaml

Phase 3: Traefik Shard Implementation

  1. Implement Traefik Shard with 7 tools (depends on step 1)

    • list_routes - Query Traefik API for all routes
    • create_route - Write new YAML file to /dynamic/mcp-managed/
    • delete_route - Remove route YAML file
    • validate_config - YAML syntax check + Traefik API validation
    • get_backend_status - Health check backend services
    • check_ssl_status - Query Traefik API for cert info
    • reload_config - Trigger Traefik config reload (if needed)
  2. Add documentation fetcher to Traefik Shard (parallel with step 5)

    • Tool: get_traefik_docs(topic) - Fetch from docs.traefik.io
    • Use httpx to fetch and cache temporarily
    • Parse HTML/Markdown for relevant sections
  3. Implement shard registration with Gateway (depends on step 5)

    • Health endpoint for Gateway discovery
    • Tool manifest endpoint (list available tools)
  4. Create Traefik Shard Dockerfile and requirements.txt (depends on step 5)

    • Python 3.11 base image
    • Install mcp, httpx, pyyaml, beautifulsoup4
  5. Create unified docker-compose.yaml (depends on steps 4, 8)

    • Gateway service with appdata mount
    • Traefik Shard service with NFS mount to /mnt/appdata/traefik/dynamic:rw
    • Shared Docker network for inter-shard communication
    • Environment: TRAEFIK_API_URL=http://10.0.0.151:8080/api (reach Heimdall)

Phase 4: Prepare Traefik Integration

  1. Create /mnt/appdata/traefik/dynamic/mcp-managed/ directory (depends on step 9)

    • Isolated folder for MCP-managed routes (safer, easier cleanup)
    • Traefik file watcher will auto-detect changes here
  2. Verify Traefik allows write access (parallel with step 10)

    • Confirm NFS mount on Waldorf allows writes to /mnt/appdata/traefik/dynamic/
    • If needed, update Traefik mount from :ro to :rw in nodes/heimdall/core/compose.yaml

Phase 5: Shard Template Creation

  1. Create comprehensive shard template (depends on steps 5-7)

    • template/shard_template.py - Skeleton MCP server
    • template/Dockerfile.template - Standard container build
    • template/compose.yaml.template - Docker compose service boilerplate
    • template/requirements.txt - Common dependencies
  2. Write template documentation (parallel with step 12)

    • /mcp_root/template/README.md - How to create a new shard
    • /mcp_root/template/INTEGRATION.md - How shards register with Gateway
    • /mcp_root/ARCHITECTURE.md - Overall system design

Phase 6: Deployment & Validation

  1. Deploy unified MCP system on Waldorf (depends on steps 9, 10)

    • docker compose up in /nodes/waldorf/mcp-system/
    • Verify Gateway logs show successful startup and shard discovery
    • Verify Traefik Shard registers successfully
  2. Test tool execution (depends on step 14)

    • Gateway → list_routes → Traefik Shard → Traefik API (Heimdall)
    • Create test route for validation
    • Verify documentation fetcher works
  3. Integration with Open WebUI (depends on step 15)

    • Update /nodes/waldorf/openwebui/compose.yaml to connect to MCP Gateway
    • Configure MCP Gateway connection in Open WebUI (localhost since same host)
    • Test end-to-end LLM → Gateway → Shard flow

Relevant Files

  • ansible/archive/scripts/ansible_mcp_server.py - Reference implementation showing MCP server patterns, job tracking, configuration
  • nodes/heimdall/core/compose.yaml - Contains Traefik service definition (lines 10-50), needs mount permission update
  • nodes/waldorf/openwebui/compose.yaml - Open WebUI config with commented MCP Gateway integration (lines 15-17)
  • ansible/archive/outputs/heimdall-baseline-20260312T214117/traefik_configs/traefik.yml - Static Traefik config showing API endpoint, providers, file watch
  • ansible/archive/outputs/heimdall-baseline-20260312T214117/traefik_configs/static-backends.yml - Example dynamic route structure to replicate
  • ansible/archive/outputs/heimdall-baseline-20260312T214117/traefik_configs/middleware.yml - Existing middleware definitions to reference

Verification

  1. Gateway Health Check: curl http://10.0.0.251:9100/health returns shard registry
  2. Shard Registration: Gateway logs show Traefik shard discovered and registered
  3. Tool Execution: Call list_routes through Gateway, receive Traefik API response
  4. Route Creation: Create test route test.castaldifamily.com → Appears in Traefik dashboard
  5. Documentation Fetcher: Call get_traefik_docs("middlewares") → Returns relevant Traefik docs
  6. File Validation: Check /mnt/appdata/traefik/dynamic/mcp-managed/ contains created routes
  7. Traefik Reload: Verify Traefik auto-detects new YAML files (file watch enabled)
  8. Open WebUI Integration: Send message in Open WebUI that triggers MCP tool → See logs in Gateway
  9. Template Usability: Follow template README to create a stub "Dozzle Shard" → Registers successfully

Decisions

  • Language: Python (mcp.server.fastmcp) - matches existing Ansible MCP server pattern
  • Deployment Location: All components on Waldorf (10.0.0.251) - stable 24/7 node with 16GB RAM, runs Open WebUI
  • Single Compose File: Gateway + all shards in one docker-compose.yaml - simpler MVP, easier debugging
  • Traefik Access: Shard reaches Traefik API on Heimdall via http://10.0.0.151:8080/api, writes to shared NFS mount /mnt/appdata/traefik/dynamic/
  • Authentication: None for MVP - trust internal network isolation (add in future if needed)
  • Documentation Fetching: On-demand web fetching using httpx - fetch from official service docs when tool is called
  • Route Management: Create isolated /mcp-managed/ subdirectory in Traefik dynamic config - safer than mixing with existing routes
  • All 7 Traefik tools included: list_routes, create_route, delete_route, validate_config, get_backend_status, check_ssl_status, reload_config

Scope Boundaries

Included:

  • MCP Gateway with shard discovery and routing
  • Complete Traefik shard with 7 tools + documentation fetcher
  • Comprehensive template for creating new shards
  • Integration with Open WebUI
  • Single docker-compose deployment on Waldorf

Excluded:

  • Additional shards (Dozzle, Authentik) - future work, use template to create
  • Authentication/authorization - trust network for MVP
  • Monitoring/metrics collection - add later if needed
  • Web UI for Gateway management - CLI/API only for MVP
  • Advanced caching for documentation - simple in-memory cache only
  • Cross-node service mesh networking - direct HTTP between containers
  • Ansible playbook for automated deployment - manual docker compose for MVP

Further Considerations

None - all clarifications obtained. Ready for implementation.