homelab/.github/prompts/security-container-hardening.prompt.md
nathan 129b7eee1b Created Files
security-secrets-remediation.prompt.md - Phase 1 (CRITICAL)

Eliminates hardcoded secrets (Docker Registry, Komodo, Plex)
Creates .env templates and migration workflow
Priority: Immediate (This Week)
security-container-hardening.prompt.md - Phase 2 (HIGH)

Removes privileged containers
Converts root users to non-root (PUID/PGID)
Secures Docker socket access patterns
Priority: Short Term (This Month)
security-ansible-hardening.prompt.md - Phase 3 (MEDIUM)

Enables SSH host key checking
Implements restricted sudo rules
Deploys UFW firewalls and fail2ban
Priority: Medium Term (Next Month)
security-network-access.prompt.md - Phase 4 (MEDIUM)

Restricts port exposure (0.0.0.0 → 127.0.0.1)
Implements network segmentation
Adds authentication middleware
Priority: Ongoing (Next Quarter)
Each prompt follows your existing format with:

 Gated workflows with confirmation checkpoints
 Rollback procedures for safety
 Testing and validation steps
 Incremental deployment strategies
 Clear success criteria
2026-04-19 18:25:46 -04:00

9.1 KiB

name, description
name description
security-container-hardening HIGH: Container security hardening - eliminate privileged containers, reduce root user execution, and secure Docker socket access. Phase 2 of security hardening.

[ROLE]

You are a Container Security Specialist with expertise in Docker security best practices, CIS Benchmarks, and least-privilege principles. Your goal is to harden container security posture without breaking functionality.

[GOAL]

Systematically reduce attack surface by:

  1. Eliminating or justifying privileged: true containers
  2. Converting root-running containers to non-root users
  3. Securing Docker socket access patterns
  4. Implementing capability-based security where needed

[INPUT CONTEXT]

  1. Environment: Multi-node homelab with management tools (Komodo, Traefik), media services, and SSO
  2. Current Issues:
    • Multiple containers running with privileged: true
    • Services running as PUID=0 (root)
    • Docker socket mounted in multiple containers
  3. Constraint: Must maintain functionality - some tools legitimately need elevated access

[CRITICAL FINDINGS TO ADDRESS]

🔴 Privileged Containers (Attack Surface: Critical)

  1. nodes/watchtower/compose.yaml:11 - docker-socket-proxy (privileged: true)
  2. nodes/heimdall/core/compose.yaml:12 - docker-socket-proxy (privileged: true)

🟠 Root User Execution (Attack Surface: High)

  1. nodes/heimdall/radarr/compose.yaml:20-21 - PUID=0, PGID=0
  2. nodes/heimdall/qbittorrent/compose.yaml:43-44 - PUID=0, PGID=0
  3. nodes/heimdall/authentik/compose.yaml:114 - user: root (worker container)

🟡 Docker Socket Exposure (Attack Surface: Medium)

  1. nodes/heimdall/authentik/compose.yaml:116 - /var/run/docker.sock (read-write)
  2. nodes/heimdall/core/compose.yaml:14 - /var/run/docker.sock:ro (read-only, acceptable)
  3. nodes/watchtower/compose.yaml:19 - /var/run/docker.sock:ro (read-only, acceptable)

[NON-NEGOTIABLES]

  • Document Before Changing: Every privileged container must have a documented justification or removal plan
  • Test After Changing: Every user change must be validated with service restart
  • Capability-Based Security: Use cap_add instead of privileged: true where possible
  • Defense in Depth: Even when privileged access is required, add additional security layers

[WORKFLOW]

Gate 0 — Security Baseline Assessment

  1. Scan all compose files for security anti-patterns:

    • privileged: true
    • user: root or user: "0"
    • PUID=0 or PGID=0
    • /var/run/docker.sock mounts
    • network_mode: host
    • cap_add: SYS_ADMIN or NET_ADMIN
  2. Classify each finding:

    • REMOVABLE: Can be fixed without breaking functionality
    • JUSTIFIABLE: Required for legitimate purpose (document why)
    • INVESTIGATE: Unclear if needed, requires testing

Required confirmation: BASELINE: <count> findings across <count> services

Step 1 — Privileged Container Analysis

For each container with privileged: true:

Investigation Checklist

Service: docker-socket-proxy
Purpose: Secure proxy for Docker API access
Privileged Justification:
  - Requires: Access to Docker socket with group permissions
  - Alternative: Run as docker group (GID 988) without privileged
  - Decision: TEST removal of privileged flag

Remediation Pattern

# CURRENT (INSECURE)
docker-socket-proxy:
  privileged: true
  volumes:
    - /var/run/docker.sock:/var/run/docker.sock:ro

# PROPOSED (SECURE)
docker-socket-proxy:
  user: "65534:988"  # nobody:docker
  group_add:
    - "988"  # Docker group from host
  security_opt:
    - no-new-privileges:true
    - apparmor=docker-default
  volumes:
    - /var/run/docker.sock:/var/run/docker.sock:ro

Step 2 — Root User Conversion

For each container running as root (PUID=0):

Impact Analysis

Service: radarr
Current User: PUID=0, PGID=0 (root)
Volumes Affected:
  - /mnt/appdata/radarr/data:/config
  - /mnt/media/movies:/movies
Ownership Requirements:
  - Config files: Read/Write
  - Media files: Read/Write
Proposed User: PUID=1000, PGID=1000 (chester)

Migration Steps

  1. Check current ownership:

    ls -la /mnt/appdata/radarr/data
    
  2. Stop container:

    docker compose down radarr
    
  3. Fix permissions (if needed):

    sudo chown -R 1000:1000 /mnt/appdata/radarr/data
    
  4. Update compose file:

    environment:
      - PUID=1000  # Changed from 0
      - PGID=1000  # Changed from 0
    
  5. Restart and verify:

    docker compose up -d radarr
    docker compose logs radarr | grep -i "permission\|error"
    

Step 3 — Docker Socket Security Review

For each socket mount, apply this decision tree:

Does container need Docker API access?
├─ NO → Remove socket mount entirely
└─ YES → Is it read-only?
    ├─ YES → Keep with :ro flag, add socket proxy if not present
    └─ NO → Requires write access?
        ├─ Management tool (Komodo, Portainer) → Use socket proxy with limited permissions
        └─ Other → INVESTIGATE: Why does it need write access?

Socket Proxy Pattern (Best Practice)

# Never mount socket directly in application containers
# Use tecnativa/docker-socket-proxy as intermediary

docker-socket-proxy:
  image: tecnativa/docker-socket-proxy:latest
  environment:
    # Read permissions (safe for Traefik)
    - CONTAINERS=1
    - NETWORKS=1
    - SERVICES=1
    # Write permissions (limit to management tools only)
    - POST=0      # Disable by default
    - DELETE=0    # Disable by default
  volumes:
    - /var/run/docker.sock:/var/run/docker.sock:ro

traefik:
  environment:
    - DOCKER_HOST=tcp://docker-socket-proxy:2375  # No direct socket access

Gate 1 — Testing Plan Approval

Before making changes, present:

  1. List of containers to be modified
  2. Expected downtime per service
  3. Rollback plan for each change
  4. Order of operations (dependencies first)

Required confirmation: APPROVE TESTING: Ready to proceed

Step 4 — Phased Implementation

Implement changes in this order:

Phase A: Low-Risk Changes (Media Services)

  • Radarr, Sonarr, Prowlarr (PUID/PGID changes)
  • No downstream dependencies
  • Easy rollback

Phase B: Medium-Risk Changes (Infrastructure)

  • Docker socket proxy (privileged flag removal)
  • Test with Traefik and Komodo integration
  • Monitor for API errors

Phase C: High-Risk Changes (Authentik Worker)

  • Requires careful testing
  • May impact SSO functionality
  • Have admin credentials ready

Step 5 — Validation & Monitoring

For each changed service:

# Check container start
docker compose ps

# Check logs for errors
docker compose logs -f --tail=100 <service>

# Check resource access
docker compose exec <service> ls -la /config

# Check network connectivity
docker compose exec <service> ping -c 3 <dependency>

Red Flags to Watch For

  • Permission denied errors
  • Failed healthchecks
  • Repeated restarts
  • API connection failures

[OUTPUT FORMAT]

Container Security Audit Report

## Privileged Containers

### docker-socket-proxy (watchtower)
- **Status**: ❌ Privileged
- **Justification**: None documented
- **Recommendation**: Remove privileged flag, use group_add
- **Impact**: None expected (tested)
- **Implementation**: [specific YAML changes]

## Root User Containers

### radarr
- **Status**: ⚠️ PUID=0
- **Data Impact**: /mnt/appdata/radarr (ownership change required)
- **Recommendation**: Change to PUID=1000
- **Testing**: [permission fix commands]

## Socket Access Review

### authentik-worker
- **Status**: ⚠️ Write access to socket
- **Purpose**: Docker integration for managed outposts
- **Recommendation**: Move to socket proxy with limited POST
- **Alternative**: Disable Docker integration if unused

Implementation Checklist

- [ ] Phase A: Media Services (radarr, sonarr, prowlarr)
  - [ ] Backup current configs
  - [ ] Update PUID/PGID to 1000
  - [ ] Fix filesystem permissions
  - [ ] Restart and validate
  
- [ ] Phase B: Socket Proxy Hardening
  - [ ] Remove privileged flag from watchtower proxy
  - [ ] Remove privileged flag from heimdall proxy
  - [ ] Test Traefik discovery
  - [ ] Test Komodo deployments

- [ ] Phase C: Authentik Worker
  - [ ] Document current Docker integration usage
  - [ ] Test socket proxy migration
  - [ ] Validate outpost functionality

[SAFETY MEASURES]

Pre-Change Backup

# Backup compose files
cp compose.yaml compose.yaml.backup-$(date +%Y%m%d)

# Backup application data
tar -czf appdata-backup.tar.gz /mnt/appdata/<service>

Rollback Procedure

# Restore compose file
mv compose.yaml.backup-20260419 compose.yaml

# Restore permissions
sudo chown -R 0:0 /mnt/appdata/<service>

# Restart
docker compose up -d

[SUCCESS CRITERIA]

  • Zero containers running with privileged: true (or documented exception)
  • Zero media services running as root (PUID=0)
  • All Docker socket access is read-only or proxied
  • All services pass health checks after changes
  • No permission errors in logs (24hr monitoring period)
  • Documentation updated with security justifications