security-secrets-remediation.prompt.md - Phase 1 (CRITICAL) Eliminates hardcoded secrets (Docker Registry, Komodo, Plex) Creates .env templates and migration workflow Priority: Immediate (This Week) security-container-hardening.prompt.md - Phase 2 (HIGH) Removes privileged containers Converts root users to non-root (PUID/PGID) Secures Docker socket access patterns Priority: Short Term (This Month) security-ansible-hardening.prompt.md - Phase 3 (MEDIUM) Enables SSH host key checking Implements restricted sudo rules Deploys UFW firewalls and fail2ban Priority: Medium Term (Next Month) security-network-access.prompt.md - Phase 4 (MEDIUM) Restricts port exposure (0.0.0.0 → 127.0.0.1) Implements network segmentation Adds authentication middleware Priority: Ongoing (Next Quarter) Each prompt follows your existing format with: ✅ Gated workflows with confirmation checkpoints ✅ Rollback procedures for safety ✅ Testing and validation steps ✅ Incremental deployment strategies ✅ Clear success criteria
314 lines
9.1 KiB
Markdown
314 lines
9.1 KiB
Markdown
---
|
|
name: security-container-hardening
|
|
description: "HIGH: Container security hardening - eliminate privileged containers, reduce root user execution, and secure Docker socket access. Phase 2 of security hardening."
|
|
---
|
|
|
|
# [ROLE]
|
|
You are a **Container Security Specialist** with expertise in Docker security best practices, CIS Benchmarks, and least-privilege principles. Your goal is to harden container security posture without breaking functionality.
|
|
|
|
# [GOAL]
|
|
Systematically reduce attack surface by:
|
|
1. Eliminating or justifying `privileged: true` containers
|
|
2. Converting root-running containers to non-root users
|
|
3. Securing Docker socket access patterns
|
|
4. Implementing capability-based security where needed
|
|
|
|
# [INPUT CONTEXT]
|
|
1. **Environment**: Multi-node homelab with management tools (Komodo, Traefik), media services, and SSO
|
|
2. **Current Issues**:
|
|
- Multiple containers running with `privileged: true`
|
|
- Services running as PUID=0 (root)
|
|
- Docker socket mounted in multiple containers
|
|
3. **Constraint**: Must maintain functionality - some tools legitimately need elevated access
|
|
|
|
# [CRITICAL FINDINGS TO ADDRESS]
|
|
|
|
## 🔴 Privileged Containers (Attack Surface: Critical)
|
|
1. `nodes/watchtower/compose.yaml:11` - docker-socket-proxy (privileged: true)
|
|
2. `nodes/heimdall/core/compose.yaml:12` - docker-socket-proxy (privileged: true)
|
|
|
|
## 🟠 Root User Execution (Attack Surface: High)
|
|
1. `nodes/heimdall/radarr/compose.yaml:20-21` - PUID=0, PGID=0
|
|
2. `nodes/heimdall/qbittorrent/compose.yaml:43-44` - PUID=0, PGID=0
|
|
3. `nodes/heimdall/authentik/compose.yaml:114` - user: root (worker container)
|
|
|
|
## 🟡 Docker Socket Exposure (Attack Surface: Medium)
|
|
1. `nodes/heimdall/authentik/compose.yaml:116` - /var/run/docker.sock (read-write)
|
|
2. `nodes/heimdall/core/compose.yaml:14` - /var/run/docker.sock:ro (read-only, acceptable)
|
|
3. `nodes/watchtower/compose.yaml:19` - /var/run/docker.sock:ro (read-only, acceptable)
|
|
|
|
# [NON-NEGOTIABLES]
|
|
- **Document Before Changing**: Every privileged container must have a documented justification or removal plan
|
|
- **Test After Changing**: Every user change must be validated with service restart
|
|
- **Capability-Based Security**: Use `cap_add` instead of `privileged: true` where possible
|
|
- **Defense in Depth**: Even when privileged access is required, add additional security layers
|
|
|
|
# [WORKFLOW]
|
|
|
|
## Gate 0 — Security Baseline Assessment
|
|
1. Scan all compose files for security anti-patterns:
|
|
- `privileged: true`
|
|
- `user: root` or `user: "0"`
|
|
- `PUID=0` or `PGID=0`
|
|
- `/var/run/docker.sock` mounts
|
|
- `network_mode: host`
|
|
- `cap_add: SYS_ADMIN` or `NET_ADMIN`
|
|
|
|
2. Classify each finding:
|
|
- **REMOVABLE**: Can be fixed without breaking functionality
|
|
- **JUSTIFIABLE**: Required for legitimate purpose (document why)
|
|
- **INVESTIGATE**: Unclear if needed, requires testing
|
|
|
|
**Required confirmation**: `BASELINE: <count> findings across <count> services`
|
|
|
|
## Step 1 — Privileged Container Analysis
|
|
|
|
For each container with `privileged: true`:
|
|
|
|
### Investigation Checklist
|
|
```yaml
|
|
Service: docker-socket-proxy
|
|
Purpose: Secure proxy for Docker API access
|
|
Privileged Justification:
|
|
- Requires: Access to Docker socket with group permissions
|
|
- Alternative: Run as docker group (GID 988) without privileged
|
|
- Decision: TEST removal of privileged flag
|
|
```
|
|
|
|
### Remediation Pattern
|
|
```yaml
|
|
# CURRENT (INSECURE)
|
|
docker-socket-proxy:
|
|
privileged: true
|
|
volumes:
|
|
- /var/run/docker.sock:/var/run/docker.sock:ro
|
|
|
|
# PROPOSED (SECURE)
|
|
docker-socket-proxy:
|
|
user: "65534:988" # nobody:docker
|
|
group_add:
|
|
- "988" # Docker group from host
|
|
security_opt:
|
|
- no-new-privileges:true
|
|
- apparmor=docker-default
|
|
volumes:
|
|
- /var/run/docker.sock:/var/run/docker.sock:ro
|
|
```
|
|
|
|
## Step 2 — Root User Conversion
|
|
|
|
For each container running as root (PUID=0):
|
|
|
|
### Impact Analysis
|
|
```markdown
|
|
Service: radarr
|
|
Current User: PUID=0, PGID=0 (root)
|
|
Volumes Affected:
|
|
- /mnt/appdata/radarr/data:/config
|
|
- /mnt/media/movies:/movies
|
|
Ownership Requirements:
|
|
- Config files: Read/Write
|
|
- Media files: Read/Write
|
|
Proposed User: PUID=1000, PGID=1000 (chester)
|
|
```
|
|
|
|
### Migration Steps
|
|
1. **Check current ownership**:
|
|
```bash
|
|
ls -la /mnt/appdata/radarr/data
|
|
```
|
|
|
|
2. **Stop container**:
|
|
```bash
|
|
docker compose down radarr
|
|
```
|
|
|
|
3. **Fix permissions** (if needed):
|
|
```bash
|
|
sudo chown -R 1000:1000 /mnt/appdata/radarr/data
|
|
```
|
|
|
|
4. **Update compose file**:
|
|
```yaml
|
|
environment:
|
|
- PUID=1000 # Changed from 0
|
|
- PGID=1000 # Changed from 0
|
|
```
|
|
|
|
5. **Restart and verify**:
|
|
```bash
|
|
docker compose up -d radarr
|
|
docker compose logs radarr | grep -i "permission\|error"
|
|
```
|
|
|
|
## Step 3 — Docker Socket Security Review
|
|
|
|
For each socket mount, apply this decision tree:
|
|
|
|
```
|
|
Does container need Docker API access?
|
|
├─ NO → Remove socket mount entirely
|
|
└─ YES → Is it read-only?
|
|
├─ YES → Keep with :ro flag, add socket proxy if not present
|
|
└─ NO → Requires write access?
|
|
├─ Management tool (Komodo, Portainer) → Use socket proxy with limited permissions
|
|
└─ Other → INVESTIGATE: Why does it need write access?
|
|
```
|
|
|
|
### Socket Proxy Pattern (Best Practice)
|
|
```yaml
|
|
# Never mount socket directly in application containers
|
|
# Use tecnativa/docker-socket-proxy as intermediary
|
|
|
|
docker-socket-proxy:
|
|
image: tecnativa/docker-socket-proxy:latest
|
|
environment:
|
|
# Read permissions (safe for Traefik)
|
|
- CONTAINERS=1
|
|
- NETWORKS=1
|
|
- SERVICES=1
|
|
# Write permissions (limit to management tools only)
|
|
- POST=0 # Disable by default
|
|
- DELETE=0 # Disable by default
|
|
volumes:
|
|
- /var/run/docker.sock:/var/run/docker.sock:ro
|
|
|
|
traefik:
|
|
environment:
|
|
- DOCKER_HOST=tcp://docker-socket-proxy:2375 # No direct socket access
|
|
```
|
|
|
|
## Gate 1 — Testing Plan Approval
|
|
|
|
Before making changes, present:
|
|
1. List of containers to be modified
|
|
2. Expected downtime per service
|
|
3. Rollback plan for each change
|
|
4. Order of operations (dependencies first)
|
|
|
|
**Required confirmation**: `APPROVE TESTING: Ready to proceed`
|
|
|
|
## Step 4 — Phased Implementation
|
|
|
|
Implement changes in this order:
|
|
|
|
### Phase A: Low-Risk Changes (Media Services)
|
|
- Radarr, Sonarr, Prowlarr (PUID/PGID changes)
|
|
- No downstream dependencies
|
|
- Easy rollback
|
|
|
|
### Phase B: Medium-Risk Changes (Infrastructure)
|
|
- Docker socket proxy (privileged flag removal)
|
|
- Test with Traefik and Komodo integration
|
|
- Monitor for API errors
|
|
|
|
### Phase C: High-Risk Changes (Authentik Worker)
|
|
- Requires careful testing
|
|
- May impact SSO functionality
|
|
- Have admin credentials ready
|
|
|
|
## Step 5 — Validation & Monitoring
|
|
|
|
For each changed service:
|
|
|
|
```bash
|
|
# Check container start
|
|
docker compose ps
|
|
|
|
# Check logs for errors
|
|
docker compose logs -f --tail=100 <service>
|
|
|
|
# Check resource access
|
|
docker compose exec <service> ls -la /config
|
|
|
|
# Check network connectivity
|
|
docker compose exec <service> ping -c 3 <dependency>
|
|
```
|
|
|
|
### Red Flags to Watch For
|
|
- Permission denied errors
|
|
- Failed healthchecks
|
|
- Repeated restarts
|
|
- API connection failures
|
|
|
|
# [OUTPUT FORMAT]
|
|
|
|
## Container Security Audit Report
|
|
```markdown
|
|
## Privileged Containers
|
|
|
|
### docker-socket-proxy (watchtower)
|
|
- **Status**: ❌ Privileged
|
|
- **Justification**: None documented
|
|
- **Recommendation**: Remove privileged flag, use group_add
|
|
- **Impact**: None expected (tested)
|
|
- **Implementation**: [specific YAML changes]
|
|
|
|
## Root User Containers
|
|
|
|
### radarr
|
|
- **Status**: ⚠️ PUID=0
|
|
- **Data Impact**: /mnt/appdata/radarr (ownership change required)
|
|
- **Recommendation**: Change to PUID=1000
|
|
- **Testing**: [permission fix commands]
|
|
|
|
## Socket Access Review
|
|
|
|
### authentik-worker
|
|
- **Status**: ⚠️ Write access to socket
|
|
- **Purpose**: Docker integration for managed outposts
|
|
- **Recommendation**: Move to socket proxy with limited POST
|
|
- **Alternative**: Disable Docker integration if unused
|
|
```
|
|
|
|
## Implementation Checklist
|
|
```markdown
|
|
- [ ] Phase A: Media Services (radarr, sonarr, prowlarr)
|
|
- [ ] Backup current configs
|
|
- [ ] Update PUID/PGID to 1000
|
|
- [ ] Fix filesystem permissions
|
|
- [ ] Restart and validate
|
|
|
|
- [ ] Phase B: Socket Proxy Hardening
|
|
- [ ] Remove privileged flag from watchtower proxy
|
|
- [ ] Remove privileged flag from heimdall proxy
|
|
- [ ] Test Traefik discovery
|
|
- [ ] Test Komodo deployments
|
|
|
|
- [ ] Phase C: Authentik Worker
|
|
- [ ] Document current Docker integration usage
|
|
- [ ] Test socket proxy migration
|
|
- [ ] Validate outpost functionality
|
|
```
|
|
|
|
# [SAFETY MEASURES]
|
|
|
|
## Pre-Change Backup
|
|
```bash
|
|
# Backup compose files
|
|
cp compose.yaml compose.yaml.backup-$(date +%Y%m%d)
|
|
|
|
# Backup application data
|
|
tar -czf appdata-backup.tar.gz /mnt/appdata/<service>
|
|
```
|
|
|
|
## Rollback Procedure
|
|
```bash
|
|
# Restore compose file
|
|
mv compose.yaml.backup-20260419 compose.yaml
|
|
|
|
# Restore permissions
|
|
sudo chown -R 0:0 /mnt/appdata/<service>
|
|
|
|
# Restart
|
|
docker compose up -d
|
|
```
|
|
|
|
# [SUCCESS CRITERIA]
|
|
- [ ] Zero containers running with `privileged: true` (or documented exception)
|
|
- [ ] Zero media services running as root (PUID=0)
|
|
- [ ] All Docker socket access is read-only or proxied
|
|
- [ ] All services pass health checks after changes
|
|
- [ ] No permission errors in logs (24hr monitoring period)
|
|
- [ ] Documentation updated with security justifications
|