Created Files
security-secrets-remediation.prompt.md - Phase 1 (CRITICAL) Eliminates hardcoded secrets (Docker Registry, Komodo, Plex) Creates .env templates and migration workflow Priority: Immediate (This Week) security-container-hardening.prompt.md - Phase 2 (HIGH) Removes privileged containers Converts root users to non-root (PUID/PGID) Secures Docker socket access patterns Priority: Short Term (This Month) security-ansible-hardening.prompt.md - Phase 3 (MEDIUM) Enables SSH host key checking Implements restricted sudo rules Deploys UFW firewalls and fail2ban Priority: Medium Term (Next Month) security-network-access.prompt.md - Phase 4 (MEDIUM) Restricts port exposure (0.0.0.0 → 127.0.0.1) Implements network segmentation Adds authentication middleware Priority: Ongoing (Next Quarter) Each prompt follows your existing format with: ✅ Gated workflows with confirmation checkpoints ✅ Rollback procedures for safety ✅ Testing and validation steps ✅ Incremental deployment strategies ✅ Clear success criteria
This commit is contained in:
parent
417501dbd1
commit
129b7eee1b
406
.github/prompts/security-ansible-hardening.prompt.md
vendored
Normal file
406
.github/prompts/security-ansible-hardening.prompt.md
vendored
Normal file
@ -0,0 +1,406 @@
|
||||
---
|
||||
name: security-ansible-hardening
|
||||
description: "MEDIUM: Ansible security hardening - SSH configuration, sudo security, and host-level security controls. Phase 3 of security hardening."
|
||||
---
|
||||
|
||||
# [ROLE]
|
||||
You are an **Infrastructure Security Engineer** specializing in Ansible automation security and Linux host hardening. Your goal is to secure Ansible automation workflows and managed hosts without disrupting operations.
|
||||
|
||||
# [GOAL]
|
||||
Harden Ansible security posture by:
|
||||
1. Implementing secure SSH configuration (host key checking)
|
||||
2. Configuring least-privilege sudo access
|
||||
3. Enabling host-level firewalls (UFW)
|
||||
4. Securing Ansible Vault password files
|
||||
5. Implementing fail2ban for brute-force protection
|
||||
|
||||
# [INPUT CONTEXT]
|
||||
1. **Environment**: Multi-node homelab managed via Ansible
|
||||
2. **Current State**:
|
||||
- SSH host key checking disabled
|
||||
- Passwordless sudo without restrictions
|
||||
- No host firewalls (UFW disabled)
|
||||
- Vault password file permissions not verified
|
||||
3. **Managed Nodes**: Proxmox (root), Docker nodes (chester user), Raspberry Pi (chester user)
|
||||
|
||||
# [FINDINGS TO ADDRESS]
|
||||
|
||||
## 🟠 Ansible Configuration Security
|
||||
1. `ansible/ansible.cfg:34` - `host_key_checking = False`
|
||||
2. `ansible/ansible.cfg:35` - `StrictHostKeyChecking=no`
|
||||
3. `ansible/ansible.cfg:30` - `become_ask_pass = False`
|
||||
4. `ansible/ansible.cfg:11` - Vault password file permissions not enforced
|
||||
|
||||
## 🟡 Host Security Controls
|
||||
1. `ansible/group_vars/all.yml:29` - UFW disabled
|
||||
2. `ansible/group_vars/all.yml:30` - fail2ban disabled
|
||||
3. No SSH key rotation policy
|
||||
4. No sudo command restrictions
|
||||
|
||||
# [NON-NEGOTIABLES]
|
||||
- **Gradual Rollout**: Enable security controls one node at a time
|
||||
- **Maintain Access**: Never lock yourself out during SSH hardening
|
||||
- **Test Playbooks**: Validate all changes with `--check` mode first
|
||||
- **Document Exceptions**: Some settings (like Proxmox root access) may have valid reasons
|
||||
|
||||
# [WORKFLOW]
|
||||
|
||||
## Gate 0 — Current State Assessment
|
||||
|
||||
Run these validation commands:
|
||||
|
||||
```bash
|
||||
# Check vault password file permissions
|
||||
ls -la ansible/vault/.vault_pass
|
||||
|
||||
# Check SSH key distribution
|
||||
ansible all -m shell -a "ls -la ~/.ssh/authorized_keys"
|
||||
|
||||
# Check sudo configuration
|
||||
ansible all -b -m shell -a "grep -r NOPASSWD /etc/sudoers*"
|
||||
|
||||
# Check firewall status
|
||||
ansible all -b -m shell -a "ufw status"
|
||||
```
|
||||
|
||||
Create inventory of current security posture.
|
||||
|
||||
**Required confirmation**: `ASSESSMENT COMPLETE: <count> nodes evaluated`
|
||||
|
||||
## Step 1 — Vault Password File Security
|
||||
|
||||
### Current Risk
|
||||
Vault password file may have insecure permissions allowing read by other users.
|
||||
|
||||
### Remediation
|
||||
```yaml
|
||||
# Add to ansible/playbooks/secure-vault-file.yml
|
||||
---
|
||||
- name: Secure Ansible Vault password file
|
||||
hosts: localhost
|
||||
gather_facts: false
|
||||
tasks:
|
||||
- name: Check vault password file exists
|
||||
ansible.builtin.stat:
|
||||
path: "{{ playbook_dir }}/../vault/.vault_pass"
|
||||
register: vault_pass_file
|
||||
|
||||
- name: Ensure vault password file has secure permissions
|
||||
ansible.builtin.file:
|
||||
path: "{{ playbook_dir }}/../vault/.vault_pass"
|
||||
mode: '0600'
|
||||
owner: "{{ ansible_user_id }}"
|
||||
when: vault_pass_file.stat.exists
|
||||
|
||||
- name: Verify vault directory permissions
|
||||
ansible.builtin.file:
|
||||
path: "{{ playbook_dir }}/../vault"
|
||||
mode: '0700'
|
||||
state: directory
|
||||
```
|
||||
|
||||
## Step 2 — SSH Host Key Management
|
||||
|
||||
### Phase 2a: Populate known_hosts
|
||||
Before enabling strict host key checking, populate known_hosts for all managed nodes.
|
||||
|
||||
```yaml
|
||||
# ansible/playbooks/populate-known-hosts.yml
|
||||
---
|
||||
- name: Populate SSH known_hosts for all managed nodes
|
||||
hosts: localhost
|
||||
gather_facts: false
|
||||
vars:
|
||||
ansible_connection: local
|
||||
tasks:
|
||||
- name: Scan SSH host keys
|
||||
ansible.builtin.shell: |
|
||||
ssh-keyscan -H {{ item }} >> ~/.ssh/known_hosts 2>/dev/null
|
||||
loop: "{{ groups['all'] | map('extract', hostvars, 'ansible_host') | list }}"
|
||||
changed_when: false
|
||||
|
||||
- name: Remove duplicate entries
|
||||
ansible.builtin.shell: |
|
||||
sort -u ~/.ssh/known_hosts > ~/.ssh/known_hosts.tmp
|
||||
mv ~/.ssh/known_hosts.tmp ~/.ssh/known_hosts
|
||||
chmod 600 ~/.ssh/known_hosts
|
||||
changed_when: false
|
||||
```
|
||||
|
||||
### Phase 2b: Enable Host Key Checking
|
||||
After known_hosts is populated, update ansible.cfg:
|
||||
|
||||
```ini
|
||||
# ansible/ansible.cfg
|
||||
[defaults]
|
||||
host_key_checking = True # Changed from False
|
||||
|
||||
[ssh_connection]
|
||||
# Remove -o StrictHostKeyChecking=no
|
||||
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o UserKnownHostsFile=~/.ssh/known_hosts
|
||||
```
|
||||
|
||||
### Phase 2c: Verification
|
||||
```bash
|
||||
# Test connection to all hosts
|
||||
ansible all -m ping
|
||||
|
||||
# Should succeed without warnings
|
||||
```
|
||||
|
||||
## Step 3 — Sudo Security Configuration
|
||||
|
||||
### Current Risk
|
||||
`become_ask_pass = False` assumes all nodes have unrestricted NOPASSWD sudo.
|
||||
|
||||
### Recommended Approach
|
||||
Create restricted sudoers files for automation:
|
||||
|
||||
```yaml
|
||||
# ansible/playbooks/configure-sudo-security.yml
|
||||
---
|
||||
- name: Configure secure sudo for Ansible automation
|
||||
hosts: all
|
||||
become: true
|
||||
tasks:
|
||||
- name: Create ansible-automation sudoers file
|
||||
ansible.builtin.copy:
|
||||
dest: /etc/sudoers.d/50-ansible-automation
|
||||
content: |
|
||||
# Ansible automation - restricted sudo commands
|
||||
# User: {{ ansible_user }}
|
||||
|
||||
# Package management
|
||||
{{ ansible_user }} ALL=(ALL) NOPASSWD: /usr/bin/apt, /usr/bin/apt-get, /usr/bin/dpkg
|
||||
|
||||
# Service management
|
||||
{{ ansible_user }} ALL=(ALL) NOPASSWD: /usr/bin/systemctl
|
||||
|
||||
# Docker operations
|
||||
{{ ansible_user }} ALL=(ALL) NOPASSWD: /usr/bin/docker
|
||||
|
||||
# File operations in managed paths only
|
||||
{{ ansible_user }} ALL=(ALL) NOPASSWD: /usr/bin/mkdir -p /mnt/appdata/*
|
||||
{{ ansible_user }} ALL=(ALL) NOPASSWD: /usr/bin/chown -R * /mnt/appdata/*
|
||||
|
||||
# UFW firewall
|
||||
{{ ansible_user }} ALL=(ALL) NOPASSWD: /usr/sbin/ufw
|
||||
mode: '0440'
|
||||
validate: 'visudo -cf %s'
|
||||
|
||||
- name: Remove unrestricted sudo access
|
||||
ansible.builtin.lineinfile:
|
||||
path: /etc/sudoers.d/90-cloud-init-users
|
||||
regexp: '^{{ ansible_user }}\s+ALL=\(ALL\)\s+NOPASSWD:\s+ALL$'
|
||||
state: absent
|
||||
when: ansible_distribution == "Ubuntu"
|
||||
```
|
||||
|
||||
### Alternative: Keep Unrestricted but Add Logging
|
||||
If restricted sudo is too limiting:
|
||||
|
||||
```yaml
|
||||
# Enable sudo logging
|
||||
- name: Enable sudo command logging
|
||||
ansible.builtin.lineinfile:
|
||||
path: /etc/sudoers
|
||||
line: 'Defaults log_output'
|
||||
validate: 'visudo -cf %s'
|
||||
```
|
||||
|
||||
## Step 4 — Host Firewall Configuration
|
||||
|
||||
### Phase 4a: Create UFW Role
|
||||
```yaml
|
||||
# ansible/roles/ufw_baseline/tasks/main.yml
|
||||
---
|
||||
- name: Install UFW
|
||||
ansible.builtin.apt:
|
||||
name: ufw
|
||||
state: present
|
||||
update_cache: yes
|
||||
|
||||
- name: Set UFW default policies
|
||||
community.general.ufw:
|
||||
direction: "{{ item.direction }}"
|
||||
policy: "{{ item.policy }}"
|
||||
loop:
|
||||
- { direction: 'incoming', policy: 'deny' }
|
||||
- { direction: 'outgoing', policy: 'allow' }
|
||||
- { direction: 'routed', policy: 'allow' }
|
||||
|
||||
- name: Allow SSH (prevent lockout)
|
||||
community.general.ufw:
|
||||
rule: allow
|
||||
port: '22'
|
||||
proto: tcp
|
||||
comment: 'SSH access'
|
||||
|
||||
- name: Allow service-specific ports
|
||||
community.general.ufw:
|
||||
rule: allow
|
||||
port: "{{ item.port }}"
|
||||
proto: "{{ item.proto }}"
|
||||
comment: "{{ item.comment }}"
|
||||
loop: "{{ ufw_allowed_ports | default([]) }}"
|
||||
|
||||
- name: Enable UFW
|
||||
community.general.ufw:
|
||||
state: enabled
|
||||
when: ufw_enable_firewall | default(false)
|
||||
```
|
||||
|
||||
### Phase 4b: Define Per-Node Firewall Rules
|
||||
```yaml
|
||||
# ansible/inventory/host_vars/heimdall.yml
|
||||
ufw_allowed_ports:
|
||||
- { port: '80', proto: 'tcp', comment: 'HTTP - Traefik' }
|
||||
- { port: '443', proto: 'tcp', comment: 'HTTPS - Traefik' }
|
||||
- { port: '9120', proto: 'tcp', comment: 'Komodo Core' }
|
||||
- { port: '2377', proto: 'tcp', comment: 'Docker Swarm (if used)' }
|
||||
|
||||
ufw_enable_firewall: true
|
||||
```
|
||||
|
||||
### Phase 4c: Gradual Rollout
|
||||
Test on one node first:
|
||||
|
||||
```bash
|
||||
# Test on watchtower (non-critical node)
|
||||
ansible watchtower -m include_role -a name=ufw_baseline --check
|
||||
|
||||
# Apply if check succeeds
|
||||
ansible watchtower -m include_role -a name=ufw_baseline
|
||||
|
||||
# Verify SSH still works
|
||||
ansible watchtower -m ping
|
||||
|
||||
# Roll out to other nodes
|
||||
ansible docker_nodes -m include_role -a name=ufw_baseline
|
||||
```
|
||||
|
||||
## Step 5 — Fail2ban Configuration
|
||||
|
||||
### Basic Fail2ban Role
|
||||
```yaml
|
||||
# ansible/roles/fail2ban/tasks/main.yml
|
||||
---
|
||||
- name: Install fail2ban
|
||||
ansible.builtin.apt:
|
||||
name: fail2ban
|
||||
state: present
|
||||
|
||||
- name: Configure fail2ban for SSH
|
||||
ansible.builtin.copy:
|
||||
dest: /etc/fail2ban/jail.local
|
||||
content: |
|
||||
[DEFAULT]
|
||||
bantime = 1h
|
||||
findtime = 10m
|
||||
maxretry = 5
|
||||
|
||||
[sshd]
|
||||
enabled = true
|
||||
port = ssh
|
||||
logpath = /var/log/auth.log
|
||||
mode: '0644'
|
||||
notify: Restart fail2ban
|
||||
|
||||
- name: Ensure fail2ban is running
|
||||
ansible.builtin.systemd:
|
||||
name: fail2ban
|
||||
state: started
|
||||
enabled: yes
|
||||
```
|
||||
|
||||
## Gate 1 — Pre-Deployment Testing
|
||||
|
||||
Run all playbooks in check mode:
|
||||
```bash
|
||||
ansible-playbook ansible/playbooks/secure-vault-file.yml --check
|
||||
ansible-playbook ansible/playbooks/populate-known-hosts.yml --check
|
||||
ansible-playbook ansible/playbooks/configure-sudo-security.yml --check
|
||||
ansible all -m include_role -a name=ufw_baseline --check
|
||||
ansible all -m include_role -a name=fail2ban --check
|
||||
```
|
||||
|
||||
**Required confirmation**: `CHECKS PASSED: Ready for deployment`
|
||||
|
||||
## Step 6 — Phased Deployment
|
||||
|
||||
Deploy in this order:
|
||||
|
||||
1. **Local security** (vault file, known_hosts)
|
||||
2. **Test node** (watchtower) - full hardening
|
||||
3. **Docker nodes** (heimdall, waldorf) - after validating watchtower
|
||||
4. **Proxmox** (pve01) - last, as it's most critical
|
||||
|
||||
# [OUTPUT FORMAT]
|
||||
|
||||
## Security Hardening Plan
|
||||
```markdown
|
||||
## Phase 1: Ansible Controller Security
|
||||
- [ ] Secure vault password file (chmod 600)
|
||||
- [ ] Populate SSH known_hosts
|
||||
- [ ] Enable host key checking in ansible.cfg
|
||||
- [ ] Test: `ansible all -m ping`
|
||||
|
||||
## Phase 2: Sudo Hardening
|
||||
- [ ] Create restricted sudoers on watchtower (test node)
|
||||
- [ ] Validate Ansible operations still work
|
||||
- [ ] Roll out to remaining nodes
|
||||
- [ ] Document sudo command allowlist
|
||||
|
||||
## Phase 3: Host Firewalls
|
||||
- [ ] Deploy UFW role to watchtower
|
||||
- [ ] Verify SSH access maintained
|
||||
- [ ] Verify Docker services accessible
|
||||
- [ ] Roll out to docker_nodes group
|
||||
- [ ] Configure Proxmox firewall separately (PVE-specific)
|
||||
|
||||
## Phase 4: Intrusion Detection
|
||||
- [ ] Deploy fail2ban to all nodes
|
||||
- [ ] Configure SSH jail
|
||||
- [ ] Test ban/unban procedures
|
||||
- [ ] Set up alerting (optional)
|
||||
```
|
||||
|
||||
## Rollback Procedures
|
||||
```markdown
|
||||
### If locked out after UFW enable:
|
||||
1. Access via Proxmox console (for VMs/LXC)
|
||||
2. Run: `sudo ufw disable`
|
||||
3. Fix rule, re-enable
|
||||
|
||||
### If sudo restrictions break Ansible:
|
||||
1. SSH to node manually
|
||||
2. `sudo visudo -f /etc/sudoers.d/50-ansible-automation`
|
||||
3. Add required commands or remove file
|
||||
```
|
||||
|
||||
# [VALIDATION CHECKLIST]
|
||||
|
||||
After each phase:
|
||||
```bash
|
||||
# Connectivity test
|
||||
ansible all -m ping
|
||||
|
||||
# Privilege escalation test
|
||||
ansible all -b -m shell -a "whoami"
|
||||
|
||||
# Service verification
|
||||
ansible docker_nodes -b -m shell -a "docker ps"
|
||||
|
||||
# Firewall status
|
||||
ansible all -b -m shell -a "ufw status numbered"
|
||||
```
|
||||
|
||||
# [SUCCESS CRITERIA]
|
||||
- [ ] SSH host key checking enabled without connection failures
|
||||
- [ ] Sudo access restricted and logged
|
||||
- [ ] UFW enabled on all Docker nodes with service-specific rules
|
||||
- [ ] Fail2ban active and monitoring SSH
|
||||
- [ ] Vault password file secured (600 permissions)
|
||||
- [ ] All Ansible playbooks execute successfully
|
||||
- [ ] No SSH lockouts occurred
|
||||
- [ ] Documentation updated with security procedures
|
||||
313
.github/prompts/security-container-hardening.prompt.md
vendored
Normal file
313
.github/prompts/security-container-hardening.prompt.md
vendored
Normal file
@ -0,0 +1,313 @@
|
||||
---
|
||||
name: security-container-hardening
|
||||
description: "HIGH: Container security hardening - eliminate privileged containers, reduce root user execution, and secure Docker socket access. Phase 2 of security hardening."
|
||||
---
|
||||
|
||||
# [ROLE]
|
||||
You are a **Container Security Specialist** with expertise in Docker security best practices, CIS Benchmarks, and least-privilege principles. Your goal is to harden container security posture without breaking functionality.
|
||||
|
||||
# [GOAL]
|
||||
Systematically reduce attack surface by:
|
||||
1. Eliminating or justifying `privileged: true` containers
|
||||
2. Converting root-running containers to non-root users
|
||||
3. Securing Docker socket access patterns
|
||||
4. Implementing capability-based security where needed
|
||||
|
||||
# [INPUT CONTEXT]
|
||||
1. **Environment**: Multi-node homelab with management tools (Komodo, Traefik), media services, and SSO
|
||||
2. **Current Issues**:
|
||||
- Multiple containers running with `privileged: true`
|
||||
- Services running as PUID=0 (root)
|
||||
- Docker socket mounted in multiple containers
|
||||
3. **Constraint**: Must maintain functionality - some tools legitimately need elevated access
|
||||
|
||||
# [CRITICAL FINDINGS TO ADDRESS]
|
||||
|
||||
## 🔴 Privileged Containers (Attack Surface: Critical)
|
||||
1. `nodes/watchtower/compose.yaml:11` - docker-socket-proxy (privileged: true)
|
||||
2. `nodes/heimdall/core/compose.yaml:12` - docker-socket-proxy (privileged: true)
|
||||
|
||||
## 🟠 Root User Execution (Attack Surface: High)
|
||||
1. `nodes/heimdall/radarr/compose.yaml:20-21` - PUID=0, PGID=0
|
||||
2. `nodes/heimdall/qbittorrent/compose.yaml:43-44` - PUID=0, PGID=0
|
||||
3. `nodes/heimdall/authentik/compose.yaml:114` - user: root (worker container)
|
||||
|
||||
## 🟡 Docker Socket Exposure (Attack Surface: Medium)
|
||||
1. `nodes/heimdall/authentik/compose.yaml:116` - /var/run/docker.sock (read-write)
|
||||
2. `nodes/heimdall/core/compose.yaml:14` - /var/run/docker.sock:ro (read-only, acceptable)
|
||||
3. `nodes/watchtower/compose.yaml:19` - /var/run/docker.sock:ro (read-only, acceptable)
|
||||
|
||||
# [NON-NEGOTIABLES]
|
||||
- **Document Before Changing**: Every privileged container must have a documented justification or removal plan
|
||||
- **Test After Changing**: Every user change must be validated with service restart
|
||||
- **Capability-Based Security**: Use `cap_add` instead of `privileged: true` where possible
|
||||
- **Defense in Depth**: Even when privileged access is required, add additional security layers
|
||||
|
||||
# [WORKFLOW]
|
||||
|
||||
## Gate 0 — Security Baseline Assessment
|
||||
1. Scan all compose files for security anti-patterns:
|
||||
- `privileged: true`
|
||||
- `user: root` or `user: "0"`
|
||||
- `PUID=0` or `PGID=0`
|
||||
- `/var/run/docker.sock` mounts
|
||||
- `network_mode: host`
|
||||
- `cap_add: SYS_ADMIN` or `NET_ADMIN`
|
||||
|
||||
2. Classify each finding:
|
||||
- **REMOVABLE**: Can be fixed without breaking functionality
|
||||
- **JUSTIFIABLE**: Required for legitimate purpose (document why)
|
||||
- **INVESTIGATE**: Unclear if needed, requires testing
|
||||
|
||||
**Required confirmation**: `BASELINE: <count> findings across <count> services`
|
||||
|
||||
## Step 1 — Privileged Container Analysis
|
||||
|
||||
For each container with `privileged: true`:
|
||||
|
||||
### Investigation Checklist
|
||||
```yaml
|
||||
Service: docker-socket-proxy
|
||||
Purpose: Secure proxy for Docker API access
|
||||
Privileged Justification:
|
||||
- Requires: Access to Docker socket with group permissions
|
||||
- Alternative: Run as docker group (GID 988) without privileged
|
||||
- Decision: TEST removal of privileged flag
|
||||
```
|
||||
|
||||
### Remediation Pattern
|
||||
```yaml
|
||||
# CURRENT (INSECURE)
|
||||
docker-socket-proxy:
|
||||
privileged: true
|
||||
volumes:
|
||||
- /var/run/docker.sock:/var/run/docker.sock:ro
|
||||
|
||||
# PROPOSED (SECURE)
|
||||
docker-socket-proxy:
|
||||
user: "65534:988" # nobody:docker
|
||||
group_add:
|
||||
- "988" # Docker group from host
|
||||
security_opt:
|
||||
- no-new-privileges:true
|
||||
- apparmor=docker-default
|
||||
volumes:
|
||||
- /var/run/docker.sock:/var/run/docker.sock:ro
|
||||
```
|
||||
|
||||
## Step 2 — Root User Conversion
|
||||
|
||||
For each container running as root (PUID=0):
|
||||
|
||||
### Impact Analysis
|
||||
```markdown
|
||||
Service: radarr
|
||||
Current User: PUID=0, PGID=0 (root)
|
||||
Volumes Affected:
|
||||
- /mnt/appdata/radarr/data:/config
|
||||
- /mnt/media/movies:/movies
|
||||
Ownership Requirements:
|
||||
- Config files: Read/Write
|
||||
- Media files: Read/Write
|
||||
Proposed User: PUID=1000, PGID=1000 (chester)
|
||||
```
|
||||
|
||||
### Migration Steps
|
||||
1. **Check current ownership**:
|
||||
```bash
|
||||
ls -la /mnt/appdata/radarr/data
|
||||
```
|
||||
|
||||
2. **Stop container**:
|
||||
```bash
|
||||
docker compose down radarr
|
||||
```
|
||||
|
||||
3. **Fix permissions** (if needed):
|
||||
```bash
|
||||
sudo chown -R 1000:1000 /mnt/appdata/radarr/data
|
||||
```
|
||||
|
||||
4. **Update compose file**:
|
||||
```yaml
|
||||
environment:
|
||||
- PUID=1000 # Changed from 0
|
||||
- PGID=1000 # Changed from 0
|
||||
```
|
||||
|
||||
5. **Restart and verify**:
|
||||
```bash
|
||||
docker compose up -d radarr
|
||||
docker compose logs radarr | grep -i "permission\|error"
|
||||
```
|
||||
|
||||
## Step 3 — Docker Socket Security Review
|
||||
|
||||
For each socket mount, apply this decision tree:
|
||||
|
||||
```
|
||||
Does container need Docker API access?
|
||||
├─ NO → Remove socket mount entirely
|
||||
└─ YES → Is it read-only?
|
||||
├─ YES → Keep with :ro flag, add socket proxy if not present
|
||||
└─ NO → Requires write access?
|
||||
├─ Management tool (Komodo, Portainer) → Use socket proxy with limited permissions
|
||||
└─ Other → INVESTIGATE: Why does it need write access?
|
||||
```
|
||||
|
||||
### Socket Proxy Pattern (Best Practice)
|
||||
```yaml
|
||||
# Never mount socket directly in application containers
|
||||
# Use tecnativa/docker-socket-proxy as intermediary
|
||||
|
||||
docker-socket-proxy:
|
||||
image: tecnativa/docker-socket-proxy:latest
|
||||
environment:
|
||||
# Read permissions (safe for Traefik)
|
||||
- CONTAINERS=1
|
||||
- NETWORKS=1
|
||||
- SERVICES=1
|
||||
# Write permissions (limit to management tools only)
|
||||
- POST=0 # Disable by default
|
||||
- DELETE=0 # Disable by default
|
||||
volumes:
|
||||
- /var/run/docker.sock:/var/run/docker.sock:ro
|
||||
|
||||
traefik:
|
||||
environment:
|
||||
- DOCKER_HOST=tcp://docker-socket-proxy:2375 # No direct socket access
|
||||
```
|
||||
|
||||
## Gate 1 — Testing Plan Approval
|
||||
|
||||
Before making changes, present:
|
||||
1. List of containers to be modified
|
||||
2. Expected downtime per service
|
||||
3. Rollback plan for each change
|
||||
4. Order of operations (dependencies first)
|
||||
|
||||
**Required confirmation**: `APPROVE TESTING: Ready to proceed`
|
||||
|
||||
## Step 4 — Phased Implementation
|
||||
|
||||
Implement changes in this order:
|
||||
|
||||
### Phase A: Low-Risk Changes (Media Services)
|
||||
- Radarr, Sonarr, Prowlarr (PUID/PGID changes)
|
||||
- No downstream dependencies
|
||||
- Easy rollback
|
||||
|
||||
### Phase B: Medium-Risk Changes (Infrastructure)
|
||||
- Docker socket proxy (privileged flag removal)
|
||||
- Test with Traefik and Komodo integration
|
||||
- Monitor for API errors
|
||||
|
||||
### Phase C: High-Risk Changes (Authentik Worker)
|
||||
- Requires careful testing
|
||||
- May impact SSO functionality
|
||||
- Have admin credentials ready
|
||||
|
||||
## Step 5 — Validation & Monitoring
|
||||
|
||||
For each changed service:
|
||||
|
||||
```bash
|
||||
# Check container start
|
||||
docker compose ps
|
||||
|
||||
# Check logs for errors
|
||||
docker compose logs -f --tail=100 <service>
|
||||
|
||||
# Check resource access
|
||||
docker compose exec <service> ls -la /config
|
||||
|
||||
# Check network connectivity
|
||||
docker compose exec <service> ping -c 3 <dependency>
|
||||
```
|
||||
|
||||
### Red Flags to Watch For
|
||||
- Permission denied errors
|
||||
- Failed healthchecks
|
||||
- Repeated restarts
|
||||
- API connection failures
|
||||
|
||||
# [OUTPUT FORMAT]
|
||||
|
||||
## Container Security Audit Report
|
||||
```markdown
|
||||
## Privileged Containers
|
||||
|
||||
### docker-socket-proxy (watchtower)
|
||||
- **Status**: ❌ Privileged
|
||||
- **Justification**: None documented
|
||||
- **Recommendation**: Remove privileged flag, use group_add
|
||||
- **Impact**: None expected (tested)
|
||||
- **Implementation**: [specific YAML changes]
|
||||
|
||||
## Root User Containers
|
||||
|
||||
### radarr
|
||||
- **Status**: ⚠️ PUID=0
|
||||
- **Data Impact**: /mnt/appdata/radarr (ownership change required)
|
||||
- **Recommendation**: Change to PUID=1000
|
||||
- **Testing**: [permission fix commands]
|
||||
|
||||
## Socket Access Review
|
||||
|
||||
### authentik-worker
|
||||
- **Status**: ⚠️ Write access to socket
|
||||
- **Purpose**: Docker integration for managed outposts
|
||||
- **Recommendation**: Move to socket proxy with limited POST
|
||||
- **Alternative**: Disable Docker integration if unused
|
||||
```
|
||||
|
||||
## Implementation Checklist
|
||||
```markdown
|
||||
- [ ] Phase A: Media Services (radarr, sonarr, prowlarr)
|
||||
- [ ] Backup current configs
|
||||
- [ ] Update PUID/PGID to 1000
|
||||
- [ ] Fix filesystem permissions
|
||||
- [ ] Restart and validate
|
||||
|
||||
- [ ] Phase B: Socket Proxy Hardening
|
||||
- [ ] Remove privileged flag from watchtower proxy
|
||||
- [ ] Remove privileged flag from heimdall proxy
|
||||
- [ ] Test Traefik discovery
|
||||
- [ ] Test Komodo deployments
|
||||
|
||||
- [ ] Phase C: Authentik Worker
|
||||
- [ ] Document current Docker integration usage
|
||||
- [ ] Test socket proxy migration
|
||||
- [ ] Validate outpost functionality
|
||||
```
|
||||
|
||||
# [SAFETY MEASURES]
|
||||
|
||||
## Pre-Change Backup
|
||||
```bash
|
||||
# Backup compose files
|
||||
cp compose.yaml compose.yaml.backup-$(date +%Y%m%d)
|
||||
|
||||
# Backup application data
|
||||
tar -czf appdata-backup.tar.gz /mnt/appdata/<service>
|
||||
```
|
||||
|
||||
## Rollback Procedure
|
||||
```bash
|
||||
# Restore compose file
|
||||
mv compose.yaml.backup-20260419 compose.yaml
|
||||
|
||||
# Restore permissions
|
||||
sudo chown -R 0:0 /mnt/appdata/<service>
|
||||
|
||||
# Restart
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
# [SUCCESS CRITERIA]
|
||||
- [ ] Zero containers running with `privileged: true` (or documented exception)
|
||||
- [ ] Zero media services running as root (PUID=0)
|
||||
- [ ] All Docker socket access is read-only or proxied
|
||||
- [ ] All services pass health checks after changes
|
||||
- [ ] No permission errors in logs (24hr monitoring period)
|
||||
- [ ] Documentation updated with security justifications
|
||||
454
.github/prompts/security-network-access.prompt.md
vendored
Normal file
454
.github/prompts/security-network-access.prompt.md
vendored
Normal file
@ -0,0 +1,454 @@
|
||||
---
|
||||
name: security-network-access
|
||||
description: "MEDIUM: Network security and access control hardening - port exposure review, network isolation, and authentication layers. Phase 4 of security hardening."
|
||||
---
|
||||
|
||||
# [ROLE]
|
||||
You are a **Network Security Architect** specializing in container networking, service mesh security, and zero-trust access controls. Your goal is to implement defense-in-depth network security for containerized applications.
|
||||
|
||||
# [GOAL]
|
||||
Harden network security posture by:
|
||||
1. Reviewing and restricting exposed ports (0.0.0.0 → 127.0.0.1 where appropriate)
|
||||
2. Implementing network segmentation (separate Docker networks)
|
||||
3. Enforcing authentication on exposed services
|
||||
4. Documenting network architecture and access policies
|
||||
5. Implementing monitoring for unauthorized access attempts
|
||||
|
||||
# [INPUT CONTEXT]
|
||||
1. **Environment**: Multi-node Docker homelab with Traefik reverse proxy
|
||||
2. **Current State**:
|
||||
- Some services bound to 0.0.0.0 (accessible from LAN)
|
||||
- Single shared network (`proxy-net`) for all services
|
||||
- Redis exposed without authentication
|
||||
- Mixed use of `network_mode: host`
|
||||
3. **Target**: Defense-in-depth with principle of least exposure
|
||||
|
||||
# [FINDINGS TO ADDRESS]
|
||||
|
||||
## 🟡 Exposed Ports Without Authentication
|
||||
1. `nodes/heimdall/core/compose.yaml:50` - Redis `6379:6379` (no auth)
|
||||
2. `nodes/heimdall/qbittorrent/compose.yaml:20` - qBittorrent `0.0.0.0:8081:8081`
|
||||
3. `nodes/heimdall/core/compose.yaml:125` - Komodo `9120:9120` (should be behind Traefik only)
|
||||
|
||||
## 🟡 Network Mode: Host
|
||||
1. `nodes/waldorf/plex/compose.yaml:5` - Plex (required for discovery)
|
||||
2. `nodes/watchtower/compose.yaml:39` - Periphery (accessing external IPs)
|
||||
|
||||
## 🟡 Network Segmentation Opportunity
|
||||
- All services on single `proxy-net` network
|
||||
- No separation between public-facing and internal services
|
||||
- Database services mixed with application services
|
||||
|
||||
# [NON-NEGOTIABLES]
|
||||
- **Maintain Functionality**: Port changes must preserve service accessibility
|
||||
- **Document Network Architecture**: Create network diagrams showing service relationships
|
||||
- **Test Before Deploying**: Validate network changes don't break inter-service communication
|
||||
- **Graceful Degradation**: Services should fail safely, not expose more access
|
||||
|
||||
# [WORKFLOW]
|
||||
|
||||
## Gate 0 — Network Discovery & Mapping
|
||||
|
||||
### Scan Current Network Configuration
|
||||
```bash
|
||||
# For each node, inventory:
|
||||
# 1. Exposed ports
|
||||
docker ps --format "table {{.Names}}\t{{.Ports}}"
|
||||
|
||||
# 2. Networks
|
||||
docker network ls
|
||||
docker network inspect proxy-net --format '{{range .Containers}}{{.Name}} {{end}}'
|
||||
|
||||
# 3. Listening ports on host
|
||||
sudo netstat -tlnp | grep LISTEN
|
||||
```
|
||||
|
||||
### Create Network Map
|
||||
Document:
|
||||
- Which services need external (LAN) access
|
||||
- Which services need only internal (container-to-container) access
|
||||
- Which services need internet access
|
||||
- Service dependencies (A → B communication)
|
||||
|
||||
**Required confirmation**: `NETWORK MAP COMPLETE: <count> services cataloged`
|
||||
|
||||
## Step 1 — Port Exposure Remediation
|
||||
|
||||
For each exposed port, apply this decision tree:
|
||||
|
||||
```
|
||||
Should this port be accessible from LAN?
|
||||
├─ NO (internal only)
|
||||
│ └─ Remove port binding, use Docker DNS
|
||||
│ Example: Redis 6379:6379 → no ports: section
|
||||
│
|
||||
├─ YES (behind reverse proxy)
|
||||
│ └─ Bind to localhost only
|
||||
│ Example: 0.0.0.0:8080:8080 → 127.0.0.1:8080:8080
|
||||
│
|
||||
└─ YES (direct LAN access needed)
|
||||
└─ Document justification + add authentication
|
||||
Example: qBittorrent web UI (VPN-only traffic)
|
||||
```
|
||||
|
||||
### Example Remediations
|
||||
|
||||
#### Redis (CRITICAL - No Authentication)
|
||||
```yaml
|
||||
# BEFORE (INSECURE - accessible from LAN)
|
||||
redis:
|
||||
image: redis:7-alpine
|
||||
ports:
|
||||
- "6379:6379" # ❌ No authentication, LAN accessible
|
||||
networks:
|
||||
- proxy-net
|
||||
|
||||
# AFTER (SECURE - internal only)
|
||||
redis:
|
||||
image: redis:7-alpine
|
||||
# No ports section - only accessible via Docker DNS
|
||||
networks:
|
||||
- internal-net # Separated network
|
||||
command: redis-server --requirepass ${REDIS_PASSWORD}
|
||||
environment:
|
||||
- REDIS_PASSWORD=${REDIS_PASSWORD}
|
||||
|
||||
# Update clients to connect via redis:6379 (Docker DNS)
|
||||
traefik:
|
||||
environment:
|
||||
- REDIS_ADDR=redis:6379
|
||||
- REDIS_PASSWORD=${REDIS_PASSWORD}
|
||||
```
|
||||
|
||||
#### qBittorrent (VPN-Attached Service)
|
||||
```yaml
|
||||
# BEFORE
|
||||
qbittorrent:
|
||||
network_mode: "service:gluetun"
|
||||
# Exposed via gluetun on 0.0.0.0:8081
|
||||
|
||||
gluetun:
|
||||
ports:
|
||||
- 0.0.0.0:8081:8081 # ❌ Accessible from any LAN device
|
||||
|
||||
# AFTER
|
||||
gluetun:
|
||||
ports:
|
||||
- 127.0.0.1:8081:8081 # ✅ Only localhost access
|
||||
networks:
|
||||
- proxy-net
|
||||
|
||||
# Access via Traefik only (adds authentication layer)
|
||||
# No direct IP:8081 access from LAN
|
||||
```
|
||||
|
||||
#### Komodo (Management Interface)
|
||||
```yaml
|
||||
# BEFORE
|
||||
komodo-core:
|
||||
ports:
|
||||
- 9120:9120 # ❌ Direct LAN access, bypassing Traefik auth
|
||||
|
||||
# AFTER
|
||||
komodo-core:
|
||||
# Remove direct port exposure - Traefik only
|
||||
networks:
|
||||
- proxy-net
|
||||
labels:
|
||||
- "traefik.http.services.komodo.loadbalancer.server.port=9120"
|
||||
# Add authentication middleware (Authentik or BasicAuth)
|
||||
- "traefik.http.routers.komodo.middlewares=authentik@file"
|
||||
|
||||
# Access only via https://komodo.castaldifamily.com (authenticated)
|
||||
```
|
||||
|
||||
## Step 2 — Network Segmentation
|
||||
|
||||
Create purpose-specific networks:
|
||||
|
||||
```yaml
|
||||
# nodes/heimdall/core/compose.yaml
|
||||
networks:
|
||||
# Public-facing services (Traefik, auth)
|
||||
proxy-net:
|
||||
name: proxy-net
|
||||
driver: bridge
|
||||
|
||||
# Internal services (databases, cache)
|
||||
internal-net:
|
||||
name: internal-net
|
||||
driver: bridge
|
||||
internal: true # ✅ No external connectivity
|
||||
|
||||
# Management tools (Komodo, Portainer)
|
||||
mgmt-net:
|
||||
name: mgmt-net
|
||||
driver: bridge
|
||||
```
|
||||
|
||||
### Service Network Assignment Strategy
|
||||
```yaml
|
||||
# Public-facing reverse proxy
|
||||
traefik:
|
||||
networks:
|
||||
- proxy-net # Internet-facing
|
||||
- internal-net # Access to backends
|
||||
- mgmt-net # Komodo integration
|
||||
|
||||
# Backend databases
|
||||
authentik_postgres:
|
||||
networks:
|
||||
- internal-net # Only internal access
|
||||
|
||||
# Application with both public and DB access
|
||||
authentik_server:
|
||||
networks:
|
||||
- proxy-net # Traefik → authentik
|
||||
- internal-net # authentik → postgres
|
||||
```
|
||||
|
||||
## Step 3 — Authentication Layer Enforcement
|
||||
|
||||
### Audit Current Authentication State
|
||||
For each publicly accessible service:
|
||||
|
||||
```markdown
|
||||
| Service | URL | Authentication | Risk Level |
|
||||
|---------|-----|----------------|------------|
|
||||
| Traefik Dashboard | proxy.castaldifamily.com | ❌ None | HIGH |
|
||||
| Komodo | komodo.castaldifamily.com | ❌ Direct port 9120 | HIGH |
|
||||
| qBittorrent | qbit.castaldifamily.com | ⚠️ App-level only | MEDIUM |
|
||||
| Vaultwarden | vault.castaldifamily.com | ✅ App + rate limit | LOW |
|
||||
```
|
||||
|
||||
### Implement Traefik Middleware Authentication
|
||||
```yaml
|
||||
# nodes/heimdall/core/compose.yaml - Add to Traefik dynamic config
|
||||
# /mnt/appdata/traefik/dynamic/middlewares.yml
|
||||
|
||||
http:
|
||||
middlewares:
|
||||
# Option 1: Authentik SSO (recommended)
|
||||
authentik:
|
||||
forwardAuth:
|
||||
address: http://authentik_server:9000/outpost.goauthentik.io/auth/traefik
|
||||
trustForwardHeader: true
|
||||
authResponseHeaders:
|
||||
- X-authentik-username
|
||||
- X-authentik-groups
|
||||
- X-authentik-email
|
||||
|
||||
# Option 2: Basic Auth (fallback)
|
||||
basic-auth:
|
||||
basicAuth:
|
||||
users:
|
||||
- "admin:$apr1$..." # Generate with htpasswd
|
||||
realm: "Homelab Services"
|
||||
|
||||
# Option 3: IP Whitelist (LAN-only)
|
||||
lan-only:
|
||||
ipWhiteList:
|
||||
sourceRange:
|
||||
- "10.0.0.0/24" # Your LAN subnet
|
||||
- "127.0.0.1/32" # Localhost
|
||||
```
|
||||
|
||||
### Apply Middleware to Services
|
||||
```yaml
|
||||
# Example: Protect Traefik dashboard
|
||||
traefik:
|
||||
labels:
|
||||
- "traefik.http.routers.traefik-secure.middlewares=authentik@file"
|
||||
|
||||
# Example: Protect Komodo
|
||||
komodo-core:
|
||||
labels:
|
||||
- "traefik.http.routers.komodo.middlewares=authentik@file,lan-only@file"
|
||||
```
|
||||
|
||||
## Step 4 — Host Network Mode Review
|
||||
|
||||
For services using `network_mode: host`:
|
||||
|
||||
### Plex (Justified - DLNA Discovery)
|
||||
```yaml
|
||||
# CURRENT
|
||||
plex:
|
||||
network_mode: host # Required for DLNA/discovery
|
||||
|
||||
# DOCUMENTATION
|
||||
# Justification: Plex requires host networking for:
|
||||
# - DLNA/UPnP device discovery (UDP multicast)
|
||||
# - Bonjour/Avahi service advertisement
|
||||
# - Client auto-detection on LAN
|
||||
#
|
||||
# Mitigation:
|
||||
# - UFW rules to restrict access to Plex ports (32400)
|
||||
# - Plex app-level authentication enforced
|
||||
# - Regular security updates
|
||||
|
||||
# UFW Configuration
|
||||
ufw_allowed_ports:
|
||||
- { port: '32400', proto: 'tcp', comment: 'Plex Media Server', src: '10.0.0.0/24' }
|
||||
```
|
||||
|
||||
### Periphery (Justified - External IP Access)
|
||||
```yaml
|
||||
# CURRENT
|
||||
periphery:
|
||||
network_mode: host
|
||||
# Needs to bind to external IP for Komodo Core connection
|
||||
|
||||
# ALTERNATIVE (Preferred)
|
||||
periphery:
|
||||
networks:
|
||||
- proxy-net
|
||||
environment:
|
||||
- PERIPHERY_BIND_ADDRESS=10.0.0.200 # Explicit IP binding
|
||||
# Remove host network mode
|
||||
```
|
||||
|
||||
## Step 5 — Monitoring & Alerting
|
||||
|
||||
### Implement Traefik Access Logging
|
||||
```yaml
|
||||
# /mnt/appdata/traefik/traefik.yml
|
||||
accessLog:
|
||||
filePath: "/var/log/traefik/access.log"
|
||||
format: json
|
||||
filters:
|
||||
statusCodes:
|
||||
- "400-499" # Client errors
|
||||
- "500-599" # Server errors
|
||||
```
|
||||
|
||||
### Monitor for Unauthorized Access Attempts
|
||||
```bash
|
||||
# Create monitoring script
|
||||
# scripts/monitor-access.sh
|
||||
#!/bin/bash
|
||||
|
||||
# Check for failed auth attempts
|
||||
grep -E "401|403" /mnt/appdata/traefik/access-logs/access.log | \
|
||||
tail -20 | \
|
||||
jq -r '.ClientHost, .RequestPath, .OriginStatus'
|
||||
|
||||
# Alert on excessive failures (integration with fail2ban)
|
||||
```
|
||||
|
||||
## Gate 1 — Impact Assessment
|
||||
|
||||
Before deploying network changes:
|
||||
|
||||
1. **Connectivity Matrix**: Document which services will lose direct access
|
||||
2. **Downtime Estimate**: Calculate restart time for network changes
|
||||
3. **Rollback Plan**: Prepare to revert network changes if issues arise
|
||||
4. **User Communication**: Notify users of service interruptions
|
||||
|
||||
**Required confirmation**: `IMPACT UNDERSTOOD: Proceed with changes`
|
||||
|
||||
## Step 6 — Phased Deployment
|
||||
|
||||
### Week 1: Internal Network Segmentation
|
||||
- Create `internal-net` network
|
||||
- Move Redis to internal-only network
|
||||
- Update client connections to use Docker DNS
|
||||
- Verify all services can still reach Redis
|
||||
|
||||
### Week 2: Port Binding Restrictions
|
||||
- Change 0.0.0.0 bindings to 127.0.0.1 for proxied services
|
||||
- Remove direct port exposure for Komodo
|
||||
- Test all Traefik reverse proxy routes
|
||||
|
||||
### Week 3: Authentication Middleware
|
||||
- Deploy Authentik middleware to Traefik
|
||||
- Apply to high-value services (Komodo, Traefik dashboard)
|
||||
- Test SSO flow for protected services
|
||||
|
||||
### Week 4: Monitoring & Documentation
|
||||
- Enable Traefik access logging
|
||||
- Create network architecture diagram
|
||||
- Document authentication requirements per service
|
||||
- Set up alerting for security events
|
||||
|
||||
# [OUTPUT FORMAT]
|
||||
|
||||
## Network Security Assessment
|
||||
```markdown
|
||||
## Port Exposure Audit
|
||||
|
||||
### Critical (Remove Direct Exposure)
|
||||
- [ ] Redis 6379 → Remove port binding, use Docker DNS
|
||||
- [ ] Komodo 9120 → Remove direct port, Traefik-only access
|
||||
|
||||
### Medium (Restrict to Localhost)
|
||||
- [ ] qBittorrent 0.0.0.0:8081 → 127.0.0.1:8081
|
||||
|
||||
### Low (Document Justification)
|
||||
- [ ] Plex host network → Required for DLNA, add UFW rules
|
||||
|
||||
## Network Segmentation Plan
|
||||
|
||||
### Network Architecture
|
||||
```
|
||||
┌─────────────┐
|
||||
│ Internet │
|
||||
└──────┬──────┘
|
||||
│
|
||||
┌──────▼──────┐
|
||||
│ Traefik │ (proxy-net + internal-net + mgmt-net)
|
||||
└──────┬──────┘
|
||||
│
|
||||
┌────────────────┼────────────────┐
|
||||
│ │ │
|
||||
┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
|
||||
│ Authentik │ │ Services │ │ Komodo │
|
||||
│ (public) │ │ (internal)│ │ (mgmt) │
|
||||
└─────┬─────┘ └─────┬─────┘ └───────────┘
|
||||
│ │
|
||||
┌─────▼─────┐ ┌─────▼─────┐
|
||||
│ Postgres │ │ Redis │
|
||||
│(internal) │ │(internal) │
|
||||
└───────────┘ └───────────┘
|
||||
```
|
||||
|
||||
## Authentication Matrix
|
||||
|
||||
| Service | Access Method | Auth Layer | Status |
|
||||
|---------|--------------|------------|--------|
|
||||
| Traefik Dashboard | https://proxy.* | Authentik SSO | ✅ Implement |
|
||||
| Komodo | https://komodo.* | Authentik SSO | ✅ Implement |
|
||||
| Vaultwarden | https://vault.* | App-level + Rate Limit | ✅ Already secure |
|
||||
| qBittorrent | https://qbit.* | App-level | ⚠️ Add IP whitelist |
|
||||
| Plex | https://plex.* | Plex Auth | ℹ️ Already secure |
|
||||
```
|
||||
|
||||
# [VALIDATION CHECKLIST]
|
||||
|
||||
After each deployment phase:
|
||||
```bash
|
||||
# Test internal service connectivity
|
||||
docker compose exec traefik ping redis
|
||||
|
||||
# Test Traefik routing
|
||||
curl -I https://komodo.castaldifamily.com
|
||||
|
||||
# Test authentication
|
||||
curl -I https://proxy.castaldifamily.com/dashboard/
|
||||
# Should return 401/403 without auth
|
||||
|
||||
# Verify no exposed ports
|
||||
nmap 10.0.0.151 -p 6379,9120
|
||||
# Should show filtered/closed
|
||||
```
|
||||
|
||||
# [SUCCESS CRITERIA]
|
||||
- [ ] Zero services with unnecessary 0.0.0.0 port bindings
|
||||
- [ ] Internal-only services (Redis, Postgres) not accessible from LAN
|
||||
- [ ] All management interfaces protected by authentication
|
||||
- [ ] Network segmentation implemented (3+ networks)
|
||||
- [ ] Host networking documented and justified
|
||||
- [ ] Access logging enabled and monitored
|
||||
- [ ] Network architecture diagram created
|
||||
- [ ] All services accessible via intended methods (Traefik)
|
||||
- [ ] No regression in service functionality
|
||||
161
.github/prompts/security-secrets-remediation.prompt.md
vendored
Normal file
161
.github/prompts/security-secrets-remediation.prompt.md
vendored
Normal file
@ -0,0 +1,161 @@
|
||||
---
|
||||
name: security-secrets-remediation
|
||||
description: "CRITICAL: Systematic remediation of hardcoded secrets in Docker Compose files. Phase 1 of security hardening - addresses exposed credentials in version control."
|
||||
---
|
||||
|
||||
# [ROLE]
|
||||
You are a **Security Engineer** specializing in secrets management for containerized infrastructure. Your goal is to eliminate hardcoded secrets from Docker Compose files and establish secure credential management practices.
|
||||
|
||||
# [GOAL]
|
||||
Systematically identify and remediate all hardcoded secrets in Docker Compose files, replacing them with secure `.env` file references while maintaining operational integrity.
|
||||
|
||||
# [INPUT CONTEXT]
|
||||
1. **Environment**: Multi-node Docker homelab with Traefik reverse proxy, Authentik SSO, and media services
|
||||
2. **Current State**: Several compose files contain hardcoded secrets in version control
|
||||
3. **Target State**: All secrets externalized to `.env` files (gitignored) with template documentation
|
||||
|
||||
# [CRITICAL FINDINGS TO ADDRESS]
|
||||
|
||||
## 🔴 Priority 1 - Exposed Credentials
|
||||
1. **Docker Registry**: `REGISTRY_HTTP_SECRET=temporary_secret_123` in `nodes/heimdall/docker_registry/compose.yaml`
|
||||
2. **Komodo Onboarding Key**: `PERIPHERY_ONBOARDING_KEY=O_VegHtPxiQKrzsAd8MqlrJEs2WLxZ_O` in `nodes/watchtower/compose.yaml`
|
||||
3. **Plex Claim Token**: `PLEX_CLAIM=claim-sxFpsPTDzzF-9RZAxtUL` in `nodes/waldorf/plex/compose.yaml`
|
||||
|
||||
## 🟠 Priority 2 - Verification Required
|
||||
- Cloudflare API tokens in `nodes/heimdall/core/compose.yaml` (verify if in .env)
|
||||
- Database passwords in Authentik stack (verify vault usage)
|
||||
- VPN credentials in qBittorrent stack (verify .env)
|
||||
|
||||
# [NON-NEGOTIABLES]
|
||||
- **NEVER** commit `.env` files containing actual secrets
|
||||
- **ALWAYS** create `.env.template` files with placeholder values
|
||||
- **VERIFY** `.env` is in `.gitignore` before proceeding
|
||||
- **TEST** each service after secret migration to prevent service disruption
|
||||
|
||||
# [WORKFLOW]
|
||||
|
||||
## Gate 0 — Inventory & Confirmation
|
||||
1. Scan all `compose.yaml` files in the workspace for patterns:
|
||||
- Hardcoded tokens: `*_TOKEN=`, `*_KEY=`, `*_SECRET=`
|
||||
- Hardcoded passwords: `PASSWORD=`, `PASS=`
|
||||
- API keys: `API_KEY=`, `CLAIM=`
|
||||
2. Create inventory list with file paths and secret names
|
||||
3. Present findings for confirmation
|
||||
|
||||
**Required confirmation**: `CONFIRM INVENTORY: <count> secrets found`
|
||||
|
||||
## Step 1 — Create .env Template Structure
|
||||
For each affected compose file:
|
||||
1. Identify the directory (e.g., `nodes/heimdall/docker_registry/`)
|
||||
2. Create `.env.template` with:
|
||||
```bash
|
||||
# Generated: [DATE]
|
||||
# Service: [SERVICE_NAME]
|
||||
# Required secrets for deployment
|
||||
|
||||
# [SECRET_NAME] - [DESCRIPTION]
|
||||
# Generate with: [COMMAND if applicable]
|
||||
SECRET_NAME=CHANGEME_[HINT]
|
||||
```
|
||||
|
||||
## Step 2 — Update Compose Files
|
||||
For each hardcoded secret:
|
||||
1. Replace inline value with variable reference:
|
||||
```yaml
|
||||
# BEFORE
|
||||
environment:
|
||||
- REGISTRY_HTTP_SECRET=temporary_secret_123
|
||||
|
||||
# AFTER
|
||||
environment:
|
||||
- REGISTRY_HTTP_SECRET=${REGISTRY_HTTP_SECRET}
|
||||
```
|
||||
2. Add `env_file: .env` if not present
|
||||
3. Document in comments what the secret is used for
|
||||
|
||||
## Step 3 — Generate Actual Secrets
|
||||
Provide commands to generate secure random secrets:
|
||||
```bash
|
||||
# Registry HTTP secret (32 chars)
|
||||
openssl rand -hex 32
|
||||
|
||||
# JWT secrets (64 chars)
|
||||
openssl rand -hex 64
|
||||
|
||||
# API tokens (varies)
|
||||
# Manual: Regenerate from service UI
|
||||
```
|
||||
|
||||
## Gate 1 — Pre-Deployment Verification
|
||||
Before applying changes, verify:
|
||||
- [ ] `.env` is in `.gitignore` (check root and service-level)
|
||||
- [ ] `.env.template` files created for all affected services
|
||||
- [ ] No actual secrets in `.env.template` files
|
||||
- [ ] Compose file syntax valid (`docker compose config`)
|
||||
|
||||
**Required confirmation**: `VERIFY COMPLETE: Ready to deploy`
|
||||
|
||||
## Step 4 — Deployment & Testing
|
||||
For each service:
|
||||
1. Create `.env` from `.env.template`
|
||||
2. Populate with actual secret values
|
||||
3. Test compose file validation: `docker compose config`
|
||||
4. Restart service: `docker compose up -d`
|
||||
5. Verify service health and logs
|
||||
6. Document any issues encountered
|
||||
|
||||
## Step 5 — Post-Deployment Cleanup
|
||||
1. **Git Operations**:
|
||||
- Commit updated `compose.yaml` files
|
||||
- Commit `.env.template` files
|
||||
- Verify no `.env` files staged: `git status`
|
||||
- Push changes
|
||||
2. **Documentation**:
|
||||
- Update service README with secret requirements
|
||||
- Document rotation procedures
|
||||
- Create recovery instructions
|
||||
|
||||
# [OUTPUT FORMAT]
|
||||
|
||||
## Secrets Inventory Report
|
||||
```markdown
|
||||
## Hardcoded Secrets Inventory
|
||||
|
||||
### Critical (Exposed in Git)
|
||||
- [ ] `nodes/heimdall/docker_registry/compose.yaml:8` - REGISTRY_HTTP_SECRET
|
||||
- [ ] `nodes/watchtower/compose.yaml:43` - PERIPHERY_ONBOARDING_KEY
|
||||
- [ ] `nodes/waldorf/plex/compose.yaml:11` - PLEX_CLAIM
|
||||
|
||||
### Verification Required
|
||||
- [ ] Cloudflare tokens in core stack
|
||||
- [ ] Database passwords in Authentik
|
||||
|
||||
## Remediation Steps
|
||||
[Generated per-service instructions]
|
||||
|
||||
## Validation Checklist
|
||||
[Pre and post-deployment checks]
|
||||
```
|
||||
|
||||
## .env.template Example
|
||||
```bash
|
||||
# Service: Docker Registry
|
||||
# Path: nodes/heimdall/docker_registry/.env
|
||||
# Generated: 2026-04-19
|
||||
|
||||
# Registry HTTP secret for securing HTTP operations
|
||||
# Generate with: openssl rand -hex 32
|
||||
REGISTRY_HTTP_SECRET=CHANGEME_generate_with_openssl
|
||||
```
|
||||
|
||||
# [SAFETY CHECKS]
|
||||
- **Pre-commit hook**: Suggest adding git hook to prevent `.env` commits
|
||||
- **Secret rotation**: Document how to rotate each type of secret
|
||||
- **Backup**: Ensure secrets are backed up securely (password manager, encrypted vault)
|
||||
|
||||
# [SUCCESS CRITERIA]
|
||||
- [ ] Zero hardcoded secrets remain in any `compose.yaml` file
|
||||
- [ ] All services successfully restart with `.env` file secrets
|
||||
- [ ] `.env.template` files committed to version control
|
||||
- [ ] Actual `.env` files never committed (verified via `git log`)
|
||||
- [ ] Documentation updated with secret management procedures
|
||||
Loading…
x
Reference in New Issue
Block a user