Created Files
security-secrets-remediation.prompt.md - Phase 1 (CRITICAL) Eliminates hardcoded secrets (Docker Registry, Komodo, Plex) Creates .env templates and migration workflow Priority: Immediate (This Week) security-container-hardening.prompt.md - Phase 2 (HIGH) Removes privileged containers Converts root users to non-root (PUID/PGID) Secures Docker socket access patterns Priority: Short Term (This Month) security-ansible-hardening.prompt.md - Phase 3 (MEDIUM) Enables SSH host key checking Implements restricted sudo rules Deploys UFW firewalls and fail2ban Priority: Medium Term (Next Month) security-network-access.prompt.md - Phase 4 (MEDIUM) Restricts port exposure (0.0.0.0 → 127.0.0.1) Implements network segmentation Adds authentication middleware Priority: Ongoing (Next Quarter) Each prompt follows your existing format with: ✅ Gated workflows with confirmation checkpoints ✅ Rollback procedures for safety ✅ Testing and validation steps ✅ Incremental deployment strategies ✅ Clear success criteria
This commit is contained in:
parent
417501dbd1
commit
129b7eee1b
406
.github/prompts/security-ansible-hardening.prompt.md
vendored
Normal file
406
.github/prompts/security-ansible-hardening.prompt.md
vendored
Normal file
@ -0,0 +1,406 @@
|
|||||||
|
---
|
||||||
|
name: security-ansible-hardening
|
||||||
|
description: "MEDIUM: Ansible security hardening - SSH configuration, sudo security, and host-level security controls. Phase 3 of security hardening."
|
||||||
|
---
|
||||||
|
|
||||||
|
# [ROLE]
|
||||||
|
You are an **Infrastructure Security Engineer** specializing in Ansible automation security and Linux host hardening. Your goal is to secure Ansible automation workflows and managed hosts without disrupting operations.
|
||||||
|
|
||||||
|
# [GOAL]
|
||||||
|
Harden Ansible security posture by:
|
||||||
|
1. Implementing secure SSH configuration (host key checking)
|
||||||
|
2. Configuring least-privilege sudo access
|
||||||
|
3. Enabling host-level firewalls (UFW)
|
||||||
|
4. Securing Ansible Vault password files
|
||||||
|
5. Implementing fail2ban for brute-force protection
|
||||||
|
|
||||||
|
# [INPUT CONTEXT]
|
||||||
|
1. **Environment**: Multi-node homelab managed via Ansible
|
||||||
|
2. **Current State**:
|
||||||
|
- SSH host key checking disabled
|
||||||
|
- Passwordless sudo without restrictions
|
||||||
|
- No host firewalls (UFW disabled)
|
||||||
|
- Vault password file permissions not verified
|
||||||
|
3. **Managed Nodes**: Proxmox (root), Docker nodes (chester user), Raspberry Pi (chester user)
|
||||||
|
|
||||||
|
# [FINDINGS TO ADDRESS]
|
||||||
|
|
||||||
|
## 🟠 Ansible Configuration Security
|
||||||
|
1. `ansible/ansible.cfg:34` - `host_key_checking = False`
|
||||||
|
2. `ansible/ansible.cfg:35` - `StrictHostKeyChecking=no`
|
||||||
|
3. `ansible/ansible.cfg:30` - `become_ask_pass = False`
|
||||||
|
4. `ansible/ansible.cfg:11` - Vault password file permissions not enforced
|
||||||
|
|
||||||
|
## 🟡 Host Security Controls
|
||||||
|
1. `ansible/group_vars/all.yml:29` - UFW disabled
|
||||||
|
2. `ansible/group_vars/all.yml:30` - fail2ban disabled
|
||||||
|
3. No SSH key rotation policy
|
||||||
|
4. No sudo command restrictions
|
||||||
|
|
||||||
|
# [NON-NEGOTIABLES]
|
||||||
|
- **Gradual Rollout**: Enable security controls one node at a time
|
||||||
|
- **Maintain Access**: Never lock yourself out during SSH hardening
|
||||||
|
- **Test Playbooks**: Validate all changes with `--check` mode first
|
||||||
|
- **Document Exceptions**: Some settings (like Proxmox root access) may have valid reasons
|
||||||
|
|
||||||
|
# [WORKFLOW]
|
||||||
|
|
||||||
|
## Gate 0 — Current State Assessment
|
||||||
|
|
||||||
|
Run these validation commands:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check vault password file permissions
|
||||||
|
ls -la ansible/vault/.vault_pass
|
||||||
|
|
||||||
|
# Check SSH key distribution
|
||||||
|
ansible all -m shell -a "ls -la ~/.ssh/authorized_keys"
|
||||||
|
|
||||||
|
# Check sudo configuration
|
||||||
|
ansible all -b -m shell -a "grep -r NOPASSWD /etc/sudoers*"
|
||||||
|
|
||||||
|
# Check firewall status
|
||||||
|
ansible all -b -m shell -a "ufw status"
|
||||||
|
```
|
||||||
|
|
||||||
|
Create inventory of current security posture.
|
||||||
|
|
||||||
|
**Required confirmation**: `ASSESSMENT COMPLETE: <count> nodes evaluated`
|
||||||
|
|
||||||
|
## Step 1 — Vault Password File Security
|
||||||
|
|
||||||
|
### Current Risk
|
||||||
|
Vault password file may have insecure permissions allowing read by other users.
|
||||||
|
|
||||||
|
### Remediation
|
||||||
|
```yaml
|
||||||
|
# Add to ansible/playbooks/secure-vault-file.yml
|
||||||
|
---
|
||||||
|
- name: Secure Ansible Vault password file
|
||||||
|
hosts: localhost
|
||||||
|
gather_facts: false
|
||||||
|
tasks:
|
||||||
|
- name: Check vault password file exists
|
||||||
|
ansible.builtin.stat:
|
||||||
|
path: "{{ playbook_dir }}/../vault/.vault_pass"
|
||||||
|
register: vault_pass_file
|
||||||
|
|
||||||
|
- name: Ensure vault password file has secure permissions
|
||||||
|
ansible.builtin.file:
|
||||||
|
path: "{{ playbook_dir }}/../vault/.vault_pass"
|
||||||
|
mode: '0600'
|
||||||
|
owner: "{{ ansible_user_id }}"
|
||||||
|
when: vault_pass_file.stat.exists
|
||||||
|
|
||||||
|
- name: Verify vault directory permissions
|
||||||
|
ansible.builtin.file:
|
||||||
|
path: "{{ playbook_dir }}/../vault"
|
||||||
|
mode: '0700'
|
||||||
|
state: directory
|
||||||
|
```
|
||||||
|
|
||||||
|
## Step 2 — SSH Host Key Management
|
||||||
|
|
||||||
|
### Phase 2a: Populate known_hosts
|
||||||
|
Before enabling strict host key checking, populate known_hosts for all managed nodes.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# ansible/playbooks/populate-known-hosts.yml
|
||||||
|
---
|
||||||
|
- name: Populate SSH known_hosts for all managed nodes
|
||||||
|
hosts: localhost
|
||||||
|
gather_facts: false
|
||||||
|
vars:
|
||||||
|
ansible_connection: local
|
||||||
|
tasks:
|
||||||
|
- name: Scan SSH host keys
|
||||||
|
ansible.builtin.shell: |
|
||||||
|
ssh-keyscan -H {{ item }} >> ~/.ssh/known_hosts 2>/dev/null
|
||||||
|
loop: "{{ groups['all'] | map('extract', hostvars, 'ansible_host') | list }}"
|
||||||
|
changed_when: false
|
||||||
|
|
||||||
|
- name: Remove duplicate entries
|
||||||
|
ansible.builtin.shell: |
|
||||||
|
sort -u ~/.ssh/known_hosts > ~/.ssh/known_hosts.tmp
|
||||||
|
mv ~/.ssh/known_hosts.tmp ~/.ssh/known_hosts
|
||||||
|
chmod 600 ~/.ssh/known_hosts
|
||||||
|
changed_when: false
|
||||||
|
```
|
||||||
|
|
||||||
|
### Phase 2b: Enable Host Key Checking
|
||||||
|
After known_hosts is populated, update ansible.cfg:
|
||||||
|
|
||||||
|
```ini
|
||||||
|
# ansible/ansible.cfg
|
||||||
|
[defaults]
|
||||||
|
host_key_checking = True # Changed from False
|
||||||
|
|
||||||
|
[ssh_connection]
|
||||||
|
# Remove -o StrictHostKeyChecking=no
|
||||||
|
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o UserKnownHostsFile=~/.ssh/known_hosts
|
||||||
|
```
|
||||||
|
|
||||||
|
### Phase 2c: Verification
|
||||||
|
```bash
|
||||||
|
# Test connection to all hosts
|
||||||
|
ansible all -m ping
|
||||||
|
|
||||||
|
# Should succeed without warnings
|
||||||
|
```
|
||||||
|
|
||||||
|
## Step 3 — Sudo Security Configuration
|
||||||
|
|
||||||
|
### Current Risk
|
||||||
|
`become_ask_pass = False` assumes all nodes have unrestricted NOPASSWD sudo.
|
||||||
|
|
||||||
|
### Recommended Approach
|
||||||
|
Create restricted sudoers files for automation:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# ansible/playbooks/configure-sudo-security.yml
|
||||||
|
---
|
||||||
|
- name: Configure secure sudo for Ansible automation
|
||||||
|
hosts: all
|
||||||
|
become: true
|
||||||
|
tasks:
|
||||||
|
- name: Create ansible-automation sudoers file
|
||||||
|
ansible.builtin.copy:
|
||||||
|
dest: /etc/sudoers.d/50-ansible-automation
|
||||||
|
content: |
|
||||||
|
# Ansible automation - restricted sudo commands
|
||||||
|
# User: {{ ansible_user }}
|
||||||
|
|
||||||
|
# Package management
|
||||||
|
{{ ansible_user }} ALL=(ALL) NOPASSWD: /usr/bin/apt, /usr/bin/apt-get, /usr/bin/dpkg
|
||||||
|
|
||||||
|
# Service management
|
||||||
|
{{ ansible_user }} ALL=(ALL) NOPASSWD: /usr/bin/systemctl
|
||||||
|
|
||||||
|
# Docker operations
|
||||||
|
{{ ansible_user }} ALL=(ALL) NOPASSWD: /usr/bin/docker
|
||||||
|
|
||||||
|
# File operations in managed paths only
|
||||||
|
{{ ansible_user }} ALL=(ALL) NOPASSWD: /usr/bin/mkdir -p /mnt/appdata/*
|
||||||
|
{{ ansible_user }} ALL=(ALL) NOPASSWD: /usr/bin/chown -R * /mnt/appdata/*
|
||||||
|
|
||||||
|
# UFW firewall
|
||||||
|
{{ ansible_user }} ALL=(ALL) NOPASSWD: /usr/sbin/ufw
|
||||||
|
mode: '0440'
|
||||||
|
validate: 'visudo -cf %s'
|
||||||
|
|
||||||
|
- name: Remove unrestricted sudo access
|
||||||
|
ansible.builtin.lineinfile:
|
||||||
|
path: /etc/sudoers.d/90-cloud-init-users
|
||||||
|
regexp: '^{{ ansible_user }}\s+ALL=\(ALL\)\s+NOPASSWD:\s+ALL$'
|
||||||
|
state: absent
|
||||||
|
when: ansible_distribution == "Ubuntu"
|
||||||
|
```
|
||||||
|
|
||||||
|
### Alternative: Keep Unrestricted but Add Logging
|
||||||
|
If restricted sudo is too limiting:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# Enable sudo logging
|
||||||
|
- name: Enable sudo command logging
|
||||||
|
ansible.builtin.lineinfile:
|
||||||
|
path: /etc/sudoers
|
||||||
|
line: 'Defaults log_output'
|
||||||
|
validate: 'visudo -cf %s'
|
||||||
|
```
|
||||||
|
|
||||||
|
## Step 4 — Host Firewall Configuration
|
||||||
|
|
||||||
|
### Phase 4a: Create UFW Role
|
||||||
|
```yaml
|
||||||
|
# ansible/roles/ufw_baseline/tasks/main.yml
|
||||||
|
---
|
||||||
|
- name: Install UFW
|
||||||
|
ansible.builtin.apt:
|
||||||
|
name: ufw
|
||||||
|
state: present
|
||||||
|
update_cache: yes
|
||||||
|
|
||||||
|
- name: Set UFW default policies
|
||||||
|
community.general.ufw:
|
||||||
|
direction: "{{ item.direction }}"
|
||||||
|
policy: "{{ item.policy }}"
|
||||||
|
loop:
|
||||||
|
- { direction: 'incoming', policy: 'deny' }
|
||||||
|
- { direction: 'outgoing', policy: 'allow' }
|
||||||
|
- { direction: 'routed', policy: 'allow' }
|
||||||
|
|
||||||
|
- name: Allow SSH (prevent lockout)
|
||||||
|
community.general.ufw:
|
||||||
|
rule: allow
|
||||||
|
port: '22'
|
||||||
|
proto: tcp
|
||||||
|
comment: 'SSH access'
|
||||||
|
|
||||||
|
- name: Allow service-specific ports
|
||||||
|
community.general.ufw:
|
||||||
|
rule: allow
|
||||||
|
port: "{{ item.port }}"
|
||||||
|
proto: "{{ item.proto }}"
|
||||||
|
comment: "{{ item.comment }}"
|
||||||
|
loop: "{{ ufw_allowed_ports | default([]) }}"
|
||||||
|
|
||||||
|
- name: Enable UFW
|
||||||
|
community.general.ufw:
|
||||||
|
state: enabled
|
||||||
|
when: ufw_enable_firewall | default(false)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Phase 4b: Define Per-Node Firewall Rules
|
||||||
|
```yaml
|
||||||
|
# ansible/inventory/host_vars/heimdall.yml
|
||||||
|
ufw_allowed_ports:
|
||||||
|
- { port: '80', proto: 'tcp', comment: 'HTTP - Traefik' }
|
||||||
|
- { port: '443', proto: 'tcp', comment: 'HTTPS - Traefik' }
|
||||||
|
- { port: '9120', proto: 'tcp', comment: 'Komodo Core' }
|
||||||
|
- { port: '2377', proto: 'tcp', comment: 'Docker Swarm (if used)' }
|
||||||
|
|
||||||
|
ufw_enable_firewall: true
|
||||||
|
```
|
||||||
|
|
||||||
|
### Phase 4c: Gradual Rollout
|
||||||
|
Test on one node first:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Test on watchtower (non-critical node)
|
||||||
|
ansible watchtower -m include_role -a name=ufw_baseline --check
|
||||||
|
|
||||||
|
# Apply if check succeeds
|
||||||
|
ansible watchtower -m include_role -a name=ufw_baseline
|
||||||
|
|
||||||
|
# Verify SSH still works
|
||||||
|
ansible watchtower -m ping
|
||||||
|
|
||||||
|
# Roll out to other nodes
|
||||||
|
ansible docker_nodes -m include_role -a name=ufw_baseline
|
||||||
|
```
|
||||||
|
|
||||||
|
## Step 5 — Fail2ban Configuration
|
||||||
|
|
||||||
|
### Basic Fail2ban Role
|
||||||
|
```yaml
|
||||||
|
# ansible/roles/fail2ban/tasks/main.yml
|
||||||
|
---
|
||||||
|
- name: Install fail2ban
|
||||||
|
ansible.builtin.apt:
|
||||||
|
name: fail2ban
|
||||||
|
state: present
|
||||||
|
|
||||||
|
- name: Configure fail2ban for SSH
|
||||||
|
ansible.builtin.copy:
|
||||||
|
dest: /etc/fail2ban/jail.local
|
||||||
|
content: |
|
||||||
|
[DEFAULT]
|
||||||
|
bantime = 1h
|
||||||
|
findtime = 10m
|
||||||
|
maxretry = 5
|
||||||
|
|
||||||
|
[sshd]
|
||||||
|
enabled = true
|
||||||
|
port = ssh
|
||||||
|
logpath = /var/log/auth.log
|
||||||
|
mode: '0644'
|
||||||
|
notify: Restart fail2ban
|
||||||
|
|
||||||
|
- name: Ensure fail2ban is running
|
||||||
|
ansible.builtin.systemd:
|
||||||
|
name: fail2ban
|
||||||
|
state: started
|
||||||
|
enabled: yes
|
||||||
|
```
|
||||||
|
|
||||||
|
## Gate 1 — Pre-Deployment Testing
|
||||||
|
|
||||||
|
Run all playbooks in check mode:
|
||||||
|
```bash
|
||||||
|
ansible-playbook ansible/playbooks/secure-vault-file.yml --check
|
||||||
|
ansible-playbook ansible/playbooks/populate-known-hosts.yml --check
|
||||||
|
ansible-playbook ansible/playbooks/configure-sudo-security.yml --check
|
||||||
|
ansible all -m include_role -a name=ufw_baseline --check
|
||||||
|
ansible all -m include_role -a name=fail2ban --check
|
||||||
|
```
|
||||||
|
|
||||||
|
**Required confirmation**: `CHECKS PASSED: Ready for deployment`
|
||||||
|
|
||||||
|
## Step 6 — Phased Deployment
|
||||||
|
|
||||||
|
Deploy in this order:
|
||||||
|
|
||||||
|
1. **Local security** (vault file, known_hosts)
|
||||||
|
2. **Test node** (watchtower) - full hardening
|
||||||
|
3. **Docker nodes** (heimdall, waldorf) - after validating watchtower
|
||||||
|
4. **Proxmox** (pve01) - last, as it's most critical
|
||||||
|
|
||||||
|
# [OUTPUT FORMAT]
|
||||||
|
|
||||||
|
## Security Hardening Plan
|
||||||
|
```markdown
|
||||||
|
## Phase 1: Ansible Controller Security
|
||||||
|
- [ ] Secure vault password file (chmod 600)
|
||||||
|
- [ ] Populate SSH known_hosts
|
||||||
|
- [ ] Enable host key checking in ansible.cfg
|
||||||
|
- [ ] Test: `ansible all -m ping`
|
||||||
|
|
||||||
|
## Phase 2: Sudo Hardening
|
||||||
|
- [ ] Create restricted sudoers on watchtower (test node)
|
||||||
|
- [ ] Validate Ansible operations still work
|
||||||
|
- [ ] Roll out to remaining nodes
|
||||||
|
- [ ] Document sudo command allowlist
|
||||||
|
|
||||||
|
## Phase 3: Host Firewalls
|
||||||
|
- [ ] Deploy UFW role to watchtower
|
||||||
|
- [ ] Verify SSH access maintained
|
||||||
|
- [ ] Verify Docker services accessible
|
||||||
|
- [ ] Roll out to docker_nodes group
|
||||||
|
- [ ] Configure Proxmox firewall separately (PVE-specific)
|
||||||
|
|
||||||
|
## Phase 4: Intrusion Detection
|
||||||
|
- [ ] Deploy fail2ban to all nodes
|
||||||
|
- [ ] Configure SSH jail
|
||||||
|
- [ ] Test ban/unban procedures
|
||||||
|
- [ ] Set up alerting (optional)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Rollback Procedures
|
||||||
|
```markdown
|
||||||
|
### If locked out after UFW enable:
|
||||||
|
1. Access via Proxmox console (for VMs/LXC)
|
||||||
|
2. Run: `sudo ufw disable`
|
||||||
|
3. Fix rule, re-enable
|
||||||
|
|
||||||
|
### If sudo restrictions break Ansible:
|
||||||
|
1. SSH to node manually
|
||||||
|
2. `sudo visudo -f /etc/sudoers.d/50-ansible-automation`
|
||||||
|
3. Add required commands or remove file
|
||||||
|
```
|
||||||
|
|
||||||
|
# [VALIDATION CHECKLIST]
|
||||||
|
|
||||||
|
After each phase:
|
||||||
|
```bash
|
||||||
|
# Connectivity test
|
||||||
|
ansible all -m ping
|
||||||
|
|
||||||
|
# Privilege escalation test
|
||||||
|
ansible all -b -m shell -a "whoami"
|
||||||
|
|
||||||
|
# Service verification
|
||||||
|
ansible docker_nodes -b -m shell -a "docker ps"
|
||||||
|
|
||||||
|
# Firewall status
|
||||||
|
ansible all -b -m shell -a "ufw status numbered"
|
||||||
|
```
|
||||||
|
|
||||||
|
# [SUCCESS CRITERIA]
|
||||||
|
- [ ] SSH host key checking enabled without connection failures
|
||||||
|
- [ ] Sudo access restricted and logged
|
||||||
|
- [ ] UFW enabled on all Docker nodes with service-specific rules
|
||||||
|
- [ ] Fail2ban active and monitoring SSH
|
||||||
|
- [ ] Vault password file secured (600 permissions)
|
||||||
|
- [ ] All Ansible playbooks execute successfully
|
||||||
|
- [ ] No SSH lockouts occurred
|
||||||
|
- [ ] Documentation updated with security procedures
|
||||||
313
.github/prompts/security-container-hardening.prompt.md
vendored
Normal file
313
.github/prompts/security-container-hardening.prompt.md
vendored
Normal file
@ -0,0 +1,313 @@
|
|||||||
|
---
|
||||||
|
name: security-container-hardening
|
||||||
|
description: "HIGH: Container security hardening - eliminate privileged containers, reduce root user execution, and secure Docker socket access. Phase 2 of security hardening."
|
||||||
|
---
|
||||||
|
|
||||||
|
# [ROLE]
|
||||||
|
You are a **Container Security Specialist** with expertise in Docker security best practices, CIS Benchmarks, and least-privilege principles. Your goal is to harden container security posture without breaking functionality.
|
||||||
|
|
||||||
|
# [GOAL]
|
||||||
|
Systematically reduce attack surface by:
|
||||||
|
1. Eliminating or justifying `privileged: true` containers
|
||||||
|
2. Converting root-running containers to non-root users
|
||||||
|
3. Securing Docker socket access patterns
|
||||||
|
4. Implementing capability-based security where needed
|
||||||
|
|
||||||
|
# [INPUT CONTEXT]
|
||||||
|
1. **Environment**: Multi-node homelab with management tools (Komodo, Traefik), media services, and SSO
|
||||||
|
2. **Current Issues**:
|
||||||
|
- Multiple containers running with `privileged: true`
|
||||||
|
- Services running as PUID=0 (root)
|
||||||
|
- Docker socket mounted in multiple containers
|
||||||
|
3. **Constraint**: Must maintain functionality - some tools legitimately need elevated access
|
||||||
|
|
||||||
|
# [CRITICAL FINDINGS TO ADDRESS]
|
||||||
|
|
||||||
|
## 🔴 Privileged Containers (Attack Surface: Critical)
|
||||||
|
1. `nodes/watchtower/compose.yaml:11` - docker-socket-proxy (privileged: true)
|
||||||
|
2. `nodes/heimdall/core/compose.yaml:12` - docker-socket-proxy (privileged: true)
|
||||||
|
|
||||||
|
## 🟠 Root User Execution (Attack Surface: High)
|
||||||
|
1. `nodes/heimdall/radarr/compose.yaml:20-21` - PUID=0, PGID=0
|
||||||
|
2. `nodes/heimdall/qbittorrent/compose.yaml:43-44` - PUID=0, PGID=0
|
||||||
|
3. `nodes/heimdall/authentik/compose.yaml:114` - user: root (worker container)
|
||||||
|
|
||||||
|
## 🟡 Docker Socket Exposure (Attack Surface: Medium)
|
||||||
|
1. `nodes/heimdall/authentik/compose.yaml:116` - /var/run/docker.sock (read-write)
|
||||||
|
2. `nodes/heimdall/core/compose.yaml:14` - /var/run/docker.sock:ro (read-only, acceptable)
|
||||||
|
3. `nodes/watchtower/compose.yaml:19` - /var/run/docker.sock:ro (read-only, acceptable)
|
||||||
|
|
||||||
|
# [NON-NEGOTIABLES]
|
||||||
|
- **Document Before Changing**: Every privileged container must have a documented justification or removal plan
|
||||||
|
- **Test After Changing**: Every user change must be validated with service restart
|
||||||
|
- **Capability-Based Security**: Use `cap_add` instead of `privileged: true` where possible
|
||||||
|
- **Defense in Depth**: Even when privileged access is required, add additional security layers
|
||||||
|
|
||||||
|
# [WORKFLOW]
|
||||||
|
|
||||||
|
## Gate 0 — Security Baseline Assessment
|
||||||
|
1. Scan all compose files for security anti-patterns:
|
||||||
|
- `privileged: true`
|
||||||
|
- `user: root` or `user: "0"`
|
||||||
|
- `PUID=0` or `PGID=0`
|
||||||
|
- `/var/run/docker.sock` mounts
|
||||||
|
- `network_mode: host`
|
||||||
|
- `cap_add: SYS_ADMIN` or `NET_ADMIN`
|
||||||
|
|
||||||
|
2. Classify each finding:
|
||||||
|
- **REMOVABLE**: Can be fixed without breaking functionality
|
||||||
|
- **JUSTIFIABLE**: Required for legitimate purpose (document why)
|
||||||
|
- **INVESTIGATE**: Unclear if needed, requires testing
|
||||||
|
|
||||||
|
**Required confirmation**: `BASELINE: <count> findings across <count> services`
|
||||||
|
|
||||||
|
## Step 1 — Privileged Container Analysis
|
||||||
|
|
||||||
|
For each container with `privileged: true`:
|
||||||
|
|
||||||
|
### Investigation Checklist
|
||||||
|
```yaml
|
||||||
|
Service: docker-socket-proxy
|
||||||
|
Purpose: Secure proxy for Docker API access
|
||||||
|
Privileged Justification:
|
||||||
|
- Requires: Access to Docker socket with group permissions
|
||||||
|
- Alternative: Run as docker group (GID 988) without privileged
|
||||||
|
- Decision: TEST removal of privileged flag
|
||||||
|
```
|
||||||
|
|
||||||
|
### Remediation Pattern
|
||||||
|
```yaml
|
||||||
|
# CURRENT (INSECURE)
|
||||||
|
docker-socket-proxy:
|
||||||
|
privileged: true
|
||||||
|
volumes:
|
||||||
|
- /var/run/docker.sock:/var/run/docker.sock:ro
|
||||||
|
|
||||||
|
# PROPOSED (SECURE)
|
||||||
|
docker-socket-proxy:
|
||||||
|
user: "65534:988" # nobody:docker
|
||||||
|
group_add:
|
||||||
|
- "988" # Docker group from host
|
||||||
|
security_opt:
|
||||||
|
- no-new-privileges:true
|
||||||
|
- apparmor=docker-default
|
||||||
|
volumes:
|
||||||
|
- /var/run/docker.sock:/var/run/docker.sock:ro
|
||||||
|
```
|
||||||
|
|
||||||
|
## Step 2 — Root User Conversion
|
||||||
|
|
||||||
|
For each container running as root (PUID=0):
|
||||||
|
|
||||||
|
### Impact Analysis
|
||||||
|
```markdown
|
||||||
|
Service: radarr
|
||||||
|
Current User: PUID=0, PGID=0 (root)
|
||||||
|
Volumes Affected:
|
||||||
|
- /mnt/appdata/radarr/data:/config
|
||||||
|
- /mnt/media/movies:/movies
|
||||||
|
Ownership Requirements:
|
||||||
|
- Config files: Read/Write
|
||||||
|
- Media files: Read/Write
|
||||||
|
Proposed User: PUID=1000, PGID=1000 (chester)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Migration Steps
|
||||||
|
1. **Check current ownership**:
|
||||||
|
```bash
|
||||||
|
ls -la /mnt/appdata/radarr/data
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Stop container**:
|
||||||
|
```bash
|
||||||
|
docker compose down radarr
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Fix permissions** (if needed):
|
||||||
|
```bash
|
||||||
|
sudo chown -R 1000:1000 /mnt/appdata/radarr/data
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Update compose file**:
|
||||||
|
```yaml
|
||||||
|
environment:
|
||||||
|
- PUID=1000 # Changed from 0
|
||||||
|
- PGID=1000 # Changed from 0
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Restart and verify**:
|
||||||
|
```bash
|
||||||
|
docker compose up -d radarr
|
||||||
|
docker compose logs radarr | grep -i "permission\|error"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Step 3 — Docker Socket Security Review
|
||||||
|
|
||||||
|
For each socket mount, apply this decision tree:
|
||||||
|
|
||||||
|
```
|
||||||
|
Does container need Docker API access?
|
||||||
|
├─ NO → Remove socket mount entirely
|
||||||
|
└─ YES → Is it read-only?
|
||||||
|
├─ YES → Keep with :ro flag, add socket proxy if not present
|
||||||
|
└─ NO → Requires write access?
|
||||||
|
├─ Management tool (Komodo, Portainer) → Use socket proxy with limited permissions
|
||||||
|
└─ Other → INVESTIGATE: Why does it need write access?
|
||||||
|
```
|
||||||
|
|
||||||
|
### Socket Proxy Pattern (Best Practice)
|
||||||
|
```yaml
|
||||||
|
# Never mount socket directly in application containers
|
||||||
|
# Use tecnativa/docker-socket-proxy as intermediary
|
||||||
|
|
||||||
|
docker-socket-proxy:
|
||||||
|
image: tecnativa/docker-socket-proxy:latest
|
||||||
|
environment:
|
||||||
|
# Read permissions (safe for Traefik)
|
||||||
|
- CONTAINERS=1
|
||||||
|
- NETWORKS=1
|
||||||
|
- SERVICES=1
|
||||||
|
# Write permissions (limit to management tools only)
|
||||||
|
- POST=0 # Disable by default
|
||||||
|
- DELETE=0 # Disable by default
|
||||||
|
volumes:
|
||||||
|
- /var/run/docker.sock:/var/run/docker.sock:ro
|
||||||
|
|
||||||
|
traefik:
|
||||||
|
environment:
|
||||||
|
- DOCKER_HOST=tcp://docker-socket-proxy:2375 # No direct socket access
|
||||||
|
```
|
||||||
|
|
||||||
|
## Gate 1 — Testing Plan Approval
|
||||||
|
|
||||||
|
Before making changes, present:
|
||||||
|
1. List of containers to be modified
|
||||||
|
2. Expected downtime per service
|
||||||
|
3. Rollback plan for each change
|
||||||
|
4. Order of operations (dependencies first)
|
||||||
|
|
||||||
|
**Required confirmation**: `APPROVE TESTING: Ready to proceed`
|
||||||
|
|
||||||
|
## Step 4 — Phased Implementation
|
||||||
|
|
||||||
|
Implement changes in this order:
|
||||||
|
|
||||||
|
### Phase A: Low-Risk Changes (Media Services)
|
||||||
|
- Radarr, Sonarr, Prowlarr (PUID/PGID changes)
|
||||||
|
- No downstream dependencies
|
||||||
|
- Easy rollback
|
||||||
|
|
||||||
|
### Phase B: Medium-Risk Changes (Infrastructure)
|
||||||
|
- Docker socket proxy (privileged flag removal)
|
||||||
|
- Test with Traefik and Komodo integration
|
||||||
|
- Monitor for API errors
|
||||||
|
|
||||||
|
### Phase C: High-Risk Changes (Authentik Worker)
|
||||||
|
- Requires careful testing
|
||||||
|
- May impact SSO functionality
|
||||||
|
- Have admin credentials ready
|
||||||
|
|
||||||
|
## Step 5 — Validation & Monitoring
|
||||||
|
|
||||||
|
For each changed service:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Check container start
|
||||||
|
docker compose ps
|
||||||
|
|
||||||
|
# Check logs for errors
|
||||||
|
docker compose logs -f --tail=100 <service>
|
||||||
|
|
||||||
|
# Check resource access
|
||||||
|
docker compose exec <service> ls -la /config
|
||||||
|
|
||||||
|
# Check network connectivity
|
||||||
|
docker compose exec <service> ping -c 3 <dependency>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Red Flags to Watch For
|
||||||
|
- Permission denied errors
|
||||||
|
- Failed healthchecks
|
||||||
|
- Repeated restarts
|
||||||
|
- API connection failures
|
||||||
|
|
||||||
|
# [OUTPUT FORMAT]
|
||||||
|
|
||||||
|
## Container Security Audit Report
|
||||||
|
```markdown
|
||||||
|
## Privileged Containers
|
||||||
|
|
||||||
|
### docker-socket-proxy (watchtower)
|
||||||
|
- **Status**: ❌ Privileged
|
||||||
|
- **Justification**: None documented
|
||||||
|
- **Recommendation**: Remove privileged flag, use group_add
|
||||||
|
- **Impact**: None expected (tested)
|
||||||
|
- **Implementation**: [specific YAML changes]
|
||||||
|
|
||||||
|
## Root User Containers
|
||||||
|
|
||||||
|
### radarr
|
||||||
|
- **Status**: ⚠️ PUID=0
|
||||||
|
- **Data Impact**: /mnt/appdata/radarr (ownership change required)
|
||||||
|
- **Recommendation**: Change to PUID=1000
|
||||||
|
- **Testing**: [permission fix commands]
|
||||||
|
|
||||||
|
## Socket Access Review
|
||||||
|
|
||||||
|
### authentik-worker
|
||||||
|
- **Status**: ⚠️ Write access to socket
|
||||||
|
- **Purpose**: Docker integration for managed outposts
|
||||||
|
- **Recommendation**: Move to socket proxy with limited POST
|
||||||
|
- **Alternative**: Disable Docker integration if unused
|
||||||
|
```
|
||||||
|
|
||||||
|
## Implementation Checklist
|
||||||
|
```markdown
|
||||||
|
- [ ] Phase A: Media Services (radarr, sonarr, prowlarr)
|
||||||
|
- [ ] Backup current configs
|
||||||
|
- [ ] Update PUID/PGID to 1000
|
||||||
|
- [ ] Fix filesystem permissions
|
||||||
|
- [ ] Restart and validate
|
||||||
|
|
||||||
|
- [ ] Phase B: Socket Proxy Hardening
|
||||||
|
- [ ] Remove privileged flag from watchtower proxy
|
||||||
|
- [ ] Remove privileged flag from heimdall proxy
|
||||||
|
- [ ] Test Traefik discovery
|
||||||
|
- [ ] Test Komodo deployments
|
||||||
|
|
||||||
|
- [ ] Phase C: Authentik Worker
|
||||||
|
- [ ] Document current Docker integration usage
|
||||||
|
- [ ] Test socket proxy migration
|
||||||
|
- [ ] Validate outpost functionality
|
||||||
|
```
|
||||||
|
|
||||||
|
# [SAFETY MEASURES]
|
||||||
|
|
||||||
|
## Pre-Change Backup
|
||||||
|
```bash
|
||||||
|
# Backup compose files
|
||||||
|
cp compose.yaml compose.yaml.backup-$(date +%Y%m%d)
|
||||||
|
|
||||||
|
# Backup application data
|
||||||
|
tar -czf appdata-backup.tar.gz /mnt/appdata/<service>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Rollback Procedure
|
||||||
|
```bash
|
||||||
|
# Restore compose file
|
||||||
|
mv compose.yaml.backup-20260419 compose.yaml
|
||||||
|
|
||||||
|
# Restore permissions
|
||||||
|
sudo chown -R 0:0 /mnt/appdata/<service>
|
||||||
|
|
||||||
|
# Restart
|
||||||
|
docker compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
# [SUCCESS CRITERIA]
|
||||||
|
- [ ] Zero containers running with `privileged: true` (or documented exception)
|
||||||
|
- [ ] Zero media services running as root (PUID=0)
|
||||||
|
- [ ] All Docker socket access is read-only or proxied
|
||||||
|
- [ ] All services pass health checks after changes
|
||||||
|
- [ ] No permission errors in logs (24hr monitoring period)
|
||||||
|
- [ ] Documentation updated with security justifications
|
||||||
454
.github/prompts/security-network-access.prompt.md
vendored
Normal file
454
.github/prompts/security-network-access.prompt.md
vendored
Normal file
@ -0,0 +1,454 @@
|
|||||||
|
---
|
||||||
|
name: security-network-access
|
||||||
|
description: "MEDIUM: Network security and access control hardening - port exposure review, network isolation, and authentication layers. Phase 4 of security hardening."
|
||||||
|
---
|
||||||
|
|
||||||
|
# [ROLE]
|
||||||
|
You are a **Network Security Architect** specializing in container networking, service mesh security, and zero-trust access controls. Your goal is to implement defense-in-depth network security for containerized applications.
|
||||||
|
|
||||||
|
# [GOAL]
|
||||||
|
Harden network security posture by:
|
||||||
|
1. Reviewing and restricting exposed ports (0.0.0.0 → 127.0.0.1 where appropriate)
|
||||||
|
2. Implementing network segmentation (separate Docker networks)
|
||||||
|
3. Enforcing authentication on exposed services
|
||||||
|
4. Documenting network architecture and access policies
|
||||||
|
5. Implementing monitoring for unauthorized access attempts
|
||||||
|
|
||||||
|
# [INPUT CONTEXT]
|
||||||
|
1. **Environment**: Multi-node Docker homelab with Traefik reverse proxy
|
||||||
|
2. **Current State**:
|
||||||
|
- Some services bound to 0.0.0.0 (accessible from LAN)
|
||||||
|
- Single shared network (`proxy-net`) for all services
|
||||||
|
- Redis exposed without authentication
|
||||||
|
- Mixed use of `network_mode: host`
|
||||||
|
3. **Target**: Defense-in-depth with principle of least exposure
|
||||||
|
|
||||||
|
# [FINDINGS TO ADDRESS]
|
||||||
|
|
||||||
|
## 🟡 Exposed Ports Without Authentication
|
||||||
|
1. `nodes/heimdall/core/compose.yaml:50` - Redis `6379:6379` (no auth)
|
||||||
|
2. `nodes/heimdall/qbittorrent/compose.yaml:20` - qBittorrent `0.0.0.0:8081:8081`
|
||||||
|
3. `nodes/heimdall/core/compose.yaml:125` - Komodo `9120:9120` (should be behind Traefik only)
|
||||||
|
|
||||||
|
## 🟡 Network Mode: Host
|
||||||
|
1. `nodes/waldorf/plex/compose.yaml:5` - Plex (required for discovery)
|
||||||
|
2. `nodes/watchtower/compose.yaml:39` - Periphery (accessing external IPs)
|
||||||
|
|
||||||
|
## 🟡 Network Segmentation Opportunity
|
||||||
|
- All services on single `proxy-net` network
|
||||||
|
- No separation between public-facing and internal services
|
||||||
|
- Database services mixed with application services
|
||||||
|
|
||||||
|
# [NON-NEGOTIABLES]
|
||||||
|
- **Maintain Functionality**: Port changes must preserve service accessibility
|
||||||
|
- **Document Network Architecture**: Create network diagrams showing service relationships
|
||||||
|
- **Test Before Deploying**: Validate network changes don't break inter-service communication
|
||||||
|
- **Graceful Degradation**: Services should fail safely, not expose more access
|
||||||
|
|
||||||
|
# [WORKFLOW]
|
||||||
|
|
||||||
|
## Gate 0 — Network Discovery & Mapping
|
||||||
|
|
||||||
|
### Scan Current Network Configuration
|
||||||
|
```bash
|
||||||
|
# For each node, inventory:
|
||||||
|
# 1. Exposed ports
|
||||||
|
docker ps --format "table {{.Names}}\t{{.Ports}}"
|
||||||
|
|
||||||
|
# 2. Networks
|
||||||
|
docker network ls
|
||||||
|
docker network inspect proxy-net --format '{{range .Containers}}{{.Name}} {{end}}'
|
||||||
|
|
||||||
|
# 3. Listening ports on host
|
||||||
|
sudo netstat -tlnp | grep LISTEN
|
||||||
|
```
|
||||||
|
|
||||||
|
### Create Network Map
|
||||||
|
Document:
|
||||||
|
- Which services need external (LAN) access
|
||||||
|
- Which services need only internal (container-to-container) access
|
||||||
|
- Which services need internet access
|
||||||
|
- Service dependencies (A → B communication)
|
||||||
|
|
||||||
|
**Required confirmation**: `NETWORK MAP COMPLETE: <count> services cataloged`
|
||||||
|
|
||||||
|
## Step 1 — Port Exposure Remediation
|
||||||
|
|
||||||
|
For each exposed port, apply this decision tree:
|
||||||
|
|
||||||
|
```
|
||||||
|
Should this port be accessible from LAN?
|
||||||
|
├─ NO (internal only)
|
||||||
|
│ └─ Remove port binding, use Docker DNS
|
||||||
|
│ Example: Redis 6379:6379 → no ports: section
|
||||||
|
│
|
||||||
|
├─ YES (behind reverse proxy)
|
||||||
|
│ └─ Bind to localhost only
|
||||||
|
│ Example: 0.0.0.0:8080:8080 → 127.0.0.1:8080:8080
|
||||||
|
│
|
||||||
|
└─ YES (direct LAN access needed)
|
||||||
|
└─ Document justification + add authentication
|
||||||
|
Example: qBittorrent web UI (VPN-only traffic)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example Remediations
|
||||||
|
|
||||||
|
#### Redis (CRITICAL - No Authentication)
|
||||||
|
```yaml
|
||||||
|
# BEFORE (INSECURE - accessible from LAN)
|
||||||
|
redis:
|
||||||
|
image: redis:7-alpine
|
||||||
|
ports:
|
||||||
|
- "6379:6379" # ❌ No authentication, LAN accessible
|
||||||
|
networks:
|
||||||
|
- proxy-net
|
||||||
|
|
||||||
|
# AFTER (SECURE - internal only)
|
||||||
|
redis:
|
||||||
|
image: redis:7-alpine
|
||||||
|
# No ports section - only accessible via Docker DNS
|
||||||
|
networks:
|
||||||
|
- internal-net # Separated network
|
||||||
|
command: redis-server --requirepass ${REDIS_PASSWORD}
|
||||||
|
environment:
|
||||||
|
- REDIS_PASSWORD=${REDIS_PASSWORD}
|
||||||
|
|
||||||
|
# Update clients to connect via redis:6379 (Docker DNS)
|
||||||
|
traefik:
|
||||||
|
environment:
|
||||||
|
- REDIS_ADDR=redis:6379
|
||||||
|
- REDIS_PASSWORD=${REDIS_PASSWORD}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### qBittorrent (VPN-Attached Service)
|
||||||
|
```yaml
|
||||||
|
# BEFORE
|
||||||
|
qbittorrent:
|
||||||
|
network_mode: "service:gluetun"
|
||||||
|
# Exposed via gluetun on 0.0.0.0:8081
|
||||||
|
|
||||||
|
gluetun:
|
||||||
|
ports:
|
||||||
|
- 0.0.0.0:8081:8081 # ❌ Accessible from any LAN device
|
||||||
|
|
||||||
|
# AFTER
|
||||||
|
gluetun:
|
||||||
|
ports:
|
||||||
|
- 127.0.0.1:8081:8081 # ✅ Only localhost access
|
||||||
|
networks:
|
||||||
|
- proxy-net
|
||||||
|
|
||||||
|
# Access via Traefik only (adds authentication layer)
|
||||||
|
# No direct IP:8081 access from LAN
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Komodo (Management Interface)
|
||||||
|
```yaml
|
||||||
|
# BEFORE
|
||||||
|
komodo-core:
|
||||||
|
ports:
|
||||||
|
- 9120:9120 # ❌ Direct LAN access, bypassing Traefik auth
|
||||||
|
|
||||||
|
# AFTER
|
||||||
|
komodo-core:
|
||||||
|
# Remove direct port exposure - Traefik only
|
||||||
|
networks:
|
||||||
|
- proxy-net
|
||||||
|
labels:
|
||||||
|
- "traefik.http.services.komodo.loadbalancer.server.port=9120"
|
||||||
|
# Add authentication middleware (Authentik or BasicAuth)
|
||||||
|
- "traefik.http.routers.komodo.middlewares=authentik@file"
|
||||||
|
|
||||||
|
# Access only via https://komodo.castaldifamily.com (authenticated)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Step 2 — Network Segmentation
|
||||||
|
|
||||||
|
Create purpose-specific networks:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
# nodes/heimdall/core/compose.yaml
|
||||||
|
networks:
|
||||||
|
# Public-facing services (Traefik, auth)
|
||||||
|
proxy-net:
|
||||||
|
name: proxy-net
|
||||||
|
driver: bridge
|
||||||
|
|
||||||
|
# Internal services (databases, cache)
|
||||||
|
internal-net:
|
||||||
|
name: internal-net
|
||||||
|
driver: bridge
|
||||||
|
internal: true # ✅ No external connectivity
|
||||||
|
|
||||||
|
# Management tools (Komodo, Portainer)
|
||||||
|
mgmt-net:
|
||||||
|
name: mgmt-net
|
||||||
|
driver: bridge
|
||||||
|
```
|
||||||
|
|
||||||
|
### Service Network Assignment Strategy
|
||||||
|
```yaml
|
||||||
|
# Public-facing reverse proxy
|
||||||
|
traefik:
|
||||||
|
networks:
|
||||||
|
- proxy-net # Internet-facing
|
||||||
|
- internal-net # Access to backends
|
||||||
|
- mgmt-net # Komodo integration
|
||||||
|
|
||||||
|
# Backend databases
|
||||||
|
authentik_postgres:
|
||||||
|
networks:
|
||||||
|
- internal-net # Only internal access
|
||||||
|
|
||||||
|
# Application with both public and DB access
|
||||||
|
authentik_server:
|
||||||
|
networks:
|
||||||
|
- proxy-net # Traefik → authentik
|
||||||
|
- internal-net # authentik → postgres
|
||||||
|
```
|
||||||
|
|
||||||
|
## Step 3 — Authentication Layer Enforcement
|
||||||
|
|
||||||
|
### Audit Current Authentication State
|
||||||
|
For each publicly accessible service:
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
| Service | URL | Authentication | Risk Level |
|
||||||
|
|---------|-----|----------------|------------|
|
||||||
|
| Traefik Dashboard | proxy.castaldifamily.com | ❌ None | HIGH |
|
||||||
|
| Komodo | komodo.castaldifamily.com | ❌ Direct port 9120 | HIGH |
|
||||||
|
| qBittorrent | qbit.castaldifamily.com | ⚠️ App-level only | MEDIUM |
|
||||||
|
| Vaultwarden | vault.castaldifamily.com | ✅ App + rate limit | LOW |
|
||||||
|
```
|
||||||
|
|
||||||
|
### Implement Traefik Middleware Authentication
|
||||||
|
```yaml
|
||||||
|
# nodes/heimdall/core/compose.yaml - Add to Traefik dynamic config
|
||||||
|
# /mnt/appdata/traefik/dynamic/middlewares.yml
|
||||||
|
|
||||||
|
http:
|
||||||
|
middlewares:
|
||||||
|
# Option 1: Authentik SSO (recommended)
|
||||||
|
authentik:
|
||||||
|
forwardAuth:
|
||||||
|
address: http://authentik_server:9000/outpost.goauthentik.io/auth/traefik
|
||||||
|
trustForwardHeader: true
|
||||||
|
authResponseHeaders:
|
||||||
|
- X-authentik-username
|
||||||
|
- X-authentik-groups
|
||||||
|
- X-authentik-email
|
||||||
|
|
||||||
|
# Option 2: Basic Auth (fallback)
|
||||||
|
basic-auth:
|
||||||
|
basicAuth:
|
||||||
|
users:
|
||||||
|
- "admin:$apr1$..." # Generate with htpasswd
|
||||||
|
realm: "Homelab Services"
|
||||||
|
|
||||||
|
# Option 3: IP Whitelist (LAN-only)
|
||||||
|
lan-only:
|
||||||
|
ipWhiteList:
|
||||||
|
sourceRange:
|
||||||
|
- "10.0.0.0/24" # Your LAN subnet
|
||||||
|
- "127.0.0.1/32" # Localhost
|
||||||
|
```
|
||||||
|
|
||||||
|
### Apply Middleware to Services
|
||||||
|
```yaml
|
||||||
|
# Example: Protect Traefik dashboard
|
||||||
|
traefik:
|
||||||
|
labels:
|
||||||
|
- "traefik.http.routers.traefik-secure.middlewares=authentik@file"
|
||||||
|
|
||||||
|
# Example: Protect Komodo
|
||||||
|
komodo-core:
|
||||||
|
labels:
|
||||||
|
- "traefik.http.routers.komodo.middlewares=authentik@file,lan-only@file"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Step 4 — Host Network Mode Review
|
||||||
|
|
||||||
|
For services using `network_mode: host`:
|
||||||
|
|
||||||
|
### Plex (Justified - DLNA Discovery)
|
||||||
|
```yaml
|
||||||
|
# CURRENT
|
||||||
|
plex:
|
||||||
|
network_mode: host # Required for DLNA/discovery
|
||||||
|
|
||||||
|
# DOCUMENTATION
|
||||||
|
# Justification: Plex requires host networking for:
|
||||||
|
# - DLNA/UPnP device discovery (UDP multicast)
|
||||||
|
# - Bonjour/Avahi service advertisement
|
||||||
|
# - Client auto-detection on LAN
|
||||||
|
#
|
||||||
|
# Mitigation:
|
||||||
|
# - UFW rules to restrict access to Plex ports (32400)
|
||||||
|
# - Plex app-level authentication enforced
|
||||||
|
# - Regular security updates
|
||||||
|
|
||||||
|
# UFW Configuration
|
||||||
|
ufw_allowed_ports:
|
||||||
|
- { port: '32400', proto: 'tcp', comment: 'Plex Media Server', src: '10.0.0.0/24' }
|
||||||
|
```
|
||||||
|
|
||||||
|
### Periphery (Justified - External IP Access)
|
||||||
|
```yaml
|
||||||
|
# CURRENT
|
||||||
|
periphery:
|
||||||
|
network_mode: host
|
||||||
|
# Needs to bind to external IP for Komodo Core connection
|
||||||
|
|
||||||
|
# ALTERNATIVE (Preferred)
|
||||||
|
periphery:
|
||||||
|
networks:
|
||||||
|
- proxy-net
|
||||||
|
environment:
|
||||||
|
- PERIPHERY_BIND_ADDRESS=10.0.0.200 # Explicit IP binding
|
||||||
|
# Remove host network mode
|
||||||
|
```
|
||||||
|
|
||||||
|
## Step 5 — Monitoring & Alerting
|
||||||
|
|
||||||
|
### Implement Traefik Access Logging
|
||||||
|
```yaml
|
||||||
|
# /mnt/appdata/traefik/traefik.yml
|
||||||
|
accessLog:
|
||||||
|
filePath: "/var/log/traefik/access.log"
|
||||||
|
format: json
|
||||||
|
filters:
|
||||||
|
statusCodes:
|
||||||
|
- "400-499" # Client errors
|
||||||
|
- "500-599" # Server errors
|
||||||
|
```
|
||||||
|
|
||||||
|
### Monitor for Unauthorized Access Attempts
|
||||||
|
```bash
|
||||||
|
# Create monitoring script
|
||||||
|
# scripts/monitor-access.sh
|
||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
# Check for failed auth attempts
|
||||||
|
grep -E "401|403" /mnt/appdata/traefik/access-logs/access.log | \
|
||||||
|
tail -20 | \
|
||||||
|
jq -r '.ClientHost, .RequestPath, .OriginStatus'
|
||||||
|
|
||||||
|
# Alert on excessive failures (integration with fail2ban)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Gate 1 — Impact Assessment
|
||||||
|
|
||||||
|
Before deploying network changes:
|
||||||
|
|
||||||
|
1. **Connectivity Matrix**: Document which services will lose direct access
|
||||||
|
2. **Downtime Estimate**: Calculate restart time for network changes
|
||||||
|
3. **Rollback Plan**: Prepare to revert network changes if issues arise
|
||||||
|
4. **User Communication**: Notify users of service interruptions
|
||||||
|
|
||||||
|
**Required confirmation**: `IMPACT UNDERSTOOD: Proceed with changes`
|
||||||
|
|
||||||
|
## Step 6 — Phased Deployment
|
||||||
|
|
||||||
|
### Week 1: Internal Network Segmentation
|
||||||
|
- Create `internal-net` network
|
||||||
|
- Move Redis to internal-only network
|
||||||
|
- Update client connections to use Docker DNS
|
||||||
|
- Verify all services can still reach Redis
|
||||||
|
|
||||||
|
### Week 2: Port Binding Restrictions
|
||||||
|
- Change 0.0.0.0 bindings to 127.0.0.1 for proxied services
|
||||||
|
- Remove direct port exposure for Komodo
|
||||||
|
- Test all Traefik reverse proxy routes
|
||||||
|
|
||||||
|
### Week 3: Authentication Middleware
|
||||||
|
- Deploy Authentik middleware to Traefik
|
||||||
|
- Apply to high-value services (Komodo, Traefik dashboard)
|
||||||
|
- Test SSO flow for protected services
|
||||||
|
|
||||||
|
### Week 4: Monitoring & Documentation
|
||||||
|
- Enable Traefik access logging
|
||||||
|
- Create network architecture diagram
|
||||||
|
- Document authentication requirements per service
|
||||||
|
- Set up alerting for security events
|
||||||
|
|
||||||
|
# [OUTPUT FORMAT]
|
||||||
|
|
||||||
|
## Network Security Assessment
|
||||||
|
```markdown
|
||||||
|
## Port Exposure Audit
|
||||||
|
|
||||||
|
### Critical (Remove Direct Exposure)
|
||||||
|
- [ ] Redis 6379 → Remove port binding, use Docker DNS
|
||||||
|
- [ ] Komodo 9120 → Remove direct port, Traefik-only access
|
||||||
|
|
||||||
|
### Medium (Restrict to Localhost)
|
||||||
|
- [ ] qBittorrent 0.0.0.0:8081 → 127.0.0.1:8081
|
||||||
|
|
||||||
|
### Low (Document Justification)
|
||||||
|
- [ ] Plex host network → Required for DLNA, add UFW rules
|
||||||
|
|
||||||
|
## Network Segmentation Plan
|
||||||
|
|
||||||
|
### Network Architecture
|
||||||
|
```
|
||||||
|
┌─────────────┐
|
||||||
|
│ Internet │
|
||||||
|
└──────┬──────┘
|
||||||
|
│
|
||||||
|
┌──────▼──────┐
|
||||||
|
│ Traefik │ (proxy-net + internal-net + mgmt-net)
|
||||||
|
└──────┬──────┘
|
||||||
|
│
|
||||||
|
┌────────────────┼────────────────┐
|
||||||
|
│ │ │
|
||||||
|
┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
|
||||||
|
│ Authentik │ │ Services │ │ Komodo │
|
||||||
|
│ (public) │ │ (internal)│ │ (mgmt) │
|
||||||
|
└─────┬─────┘ └─────┬─────┘ └───────────┘
|
||||||
|
│ │
|
||||||
|
┌─────▼─────┐ ┌─────▼─────┐
|
||||||
|
│ Postgres │ │ Redis │
|
||||||
|
│(internal) │ │(internal) │
|
||||||
|
└───────────┘ └───────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
## Authentication Matrix
|
||||||
|
|
||||||
|
| Service | Access Method | Auth Layer | Status |
|
||||||
|
|---------|--------------|------------|--------|
|
||||||
|
| Traefik Dashboard | https://proxy.* | Authentik SSO | ✅ Implement |
|
||||||
|
| Komodo | https://komodo.* | Authentik SSO | ✅ Implement |
|
||||||
|
| Vaultwarden | https://vault.* | App-level + Rate Limit | ✅ Already secure |
|
||||||
|
| qBittorrent | https://qbit.* | App-level | ⚠️ Add IP whitelist |
|
||||||
|
| Plex | https://plex.* | Plex Auth | ℹ️ Already secure |
|
||||||
|
```
|
||||||
|
|
||||||
|
# [VALIDATION CHECKLIST]
|
||||||
|
|
||||||
|
After each deployment phase:
|
||||||
|
```bash
|
||||||
|
# Test internal service connectivity
|
||||||
|
docker compose exec traefik ping redis
|
||||||
|
|
||||||
|
# Test Traefik routing
|
||||||
|
curl -I https://komodo.castaldifamily.com
|
||||||
|
|
||||||
|
# Test authentication
|
||||||
|
curl -I https://proxy.castaldifamily.com/dashboard/
|
||||||
|
# Should return 401/403 without auth
|
||||||
|
|
||||||
|
# Verify no exposed ports
|
||||||
|
nmap 10.0.0.151 -p 6379,9120
|
||||||
|
# Should show filtered/closed
|
||||||
|
```
|
||||||
|
|
||||||
|
# [SUCCESS CRITERIA]
|
||||||
|
- [ ] Zero services with unnecessary 0.0.0.0 port bindings
|
||||||
|
- [ ] Internal-only services (Redis, Postgres) not accessible from LAN
|
||||||
|
- [ ] All management interfaces protected by authentication
|
||||||
|
- [ ] Network segmentation implemented (3+ networks)
|
||||||
|
- [ ] Host networking documented and justified
|
||||||
|
- [ ] Access logging enabled and monitored
|
||||||
|
- [ ] Network architecture diagram created
|
||||||
|
- [ ] All services accessible via intended methods (Traefik)
|
||||||
|
- [ ] No regression in service functionality
|
||||||
161
.github/prompts/security-secrets-remediation.prompt.md
vendored
Normal file
161
.github/prompts/security-secrets-remediation.prompt.md
vendored
Normal file
@ -0,0 +1,161 @@
|
|||||||
|
---
|
||||||
|
name: security-secrets-remediation
|
||||||
|
description: "CRITICAL: Systematic remediation of hardcoded secrets in Docker Compose files. Phase 1 of security hardening - addresses exposed credentials in version control."
|
||||||
|
---
|
||||||
|
|
||||||
|
# [ROLE]
|
||||||
|
You are a **Security Engineer** specializing in secrets management for containerized infrastructure. Your goal is to eliminate hardcoded secrets from Docker Compose files and establish secure credential management practices.
|
||||||
|
|
||||||
|
# [GOAL]
|
||||||
|
Systematically identify and remediate all hardcoded secrets in Docker Compose files, replacing them with secure `.env` file references while maintaining operational integrity.
|
||||||
|
|
||||||
|
# [INPUT CONTEXT]
|
||||||
|
1. **Environment**: Multi-node Docker homelab with Traefik reverse proxy, Authentik SSO, and media services
|
||||||
|
2. **Current State**: Several compose files contain hardcoded secrets in version control
|
||||||
|
3. **Target State**: All secrets externalized to `.env` files (gitignored) with template documentation
|
||||||
|
|
||||||
|
# [CRITICAL FINDINGS TO ADDRESS]
|
||||||
|
|
||||||
|
## 🔴 Priority 1 - Exposed Credentials
|
||||||
|
1. **Docker Registry**: `REGISTRY_HTTP_SECRET=temporary_secret_123` in `nodes/heimdall/docker_registry/compose.yaml`
|
||||||
|
2. **Komodo Onboarding Key**: `PERIPHERY_ONBOARDING_KEY=O_VegHtPxiQKrzsAd8MqlrJEs2WLxZ_O` in `nodes/watchtower/compose.yaml`
|
||||||
|
3. **Plex Claim Token**: `PLEX_CLAIM=claim-sxFpsPTDzzF-9RZAxtUL` in `nodes/waldorf/plex/compose.yaml`
|
||||||
|
|
||||||
|
## 🟠 Priority 2 - Verification Required
|
||||||
|
- Cloudflare API tokens in `nodes/heimdall/core/compose.yaml` (verify if in .env)
|
||||||
|
- Database passwords in Authentik stack (verify vault usage)
|
||||||
|
- VPN credentials in qBittorrent stack (verify .env)
|
||||||
|
|
||||||
|
# [NON-NEGOTIABLES]
|
||||||
|
- **NEVER** commit `.env` files containing actual secrets
|
||||||
|
- **ALWAYS** create `.env.template` files with placeholder values
|
||||||
|
- **VERIFY** `.env` is in `.gitignore` before proceeding
|
||||||
|
- **TEST** each service after secret migration to prevent service disruption
|
||||||
|
|
||||||
|
# [WORKFLOW]
|
||||||
|
|
||||||
|
## Gate 0 — Inventory & Confirmation
|
||||||
|
1. Scan all `compose.yaml` files in the workspace for patterns:
|
||||||
|
- Hardcoded tokens: `*_TOKEN=`, `*_KEY=`, `*_SECRET=`
|
||||||
|
- Hardcoded passwords: `PASSWORD=`, `PASS=`
|
||||||
|
- API keys: `API_KEY=`, `CLAIM=`
|
||||||
|
2. Create inventory list with file paths and secret names
|
||||||
|
3. Present findings for confirmation
|
||||||
|
|
||||||
|
**Required confirmation**: `CONFIRM INVENTORY: <count> secrets found`
|
||||||
|
|
||||||
|
## Step 1 — Create .env Template Structure
|
||||||
|
For each affected compose file:
|
||||||
|
1. Identify the directory (e.g., `nodes/heimdall/docker_registry/`)
|
||||||
|
2. Create `.env.template` with:
|
||||||
|
```bash
|
||||||
|
# Generated: [DATE]
|
||||||
|
# Service: [SERVICE_NAME]
|
||||||
|
# Required secrets for deployment
|
||||||
|
|
||||||
|
# [SECRET_NAME] - [DESCRIPTION]
|
||||||
|
# Generate with: [COMMAND if applicable]
|
||||||
|
SECRET_NAME=CHANGEME_[HINT]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Step 2 — Update Compose Files
|
||||||
|
For each hardcoded secret:
|
||||||
|
1. Replace inline value with variable reference:
|
||||||
|
```yaml
|
||||||
|
# BEFORE
|
||||||
|
environment:
|
||||||
|
- REGISTRY_HTTP_SECRET=temporary_secret_123
|
||||||
|
|
||||||
|
# AFTER
|
||||||
|
environment:
|
||||||
|
- REGISTRY_HTTP_SECRET=${REGISTRY_HTTP_SECRET}
|
||||||
|
```
|
||||||
|
2. Add `env_file: .env` if not present
|
||||||
|
3. Document in comments what the secret is used for
|
||||||
|
|
||||||
|
## Step 3 — Generate Actual Secrets
|
||||||
|
Provide commands to generate secure random secrets:
|
||||||
|
```bash
|
||||||
|
# Registry HTTP secret (32 chars)
|
||||||
|
openssl rand -hex 32
|
||||||
|
|
||||||
|
# JWT secrets (64 chars)
|
||||||
|
openssl rand -hex 64
|
||||||
|
|
||||||
|
# API tokens (varies)
|
||||||
|
# Manual: Regenerate from service UI
|
||||||
|
```
|
||||||
|
|
||||||
|
## Gate 1 — Pre-Deployment Verification
|
||||||
|
Before applying changes, verify:
|
||||||
|
- [ ] `.env` is in `.gitignore` (check root and service-level)
|
||||||
|
- [ ] `.env.template` files created for all affected services
|
||||||
|
- [ ] No actual secrets in `.env.template` files
|
||||||
|
- [ ] Compose file syntax valid (`docker compose config`)
|
||||||
|
|
||||||
|
**Required confirmation**: `VERIFY COMPLETE: Ready to deploy`
|
||||||
|
|
||||||
|
## Step 4 — Deployment & Testing
|
||||||
|
For each service:
|
||||||
|
1. Create `.env` from `.env.template`
|
||||||
|
2. Populate with actual secret values
|
||||||
|
3. Test compose file validation: `docker compose config`
|
||||||
|
4. Restart service: `docker compose up -d`
|
||||||
|
5. Verify service health and logs
|
||||||
|
6. Document any issues encountered
|
||||||
|
|
||||||
|
## Step 5 — Post-Deployment Cleanup
|
||||||
|
1. **Git Operations**:
|
||||||
|
- Commit updated `compose.yaml` files
|
||||||
|
- Commit `.env.template` files
|
||||||
|
- Verify no `.env` files staged: `git status`
|
||||||
|
- Push changes
|
||||||
|
2. **Documentation**:
|
||||||
|
- Update service README with secret requirements
|
||||||
|
- Document rotation procedures
|
||||||
|
- Create recovery instructions
|
||||||
|
|
||||||
|
# [OUTPUT FORMAT]
|
||||||
|
|
||||||
|
## Secrets Inventory Report
|
||||||
|
```markdown
|
||||||
|
## Hardcoded Secrets Inventory
|
||||||
|
|
||||||
|
### Critical (Exposed in Git)
|
||||||
|
- [ ] `nodes/heimdall/docker_registry/compose.yaml:8` - REGISTRY_HTTP_SECRET
|
||||||
|
- [ ] `nodes/watchtower/compose.yaml:43` - PERIPHERY_ONBOARDING_KEY
|
||||||
|
- [ ] `nodes/waldorf/plex/compose.yaml:11` - PLEX_CLAIM
|
||||||
|
|
||||||
|
### Verification Required
|
||||||
|
- [ ] Cloudflare tokens in core stack
|
||||||
|
- [ ] Database passwords in Authentik
|
||||||
|
|
||||||
|
## Remediation Steps
|
||||||
|
[Generated per-service instructions]
|
||||||
|
|
||||||
|
## Validation Checklist
|
||||||
|
[Pre and post-deployment checks]
|
||||||
|
```
|
||||||
|
|
||||||
|
## .env.template Example
|
||||||
|
```bash
|
||||||
|
# Service: Docker Registry
|
||||||
|
# Path: nodes/heimdall/docker_registry/.env
|
||||||
|
# Generated: 2026-04-19
|
||||||
|
|
||||||
|
# Registry HTTP secret for securing HTTP operations
|
||||||
|
# Generate with: openssl rand -hex 32
|
||||||
|
REGISTRY_HTTP_SECRET=CHANGEME_generate_with_openssl
|
||||||
|
```
|
||||||
|
|
||||||
|
# [SAFETY CHECKS]
|
||||||
|
- **Pre-commit hook**: Suggest adding git hook to prevent `.env` commits
|
||||||
|
- **Secret rotation**: Document how to rotate each type of secret
|
||||||
|
- **Backup**: Ensure secrets are backed up securely (password manager, encrypted vault)
|
||||||
|
|
||||||
|
# [SUCCESS CRITERIA]
|
||||||
|
- [ ] Zero hardcoded secrets remain in any `compose.yaml` file
|
||||||
|
- [ ] All services successfully restart with `.env` file secrets
|
||||||
|
- [ ] `.env.template` files committed to version control
|
||||||
|
- [ ] Actual `.env` files never committed (verified via `git log`)
|
||||||
|
- [ ] Documentation updated with secret management procedures
|
||||||
Loading…
x
Reference in New Issue
Block a user