Compare commits
3 Commits
0018930255
...
0ed4e7198d
| Author | SHA1 | Date | |
|---|---|---|---|
| 0ed4e7198d | |||
| e9eaa32765 | |||
| 202ca9ebea |
56
.github/prompts/plan-ansibleAptMaintenance.prompt.md
vendored
Normal file
56
.github/prompts/plan-ansibleAptMaintenance.prompt.md
vendored
Normal file
@ -0,0 +1,56 @@
|
|||||||
|
## Plan: Ansible apt maintenance role rollout
|
||||||
|
|
||||||
|
Create a reusable Ansible role and thin orchestration playbook to run apt cache refresh, package upgrade, conditional reboot, and structured error handling across nodes. Reuse proven patterns already present in this repository, but generalize them beyond Proxmox and Swarm. Default behavior should be safe and idempotent, while variables allow switching between safe upgrade and dist-upgrade. Because you selected all-at-once rollout, include stronger pre-flight checks and clear failure controls to reduce blast radius.
|
||||||
|
|
||||||
|
**Steps**
|
||||||
|
1. Phase 1: Scope and variable contract definition.
|
||||||
|
Define role inputs and defaults for upgrade mode, reboot policy, retry behavior, and error strategy. Include variables for continue on host failure vs fail-fast globally, apt cache timing, and health checks.
|
||||||
|
|
||||||
|
2. Phase 2: Create a dedicated role structure.
|
||||||
|
Add a role such as ansible/roles/linux_apt_maintenance with split task files for preflight, update, reboot, validation, and reporting. Keep orchestration logic in playbooks while task logic stays inside the role.
|
||||||
|
|
||||||
|
3. Phase 3: Implement pre-flight guards.
|
||||||
|
Add checks for Debian-family hosts, apt/dpkg lock contention, and optional maintenance window gating. Capture pre-state with non-changing tasks so repeated runs are safe.
|
||||||
|
|
||||||
|
4. Phase 4: Implement idempotent package update and upgrade flow.
|
||||||
|
Use ansible.builtin.apt instead of shell commands for package actions. Support both safe upgrade and dist-upgrade through a variable. Wrap critical steps in block/rescue/always to collect errors and produce host-level result facts.
|
||||||
|
|
||||||
|
5. Phase 5: Implement conditional reboot and reconnection.
|
||||||
|
Detect reboot necessity (for example reboot-required marker and/or kernel mismatch), reboot only when required, and wait for connectivity return. Add timeout controls and post-reboot health checks.
|
||||||
|
|
||||||
|
6. Phase 6: Orchestration playbook and rollout strategy.
|
||||||
|
Create a playbook that applies the role to target groups. Because rollout is all-at-once, include explicit max_fail_percentage/any_errors_fatal decisions and clear tags (update, reboot, health-check) so risky parts can be skipped.
|
||||||
|
|
||||||
|
7. Phase 7: Observability and failure reporting.
|
||||||
|
Summarize per-host outcomes at the end of the run: upgraded, rebooted, skipped, failed, and failure reason. Optionally add notification hooks later.
|
||||||
|
|
||||||
|
8. Phase 8: Validate and document usage.
|
||||||
|
Run in check mode where applicable, run against a test subset first, then full inventory. Add concise operator documentation for variables, tags, and safe usage examples.
|
||||||
|
|
||||||
|
**Relevant files**
|
||||||
|
- /home/chester/homelab/ansible/roles/proxmox_post_install/tasks/post_common.yml — Reuse apt + conditional reboot pattern.
|
||||||
|
- /home/chester/homelab/ansible/archive/playbooks/proxmox/pve_update.yml — Reuse pre/post-flight checks and reboot wait pattern.
|
||||||
|
- /home/chester/homelab/ansible/archive/playbooks/docker/swarm_update.yml — Reuse structured rolling-maintenance and health assertion ideas.
|
||||||
|
- /home/chester/homelab/ansible/archive/playbooks/onboarding/proxmox_host.yml — Reuse block/rescue/always error-handling style.
|
||||||
|
- /home/chester/homelab/ansible/group_vars/all.yml — Reuse maintenance variable naming style and defaults.
|
||||||
|
- /home/chester/homelab/ansible/playbooks — Add new orchestration playbook entrypoint.
|
||||||
|
- /home/chester/homelab/ansible/roles — Add new linux apt maintenance role directory.
|
||||||
|
|
||||||
|
**Verification**
|
||||||
|
1. Syntax and lint checks for playbook and role task files.
|
||||||
|
2. Dry-run where supported to verify targeting and conditional branches.
|
||||||
|
3. Functional test on a non-critical host group: confirm package upgrades occur and no false changes on second run.
|
||||||
|
4. Reboot-required scenario test: confirm reboot triggers only when needed and host reconnect succeeds.
|
||||||
|
5. Failure-path test: simulate apt lock or package failure and verify rescue reporting is clear.
|
||||||
|
6. Full-run validation: confirm host summary shows expected counts and no silent failures.
|
||||||
|
|
||||||
|
**Decisions**
|
||||||
|
- Included: configurable safe upgrade or dist-upgrade behavior.
|
||||||
|
- Included: reboot only when required.
|
||||||
|
- Included: all-at-once rollout as requested, with explicit risk controls.
|
||||||
|
- Excluded for now: automatic rollback and external alert integrations.
|
||||||
|
|
||||||
|
**Further Considerations**
|
||||||
|
1. Recommendation: start with a canary inventory group for first production run even if final strategy remains all-at-once.
|
||||||
|
2. Recommendation: set max_fail_percentage to a conservative threshold to prevent broad outage during all-at-once runs.
|
||||||
|
3. Recommendation: keep role generic and place service-specific drain/undrain logic in separate roles.
|
||||||
6
.github/prompts/tutor-ansible.prompt.md
vendored
6
.github/prompts/tutor-ansible.prompt.md
vendored
@ -4,8 +4,14 @@ description: Generates Ansible code with beginner-friendly explanations.
|
|||||||
---
|
---
|
||||||
|
|
||||||
You are a Senior DevOps Engineer acting as a Mentor.
|
You are a Senior DevOps Engineer acting as a Mentor.
|
||||||
|
|
||||||
The user is a beginner. Your goal is not just to provide code, but to teach "Best Practices."
|
The user is a beginner. Your goal is not just to provide code, but to teach "Best Practices."
|
||||||
|
|
||||||
|
The user will ask for Ansible code to accomplish a task. You will respond with:
|
||||||
|
1. A plain English explanation of the concept.
|
||||||
|
2. A file path where the code should be placed.
|
||||||
|
3. A valid YAML block of Ansible code.
|
||||||
|
|
||||||
**Rules for your output:**
|
**Rules for your output:**
|
||||||
1. **Structure First:** Always suggest creating a `Role` instead of a giant monolithic playbook.
|
1. **Structure First:** Always suggest creating a `Role` instead of a giant monolithic playbook.
|
||||||
2. **Explain Why:** For every module you use (e.g., `ansible.builtin.copy`), explain *why* you chose it over a shell command.
|
2. **Explain Why:** For every module you use (e.g., `ansible.builtin.copy`), explain *why* you chose it over a shell command.
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user