32 lines
858 B
Markdown
32 lines
858 B
Markdown
# nvidia_runtime_setup
|
|
|
|
Ansible role to configure NVIDIA driver/runtime readiness on Debian-family hosts.
|
|
|
|
## What it does
|
|
|
|
- Detects NVIDIA GPU hardware via `lspci`
|
|
- Auto-selects a recommended driver on Ubuntu (or uses an explicit package pin)
|
|
- Installs the NVIDIA driver package
|
|
- Optionally installs CUDA toolkit and NVIDIA container toolkit
|
|
- Handles optional reboot logic
|
|
- Verifies readiness with `nvidia-smi`
|
|
|
|
## Safe defaults
|
|
|
|
- Reboot is disabled by default (`nvidia_runtime_reboot_if_needed: false`)
|
|
- CUDA and container toolkit installs are disabled by default
|
|
- Validation is enabled by default and fails if `nvidia-smi` is unavailable
|
|
|
|
## Example
|
|
|
|
```yaml
|
|
---
|
|
- name: Configure NVIDIA runtime for AI nodes
|
|
hosts: ai_nodes
|
|
become: true
|
|
roles:
|
|
- role: nvidia_runtime_setup
|
|
vars:
|
|
nvidia_runtime_reboot_if_needed: true
|
|
```
|