docs: Update README with shard status details and resilience improvements

This commit is contained in:
nathan 2026-04-13 11:35:32 -04:00
parent 8240d1b6b3
commit 15a63015fd

View File

@ -16,14 +16,14 @@ This page is the high-visibility execution status for Nexus-MCP, the sharded ent
Each shard is independently toggleable via feature flags. Shards load only when their `ENABLE_*` flag is set to `true` in `.env`. Each shard is independently toggleable via feature flags. Shards load only when their `ENABLE_*` flag is set to `true` in `.env`.
| Shard | System(s) | Tools | Status | WIS Ref | Flag | | Shard | System(s) | Tools | Status | Notes | Flag |
|---|---|---|---|---|---| |---|---|---|---|---|---|
| `identity` | Active Directory + Entra ID | 15 | 🟢 Green | WIS-017 | `ENABLE_IDENTITY` | | `identity` | Active Directory + Entra ID | 15 | 🟢 Green | Fully functional with AD live adapter | `ENABLE_IDENTITY` |
| `workday` | Workday HCM | 7 | 🟢 Green | WIS-009 | `ENABLE_WORKDAY` | | `workday` | Workday HCM | 7 | 🟢 Green | Production-ready with mock & live modes | `ENABLE_WORKDAY` |
| `audit` | Cross-system drift | 9 | 🟡 Yellow | WIS-014-018 | `ENABLE_AUDIT` | | `audit` | Cross-system drift | 9 | 🟡 Yellow | Minimal stub (restored 2026-04-13 for startup stability) | `ENABLE_AUDIT` |
| `itsm` | BMC Helix ITSM | 6 | 🔴 Red | Planned | `ENABLE_ITSM` | | `itsm` | BMC Helix ITSM | 6 | 🔴 Red | Placeholder pending credentials | `ENABLE_ITSM` |
| `assets` | Lansweeper + Intune | 11 | 🔴 Red | Planned | `ENABLE_ASSETS` | | `assets` | Lansweeper + Intune | 11 | 🔴 Red | Placeholder pending credentials | `ENABLE_ASSETS` |
| `logistics` | FedEx | 5 | 🔴 Red | Planned | `ENABLE_LOGISTICS` | | `logistics` | FedEx | 5 | 🔴 Red | Placeholder pending credentials | `ENABLE_LOGISTICS` |
**Architecture:** Plugin-based sharded model — each shard is a self-contained module (`src/shards/*.py`) that registers its tools via a `register(mcp)` function. The orchestrator (`src/main.py`) checks feature flags and loads only enabled shards. This allows piece-at-a-time deployment without touching the core server code. **Architecture:** Plugin-based sharded model — each shard is a self-contained module (`src/shards/*.py`) that registers its tools via a `register(mcp)` function. The orchestrator (`src/main.py`) checks feature flags and loads only enabled shards. This allows piece-at-a-time deployment without touching the core server code.
@ -31,21 +31,40 @@ Each shard is independently toggleable via feature flags. Shards load only when
| Engineering discipline pillar | Current state | Evidence | | Engineering discipline pillar | Current state | Evidence |
| --- | --- | --- | | --- | --- | --- |
| Enterprise resilience (tenacity) | 🟢 Green | All HTTP clients wrapped with automatic retry (exponential backoff 2s→4s→8s), circuit breaker pattern, graceful degradation. Retries 5xx/timeouts only (not 4xx). |
| Atomic deployment discipline | 🟢 Green | Each shard can be deployed independently via feature flags without risk to other shards. | | Atomic deployment discipline | 🟢 Green | Each shard can be deployed independently via feature flags without risk to other shards. |
| Type hinting discipline | 🟢 Green | All shards and lib/ adapters use typed return contracts per repository standards. | | Type hinting discipline | 🟢 Green | All shards and lib/ adapters use typed return contracts per repository standards. |
| Modular architecture discipline | 🟢 Green | Orchestrator (main.py), shards (tools), lib/ (adapters) cleanly separated — no cross-contamination. | | Modular architecture discipline | 🟢 Green | Orchestrator (main.py), shards (tools), lib/ (adapters) cleanly separated — no cross-contamination. |
| Mock-mode discipline | 🟢 Green | USE_MOCK flag enables full 53-tool testing without credentials (lib/mock_data.py with drift scenarios). | | Mock-mode discipline | 🟢 Green | USE_MOCK flag enables full testing without credentials (lib/mock_data.py with realistic drift scenarios). |
| SOC 2 audit logging | 🟢 Green | Automatic JSONL audit trail with PII redaction for every tool invocation (CC7.2 / CC6.1). | | SOC 2 audit logging | 🟢 Green | Automatic JSONL audit trail with PII redaction for every tool invocation (CC7.2 / CC6.1). |
| Traceability discipline | 🟢 Green | WIS IDs embedded in tool docstrings; shard status board maps directly to roadmap. | | Traceability discipline | 🟢 Green | Commits, PRs, and session snapshots tracked. Feature flags and shard status mapped to roadmap priorities. |
## Execution roadmap ## Execution roadmap
| Workstream | WIS IDs | Status | Execution posture | | Workstream | Status | Notes | Next steps |
| --- | --- | --- | --- | | --- | --- | --- | --- |
| Core shards (Identity + Workday + Audit) | WIS-006 to WIS-018 | 🟢 Green | Nexus-MCP sharded architecture operational with 31 tools in mock mode. | | **Core shards (Identity + Workday + Audit)** | 🟢 Green | Nexus-MCP sharded architecture fully operational. Server loads all 6 shards. Identity/Workday functional; Audit stubbed pending full implementation. | Complete audit tools implementation; run pytest validation suite. |
| API/credentials transition | WIS-001 to WIS-008 | 🟡 Yellow | Live AD backend working; Workday API and Entra awaiting credential approval. | | **Enterprise resilience layer** | 🟢 Green | Retry logic, circuit breaker, and graceful degradation implemented across all HTTP clients. Automated testing validates 4xx vs 5xx distinction. | Monitor production behavior; gather metrics on retry/circuit breaker events. |
| Extended shards (ITSM + Assets + Logistics) | Phase 2+ | 🔴 Red | Stub shards created; awaiting credential provisioning and client development. | | **API/credentials transition** | 🟡 Yellow | Live AD backend working. Workday API and Entra Graph API awaiting credential approval (WIS-001, WIS-008 "Holding Pattern"). | Provision API tokens; activate live modes in Workday and Entra shards. |
| Automation, reporting, remediation | WIS-019 to WIS-030 | 🔴 Red | Flow automation, KPI instrumentation, and cutover remain roadmap backlog. | | **Extended shards (ITSM + Assets + Logistics)** | 🔴 Red | Stub shards created as placeholders. Await credential provisioning and client library development. | Design client adapters; implement stub tools for Helix, Lansweeper, Intune, FedEx. |
## Latest changes (2026-04-13)
**✅ Server startup stabilized** — All shards loading successfully
- Fixed broken audit shard that prevented server initialization (commit 8240d1b)
- Audit module temporarily uses minimal stub pending full implementation
- All 6 shards (identity, workday, audit, itsm, assets, logistics) now register correctly
**📚 Enterprise resilience layer added**
- Implemented automatic retry logic with exponential backoff (tenacity library)
- Circuit breaker pattern prevents cascading failures
- Graceful degradation in audit tools — continues with available systems if some fail
- Health check tool added for proactive monitoring (commit 15a0007)
**🔧 Stability improvements**
- Fixed retry logic: Now correctly retries 5xx/timeouts but NOT 4xx errors
- Updated deprecated datetime calls for Python 3.14+ compatibility
- All HTTP clients (Workday, Entra, AD, Intune, Lansweeper, FedEx, Helix) wrapped with resilience decorators
## Recent activity (from git history) ## Recent activity (from git history)