nexus-mcp/archive/Workday/Planning/Breadcrumbs/Microsoft Copilot.md

Perfect — this gives us the **real-world shape** of the Workday record we should design around, *before* you ever get API access. Based on what you pasted, your **Employee ID is numeric (8 digits)** and you have a stable set of “Job Details” + “Public Contact” fields we can treat as your **Phase 1 canonical Worker contract**. [\[wheelsinc-...epoint.com\]](https://wheelsinc-my.sharepoint.com/personal/castn1_wheels_com/Documents/Microsoft%20Copilot%20Chat%20Files/server.py)

Below I’m going to do two things:

1.  Translate your Workday profile into a **clean canonical schema** (fields, types, constraints) that you can enforce with Pydantic (**WIS-010**) now.
2.  Show how that schema feeds your **dry-run diff tool** (**WIS-019**) and your later Workday→AD sync logic—without needing Workday access yet.

***

## 1) What your profile tells us about “employee\_id” (and constraints)

From your profile:

*   `Employee ID` = `10016638` → **numeric string**, **8 digits** (at least in your tenant).
*   This is a **better primary key** than email or name for matching to AD.

### Recommended constraint (for now)

*   Treat `employee_id` as a **string** (not int) to preserve leading zeros if they exist in other records.
*   Enforce pattern: `^[0-9]{8}$` (8 digits)

> If you later discover IDs vary (e.g., 6–10 digits), you can loosen the regex to `^[0-9]{6,10}$` without breaking downstream contracts.

***

## 2) Canonical Worker Schema (Phase 1 / no Workday API required)

Here’s a **normalized Worker model** derived from your field list. I’m using names that are stable + automation-friendly, while keeping the original Workday labels clear.

### Field mapping (Workday → canonical)

**Identity**

*   `employee_id` ← Employee ID (string, 8-digit numeric)
*   `work_email` ← Email (string, email format)

**Org / Reporting**

*   `supervisory_organization` ← Supervisory Organization (string)
*   `manager_name` ← Supervisory Organization trailing manager name (string, optional)
*   `manager_employee_id` ← (not present in your paste; optional until Workday API provides it)

**Job**

*   `position` ← Position
*   `business_title` ← Business Title
*   `job_profile` ← Job Profile
*   `employee_type` ← Employee Type (e.g., Regular)
*   `management_level` ← Management Level (e.g., Individual Contributor)
*   `time_type` ← Time Type (e.g., Full time)
*   `fte_percent` ← FTE (decimal percent)

**Employment dates**

*   `hire_date` ← Hire Date (date)
*   `original_hire_date` ← Original Hire Date (date)
*   `continuous_service_date` ← Continuous Service Date (date)

**Location**

*   `location` ← Location
*   `work_address` ← Work Address (string)

### Why this specific cut matters

This set gives you enough to drive:

*   onboarding timing (hire date)
*   basic directory attributes (title, department-like org fields, location)
*   reporting chain resolution (supervisory org / manager)
*   future group mapping (job profile, employee type, management level)

…and it’s stable enough to validate *before* Workday access.

***

## 3) How I’d update your current MCP server **right now** (targeted improvements)

Your current `server.py` is a solid scaffold, but it’s using:

*   mock IDs like `EMP001`, and
*   a flat dict with fields (`name`, `status`, `dept`, `email`, `manager_id`) [\[wheelsinc-...epoint.com\]](https://wheelsinc-my.sharepoint.com/personal/castn1_wheels_com/Documents/Microsoft%20Copilot%20Chat%20Files/server.py)

To align with *real Workday data* **before you ever connect**:

### A) Switch your mock dataset to real ID format

Change mock keys from `EMP001` → `10016638` style.

*   This forces you to solve input validation and casing issues now (numeric IDs have no casing).
*   It prevents later refactors when Workday is live.

### B) Standardize your tool outputs (strongly recommended)

Right now, tools return either:

*   a worker dict OR
*   `{"error": ...}` [\[wheelsinc-...epoint.com\]](https://wheelsinc-my.sharepoint.com/personal/castn1_wheels_com/Documents/Microsoft%20Copilot%20Chat%20Files/server.py)

Before Workday access, lock in a stable envelope like:

```json
{
  "ok": true,
  "data": { ... },
  "error": null,
  "meta": {
    "correlation_id": "uuid",
    "source": "mock",
    "contract_version": "wis.v1"
  }
}
```

Why now? Because Power Automate + connectors love consistent shapes. If you wait, you’ll end up rewriting flows and schemas later.

### C) Build Pydantic models for:

*   Worker input (`WorkerModel`)
*   Tool response (`WorkerStatusResponse`)
*   Error (`ErrorModel`)
*   Envelope (`ToolResponse[T]`)

This is the core of **WIS-010**.

***

## 4) Concrete Pydantic Contract Proposal (WIS-010)

Here’s the **contract design** I recommend based on your profile fields and your automation goals.

### WorkerModel constraints (recommended)

*   `employee_id`: `^[0-9]{8}$`
*   `work_email`: must validate as email
*   `fte_percent`: 0–100 (float)
*   `time_type`: enum (`Full time`, `Part time`, etc.)
*   Dates: parsed into real `date` objects

### “Strict mode” recommendation

Start with:

*   `extra="forbid"` (reject unknown fields)
*   later relax to `extra="ignore"` if Workday payloads contain extra fields you don’t care about

This gives you confidence early.

***

## 5) Your Dry-Run Diff Tool (WIS-019) — now shaped by real Workday fields

Once you validate Workday-shaped inputs, your dry-run diff becomes meaningful:

### What it should compare (Phase 1 scope)

From the fields you pasted, your first drift checks should include:

*   Title / Business Title / Job Profile
*   Location
*   Manager / supervisory org reference
*   Employee status → AD enabled/disabled state
*   Email → UPN/mail alignment

### Diff item example (your intended format + enhancements)

You suggested:

```json
{"field":"department","old":"Sales","new":"Marketing","action":"update"}
```

I’d add two fields **now** that become critical in Phase 3:

*   `severity` (low/medium/high)
*   `reversible` (true/false)

Example:

```json
{
  "field": "business_title",
  "old": "Team Lead, Deskside Support",
  "new": "Team Lead, Deskside Support",
  "action": "none",
  "severity": "low",
  "reversible": true
}
```

Then your approval payload can be generated directly from the diff result (no extra logic later).

***

## 6) Break Tests you can run **today** (with your real constraints)

Since we now know your ID shape, your “try to break it” script should include:

### Employee ID failure cases

*   empty string `""`
*   non-numeric `"EMP001"`
*   wrong length `"1001663"` (7 digits), `"100166380"` (9 digits)
*   whitespace `" 10016638 "`

### Data failures (WorkerModel)

*   invalid email `"not-an-email"`
*   FTE out of range `150.0`
*   Hire date invalid `"2021-13-40"`
*   Missing required fields (`employee_id`, `work_email`)

### Expected behavior

*   Tools return `ok:false`
*   Error has stable code (e.g., `VALIDATION_ERROR`)
*   No sensitive fields are echoed in logs (see below)

This is how you prove WIS-010 is real, not theoretical.

***

## 7) Security/Hardening decisions informed by your profile fields

Because Workday records contain PII-like data (email, address), your hardening step (WIS-027) should **explicitly redact**:

*   `work_email`
*   `work_address`
*   `employee_id` (optional partial mask like `******38`)

### Redaction policy (recommended)

*   In logs: redact always
*   In tool responses:
    *   keep email if the tool is intended for automation and access is controlled
    *   or provide a config switch: `RETURN_PII=false` (defaults safe)

This is much easier to implement now than after Workday is live.

***

# 8) What I would change in your current `server.py` (actionable list)

In order, **before Workday API access**:

1.  **Replace EMP### mock IDs with numeric employee IDs** (like your `10016638`) [\[wheelsinc-...epoint.com\]](https://wheelsinc-my.sharepoint.com/personal/castn1_wheels_com/Documents/Microsoft%20Copilot%20Chat%20Files/server.py)
2.  **Add Pydantic WorkerModel + Response envelope** (WIS-010) [\[wheelsinc-...epoint.com\]](https://wheelsinc-my.sharepoint.com/personal/castn1_wheels_com/Documents/Microsoft%20Copilot%20Chat%20Files/server.py)
3.  **Normalize tool outputs** into `{ok,data,error,meta}` for every tool [\[wheelsinc-...epoint.com\]](https://wheelsinc-my.sharepoint.com/personal/castn1_wheels_com/Documents/Microsoft%20Copilot%20Chat%20Files/server.py)
4.  **Implement dry\_run\_compare() tool** returning a typed DiffResult (WIS-019)
5.  **Add rate-limit simulation decorator** and make errors explicitly retryable (WIS-027)
6.  **Add log redaction helper** for email/address/employee\_id (WIS-027)
7.  **Externalize mock data** into JSON + support multiple scenario datasets
8.  **Pytest suite**: manager resolution scenarios (Found/Not Found/No Manager) using your mock structure [\[wheelsinc-...epoint.com\]](https://wheelsinc-my.sharepoint.com/personal/castn1_wheels_com/Documents/Microsoft%20Copilot%20Chat%20Files/server.py)
9.  **Pre-commit hooks** to block commits on failing tests/lint

***

## 9) Two quick questions so I lock the schema correctly

1.  In your tenant, is **Employee ID always 8 digits**, or have you seen contractors/vendors with a different format?
2.  Do you want **Work Address** treated as:
    *   a field you *store* in AD/Entra, or
    *   *reference only* (used for location logic but not written anywhere)?

***

## If you want, I can take the next step automatically

If you say “go,” I’ll draft:

*   A **Pydantic schema pack** (`models.py`) for WorkerModel, DiffResult, Envelope
*   A revised `server.py` that:
    *   uses numeric employee IDs
    *   returns consistent envelopes
    *   includes `dry_run_compare`
    *   includes redaction + rate-limit simulation toggles
*   A `/tests` folder with the manager tests + “break tests” script

Just tell me whether you prefer **Pydantic v1 or v2** (most orgs are v2 now, but I’ll match your repo dependency reality).