byrxj
Working Paper · People Operations & AI

Building the Context Layer Inside People Operations

A practitioner’s blueprint for the data and context foundation that makes HR AI agents work in production, not just in the demo.

A brick foundation being built beneath a platform where agents work, representing the context layer under People Operations

Executive summary

AI adoption in HR is nearly universal and realized value is rare. The cause sits below the model. HR agents fail in production because the data is fragmented, the definitions are contested, and the rules live in people’s heads instead of in a form a machine can read.

The fix is a context layer: a governed set of machine-readable definitions, identities, and rules that every HR agent must resolve against before it answers or acts. People Operations is the right place to build it first, because it owns the richest tribal knowledge in the company and carries the highest cost when context is wrong.

This paper sets out what is uniquely hard in HR, the foundation in priority order, a 60 to 90 day starting plan, and the metrics that keep you honest. The throughline: only the function that runs the work can author the ground truth the agents need. That reframes People Operations from a consumer of AI into the owner of its foundation.

01 · The problem

The failure is below the model

Start with the gap that should worry every HR leader. Adoption is high. Value is rare. And most teams cannot even tell which it is.

The pattern repeats across every major research house in 2025 and 2026. Gartner’s HR survey found 88% of HR leaders say their organization has not realized significant business value from AI tools. SHRM’s State of AI in HR 2026 found that 56% of HR teams do not formally measure AI success at all, and only 49% have an AI policy. McKinsey’s 2025 work found about 88% of companies use AI somewhere, only about a third have scaled it, and only around 6% are real high performers. The MIT NANDA study supplied the headline that 95% of enterprise generative AI efforts show no measurable return, with the root cause named as the learning gap: tools that never connect to how the work actually runs.

Read together, these say one thing. The bottleneck is not model quality. It is the foundation underneath the model.

The governance gap, in one statistic

Roughly 74% of organizations plan to deploy agentic AI within two years. Only about 21% have a mature governance model for it. Agents are scaling faster than the guardrails. In HR, where the data is the most sensitive in the company, that gap is not a risk to tolerate. It is the risk.

02 · What is uniquely hard in HR

The definitions everyone assumes and no one wrote down

Horizontal AI advice skips the part that actually breaks HR agents. A general data strategy never has to answer what a case is, or who counts as a worker. In People Operations, these are the whole game, and each one is contested across systems.

What is a case, and what is resolved

In modern case management the case itself, not the workflow, is the unit that matters, and it carries many interdependent workflows. The same case type is named one way in the employee-facing catalog, another way in the back-office topic taxonomy, and a third way in the HR service record. An agent that cannot resolve those three to one definition will route, count, and close cases inconsistently. And resolved is not closed. If the machine cannot tell the difference, every metric built on it is fiction.

Who counts as a worker

Identity is the quiet killer. The same person carries different identifiers across the HRIS, the case system, and the directory. Worse, contingent workers are often kept out of the HRIS on purpose because they live in a separate vendor management system. An HR agent wired only to the HRIS is blind to a large slice of the workforce. The fix is a single canonical worker identity with merge rules at ingest and scheduled de-duplication, and access tied to contract dates.

Service levels, severity, and sensitivity

HR service levels split response time from resolution time and tier both by severity. Those tiers have to be data the agent reads, not a convention in someone’s memory. And HR holds the most sensitive data in the company: identifiers, salary, health, performance, disciplinary, and investigation files. Data sensitivity classes and legal-hold rules are hard constraints on what any agent may retrieve or surface.

Why this is the high-leverage move

An HR agent works in the demo and dies in production because case, resolved, and worker are contested, system-specific definitions. The leverage is not a better model. It is a machine-readable definition layer that every agent resolves against. Only the function that runs case management can author it credibly.

03 · The foundation

What to build, in priority order

The context layer is not one product. It is a stack, and the order matters. Build it bottom up.

LayerWhat it isWhy it comes first
1. Canonical worker identityOne employee ID, merge rules at ingest, scheduled de-dup, contingent workers included from the VMS.Every other layer is wrong if the agent cannot tell who it is talking about.
2. Machine-readable definitionsCase taxonomy, the resolution rule, SLA tiers by severity, data-sensitivity classes, encoded as data.This is the ground truth agents resolve against. Wiki pages do not count.
3. Governed context layerDefinitions, access policy, and lineage enforced in the query path, respecting existing permissions.Governance has to fire before an answer is generated, not after.
4. Sensitive-data guardrailsSensitivity classes, legal hold, purpose limits, so a self-replanning agent cannot drift into special-category data.In HR a wrong norm is a privacy incident, not a bad chart.
5. Compliance scaffoldingWorker notice, human in the loop, log retention, explanation capability.The EU AI Act puts HR AI in its high-risk class. Build for it now.
6. MeasurementAccuracy, autonomous resolution, escalation, drift, against a human baseline.56% of HR teams measure nothing. Pick this fight on day one.
The compliance work and the capability work are the same work

HR and employment AI is high-risk under the EU AI Act, with duties to inform workers, keep a human in the loop, retain logs for at least six months, and explain decisions. A proposed delay to late 2027 is not yet law, so do not defer. The audit trail those rules demand is the same lineage and logging the context layer needs to function. Build it once.

04 · The plan

A 60 to 90 day start

This is a construct, not a vendor framework, drawn from the patterns that work. The goal is one defensible win, not a platform.

0–30

Inventory and define

Map every system that holds worker data: the HRIS, the case system, the VMS. Author version one of the machine-readable definitions: what a case is, what resolved means, the SLA tier matrix, the data-sensitivity classes, and the canonical worker identity. Pick one high-volume, low-sensitivity case type, such as a benefits or policy lookup, as the beachhead.

30–60

Ground and guardrail

Stand up a context layer over the beachhead that enforces definitions and permissions in the query path. Wire in canonical identity. Add runtime guardrails: purpose limitation, a human in the loop on any decision that affects employment, and logging on from day one. Instrument three numbers, not one: deflection, autonomous resolution, and accuracy.

60–90

Pilot, measure, govern

Run the beachhead agent on real cases against a human baseline. Publish the number you can defend. Execute the compliance hygiene: worker notice, an oversight roster, log retention. Then decide to scale or kill on measured value. This is how you avoid the pilot purgatory that swallows most programs.

05 · Measurement

Measure the things that can embarrass you

The old software metrics do not capture autonomous decisions. The agent metric stack that does:

If you report only deflection, you are reporting the flattering number. Report autonomous resolution and accuracy against a human baseline, or you are not measuring at all.
06 · The authority claim

Why People Operations owns this

Enterprise systems are very good at recording outcomes: the final status, the closed case. They are poor at recording the reasoning that produced them. That reasoning still lives in chat threads, side conversations, and people’s heads, and it has rarely been treated as data. That reasoning is the context layer waiting to be built.

In People Operations, the reasoning is the playbook in your best caseworker’s head. It is the most valuable and least governed asset in the company. The function that runs case management is the only one that can turn it into ground truth, because it is the only one that knows what is true. That is the reframe. People Operations is not a buyer of someone else’s AI. It is the owner of the foundation that decides whether any of the company’s HR AI works at all.

Takeaways

What to remember

  1. The failure is below the model. 88% see no value, 56% measure nothing, 95% show no return. The cause is fragmented data and missing definitions, not weak models.
  2. Case, resolved, and worker are contested definitions. Settle them as machine-readable data, or every agent and every metric inherits the confusion.
  3. Identity is the quiet killer. One canonical worker ID, including contingent staff, or the agent is blind to part of the workforce.
  4. Govern in the query path. Permissions and policy have to fire before an answer is generated, not after it ships.
  5. Compliance and capability are one build. The EU AI Act audit trail is the same logging the context layer needs. Do not defer it.
  6. Measure the honest number. Autonomous resolution and accuracy against a human baseline, not deflection alone.
  7. People Operations owns the ground truth. Only the function that runs the work can author the context. That is the authority, and the opportunity.
Sources

Where this comes from

Drawn from 2025 and 2026 research across independent houses, primary regulation, and named case studies. Vendor metrics are flagged and should be read as ceilings, not typical results.

Send to your inbox

Email this paper to your inbox, with the illustration and an editable document version attached.