Architecting AI Agent Security to Stay Compliant with NIST's Identity and Authorization Framework

NIST's comment window on AI agent identity and authorization closes April 2. If you're deploying AI agents and haven't read the framework, this is the post because NIST just put formal language around a structural gap that most organizations are already sitting in.

The framework has three pillars. I'll walk through each one, what it requires at the implementation level, and where most deployments fail. Then I'll show what satisfying the framework actually looks like in practice.

Photo by Jesse Collins on Unsplash

NIST Initiative

NIST uses slightly different terminology — identification, authorization, access delegation, and logging. I've condensed these into three themes that map more directly to what engineering teams need to implement.

NIST's AI Agent Standards Initiative isn't about model safety or red-teaming or responsible AI. It's about something more specific and more tractable: how do you prove that an AI agent acted within its authorized scope?

That's an identity and authorization problem. The same class of problem that zero trust architecture solved for human users and service accounts. NIST is saying it now needs to be solved for AI agents too.

The framework organizes the problem into three pillars.

Agent Identity

Agents must have their own identities and not use shared credentials, sessions token, API keys, etc.

Each agent action needs to carry a verifiable identity that tells the enforcement layer: who initiated this, what their role is, and whether that identity is valid right now.

This is an upstream application architecture requirement. It means your application layer (agent orchestration, copilot feature, API gateway, etc), must attach verified identity context to every AI request before it reaches any enforcement point. If your agent calls an LLM with a static service account credential, the enforcement layer has nothing to evaluate against. It knows the credential is valid. It has no idea who the human principal behind it is or what their role permits.

Most deployments today run on static service credentials. Pillar 1 fails before you even get to enforcement.

Delegated Authority

Permissions must be scoped per request, not permanent.

This is where least privilege meets the AI layer. Traditional least privilege says that every identity has the minimum access necessary for its function. Static AI service credentials violate this by design. One credential, permanent access to the full model API, works for any caller, any prompt, any data context — regardless of what the specific call actually needs.

Delegated authority means something different. For this specific request, from this specific user role, under this specific policy: what is permitted?

The answer changes per request. E.g. An analyst in a healthcare SaaS accessing patient summaries gets a different scope than a compliance officer querying audit logs. Same model, same endpoint, different authority - evaluated at call time, not at credential issuance time.

Satisfying Pillar 2 requires an enforcement point that:

Receives identity context from Pillar 1
Evaluates the incoming request against a per-route, per-role policy
Makes a real-time permit or block decision before the request reaches the model
Produces a record of that decision

Action Lineage

Every agent action needs a traceable record that proves what authority was in effect at the time.

Not an application log. Application logs tell you what happened. Action lineage tells you who authorized it, under which policy, at what moment, and what the outcome was.

The difference matters to auditors and incident responders. When a regulated environment has a data exposure event and the investigation asks "what policy was in effect when this agent accessed this data, and who approved that policy being applied to this user's role?", you need to be able to answer that question from a structured record, not reconstruct it from scattered log lines.

The minimum viable action lineage record looks like this:

{
  user_id: "jsmith@acme.com",
  role: "analyst",
  policy_id: "data_access_policy_v3",
  policy_outcome: "PERMIT",
  data_sensitivity: "PII_L2",
  model_endpoint: "/v1/messages",
  timestamp: "2026-03-17T09:14:22Z"
}

That's a per-decision audit record. Every AI request gets one.

Current Deployment Gaps

Most enterprise AI deployments today:

Run agents on shared service accounts with permanent broad credentials → Pillar 1 fails
Have no per-request policy evaluation at the model API layer → Pillar 2 fails
Log at the application level (request_id: abc123, status: 200) → Pillar 3 fails

All three failures are structural and these gaps cannont be closed with model-level guardrails or prompt engineering. They require an enforcement architecture that operates at the AI API call layer, separate from the model itself.

Compliant Architecture

The control boundary matters here.

Your application layer owns Pillar 1. Identity resolution - attaching a verified JWT carrying user and role context to every AI request - is your responsibility. Without it, an enforcement layer has nothing to evaluate against. This is not a product problem. It's an architecture requirement.

An enforcement layer handles Pillars 2 and 3. Once identity context is in the request, the enforcement layer:

AI request arrives with identity context (user_id, role)
        ↓
Policy engine evaluates:
  - Does this role permit this request type?
  - Does data classification on this route match this role's scope?
        ↓
  PERMIT → request forwarded to model
  BLOCK  → request rejected, decision logged
        ↓
Per-decision audit record written:
  { user_id, role, policy_id, outcome, data_sensitivity, timestamp }
        ↓
Model response returned to caller

This is model-agnostic and doesn't care it is self-hosted or hosted on a cloud provider. It sits between the caller and the model API. The policy evaluates identity and role, not model behavior.

Policy Drift

Baking policy logic into individual agents or application code creates the same problem that baking authorization logic into microservices created before centralized policy engines existed — policy drift. Rules get inconsistently applied. There's no single source of truth. When a regulation changes, you update ten different places and inevitably miss one.

The enforcement layer needs to be separate from the execution environment. That separation gives you a unified policy decision point, an audit record that isn't co-located with the system it's auditing, and the ability to update a policy once and enforce it across every AI integration instantly.

DeepInspect

DeepInspect enforces Pillars 2 and 3 today.

It's a stateless proxy that sits between your application and any LLM API. Every request that passes through it is evaluated against per-route, per-role policies using the identity context your application supplies. PII is detected and redacted or blocked based on data classification rules. Every decision — permit or block — produces a per-decision audit record with identity, policy evaluated, data sensitivity, and timestamp.

Pillar 1 is your upstream responsibility. The application layer must resolve identity and attach it to the request. DeepInspect evaluates what you give it. If you give it a shared service account, it evaluates policy for that service account's role.

The April 2 comment window is worth tracking because the enforcement guidance that follows will define what "demonstrably compliant" means in audit conversations. That language is being written now.

If you're mapping your AI agent deployments to NIST's framework, DeepInspect handles Pillars 2 and 3 out of the box.

DeepInspect is a model-agnostic AI control plane for regulated environments. It enforces per-request AI usage policies based on user identity and role, and generates per-decision audit records for every AI API call.