Agentic AI RFC 8693 IBM Verify Token Exchange SPIFFE

One Chat, Three Specialists,
Zero Standing Privileges

A Claude hosted supervisor agent takes one user message and fans out across IBM Verify, Microsoft Entra ID, and a banking system. Every hop carries its own delegation chain and its own authorization decision. Privileged writes get an approval on the user's phone, sometimes from a Verify access policy that fires step-up, sometimes from the agent's own HITL gate. And when the destination is a database, the credential that ran the SQL did not exist sixty seconds ago and will not exist five minutes from now. The trust model is everything.

Robert Graham Global Product Architect, IBM Verify · May 2026 · Lab Build + Field Notes

00 The Setup

The user types one sentence into a chat box.

The User Prompt

"Onboard sarah.smith@example.com across IBM Verify and Microsoft Entra ID, open her a trading account, and transfer $50 from her checking into it."

Three different systems. Three different identity domains. Three different places where something could go wrong. And the agent that fields this request is not running in your data center. Anthropic owns the model. Anthropic owns the inference layer. You own the agent loop, the tools, the secrets, and the authorization decisions. The model runs on someone else's infrastructure. Everything else lives in your tenant.

That last point is the one a lot of teams are about to learn the hard way. Hosted agents are showing up in enterprise architectures faster than the identity controls around them. Microsoft's Copilot Studio agents, AWS Bedrock agents, IBM watsonx Orchestrate, and Anthropic's own managed agents all share the same risk shape. The model is somebody else's. The actions are yours.

In 2024, this usually would not have been one engineer writing a script and calling it done. It would have been a line-of-business request turned into an integration project across application developers, IAM engineers, security architects, compliance reviewers, and the system owners for Verify, Entra ID, and the banking stack. Somewhere in that build, the path of least resistance would win: a few service principals, a couple of long-lived client secrets, maybe an automation account for the database, and broad API permissions justified as “temporary” so the workflow could ship. Those credentials would end up in CI, vaults, runbooks, containers, or environment files, and over time the standing privilege would become the architecture. Once that happens, the real authority no longer belongs to the user who asked for the action. It belongs to whatever non-human identity the system was allowed to keep.

In 2026 we can do this differently. This post walks you through exactly how.

01 Meet Charlie

Charlie, our AI concierge, is a Claude-hosted agent running on Anthropic's infrastructure. Charlie speaks English. Charlie has three tools:

IIA

call_iia dispatches to the IBM Identity Agent, which administers IBM Verify via SCIM. It can list users, create users, modify group membership, and surface authentication and login reports.

M365

call_copilot dispatches to the Microsoft Identity Agent, which administers Microsoft Entra ID via Microsoft Graph. It can create Entra users, manage groups, disable accounts, and apply licensing. This is the one that matters for most enterprises because most enterprises run on Entra.

$$$

call_banking dispatches to the Big Blue Financing agent, which handles trading accounts and transfers and runs the actual SQL against a Postgres database.

Each of those three specialists is itself a Claude-hosted agent with its own tools pointed at its own API. Charlie is a supervisor. When the user types one sentence, the model inside Charlie decides which specialists to call, in which order, and Charlie dispatches.

The interesting bit is not the routing. Anthropic has had tool use for a while, every agent SDK can draw this picture. The interesting bit is what flows through each hop, and what stops the hop cold when policy says no.

Why This Matters For Hosted Agents

The browser never talks to Anthropic directly. The Anthropic-hosted model never holds an OAuth client secret. Every authority the agent exercises is minted at request time, attested by IBM Verify, and bound to a single operation. If someone walked off with the chat history, they would have a transcript and nothing else. The model is on Anthropic's infrastructure. The secrets and the authorization decisions live entirely in your tenant.

02 Following One Hop From Start to Finish

Pick the hardest case. The user asked Charlie to create an account in Microsoft Entra ID. Most enterprises will recognize this one because most enterprises live in Entra. Here is what happens between Charlie and the actual Microsoft Graph call.

Charlie asks IBM Verify for permission to call the Copilot specialist. The user's access token is the subject. Charlie's own SPIFFE identity is the actor. The scope is the chain scope bound to Charlie's token exchange app. The Rich Authorization Request payload says target=copilot, op=create_user, upn=sarah.smith@example.com. IBM Verify evaluates an access policy that reads the RAR, not just the scope, and either issues a scoped OBO token (call it T1) or denies the request outright. No service account anywhere. The token Charlie used to ask is the user's own.

Charlie hits the Copilot specialist's /api/a2a endpoint carrying T1. Copilot sees the token and knows who the original user was, that Charlie is the outer actor, and what scope the chain has. Copilot now needs its OWN delegation: same user, Copilot as the actor, scope entra:users:write, new RAR payload describing the Microsoft Graph call with the exact target attributes. Second token exchange. If the access policy decides the operation needs a step-up authentication, Verify returns a challenge token instead of an OBO. The user's phone buzzes. They approve. A second leg with the same RAR re-sent gets the real OBO. Call it T2.

Copilot calls Microsoft Graph with T2 as a Bearer. If IBM Verify policy approved the chain, Microsoft Graph trusts the token, the user is created in Entra ID, and Verify's audit trail reflects every link in the delegation chain. If policy denied, Copilot never made the Graph call in the first place.

When The Destination Is A Database

The banking specialist does not call an API. It writes rows in Postgres. The third step there looks different.

The agent does not hold a long lived database connection or a Vault client secret. The OBO JWT itself is what authenticates the agent to HashiCorp Vault. Vault Enterprise validates the OBO against IBM Verify's JWKS endpoint and resolves the agent_id claim to a Vault entity. Two enforcement points then fire on the same call.

The OAuth Resource Server profile, built into Vault 2.0, reads the vault:path_access RAR entries and answers the question "can this JWT call verify-rar/creds/banking-transfers?" That gate enforces WHICH path. Then our verify-rar plugin reads the business RAR (urn:smt:agent:banking|transfer_funds with the source account, destination account, and amount) and matches it against operator configured mappings per role. That gate enforces WHICH operation. Different business RAR, different Postgres role. No matching mapping, no credential.

If both gates pass, the plugin mints a five minute ephemeral Postgres user with GRANTs scoped to exactly that operation. The agent runs the SQL as that user. Five minutes later the user is gone. The IIA specialist uses the same plugin for SCIM writes, where the credential is audit metadata and the SCIM call still uses the OBO as a Bearer. The Copilot specialist does not need the plugin because Microsoft Graph carries its own permission model on top of Entra.

That is two nested token exchanges per specialist, three specialists, six total. The two specialists that hit databases or sensitive APIs add a Vault authorization on top of that. Every one of these is audited. Every one of them can be independently denied. And every OBO token carries an act.act claim that looks like this:

JWT CLAIMS · ILLUSTRATIVE

{
  "sub": "rgraham@us.ibm.com",
  "act": {
    "sub": "spiffe://ibm-verify-lab.com/copilot-agent",
    "act": {
      "sub": "spiffe://ibm-verify-lab.com/concierge-agent"
    }
  },
  "scope": "entra:users:write",
  "authorization_details": [
    { "type": "urn:smt:agent:copilot",
      "operationDetails": {
        "action": "graph_user_create",
        "upn": "sarah.smith@example.com" } }
  ]
}

Read it as nested wrappers around the user. The sub at the top of the JWT is the user, the original delegator. The first act wraps that with the most recent actor (Copilot). The deeper act.act is the prior actor in the chain (Concierge). So the delegation reads: Copilot acts on behalf of Concierge, which acts on behalf of the user. Every link is traceable. Every link had a policy decision. This is not a flow diagram. This is a working implementation, live. You can decode the token at jwt.io and see it. - If you follow me on linkedin, come find me at a conference or message me for a live demo!

Structurally Similar, Not Identical

Swap "Copilot" and "Microsoft Graph" for "IIA" and "IBM Verify SCIM" and the upstream picture matches: same token exchange chain, same Verify policy decision, OBO as Bearer on the destination API. Swap again for "Banking" and "Postgres" and a new gate appears in front of the destination, the verify-rar plugin minting a leased credential. One identity fabric, three target systems, with one extra trust gate on the database path.

03 What Happens When Charlie Oversteps

Type: "Delete sarah.smith@example.com permanently."

Claude inside Charlie picks up on "delete". Tool use fires. Concierge does the first token exchange with its chain scope and a RAR shaped like { target: "iia", op: "delete" }. Verify's access policy sees the scope and the RAR and matches a Tier 4 rule: deletion is blocked for this tenant.

The token exchange returns denied. No OBO token is ever issued. IIA's /api/a2a is never called. The SCIM endpoint is never touched.

Charlie's Honest Reply

"I can't do that. IBM Verify denied the action at the policy layer. This tenant blocks permanent deletes."

That is the difference.

Without the policy decision at the token exchange endpoint, your only enforcement is whatever your agent code remembered to do. With it, the authorization logic lives in a separate system that you can change without redeploying the agent, that an attacker with code execution in the agent cannot bypass, and that produces an audit record regardless of what the agent intended.

04 Why All This Work Is Worth It

The world is full of "just plug your agent into the endpoint" stories. Pick your favorite agent SDK, point it at the API, it works. Then a month later a customer asks who authorized the agent to delete that customer record at 3 AM. Then the answer is "the service account." Then the answer is "I rotated the key." Then the answer is "I think we have logs somewhere." Then you move to something like this.

The architecture in this post answers three questions by construction:

Q1 · AUDIT

Who authorized this action?

Decode any OBO token involved. The sub is the user. The act.act chain is the delegation. The scope is the grant. The authorization_details array is the exact operation with parameters.

Q2 · REVOKE

Can I revoke this right now?

Fire a session_killed CAEP event. Antenna calls Verify's session delete API, and within seconds every agent gets active:false on its next token introspection and refuses the call. One signal, many agents.

Q3 · CONTAIN

What if the host is compromised?

No long lived secrets on the host. Agent identity is anchored to a SPIFFE workload attestation and the OBO JWT itself is what authenticates the agent to Vault. The privilege to write a single row of a single table is minted on demand, scoped to the exact operation in the RAR, and expires in five minutes. Move the code to a different host, the attestation fails, no OBO, no Vault, no credential, no action.

Q4 · SCOPE

How narrow is the grant?

Every token carries a Rich Authorization Request describing the exact operation with parameters. Not "write to Verify." Write this attribute of this user in this tenant. Narrower than a scope string can ever be.

An Audit Property You Cannot Forge

One thing falls out of this architecture that you do not get when you bolt agents onto an API: a write trail that three independent systems agree on, written by three independent processes, joined by IDs that none of them invented locally.

Take banking's transfer_funds as the canonical case. The committed row stamps three things at commit time: the IBM Verify jti from the OBO that authorized the call, the OAuth grant_id that ties this OBO back to the user's original session, and the ephemeral Postgres username that ran the SQL. Those same three IDs appear independently in IBM Verify's SSO event log (the original token issuance), in HashiCorp Vault's audit log (the lease that produced the Postgres user, with the full RAR claims), and in Postgres's log_statement=mod output (the SQL run AS that user). The IIA SCIM writes have an analogous three way correlation against IBM Verify's tenant audit. Different destination, same property.

Three independent writers. Three independent stores. The same three IDs. To forge a record after the fact, you would have to convince all three log stores to lie in agreement. That is the audit posture a CISO actually wants, and you do not get it by writing more application code. It falls out of building on identity primitives that already cooperate.

Where Vault 2.0 Stops And Our Plugin Picks Up

Vault 2.0 alone closes the question "did the right token reach Vault?" The OAuth Resource Server profile validates the JWT and enforces path level RAR entries. That is real and necessary, but it is not enough. Without plugin help, every call to a given creds path returns the same credential. The grant is frozen at config time, not shaped by what the user actually approved on this request.

Our verify-rar plugin closes the second question: "did the credential Vault minted match what the user actually authorized?" It reads the business RAR, picks the role from operator configured mappings, and mints a credential whose authority is bounded to that exact intent. Different intent, different role. No matching mapping, no credential. That is the difference between "the right caller reached the right path" and "the right caller got the right authority for the right operation at the right moment."

05 What's Next: A2A on the Envelope

The wire format we use for Concierge to specialist is bespoke: Bearer token in Authorization, chain context in X-Chain-* headers, JSON body. Works fine.

The industry is converging on an A2A protocol that standardizes the envelope: JSON-RPC 2.0, task lifecycle, signed agent cards, and so on. None of that changes the security model. The token exchange, the RAR, the SPIFFE attestation, and the JWKS-validated Vault path all live below the envelope.

Adopting A2A would let a Microsoft Semantic Kernel agent, a watsonx Orchestrate agent, or any future protocol compliant client drive the Identity Agent or the Copilot specialist with no custom client code. Point it at /.well-known/agent-card.json and it works, governed identically. The hosted agent vendor changes. The trust fabric does not.

On the Roadmap

The retrofit work is queued. Same trust model, new envelope. It is the kind of change that should take days, not quarters, because everything underneath the envelope is already built.

06 The Headline

The hard part of an AI agent is not the agent. It is making sure that when the agent is wrong, confused, compromised, or manipulated, the damage it can do is bounded by a system outside of its control.

Identity

Is how you do that. Every actor in the chain has a cryptographic name.

Token Exchange

Is how you delegate it. RFC 8693 carries the act.act chain on the wire.

RAR

Is how you scope it. RFC 9396 pins the operation, not just the surface.

Vault, SPIFFE, CAEP

Vault leases the actual credential per request. SPIFFE anchors the workload. CAEP revokes the chain across every agent in seconds. One signal, many agents.

Charlie did not just onboard sarah.smith@example.com across IBM Verify and Microsoft Entra ID and move money into her trading account. Charlie proved that, in 2026, you can give a hosted AI supervisor real authority over your identity tenant, watch it chain across three target systems, approve every privileged write on your phone (sometimes via a Verify access policy that fires step-up, sometimes via the agent's own HITL gate), mint database credentials that exist for five minutes and gate the actual SQL, and have an auditable, revocable, bounded, policy enforced record of every single step.

That's the demo.
The next one is yours.

Read the Architecture

Secretless by Design: Zero-Trust Agentic AI on AWS EC2

blog.iamidentity.ai/blog/zero-trust-agentic-ai

The full technical walkthrough of the agent security runtime: SPIFFE workload attestation, IBM Verify Vault Engine Plugin, CAEP warning escalation, Token Exchange, and RAR policy enforcement.

Field Notes

So You Have a Dashboard? RSAC 2026 Field Notes

blog.iamidentity.ai/blog/so-you-have-a-dashboard

Everybody had a dashboard. We had an implementation. That is the difference between marketing agentic AI security and actually shipping it.

Read the Problem Statement

The Weakest Link: Agentic AI Agents

blog.iamidentity.ai/blog/the-weakest-link

Why agentic AI agents are the new weakest link in enterprise security, and what a zero-trust agent architecture looks like.