AWS EC2 HashiCorp Vault IBM Verify SPIFFE / SPIRE LangChain MCP Server SSF / CAEP Token Exchange · RAR

Secretless by Design:
Zero-Trust Agentic AI
on AWS EC2

How I deployed a LangChain-powered Agentic AI with an MCP server on AWS EC2 — with no secrets in the code, no secrets in the .env, no Vault tokens on disk, mutual cryptographic identity via SPIFFE, continuous session evaluation with CAEP, and fine-grained delegated authorization through Token Exchange and Rich Authorization Requests.

DevSecOps  ·  Zero Trust  ·  Agentic AI  ·  Identity-First Architecture  ·  CAEP · RAR · Token Exchange

00 The Problem With "Secrets in .env"

Nearly every tutorial for deploying an AI agent looks the same: spin up an EC2 instance, SSH in, create a .env file, paste your OpenAI key, your database password, your third-party API credentials — and ship it. It works. Until it doesn't.

Secrets in .env files get committed to Git, leaked via misconfigured S3 buckets, or exfiltrated in a breach. Static credentials don't rotate. And an AI agent that holds a live API key is one prompt injection away from being weaponized.

I wanted to build something different — a deployment where the application has no secrets at all, not even a Vault token, and where every service proves its identity cryptographically before any secret is ever exchanged.

TL;DR

In production with SPIFFE, the .env contains only configuration — no secrets. Client IDs, endpoint URLs, port numbers — nothing exploitable. No passwords. No API keys. No Vault tokens. All secrets are fetched at runtime from IBM Verify SaaS via HashiCorp Vault, and authentication to Vault itself is handled entirely by SPIFFE/SPIRE workload identity — no credential pre-seeding required.

// System Architecture Overview
Cloudflare Cloudflare WAF · DDoS Protection · Zero Trust Network Access
↕ All traffic flows through Cloudflare tunnels — zero exposed inbound ports anywhere in the VPC
🔒 AWS VPC — Private Network — No Public Ingress
Cloudflare cloudflared Tunnel Daemon outbound-only · no open ports · routes all VPC-internal & outbound calls through CF Zero Trust
EC2 — SPIRE Server
SPIRE
SPIRE Server Trust Domain CA · Node Attestation via AWS IID · SVID Issuance · Short-lived JWT / X.509
EC2 — HashiCorp Vault
Vault
HashiCorp Vault JWT Auth Method · IBM Verify Plugin · Dynamic Secrets · Lease TTLs · AI API Key storage
SPIRE Agent
SPIRE Agent sidecar issues SVID JWT · authenticates Vault workload identity to SPIRE Server
EC2 — Application
🌐 Svelte Frontend UI · served via CF tunnel · no direct exposure
🤖 Agentic AI — LangChain Reasoning · Planning · MCP tool calls · Token Exchange
🔧 MCP Server Tool Orchestration · Model Context Protocol
📡 Shared Signals / CAEP SSF event receiver · W1–W5 warning ladder · cross-Verify revocation trigger
SPIRE Agent
SPIRE Agent sidecar SVID JWTs for agent & app workloads · no vault token on disk
Vault Plugin · clientId lookup
dynamic secret lease
via CF tunnel
CAEP SET events · SSF stream
Token Exchange · RAR policies
via CF tunnel
IBM Verify
IBM Verify SaaS Secret Store · Auto-Rotation · CAEP / SSF · RAR Authorization Policies
Token Exchange (RFC 8693) · Act-on-Behalf-Of Consent · Workload Identity
☁ SaaS — External

01 The Stack

LangChain
Agentic AI framework — orchestrates tools, memory, and reasoning loops
MCP Server
Model Context Protocol — standardizes tool exposure to the AI agent
HashiCorp Vault
Secrets engine — dynamic secret issuance and lease management
IBM Verify Vault Plugin
Custom Vault engine — bridges Vault to IBM Verify SaaS for secret retrieval and rotation
IBM Verify SaaS
Identity provider — stores and rotates application credentials
SPIFFE / SPIRE
Workload identity — cryptographic identity for every process via SVIDs
AWS EC2
Compute host — the runtime environment for the full stack

02 What's in the .env? Configuration, Not Credentials.

The application is built with standard .env conventions — nothing exotic about the app itself. But the contents of that file tell a very different story than most deployments.

# .env — production with SPIFFE. No secrets anywhere.
OIDC_CLIENT_ID=a1434344-e66f6-7210-abcd-ef13211267890
EXCHANGE_CLIENT_ID=6abty5f0-e31d-421f6-859a-fc40a7f7a496
MCP_CLIENT_ID=f4219-1a7b-47he-88ca-572w6cf23d
VAULT_BASE_URI=https://vault.example.com
AUTH_METHOD=spiffe

# Client IDs, endpoints, ports — that's it.
# No client_secret. No API keys. No Vault token.
# No database passwords. No AI provider keys.
.env

Client IDs are not secrets — they're public identifiers, like usernames without passwords. They tell the IBM Verify Vault Engine Plugin which application is asking, but they prove nothing on their own. The actual secret retrieval is gated entirely behind Vault, which is gated entirely behind SPIFFE identity.

Why this matters

If this EC2 instance is compromised, an attacker finds client IDs and endpoint URLs in the .env. Without a valid SPIFFE SVID — which is cryptographically bound to this specific workload and short-lived — those identifiers are useless. There is nothing to steal.

The AI API Key Is in Vault Too

The Agentic AI provider API key (used by LangChain to call the underlying model) is stored as a secret in HashiCorp Vault — not in the .env, not in the codebase, not in any configuration file on disk. At startup, the agent authenticates to Vault via its SPIFFE identity, retrieves the key with a short-lived lease, and uses it for that session.

When the lease expires, the key is gone from memory. The next request gets a fresh one. There is no persistent AI API credential anywhere in the deployment.

03 IBM Verify Vault Engine Plugin: How Secret Rotation Works

The IBM Verify Vault Engine Plugin is a custom secrets engine that extends HashiCorp Vault with the ability to communicate directly with an IBM Verify SaaS tenant. It acts as a bridge between Vault's dynamic secrets model and IBM Verify's credential management.

1

App requests secret from Vault

LangChain agent calls the Vault API path mounted by the IBM Verify engine, passing the client ID from the .env. No credential is sent — just the identifier.

2

Vault plugin reaches out to IBM Verify SaaS

The plugin uses its own pre-configured credentials (stored securely in Vault, not accessible to the app) to call the IBM Verify SaaS API and retrieve or generate the secret for the given client ID.

3

Secret is returned with a lease

Vault returns the secret to the application with a TTL-bound lease. IBM Verify handles the rotation schedule on its end — credentials are cycled automatically without any application-side logic.

4

Lease expires, secret is invalidated

When the Vault lease expires, the credential is no longer valid. The next request triggers a fresh issuance. There is no long-lived secret sitting anywhere in the runtime environment.

IBM Verify SaaS Integration

IBM Verify SaaS maintains the authoritative credential store. The Vault plugin is effectively a thin, policy-enforcing proxy — it validates that the caller (the Vault client) is authorized, then delegates secret issuance and rotation to IBM Verify. The application never talks to IBM Verify directly.

04 SPIFFE / SPIRE: No Vault Token. Ever.

This is the piece that closes the loop. Most Vault deployments still have a bootstrapping problem: how does the app authenticate to Vault in the first place? Common answers involve a static Vault token, an AppRole secret ID stored somewhere, or AWS IAM (which requires careful IAM policy management). All of these involve something pre-seeded.

SPIFFE eliminates that entirely.

What Is SPIFFE?

SPIFFE (Secure Production Identity Framework for Everyone) is an open standard for cryptographic workload identity. Every process — the LangChain agent, the MCP server — gets a SVID (SPIFFE Verifiable Identity Document), which is a short-lived X.509 certificate or JWT that proves: "I am this specific workload, running in this specific environment."

SPIRE is the reference implementation. A SPIRE Server acts as the certificate authority. A SPIRE Agent runs on each EC2 node and issues SVIDs to local workloads through a Unix domain socket — no network call, no credential exchange, just kernel-attested identity.

JWT SVID → Vault JWT Auth

HashiCorp Vault's JWT auth method is configured to trust SVIDs issued by the SPIRE Server's trust bundle. When the agent starts up:

# Conceptual startup flow (pseudocode)
svid_jwt = spire_agent.fetch_svid(audience="vault")
# → "spiffe://example.org/ns/prod/sa/langchain-agent"

vault_token = vault.auth.jwt.login(
    role="langchain-agent",
    jwt=svid_jwt
)
# Vault validates the JWT against SPIRE's trust bundle.
# If valid → short-lived Vault token issued for this session.
# That token is never written to disk.
python

The SVID is short-lived (minutes to hours). Vault issues a correspondingly short-lived token. Neither is stored anywhere. The workload identity is ephemeral by design.

Mutual Zero Trust

SPIFFE enables mutual zero trust — not just "client proves identity to server," but both sides present and verify cryptographic identity. The LangChain agent knows it's talking to the real Vault (mTLS via SVID). Vault knows it's talking to the real LangChain agent (JWT SVID validation). No static shared secrets. No trust-on-first-use.

SPIRE Node Attestation on EC2

On AWS EC2, the SPIRE Agent uses the AWS IID (Instance Identity Document) attestor to bootstrap the node's own identity — the EC2 instance proves to the SPIRE Server that it is a legitimate AWS instance in the expected account and region. From that point, all workload identity flows from the SPIRE Agent without any human-injected credentials.

05 Before vs. After: The Threat Model Difference

Attack Surface Traditional .env Deployment This Architecture
.env file leaked All secrets exposed immediately Only a client ID exposed — not exploitable alone
EC2 instance compromised Attacker has all credentials on disk In-memory secrets with short TTLs; SVID expires quickly
Git repo leak Credentials in commit history Nothing sensitive ever touches the codebase
Vault token stolen Vault access persists until manual revocation No static Vault token exists; SPIFFE tokens expire by design
Credential rotation Manual process, often skipped Automatic — IBM Verify rotates, Vault leases enforce TTL
Lateral movement Stolen creds reusable across environments SVIDs are workload-specific; stolen JWT can't be replayed from another workload

06 Why This Matters Especially for Agentic AI

An agentic AI system is not a passive API endpoint. It reasons, plans, and takes actions — potentially calling external APIs, writing data, executing tools through the MCP server. That autonomy makes the security posture significantly more important than for a traditional application.

Consider the risk surface of an AI agent holding live credentials:

Prompt injection attacks attempt to hijack the agent's reasoning to exfiltrate secrets. Tool poisoning via a compromised MCP tool definition could redirect the agent's actions. Runaway tool loops could burn through API quotas or trigger unintended writes. In every case, the blast radius is bounded by what credentials the agent actually holds at runtime.

With this architecture, the agent holds nothing persistently. If the agent is compromised mid-session, an attacker gets ephemeral, scope-limited credentials that expire on their own. There's no master key to exfiltrate. There's no .env to read.

Defense in Depth for AI Systems

Pairing a zero-trust credential architecture with an Agentic AI isn't over-engineering. It's recognizing that an autonomous reasoning system with tool access is a high-value target, and designing accordingly. The SPIFFE identity also enables audit logging that is cryptographically attributable — every secret request is tied to a specific workload SPIFFE ID, not just an IP address.

07 Shared Signals Framework & CAEP: Continuous Session Enforcement

Static authentication — a token issued at login that lives until it expires — is fundamentally at odds with zero trust. A user (or agent) can authenticate legitimately, then immediately begin behaving maliciously. The token doesn't know. Nothing responds.

This deployment integrates IBM Verify's implementation of the Shared Signals Framework (SSF) with the Continuous Access Evaluation Profile (CAEP) — an OpenID Foundation standard that enables real-time, event-driven session revocation across all participating relying parties simultaneously.

The Warning Escalation Model

The agentic AI monitors every user-initiated request for behavioral anomalies. When the agent detects something suspicious — such as a request to transfer funds to an external account, access another user's account data, or any pattern inconsistent with normal banking activity — it does not immediately terminate the session. It escalates through a structured warning ladder.

W1

First Warning — Soft Caution

Agent flags the suspicious request, logs the signal to SSF, and challenges the user inline. Session remains active. No tokens affected.

W2

Second Warning — Elevated Scrutiny

A second anomalous signal within the session triggers step-up authentication. The user must re-verify. SSF emits a credential-change event to downstream listeners.

W3

Third Warning — High Risk Flagged

Session is marked high-risk in IBM Verify. MFA challenge is mandatory. Risk score is propagated via CAEP session-presented events to all relying parties holding tokens for this subject.

W4

Fourth Warning — Final Pre-Revocation State

The subject is placed in a restricted state. All new token issuance is blocked. Existing tokens are flagged as potentially compromised across the SSF receiver network. Human review may be triggered at this stage.

W5

Fifth Signal — Threshold Breached: Full Cross-Verify Revocation

This is the kill switch. IBM Verify emits a CAEP session-revoked event across the entire Shared Signals network. Every relying party subscribed to this subject's event stream immediately invalidates all active sessions, revokes all grants, and expires all tokens — regardless of their individual TTLs. The revocation is cross-Verify: it is not limited to this application. Every service that participates in the SSF fabric loses its session for this subject simultaneously.

Why Cross-Verify Revocation Matters

A typical session revocation only kills the token the application currently holds. The user (or attacker) may still have valid tokens in other apps, browser sessions open in other tabs, or OAuth grants issued to third-party clients. CAEP's cross-Verify revocation tears down the entire identity fabric for that subject at once — it's a synchronized kill across every service that speaks SSF, not just the one that detected the anomaly.

Unusual Behavior Detection: Out-of-Pattern Banking Requests

The second CAEP trigger condition is behavioral rather than rule-based. The agent evaluates each request against a learned baseline of typical banking interactions for the session context. Requests that fall outside that baseline — unusual transfer amounts, atypical destination accounts, access to administrative functions a user has never touched, queries about other users' account data — are treated as anomalous signals and fed into the same warning escalation ladder.

This isn't just fraud detection. It's continuous authorization: the question isn't only "did this user authenticate?" but "should this authenticated user be doing this specific thing right now?" CAEP operationalizes that question at the protocol level.

// Conceptual CAEP session-revoked event payload (SSF SET format)
{
  "iss": "https://verify.ibm.com/tenantid",
  "jti": "a-unique-event-id",
  "iat": 1710000000,
  "aud": ["https://banking-app.example.com", "https://api.example.com"],
  "events": {
    "https://schemas.openid.net/secevent/caep/event-type/session-revoked": {
      "subject": {
        "format": "iss_sub",
        "iss":    "https://verify.ibm.com/tenantid",
        "sub":    "user-uid-1234"
      },
      "reason_admin": "Anomalous transfer pattern — W5 threshold exceeded",
      "event_timestamp": 1710000000000
      // All RPs receiving this SET immediately revoke all
      // sessions, grants, and tokens for sub "user-uid-1234"
    }
  }
}
SET / JWT

08 Token Exchange, Act-on-Behalf-Of & RAR: Delegated Authorization Done Right

An agentic AI acting on behalf of a user is a delegation problem. The agent is not the user. The agent should be able to act on the user's behalf — but only for specific, consented, policy-bounded operations. Getting that wrong is how AI agents end up with overprivileged access that can be abused or misused.

IBM Verify handles all three layers of this problem natively: the consent UX, the token exchange mechanics, and the policy enforcement.

OAuth 2.0 Token Exchange (RFC 8693)

When the LangChain agent needs to act on behalf of an authenticated user, it uses OAuth 2.0 Token Exchange (RFC 8693). The agent presents the user's subject token to IBM Verify, declares its own identity as the actor, and requests a new token scoped to the specific operation it needs to perform.

# Token Exchange request to IBM Verify
POST /oauth2/token
Content-Type: application/x-www-form-urlencoded

grant_type=urn:ietf:params:oauth:grant-type:token-exchange
&subject_token=<user's_access_token>
&subject_token_type=urn:ietf:params:oauth:token-type:access_token
&actor_token=<agent's_svid_jwt>
&actor_token_type=SPIFFESMT
&requested_token_type=urn:ietf:params:oauth:token-type:access_token
&scope=transfer:all
&authorization_details=<RAR object — see below>
HTTP

The resulting token carries an act claim in the JWT that identifies the agent as the actor — creating a cryptographically verifiable, auditable record of "this agent acted on behalf of this user for this operation." IBM Verify enforces this entirely — the application doesn't implement any delegation logic itself.

Act-on-Behalf-Of: Consent Is Handled in Verify

Before any impersonation or delegation token is issued, IBM Verify enforces a consent step. The user must explicitly approve the agent acting on their behalf for the requested scope. That consent record lives in Verify — it's durable, auditable, and revocable. The application never manages consent state. If a user revokes consent, the next Token Exchange attempt for that actor/subject pair is rejected immediately, regardless of any in-flight sessions.

Why This Matters for Agentic AI

An AI agent that can act on a user's behalf without explicit, per-operation consent is an abuse vector. Verify's Token Exchange + consent model means the agent's authority is always user-granted, always scoped, and always revocable — not inferred from having the user's credentials.

Rich Authorization Requests (RAR — RFC 9396) with Verify Policies

Standard OAuth scopes — transfer:all — are coarse. They say "this token can initiate a transfer" but say nothing about limits, recipients, or conditions. Rich Authorization Requests (RAR) solve this by embedding a structured authorization_details object directly in the token request.

// authorization_details object in the Token Exchange request
[
  {
    "type":           "transfer_funds",
    "instructedAmount": {
      "currency": "USD",
      "amount":   500.00
    }
  }
]
JSON

IBM Verify evaluates the authorization_details against authorization policies defined in the tenant. If the request exceeds policy limits — amount too high, unauthorized transfer pattern — the token exchange can be denied or trigger a step-up challenge before a token is ever issued. The policies live in Verify, not in the application code.

The token returned from Verify carries the authorization_details in its claims. The resource server receiving the token can inspect those claims and enforce them independently — even if it never talks to Verify again. The authorization intent is embedded in the token itself, cryptographically signed.

The Full Delegation Picture

Token Exchange establishes who is acting for whom. Act-on-behalf-of consent ensures the user approved it. RAR specifies exactly what is permitted and under what conditions. Verify policies enforce all three at issuance time. No single layer is sufficient alone — together they make delegated AI authorization genuinely safe for production banking workloads.


09 Key Takeaways

This deployment is a proof-of-concept that secretless infrastructure is achievable today, without exotic tooling, on a standard EC2 instance running a mainstream AI stack. The components — LangChain, HashiCorp Vault, IBM Verify, SPIFFE/SPIRE — are all production-grade, well-documented, and composable.

The important shifts in mindset this architecture requires:

Identity is the new perimeter. SPIFFE workload identity replaces static credentials as the root of trust. Every service proves who it is before receiving anything sensitive.

Secrets should be dynamic, not static. A secret with a 1-hour TTL that IBM Verify rotates automatically is fundamentally safer than a 90-day API key that "might" get rotated.

Authentication is a point-in-time event. Authorization must be continuous. CAEP and the Shared Signals Framework turn session validity into a living contract — revocable cross-system in real time when behavior crosses defined thresholds.

AI agent delegation requires explicit, auditable consent. Token Exchange and act-on-behalf-of ensure the agent's authority is always user-granted and scope-limited — never inferred, never inherited, always traceable.

RAR gives authorization the precision that scopes never could. Embedding structured authorization intent directly into tokens — and enforcing it via Verify policies at issuance — moves policy out of application code and into the identity layer where it belongs.

The .env is for configuration, not credentials. Client IDs, endpoint URLs, feature flags — these belong in configuration. Secrets do not.

What's Next

Next steps include tuning the CAEP warning thresholds based on observed agent behavior patterns, adding Vault response wrapping for single-use secret delivery, wiring CAEP session events into a SIEM for full audit attribution, and evaluating SPIFFE Federation for multi-cluster trust across environments.