How I deployed a LangChain-powered Agentic AI with an MCP server on AWS EC2 — with no secrets
in the code, no secrets in the .env, no Vault tokens on disk, mutual
cryptographic identity via SPIFFE, continuous session evaluation with CAEP, and fine-grained
delegated authorization through Token Exchange and Rich Authorization Requests.
Nearly every tutorial for deploying an AI agent looks the same: spin up an EC2 instance,
SSH in, create a .env file, paste your OpenAI key, your database password,
your third-party API credentials — and ship it. It works. Until it doesn't.
Secrets in .env files get committed to Git, leaked via misconfigured S3
buckets, or exfiltrated in a breach. Static credentials don't rotate. And an AI agent
that holds a live API key is one prompt injection away from being weaponized.
I wanted to build something different — a deployment where the application has no secrets at all, not even a Vault token, and where every service proves its identity cryptographically before any secret is ever exchanged.
TL;DR
In production with SPIFFE, the .env contains only configuration — no secrets.
Client IDs, endpoint URLs, port numbers — nothing exploitable. No passwords. No API keys. No Vault tokens.
All secrets are fetched at runtime from IBM Verify SaaS via HashiCorp Vault, and authentication
to Vault itself is handled entirely by SPIFFE/SPIRE workload identity — no credential pre-seeding required.
The application is built with standard .env conventions — nothing exotic about
the app itself. But the contents of that file tell a very different story than most deployments.
# .env — production with SPIFFE. No secrets anywhere.
OIDC_CLIENT_ID=a1434344-e66f6-7210-abcd-ef13211267890
EXCHANGE_CLIENT_ID=6abty5f0-e31d-421f6-859a-fc40a7f7a496
MCP_CLIENT_ID=f4219-1a7b-47he-88ca-572w6cf23d
VAULT_BASE_URI=https://vault.example.com
AUTH_METHOD=spiffe
# Client IDs, endpoints, ports — that's it.
# No client_secret. No API keys. No Vault token.
# No database passwords. No AI provider keys.
.env
Client IDs are not secrets — they're public identifiers, like usernames without passwords. They tell the IBM Verify Vault Engine Plugin which application is asking, but they prove nothing on their own. The actual secret retrieval is gated entirely behind Vault, which is gated entirely behind SPIFFE identity.
Why this matters
If this EC2 instance is compromised, an attacker finds client IDs and endpoint URLs in the .env.
Without a valid SPIFFE SVID — which is cryptographically bound to this specific workload
and short-lived — those identifiers are useless. There is nothing to steal.
The Agentic AI provider API key (used by LangChain to call the underlying model) is stored
as a secret in HashiCorp Vault — not in the .env, not in the codebase, not in
any configuration file on disk. At startup, the agent authenticates to Vault via its SPIFFE
identity, retrieves the key with a short-lived lease, and uses it for that session.
When the lease expires, the key is gone from memory. The next request gets a fresh one. There is no persistent AI API credential anywhere in the deployment.
The IBM Verify Vault Engine Plugin is a custom secrets engine that extends HashiCorp Vault with the ability to communicate directly with an IBM Verify SaaS tenant. It acts as a bridge between Vault's dynamic secrets model and IBM Verify's credential management.
LangChain agent calls the Vault API path mounted by the IBM Verify engine,
passing the client ID from the .env. No credential is sent — just the identifier.
The plugin uses its own pre-configured credentials (stored securely in Vault, not accessible to the app) to call the IBM Verify SaaS API and retrieve or generate the secret for the given client ID.
Vault returns the secret to the application with a TTL-bound lease. IBM Verify handles the rotation schedule on its end — credentials are cycled automatically without any application-side logic.
When the Vault lease expires, the credential is no longer valid. The next request triggers a fresh issuance. There is no long-lived secret sitting anywhere in the runtime environment.
IBM Verify SaaS Integration
IBM Verify SaaS maintains the authoritative credential store. The Vault plugin is effectively a thin, policy-enforcing proxy — it validates that the caller (the Vault client) is authorized, then delegates secret issuance and rotation to IBM Verify. The application never talks to IBM Verify directly.
This is the piece that closes the loop. Most Vault deployments still have a bootstrapping problem: how does the app authenticate to Vault in the first place? Common answers involve a static Vault token, an AppRole secret ID stored somewhere, or AWS IAM (which requires careful IAM policy management). All of these involve something pre-seeded.
SPIFFE eliminates that entirely.
SPIFFE (Secure Production Identity Framework for Everyone) is an open standard for cryptographic workload identity. Every process — the LangChain agent, the MCP server — gets a SVID (SPIFFE Verifiable Identity Document), which is a short-lived X.509 certificate or JWT that proves: "I am this specific workload, running in this specific environment."
SPIRE is the reference implementation. A SPIRE Server acts as the certificate authority. A SPIRE Agent runs on each EC2 node and issues SVIDs to local workloads through a Unix domain socket — no network call, no credential exchange, just kernel-attested identity.
HashiCorp Vault's JWT auth method is configured to trust SVIDs issued by the SPIRE Server's trust bundle. When the agent starts up:
# Conceptual startup flow (pseudocode)
svid_jwt = spire_agent.fetch_svid(audience="vault")
# → "spiffe://example.org/ns/prod/sa/langchain-agent"
vault_token = vault.auth.jwt.login(
role="langchain-agent",
jwt=svid_jwt
)
# Vault validates the JWT against SPIRE's trust bundle.
# If valid → short-lived Vault token issued for this session.
# That token is never written to disk.
python
The SVID is short-lived (minutes to hours). Vault issues a correspondingly short-lived token. Neither is stored anywhere. The workload identity is ephemeral by design.
Mutual Zero Trust
SPIFFE enables mutual zero trust — not just "client proves identity to server," but both sides present and verify cryptographic identity. The LangChain agent knows it's talking to the real Vault (mTLS via SVID). Vault knows it's talking to the real LangChain agent (JWT SVID validation). No static shared secrets. No trust-on-first-use.
On AWS EC2, the SPIRE Agent uses the AWS IID (Instance Identity Document) attestor to bootstrap the node's own identity — the EC2 instance proves to the SPIRE Server that it is a legitimate AWS instance in the expected account and region. From that point, all workload identity flows from the SPIRE Agent without any human-injected credentials.
| Attack Surface | Traditional .env Deployment | This Architecture |
|---|---|---|
| .env file leaked | All secrets exposed immediately | Only a client ID exposed — not exploitable alone |
| EC2 instance compromised | Attacker has all credentials on disk | In-memory secrets with short TTLs; SVID expires quickly |
| Git repo leak | Credentials in commit history | Nothing sensitive ever touches the codebase |
| Vault token stolen | Vault access persists until manual revocation | No static Vault token exists; SPIFFE tokens expire by design |
| Credential rotation | Manual process, often skipped | Automatic — IBM Verify rotates, Vault leases enforce TTL |
| Lateral movement | Stolen creds reusable across environments | SVIDs are workload-specific; stolen JWT can't be replayed from another workload |
An agentic AI system is not a passive API endpoint. It reasons, plans, and takes actions — potentially calling external APIs, writing data, executing tools through the MCP server. That autonomy makes the security posture significantly more important than for a traditional application.
Consider the risk surface of an AI agent holding live credentials:
Prompt injection attacks attempt to hijack the agent's reasoning to exfiltrate secrets. Tool poisoning via a compromised MCP tool definition could redirect the agent's actions. Runaway tool loops could burn through API quotas or trigger unintended writes. In every case, the blast radius is bounded by what credentials the agent actually holds at runtime.
With this architecture, the agent holds nothing persistently. If the agent is
compromised mid-session, an attacker gets ephemeral, scope-limited credentials that expire
on their own. There's no master key to exfiltrate. There's no .env to read.
Defense in Depth for AI Systems
Pairing a zero-trust credential architecture with an Agentic AI isn't over-engineering. It's recognizing that an autonomous reasoning system with tool access is a high-value target, and designing accordingly. The SPIFFE identity also enables audit logging that is cryptographically attributable — every secret request is tied to a specific workload SPIFFE ID, not just an IP address.
Static authentication — a token issued at login that lives until it expires — is fundamentally at odds with zero trust. A user (or agent) can authenticate legitimately, then immediately begin behaving maliciously. The token doesn't know. Nothing responds.
This deployment integrates IBM Verify's implementation of the Shared Signals Framework (SSF) with the Continuous Access Evaluation Profile (CAEP) — an OpenID Foundation standard that enables real-time, event-driven session revocation across all participating relying parties simultaneously.
The agentic AI monitors every user-initiated request for behavioral anomalies. When the agent detects something suspicious — such as a request to transfer funds to an external account, access another user's account data, or any pattern inconsistent with normal banking activity — it does not immediately terminate the session. It escalates through a structured warning ladder.
Agent flags the suspicious request, logs the signal to SSF, and challenges the user inline. Session remains active. No tokens affected.
A second anomalous signal within the session triggers step-up authentication.
The user must re-verify. SSF emits a credential-change event to downstream
listeners.
Session is marked high-risk in IBM Verify. MFA challenge is mandatory. Risk score is
propagated via CAEP session-presented events to all relying parties holding
tokens for this subject.
The subject is placed in a restricted state. All new token issuance is blocked. Existing tokens are flagged as potentially compromised across the SSF receiver network. Human review may be triggered at this stage.
This is the kill switch. IBM Verify emits a CAEP session-revoked
event across the entire Shared Signals network. Every relying party subscribed to this
subject's event stream immediately invalidates all active sessions, revokes all grants,
and expires all tokens — regardless of their individual TTLs. The revocation is
cross-Verify: it is not limited to this application. Every service that
participates in the SSF fabric loses its session for this subject simultaneously.
Why Cross-Verify Revocation Matters
A typical session revocation only kills the token the application currently holds. The user (or attacker) may still have valid tokens in other apps, browser sessions open in other tabs, or OAuth grants issued to third-party clients. CAEP's cross-Verify revocation tears down the entire identity fabric for that subject at once — it's a synchronized kill across every service that speaks SSF, not just the one that detected the anomaly.
The second CAEP trigger condition is behavioral rather than rule-based. The agent evaluates each request against a learned baseline of typical banking interactions for the session context. Requests that fall outside that baseline — unusual transfer amounts, atypical destination accounts, access to administrative functions a user has never touched, queries about other users' account data — are treated as anomalous signals and fed into the same warning escalation ladder.
This isn't just fraud detection. It's continuous authorization: the question isn't only "did this user authenticate?" but "should this authenticated user be doing this specific thing right now?" CAEP operationalizes that question at the protocol level.
// Conceptual CAEP session-revoked event payload (SSF SET format)
{
"iss": "https://verify.ibm.com/tenantid",
"jti": "a-unique-event-id",
"iat": 1710000000,
"aud": ["https://banking-app.example.com", "https://api.example.com"],
"events": {
"https://schemas.openid.net/secevent/caep/event-type/session-revoked": {
"subject": {
"format": "iss_sub",
"iss": "https://verify.ibm.com/tenantid",
"sub": "user-uid-1234"
},
"reason_admin": "Anomalous transfer pattern — W5 threshold exceeded",
"event_timestamp": 1710000000000
// All RPs receiving this SET immediately revoke all
// sessions, grants, and tokens for sub "user-uid-1234"
}
}
}
SET / JWT
An agentic AI acting on behalf of a user is a delegation problem. The agent is not the user. The agent should be able to act on the user's behalf — but only for specific, consented, policy-bounded operations. Getting that wrong is how AI agents end up with overprivileged access that can be abused or misused.
IBM Verify handles all three layers of this problem natively: the consent UX, the token exchange mechanics, and the policy enforcement.
When the LangChain agent needs to act on behalf of an authenticated user, it uses OAuth 2.0 Token Exchange (RFC 8693). The agent presents the user's subject token to IBM Verify, declares its own identity as the actor, and requests a new token scoped to the specific operation it needs to perform.
# Token Exchange request to IBM Verify
POST /oauth2/token
Content-Type: application/x-www-form-urlencoded
grant_type=urn:ietf:params:oauth:grant-type:token-exchange
&subject_token=<user's_access_token>
&subject_token_type=urn:ietf:params:oauth:token-type:access_token
&actor_token=<agent's_svid_jwt>
&actor_token_type=SPIFFESMT
&requested_token_type=urn:ietf:params:oauth:token-type:access_token
&scope=transfer:all
&authorization_details=<RAR object — see below>
HTTP
The resulting token carries an act claim in the JWT that identifies the agent
as the actor — creating a cryptographically verifiable, auditable record of
"this agent acted on behalf of this user for this operation." IBM Verify enforces
this entirely — the application doesn't implement any delegation logic itself.
Before any impersonation or delegation token is issued, IBM Verify enforces a consent step. The user must explicitly approve the agent acting on their behalf for the requested scope. That consent record lives in Verify — it's durable, auditable, and revocable. The application never manages consent state. If a user revokes consent, the next Token Exchange attempt for that actor/subject pair is rejected immediately, regardless of any in-flight sessions.
Why This Matters for Agentic AI
An AI agent that can act on a user's behalf without explicit, per-operation consent is an abuse vector. Verify's Token Exchange + consent model means the agent's authority is always user-granted, always scoped, and always revocable — not inferred from having the user's credentials.
Standard OAuth scopes — transfer:all — are coarse. They say
"this token can initiate a transfer" but say nothing about limits, recipients,
or conditions. Rich Authorization Requests (RAR) solve this by embedding
a structured authorization_details object directly in the token request.
// authorization_details object in the Token Exchange request
[
{
"type": "transfer_funds",
"instructedAmount": {
"currency": "USD",
"amount": 500.00
}
}
]
JSON
IBM Verify evaluates the authorization_details against
authorization policies defined in the tenant. If the request exceeds
policy limits — amount too high, unauthorized transfer pattern — the token exchange
can be denied or trigger a step-up challenge before a token is ever issued. The policies
live in Verify, not in the application code.
The token returned from Verify carries the authorization_details in its claims.
The resource server receiving the token can inspect those claims and enforce them
independently — even if it never talks to Verify again. The authorization intent is
embedded in the token itself, cryptographically signed.
The Full Delegation Picture
Token Exchange establishes who is acting for whom. Act-on-behalf-of consent ensures the user approved it. RAR specifies exactly what is permitted and under what conditions. Verify policies enforce all three at issuance time. No single layer is sufficient alone — together they make delegated AI authorization genuinely safe for production banking workloads.
This deployment is a proof-of-concept that secretless infrastructure is achievable today, without exotic tooling, on a standard EC2 instance running a mainstream AI stack. The components — LangChain, HashiCorp Vault, IBM Verify, SPIFFE/SPIRE — are all production-grade, well-documented, and composable.
The important shifts in mindset this architecture requires:
Identity is the new perimeter. SPIFFE workload identity replaces static credentials as the root of trust. Every service proves who it is before receiving anything sensitive.
Secrets should be dynamic, not static. A secret with a 1-hour TTL that IBM Verify rotates automatically is fundamentally safer than a 90-day API key that "might" get rotated.
Authentication is a point-in-time event. Authorization must be continuous. CAEP and the Shared Signals Framework turn session validity into a living contract — revocable cross-system in real time when behavior crosses defined thresholds.
AI agent delegation requires explicit, auditable consent. Token Exchange and act-on-behalf-of ensure the agent's authority is always user-granted and scope-limited — never inferred, never inherited, always traceable.
RAR gives authorization the precision that scopes never could. Embedding structured authorization intent directly into tokens — and enforcing it via Verify policies at issuance — moves policy out of application code and into the identity layer where it belongs.
The .env is for configuration, not credentials. Client IDs, endpoint URLs, feature flags — these belong in configuration. Secrets do not.
What's Next
Next steps include tuning the CAEP warning thresholds based on observed agent behavior patterns, adding Vault response wrapping for single-use secret delivery, wiring CAEP session events into a SIEM for full audit attribution, and evaluating SPIFFE Federation for multi-cluster trust across environments.