NIST Submissions
Vorion actively participates in NIST's AI standards development. Our submissions draw from direct experience operating the BASIS-governed AI platform to provide concrete, implementable recommendations to policymakers.
NIST CAISI: Security Considerations for AI Agents
Vorion's response to NIST CAISI's RFI on AI agent security, drawing from direct experience building the Vorion Governed AI Execution Platform. Addresses all five RFI topics with concrete implementation patterns, quantitative data, and open-source reference implementations.
NIST-2025-0035
Submitted March 2026
Federal Register Vol. 91, No. 5 (pp. 698-701)
10 threat categories specific to agentic AI, mapped to OWASP Top 10 for Agentic Applications
Trust-tiered gating: T0-T7 tiers with 16 behavioral factors governing what agents can do
Full reference implementation at github.com/vorionsys/vorion (Apache-2.0)
Security Threats, Risks & Vulnerabilities
Traditional software threats exploit implementation bugs. Agent threats exploit the fundamental architecture -- agents receive instructions in the same medium they process data (natural language), operate with persistent state, and combine tools in emergent ways no individual tool author anticipated.
10 Agentic Threat Categories (OWASP Mapped)
| OWASP | Threat | Why Agents Are Different |
|---|---|---|
| ASI01 | Goal hijacking | Natural-language instructions cannot be distinguished from data; prompt injection exploits meaning, not syntax. |
| ASI02 | Tool weaponization | Agents chain legitimate tools in unintended sequences; individually safe tools create attack paths when combined. |
| ASI03 | Identity inheritance | Agents inherit human-level permissions by default with no established "agent identity" pattern. |
| ASI04 | Supply chain compromise | MCP servers, plugins, and prompt templates are loaded dynamically from unverified sources at runtime. |
| ASI05 | Code execution escape | Agents generate and execute code as normal operation; the boundary between "output" and "command" is blurred. |
| ASI06 | Memory poisoning | A single successful injection persists across sessions; every future interaction inherits the compromise. |
| ASI07 | Inter-agent spoofing | Multi-agent communication uses natural language with implicit trust -- no TLS equivalent exists. |
| ASI08 | Cascading failures | Connected agent systems amplify errors exponentially; one compromised agent poisons downstream chains within hours. |
| ASI09 | Trust exploitation | Agents generate authoritative explanations that turn human-in-the-loop review into rubber stamps. |
| ASI10 | Rogue behavior | Agents may develop misaligned objectives through reward hacking or memory drift without any external attacker. |
Barriers to Adoption (Question 1c)
Without graduated trust and containment, the risk profile of deploying an autonomous agent is binary -- it works or it causes damage with no intermediate states.
SOC 2 and ISO 27001 have no controls specific to AI agent behavior. Regulated industries cannot demonstrate compliance for agent deployments.
Cyber insurance policies typically exclude AI incidents or lack underwriting models for agentic risk. Without quantifiable trust metrics, insurers cannot price the risk.
Multi-Agent Specific Threats (Question 1e)
When Agent A trusts Agent B's output for downstream decisions, a single compromise propagates through the chain. There is no "certificate revocation" for poisoned outputs already consumed.
Agents may each operate within policy boundaries, yet their collective behavior produces outcomes no single agent's policy was designed to prevent -- the composition problem applied to natural language.
A low-trust agent may request a higher-trust agent to perform actions it cannot do directly. Without explicit delegation controls, multi-agent systems bypass per-agent access controls.
Security Practices: Mitigations & Technical Controls
Vorion recommends three categories of controls informed by operational experience: model-level robustness controls, agent system-level graduated trust architecture, and cryptographic proof chains.
Graduated Trust Architecture -- T0 through T7
| Tier | Score | Agent Capabilities | Governance Posture |
|---|---|---|---|
| T0 Sandbox | 0-199 | Read-only, no external calls | All intents require approval |
| T1 Observed | 200-349 | Basic tools, scoped data | Enhanced logging active |
| T2 Provisional | 350-499 | Standard tools, rate-limited | Sensitive ops require review |
| T3 Monitored | 500-649 | Full standard toolset | Continuous monitoring |
| T4 Standard | 650-799 | Extended tools + external APIs | Green-light for most operations |
| T5 Trusted | 800-875 | Cross-namespace access | Elevated authority scope |
| T6 Certified | 876-950 | Administrative operations | Can approve others' intents |
| T7 Autonomous | 951-1000 | Unrestricted within policy | Self-governing |
Scoring Design Principles
All new agents begin at T0 (Sandbox). Trust is earned, never assumed.
Failures penalize trust more heavily than successes reward it -- a deliberate design bias toward reliability over speed of trust acquisition.
182-day (6-month) half-life via 9 milestones. Grace period: days 0-6. Day 7: -6%, Day 14: -12%, Day 28: -18%, Day 56: -30%, Day 112: -40%, Day 182: -50%. Any activity before a milestone resets the clock.
Scores computed from: Behavioral (40%), Compliance (25%), Identity (20%), Context (15%).
Fluid Governance -- Three-Tier Decision Model
Proceed with constraints: allowed tools, data scopes, rate limits, execution time, and reversibility requirements.
Transform “access denied” into collaborative negotiation. Agent can reduce scope, add constraints, request approval, or decompose the intent.
Hard policy violation. Triggers containment escalation and trust score decay. No negotiation path.
Cryptographic Proof Chains
SHA-256 links each record to its predecessor; parallel SHA3-256 integrity anchors provide algorithmic diversity and a migration path if either is weakened (ADR-017).
Ed25519 signatures bind each record to a specific agent identity -- non-repudiation for every governance decision.
Periodic Merkle tree construction is scaffolded for external anchoring and batch verification in high-frequency deployments.
Pedersen commitment and range proof interfaces are scaffolded to enable agents to prove trust tier membership without revealing exact scores. Production integration with ristretto255/circom is planned.
Assessing Security
Assessment spans four dimensions: trust posture, capability scope, governance coverage, and proof chain health. These build on established SIEM and behavioral analytics practices, adapted for the unique characteristics of AI agents.
Four Assessment Dimensions
Current trust tier, score history, and trajectory. An agent with a stable T4 score and no recent containment events presents lower risk than an agent oscillating between T2 and T4.
What tools, data, and external services can the agent access? Assessment must include both explicitly granted and implicitly available capabilities (tools accessible through other tools).
What percentage of possible actions are covered by explicit policy rules? Systems with gaps in P0--P2 coverage present unacceptable risk.
Chain completeness, signature validity, chain integrity verification pass rate, and anchoring freshness -- monitored continuously.
Governance Coverage -- Policy Priority Levels
| Priority | Category | Purpose | Assessment Metric |
|---|---|---|---|
| P0 | Hard disqualifier | Immediate denial | False negative rate (target: 0%) |
| P1 | Regulatory mandate | Compliance | Compliance gap % |
| P2 | Security critical | Violation detection | Detection latency (ms) |
| P3 | Policy enforcement | Standard policies | Override rate |
| P4 | Soft constraint | Guidelines | Acknowledgment rate |
| P5 | Clarification trigger | Ambiguity resolution | Clarification success rate |
| P6 | Logging only | Audit trail | Coverage % |
Post-Deployment Incident Detection Methods
Sudden drops indicate behavioral anomalies. Default: 100-point drop triggers containment escalation. Configurable per deployment.
Continuous verification detects tampering or audit trail gaps. Chain completeness, signature validity, and hash integrity are security metrics.
Compare tool usage frequency, error rates, and operation scope against established baselines. Significant deviations trigger enhanced monitoring.
Agents repeatedly triggering YELLOW or RED decisions indicate policy misconfiguration or genuine behavioral instability -- both require investigation.
Limiting, Modifying & Monitoring Agent Access
Progressive containment provides proportional response to detected threats across 7 levels. Every GREEN governance decision attaches specific constraints -- allowed tools, data scopes, rate limits, max execution time, and reversibility requirements.
7-Level Progressive Containment
| Level | Name | Restrictions | Trigger |
|---|---|---|---|
| L1 | Full Autonomy | None | Default for T5+ agents |
| L2 | Monitored | Enhanced logging | Trust score below T4 threshold |
| L3 | Tool Restricted | High-risk capabilities blocked | Policy violation detected |
| L4 | Human-in-Loop | All actions require approval | Trust score drop >100 points |
| L5 | Simulation Only | Read-only, no side effects | Suspected goal hijacking |
| L6 | Read Only | Write/execute blocked | Confirmed anomalous behavior |
| L7 | Halted | ALL operations blocked | Kill switch activated |
Zero-Trust Adaptations for Agents
Agent identity verified on every request, not just at session establishment. Trust scores re-evaluated continuously -- not just at authentication time.
No agent is trusted by default, regardless of its creator or deployment context. All agents start at T0.
Categorical Agentic Registry provides unique, verifiable identity per agent distinct from human credentials. No credential inheritance.
All agent-to-agent communication is treated as potentially adversarial. Trust scores propagate -- agents do not blindly accept instructions from lower-trust agents.
Legal & Privacy Considerations
Proof chains may contain sensitive data (user queries, agent reasoning traces). Retention policies must balance auditability with privacy requirements.
Conflicts with immutable proof chains. Addressed through pseudonymization -- proof records reference entity IDs, not personal data -- and Pedersen commitment interfaces.
Monitoring agents acting on behalf of specific employees may implicate privacy regulations in some jurisdictions. Clear scope and purpose policies are required.
Additional Considerations
Open-source reference implementations paired with open standards are the most effective adoption mechanism. The U.S. should lead on runtime agent governance standards -- the critical gap between pre-deployment evaluation (where other jurisdictions have invested) and runtime behavioral controls (where no jurisdiction has published guidance).
Critical Research Priorities (Question 5c)
Despite 3+ years of identification, no robust defense exists. Research should focus on architectural solutions -- separating instruction and data channels -- rather than input filtering.
Formal models for how trust should propagate, decay, and revoke across agent fleets. Current approaches are entirely ad hoc.
Methods to detect when an agent's behavior has shifted from its intended purpose through subtle changes that individually appear benign but collectively represent goal drift.
Cryptographic protocols for agent-to-agent communication providing authentication, integrity, and non-repudiation without prohibitive latency.
Empirical research on how effectively containment mechanisms prevent harm propagation, and what levels are appropriate for different risk categories.
Most Urgent Government Collaboration Areas (Question 5b)
No established standard exists for AI agent identity distinct from human identity or service accounts. Government collaboration analogous to PIV/CAC for humans would address ASI03, ASI07, and ASI09 simultaneously.
As agencies deploy multi-agent systems, trust propagation between agencies' agents will require interoperable trust frameworks. NIST is uniquely positioned to develop cross-organizational agent trust standards.
AI agent security incidents are currently unreportable through existing channels (CISA, CVE). A mechanism for prompt injection campaigns, supply chain compromises, and behavioral drift events would improve collective defense.
FedRAMP and FISMA have no mapping to AI agent security controls. NIST guidance on how agent security controls satisfy existing compliance requirements would accelerate safe adoption.
Applicable Practices from Other Fields (Question 5e)
Crew resource management -- multi-layered authority model (captain, first officer, ATC) maps to trust tiers and escalation chains.
Defense in depth -- multiple independent barriers principle informs progressive containment design. No single failure should cascade.
Graduated authority limits based on demonstrated competence and track record. Trust scoring adapts this for AI agents.
FDA process validation (IQ/OQ/PQ) maps to agent lifecycle: installation, operational, and performance qualification.
SCADA safety systems are physically separate from control systems. Agent kill switches must be architecturally isolated from the agent control plane.
Standards Alignment
| Standard | How This Work Aligns |
|---|---|
| NIST AI RMF (AI 100-1) | GOVERN: Trust tiers and policy rules | MAP: Threat taxonomy | MEASURE: Trust scoring metrics | MANAGE: Progressive containment and kill switches |
| NIST CSF 2.0 | Extends ID, PR, DE, RS, RC functions with agent-specific controls; agents addressed as a distinct asset type |
| NIST IR 8596 (Cyber AI Profile) | Builds on our January 2026 public comment prepared for NIST's Cyber AI Profile |
| OWASP Top 10 for Agentic AI | Full ASI01-ASI10 mapping with implemented technical controls per threat category |
| ISO/IEC 42001 | Trust scoring complements AIMS with runtime behavioral measurement |
| EU AI Act | Trust tiers map to risk categories; progressive containment addresses Article 14 human oversight requirements |
Vorion's Open Standards Work
The BASIS standard and Vorion platform are fully open-source. All governance infrastructure referenced in these submissions is available for NIST review and public implementation.