AI Agent Security: Trace, Test, and Protect in Production

Trident gives you end-to-end security coverage for AI agents — from the first prompt your agent receives to the findings that land in your inbox. This section covers the four pillars of agent security in Trident: distributed tracing to capture every LLM call and tool invocation, automated red-teaming to discover vulnerabilities before attackers do, a runtime firewall to block prompt injection and jailbreaks in production, and a findings inbox to triage, replay, and resolve issues.

Tracing

Automatically capture every prompt, tool call, and LLM response your agent makes — with zero-config PII redaction built in.

Red-Teaming

Run 200+ adversarial attack vectors against your agents covering the full OWASP Agentic Top-10.

Firewall

Scan every prompt and output for injection attacks, jailbreaks, and data exfiltration attempts in real time.

Findings

Review, replay, and resolve security issues from red-team runs, the firewall, static analysis, monitors, and agent self-reports.

End-to-end workflow

Trident’s agent security features are designed to work together across the full development and production lifecycle.

Instrument

Install the Trident SDK and call trident.init() (TypeScript) or vouch_sdk.init() (Python) once at startup. The SDK wraps OpenLLMetry and auto-traces every LLM call in the process — no additional code required.

Red-team

Before you ship, run a red-team campaign from the dashboard or via the API. Trident fires 200+ attack vectors at your agent and scores every result against the OWASP Agentic Top-10 and AIVSS severity scale.

Firewall

In production, route prompts through the Trident gateway proxy or call trident.scan() directly before each LLM call. Every prompt passes through a two-stage firewall: your project’s custom deny rules fire first, then the LLM Guard ensemble.

Review findings

All issues — from red-team campaigns, the runtime firewall, static analysis, monitors, and agent self-reports — land in the Findings inbox. Confirm a finding to automatically push a new firewall rule. Replay it at any time to verify your fix.

Supported frameworks

Trident’s tracing and firewall integrations work with every major LLM framework and provider. Install the SDK and your existing code is instrumented automatically — no framework-specific plugins required. LLM providers: OpenAI, Anthropic, Amazon Bedrock, Google Vertex AI, Cohere Agent frameworks: LangChain, CrewAI, LlamaIndex Protocols: Model Context Protocol (MCP)

PII redaction

Trident redacts sensitive data in your process before any trace data leaves your infrastructure. The following patterns are scrubbed from span attributes automatically:

Email addresses
Credit card numbers (validated with Luhn checksum)
US Social Security Numbers (SSNs)
AWS access key IDs
JSON Web Tokens (JWTs)
API keys (OpenAI, Anthropic, and similar sk- prefixed secrets)
IBANs
IPv4 addresses
Phone numbers

You can supply your own redaction rules or disable redaction entirely by passing options to trident.init(). See the Tracing page for details.

Tracing

Red-Teaming

Firewall

Findings

​End-to-end workflow

​Supported frameworks

​PII redaction

End-to-end workflow

Supported frameworks

PII redaction