POST /api/public/trident/redteam/campaign
Enqueue an autonomous Trident-AI red-team campaign against a target agent. The campaign engine selects attack skills based on your chosenscanMode, runs multi-turn adversarial interactions, and files findings directly into your project inbox.
Endpoint: POST https://app.usetrident.dev/api/public/trident/redteam/campaign
Authentication: HTTP Basic — see Authentication
Request body
The ID of the agent to attack. Must match
[a-zA-Z0-9._-]+ and be
registered in your project.Describes how the campaign runner reaches your agent. Three target kinds are
supported:
openai-chat — OpenAI-compatible chat endpoint.| Field | Type | Notes |
|---|---|---|
kind | "openai-chat" | Required |
baseUrl | string (URL) | Required — your agent’s chat completions URL |
apiKey | string | Optional — bearer token for the endpoint |
model | string | Optional — model ID to pass in the request |
systemPrompt | string | Optional — system prompt to prepend |
http-proxy — Arbitrary HTTP endpoint with a templated body.| Field | Type | Notes |
|---|---|---|
kind | "http-proxy" | Required |
url | string (URL) | Required |
headers | object | Optional key/value request headers |
bodyTemplate | string | Required — $INPUT is replaced with the attack text |
responsePath | string | Optional — dot-path to the reply field in the JSON response |
timeoutMs | number | Optional — per-request timeout, max 120 000 ms |
echo — No-op target for testing your pipeline.| Field | Type | Notes |
|---|---|---|
kind | "echo" | Required |
responses | object | Optional map of input → response strings |
Preset scan profile. Recommended for CI use because each mode has a
calibrated cost envelope and duration estimate.
Omit
| Mode | Skills | Expected duration | Recommended for |
|---|---|---|---|
quick | ~3 | < 2 min | PR checks |
standard | ~7 | ~10 min | Main branch gates |
deep | ~12 | ~25 min | Release gates |
exhaustive | ~20 | ~60 min | Periodic full audits |
scanMode to configure skills manually via skillIds, maxSteps, and
perStepMaxIterations.Explicit list of attack skill IDs to run. Use when you need fine-grained
control over which attacks are attempted. Ignored when
scanMode is set.Maximum number of attack steps across the campaign, 1–20. Defaults are
set by
scanMode; override here for custom budgets.Maximum adversarial turns per skill step, 1–60.
Per-step LLM cost ceiling in USD, max $10. The runner halts the step when
this threshold is crossed.
Absolute total cost ceiling in USD, max $100. The campaign aborts if
cumulative LLM spend reaches this value.
Up to 500 characters of context about what the agent should never do. The
attack engine uses this to focus on the most relevant attack categories.
When
true, the campaign skips skills that already have a landed finding
against this agent, running only attack types that have not yet succeeded.
Useful for incremental scans that build on earlier results.Look-back window in days (1–180) used by
gapFillOnly to determine which
skills already have findings.Optional context objects that sharpen specific attack categories.
| Field | Type | Purpose |
|---|---|---|
exfilCanaries | string[] | Secret strings to watch for in outputs (data exfiltration probes) |
forbiddenTokens | string[] | Strings the agent must never output |
highBlastTools | string[] | Tool names that have high-blast-radius if misused |
attackerInjectedTargets | string[] | Injection targets for indirect prompt injection probes |
Example request
Example response (HTTP 202)
GET /api/public/trident/redteam/campaign/
Poll a campaign’s current state. Returns the job state plus a count of findings filed so far, so you can distinguish “still running” from “done, zero findings.” Endpoint:GET https://app.usetrident.dev/api/public/trident/redteam/campaign/{jobId}
Authentication: HTTP Basic — see Authentication
Path parameters
The
jobId returned by POST /api/public/trident/redteam/campaign.Example request
curl
Example response
State values
state | Meaning |
|---|---|
queued | Job is waiting to be picked up |
active | Worker is currently running the campaign |
completed | Campaign finished — check findingsUrl for results |
failed | Campaign encountered an unrecoverable error; see failedReason |
delayed | Job is scheduled for a future time |
waiting | Job is waiting for a concurrency slot |
unknown | Job ID not found or no longer tracked — if findingCount > 0, the campaign completed |
Finished campaigns may no longer report a specific state. If
state is
"unknown" but findingCount is greater than zero, the campaign ran
successfully and findings are available.POST /api/public/trident/redteam/run
Enqueue a Promptfoo-based red-team run. This engine uses static Promptfoo plugins and strategies for deterministic, broad-coverage scanning — ideal for compliance audits where you need reproducible attack counts. Endpoint:POST https://app.usetrident.dev/api/public/trident/redteam/run
Authentication: HTTP Basic — see Authentication
Request body
Agent ID — must match
[a-zA-Z0-9._-]+.A human-readable description of what this agent does. Surfaced in the
dashboard and used to focus plugin attacks. Maximum 500 characters.
Promptfoo plugin IDs to run, for example
["foundation", "harmful", "pii", "bias", "financial"]. Between 1 and 40 plugins.Promptfoo strategy IDs — for example
["basic", "jailbreak", "jailbreak:tree", "crescendo"]. Between 1 and 8 strategies.Provider type Promptfoo should use, for example
"http",
"openai-chat", or "echo".Provider-specific configuration — URL, headers, auth, response parser.
Shape depends on
targetProviderId.Number of test cases generated per (plugin, strategy) combination, 1–20.
Higher values increase coverage and cost.
Optional Trident policy ID to associate with this run for compliance
tracking.
Example request
Example response (HTTP 202)
POST /api/public/trident/redteam/garak
Enqueue an NVIDIA Garak static-probe run. Garak fires approximately 2,500 attack variants across four probe suites — encoding, leakreplay, snowball, and glitch — making it suitable for compliance reports that require deterministic, reproducible attack counts. Endpoint:POST https://app.usetrident.dev/api/public/trident/redteam/garak
Authentication: HTTP Basic — see Authentication
Probe suites
| Probe | What it tests |
|---|---|
encoding | Character encoding tricks — Base64, ROT13, Unicode homoglyphs — used to bypass content filters |
leakreplay | System prompt and training-data leakage |
snowball | Escalating multi-turn requests designed to erode refusal guardrails |
glitch | Unusual token sequences and adversarial suffixes that cause unexpected model behaviour |
Request body
Agent ID — must match
[a-zA-Z0-9._-]+.Scan label surfaced in the dashboard. Maximum 500 characters.
Subset of
["encoding", "leakreplay", "snowball", "glitch"]. At least one
is required.Base URL of the HTTP endpoint Garak should attack.
Human-readable model identifier, surfaced in the dashboard. Maximum 120
characters.
Path appended to
targetBaseUrl. Defaults to /chat.Optional bearer token or API key for the target. Encrypted at rest on
enqueue — never logged.
JSON body template for each probe request.
$INPUT is replaced with the
Garak-synthesised attack string. Maximum 8 000 characters.Top-level JSON key or JSONPath (prefix with
$) that contains the response
text. Maximum 200 characters.Example request
Example response (HTTP 202)
Polling pattern: wait for a campaign to complete
Use the campaign status endpoint to poll until the job reaches a terminal state, then retrieve findings.TypeScript