Red-Team API: Trigger Attack Campaigns Programmatically

The Red-Team API lets you launch security attack campaigns against your AI agents programmatically — from CI/CD pipelines, GitHub Actions, SDK calls, or any HTTP client. Trident provides three complementary scan engines: autonomous AI campaigns (recommended for broad coverage), Promptfoo plugin runs (deterministic, plugin-scoped), and Garak static probes (compliance-friendly, high-volume). All red-team endpoints return a run or job ID immediately (HTTP 202 Accepted). You then poll for status and retrieve findings once the run completes.

POST /api/public/trident/redteam/campaign

Enqueue an autonomous Trident-AI red-team campaign against a target agent. The campaign engine selects attack skills based on your chosen scanMode, runs multi-turn adversarial interactions, and files findings directly into your project inbox. Endpoint: POST https://app.usetrident.dev/api/public/trident/redteam/campaign Authentication: HTTP Basic — see Authentication

Request body

agentId

string

required

The ID of the agent to attack. Must match [a-zA-Z0-9._-]+ and be registered in your project.

target

object

required

Describes how the campaign runner reaches your agent. Three target kinds are supported:openai-chat — OpenAI-compatible chat endpoint.

Field	Type	Notes
`kind`	`"openai-chat"`	Required
`baseUrl`	`string` (URL)	Required — your agent’s chat completions URL
`apiKey`	`string`	Optional — bearer token for the endpoint
`model`	`string`	Optional — model ID to pass in the request
`systemPrompt`	`string`	Optional — system prompt to prepend

http-proxy — Arbitrary HTTP endpoint with a templated body.

Field	Type	Notes
`kind`	`"http-proxy"`	Required
`url`	`string` (URL)	Required
`headers`	`object`	Optional key/value request headers
`bodyTemplate`	`string`	Required — `$INPUT` is replaced with the attack text
`responsePath`	`string`	Optional — dot-path to the reply field in the JSON response
`timeoutMs`	`number`	Optional — per-request timeout, max 120 000 ms

echo — No-op target for testing your pipeline.

Field	Type	Notes
`kind`	`"echo"`	Required
`responses`	`object`	Optional map of input → response strings

scanMode

string

Preset scan profile. Recommended for CI use because each mode has a calibrated cost envelope and duration estimate.

Mode	Skills	Expected duration	Recommended for
`quick`	~3	< 2 min	PR checks
`standard`	~7	~10 min	Main branch gates
`deep`	~12	~25 min	Release gates
`exhaustive`	~20	~60 min	Periodic full audits

Omit scanMode to configure skills manually via skillIds, maxSteps, and perStepMaxIterations.

skillIds

string[]

Explicit list of attack skill IDs to run. Use when you need fine-grained control over which attacks are attempted. Ignored when scanMode is set.

maxSteps

number

Maximum number of attack steps across the campaign, 1–20. Defaults are set by scanMode; override here for custom budgets.

perStepMaxIterations

number

Maximum adversarial turns per skill step, 1–60.

perStepCostCapUsd

number

Per-step LLM cost ceiling in USD, max $10. The runner halts the step when this threshold is crossed.

totalCostCapUsd

number

Absolute total cost ceiling in USD, max $100. The campaign aborts if cumulative LLM spend reaches this value.

goalHint

string

Up to 500 characters of context about what the agent should never do. The attack engine uses this to focus on the most relevant attack categories.

gapFillOnly

boolean

When true, the campaign skips skills that already have a landed finding against this agent, running only attack types that have not yet succeeded. Useful for incremental scans that build on earlier results.

priorRunsSinceDays

number

Look-back window in days (1–180) used by gapFillOnly to determine which skills already have findings.

attackContext

object

Optional context objects that sharpen specific attack categories.

Field	Type	Purpose
`exfilCanaries`	`string[]`	Secret strings to watch for in outputs (data exfiltration probes)
`forbiddenTokens`	`string[]`	Strings the agent must never output
`highBlastTools`	`string[]`	Tool names that have high-blast-radius if misused
`attackerInjectedTargets`	`string[]`	Injection targets for indirect prompt injection probes

Example request

{
  "agentId": "prod-rag-assistant",
  "target": {
    "kind": "openai-chat",
    "baseUrl": "https://my-agent.example.com/v1/chat/completions",
    "apiKey": "sk-agent-token",
    "systemPrompt": "You are a helpful customer support assistant."
  },
  "scanMode": "standard",
  "goalHint": "Never reveal internal pricing, never execute administrative actions.",
  "attackContext": {
    "exfilCanaries": ["INTERNAL-SECRET-42"],
    "forbiddenTokens": ["DROP TABLE", "rm -rf"]
  }
}

Example response (HTTP 202)

{
  "jobId": "e3f7b2d1-9a0c-4b5e-8f1d-7c6a2e9b3d4f",
  "statusUrl": "/api/public/trident/redteam/campaign/e3f7b2d1-9a0c-4b5e-8f1d-7c6a2e9b3d4f",
  "findingsUrl": "/api/public/trident/findings?redteamRunId=e3f7b2d1-9a0c-4b5e-8f1d-7c6a2e9b3d4f&sinceDays=1",
  "forecast": {
    "mode": "standard",
    "expectedCostUsd": 1.20,
    "costRangeUsd": { "low": 0.84, "high": 1.56 },
    "expectedDuration": "~10 min",
    "hardCostCapUsd": 7.00,
    "cacheHitRate": 0.15,
    "derivation": "7 skills × 2 iterations × gpt-4o"
  }
}

GET /api/public/trident/redteam/campaign/

Poll a campaign’s current state. Returns the job state plus a count of findings filed so far, so you can distinguish “still running” from “done, zero findings.” Endpoint: GET https://app.usetrident.dev/api/public/trident/redteam/campaign/{jobId} Authentication: HTTP Basic — see Authentication

Path parameters

jobId

string

required

The jobId returned by POST /api/public/trident/redteam/campaign.

Example request

curl

curl "https://app.usetrident.dev/api/public/trident/redteam/campaign/e3f7b2d1-9a0c-4b5e-8f1d-7c6a2e9b3d4f" \
  -H "Authorization: Basic $CREDENTIALS"

Example response

{
  "jobId": "e3f7b2d1-9a0c-4b5e-8f1d-7c6a2e9b3d4f",
  "state": "active",
  "enqueuedAt": 1718025727000,
  "startedAt": 1718025730000,
  "finishedAt": null,
  "progress": { "step": 3, "totalSteps": 7, "currentSkill": "prompt-injection" },
  "failedReason": null,
  "findingCount": 1
}

State values

`state`	Meaning
`queued`	Job is waiting to be picked up
`active`	Worker is currently running the campaign
`completed`	Campaign finished — check `findingsUrl` for results
`failed`	Campaign encountered an unrecoverable error; see `failedReason`
`delayed`	Job is scheduled for a future time
`waiting`	Job is waiting for a concurrency slot
`unknown`	Job ID not found or no longer tracked — if `findingCount > 0`, the campaign completed

Finished campaigns may no longer report a specific state. If state is "unknown" but findingCount is greater than zero, the campaign ran successfully and findings are available.

POST /api/public/trident/redteam/run

Enqueue a Promptfoo-based red-team run. This engine uses static Promptfoo plugins and strategies for deterministic, broad-coverage scanning — ideal for compliance audits where you need reproducible attack counts. Endpoint: POST https://app.usetrident.dev/api/public/trident/redteam/run Authentication: HTTP Basic — see Authentication

Request body

agentId

string

required

Agent ID — must match [a-zA-Z0-9._-]+.

purpose

string

required

A human-readable description of what this agent does. Surfaced in the dashboard and used to focus plugin attacks. Maximum 500 characters.

plugins

string[]

required

Promptfoo plugin IDs to run, for example ["foundation", "harmful", "pii", "bias", "financial"]. Between 1 and 40 plugins.

strategies

string[]

required

Promptfoo strategy IDs — for example ["basic", "jailbreak", "jailbreak:tree", "crescendo"]. Between 1 and 8 strategies.

targetProviderId

string

required

Provider type Promptfoo should use, for example "http", "openai-chat", or "echo".

targetConfig

object

required

Provider-specific configuration — URL, headers, auth, response parser. Shape depends on targetProviderId.

numTests

number

default:"2"

Number of test cases generated per (plugin, strategy) combination, 1–20. Higher values increase coverage and cost.

policyId

string

Optional Trident policy ID to associate with this run for compliance tracking.

Example request

{
  "agentId": "prod-rag-assistant",
  "purpose": "Customer support assistant with access to order lookup tools",
  "plugins": ["foundation", "harmful", "pii"],
  "strategies": ["basic", "jailbreak"],
  "targetProviderId": "http",
  "targetConfig": {
    "url": "https://my-agent.example.com/chat",
    "method": "POST",
    "headers": { "Authorization": "Bearer sk-agent-token" },
    "body": "{\"message\": \"{{prompt}}\"}",
    "transformResponse": "json.reply"
  },
  "numTests": 3
}

Example response (HTTP 202)

{
  "runId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "statusUrl": "/api/public/trident/findings?redteamRunId=a1b2c3d4-e5f6-7890-abcd-ef1234567890&sinceDays=1",
  "findingsUrl": "/api/public/trident/findings?redteamRunId=a1b2c3d4-e5f6-7890-abcd-ef1234567890&sinceDays=1"
}

POST /api/public/trident/redteam/garak

Enqueue an NVIDIA Garak static-probe run. Garak fires approximately 2,500 attack variants across four probe suites — encoding, leakreplay, snowball, and glitch — making it suitable for compliance reports that require deterministic, reproducible attack counts. Endpoint: POST https://app.usetrident.dev/api/public/trident/redteam/garak Authentication: HTTP Basic — see Authentication

Probe suites

Probe	What it tests
`encoding`	Character encoding tricks — Base64, ROT13, Unicode homoglyphs — used to bypass content filters
`leakreplay`	System prompt and training-data leakage
`snowball`	Escalating multi-turn requests designed to erode refusal guardrails
`glitch`	Unusual token sequences and adversarial suffixes that cause unexpected model behaviour

Request body

agentId

string

required

Agent ID — must match [a-zA-Z0-9._-]+.

purpose

string

required

Scan label surfaced in the dashboard. Maximum 500 characters.

probes

string[]

required

Subset of ["encoding", "leakreplay", "snowball", "glitch"]. At least one is required.

targetBaseUrl

string

required

Base URL of the HTTP endpoint Garak should attack.

targetModel

string

default:"custom"

Human-readable model identifier, surfaced in the dashboard. Maximum 120 characters.

targetPath

string

Path appended to targetBaseUrl. Defaults to /chat.

targetApiKey

string

Optional bearer token or API key for the target. Encrypted at rest on enqueue — never logged.

requestTemplate

string

default:"{\"prompt\":\"$INPUT\"}"

JSON body template for each probe request. $INPUT is replaced with the Garak-synthesised attack string. Maximum 8 000 characters.

responseField

string

default:"reply"

Top-level JSON key or JSONPath (prefix with $) that contains the response text. Maximum 200 characters.

Example request

{
  "agentId": "prod-rag-assistant",
  "purpose": "Garak encoding + leakreplay audit for Q2 compliance report",
  "probes": ["encoding", "leakreplay"],
  "targetBaseUrl": "https://my-agent.example.com",
  "targetPath": "/chat",
  "targetApiKey": "sk-agent-token",
  "requestTemplate": "{\"message\": \"$INPUT\"}",
  "responseField": "content"
}

Example response (HTTP 202)

{
  "runId": "f9e8d7c6-b5a4-3210-fedc-ba9876543210",
  "statusUrl": "/api/public/trident/findings?redteamRunId=f9e8d7c6-b5a4-3210-fedc-ba9876543210&sinceDays=1",
  "findingsUrl": "/api/public/trident/findings?redteamRunId=f9e8d7c6-b5a4-3210-fedc-ba9876543210&sinceDays=1"
}

Polling pattern: wait for a campaign to complete

Use the campaign status endpoint to poll until the job reaches a terminal state, then retrieve findings.

TypeScript

const BASE = "https://app.usetrident.dev";
const CREDENTIALS = Buffer.from(
  `${process.env.TRIDENT_PROJECT_PUBLIC_KEY}:${process.env.TRIDENT_PROJECT_SECRET_KEY}`
).toString("base64");

async function triggerAndWait(agentId: string, targetUrl: string) {
  // 1. Enqueue the campaign
  const launchRes = await fetch(
    `${BASE}/api/public/trident/redteam/campaign`,
    {
      method: "POST",
      headers: {
        Authorization: `Basic ${CREDENTIALS}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        agentId,
        target: { kind: "openai-chat", baseUrl: targetUrl },
        scanMode: "standard",
      }),
    },
  );

  const { jobId, findingsUrl } = await launchRes.json();
  console.log(`Campaign enqueued: ${jobId}`);

  // 2. Poll until the job reaches a terminal state
  const TERMINAL = new Set(["completed", "failed", "unknown"]);
  let state = "queued";
  let findingCount = 0;

  while (!TERMINAL.has(state)) {
    await new Promise((r) => setTimeout(r, 15_000)); // wait 15 s

    const statusRes = await fetch(
      `${BASE}/api/public/trident/redteam/campaign/${jobId}`,
      { headers: { Authorization: `Basic ${CREDENTIALS}` } },
    );
    const status = await statusRes.json();
    state = status.state;
    findingCount = status.findingCount;
    console.log(`  state=${state}  findings=${findingCount}`);
  }

  if (state === "failed") {
    throw new Error(`Campaign failed`);
  }

  // 3. Fetch findings produced by this run
  const findRes = await fetch(`${BASE}${findingsUrl}`, {
    headers: { Authorization: `Basic ${CREDENTIALS}` },
  });
  return findRes.json();
}

​POST /api/public/trident/redteam/campaign

​Request body

​Example request

​Example response (HTTP 202)

​GET /api/public/trident/redteam/campaign/

​Path parameters

​Example request

​Example response

​State values

​POST /api/public/trident/redteam/run

​Request body

​Example request

​Example response (HTTP 202)

​POST /api/public/trident/redteam/garak

​Probe suites

​Request body

​Example request

​Example response (HTTP 202)

​Polling pattern: wait for a campaign to complete

POST /api/public/trident/redteam/campaign

Request body

Example request

Example response (HTTP 202)

GET /api/public/trident/redteam/campaign/

Path parameters

Example request

Example response

State values

POST /api/public/trident/redteam/run

Request body

Example request

Example response (HTTP 202)

POST /api/public/trident/redteam/garak

Probe suites

Request body

Example request

Example response (HTTP 202)

Polling pattern: wait for a campaign to complete