AI red teaming & evaluation

Break your AI before attackersBreak your AI before attackers

TrustTest is a red-teaming and evaluation framework that attacks your LLMs and agents with state-of-the-art adversarial techniques, then grades how they hold up

Get a demo

Read the docs

TrustTest red teaming and evaluation flow: any target through attack and evaluation to verdict

Trusted by the world’s leading companies

The problem

AI isn't deterministic. Testing it the old way doesn't work.

The prompt is the new attack surface, and it shifts constantly. Traditional QA and manual red teaming can't keep pace.

Non-deterministic behaviour

Outputs change with phrasing, context and model version — pass once ≠ safe forever.

Language is the exploit

Prompt injections and data leaks bypass infrastructure entirely.

Manual red teaming doesn't scale

A human tester runs a finite set of probes. TrustTest runs thousands, continuously.

The problem

AI isn't deterministic. Testing it the old way doesn't work.

The prompt is the new attack surface, and it shifts constantly. Traditional QA and manual red teaming can't keep pace.

Non-deterministic behaviour

Outputs change with phrasing, context and model version — pass once ≠ safe forever.

Language is the exploit

Prompt injections and data leaks bypass infrastructure entirely.

Manual red teaming doesn't scale

A human tester runs a finite set of probes. TrustTest runs thousands, continuously.

How it works

From a target to graded evidence, in one loopFrom a target to graded evidence, in one loop

Connect

Point TrustTest at any target through one unified interface — your model, an agent, or an HTTP API.

Generate

Tests are generated automatically across scenarios — no hand-written prompt suites to maintain.

Attack

State-of-the-art algorithmic probes run adversarial attacks to test robustness and safety.

Evaluate

Versatile evaluators score every response into a traceable verdict — locally or on the platform.

Connect

Point TrustTest at any target through one unified interface — your model, an agent, or an HTTP API.

Generate

Tests are generated automatically across scenarios — no hand-written prompt suites to maintain.

Attack

State-of-the-art algorithmic probes run adversarial attacks to test robustness and safety.

Evaluate

Versatile evaluators score every response into a traceable verdict — locally or on the platform.

Developer-native

A framework, not a black boxA framework, not a black box

TrustTest is code-first. Define a target, pick your languages, and run a full red-teaming pass in a few lines — version it in git, drop it into CI, and gate releases on the results. No portal lock-in, no waiting on a vendor to schedule a test.

TrustTest red_team.py code sample with Python SDK, CI/CD, local or platform and any provider

Capabilities

Everything you need to stress-test an AI systemEverything you need to stress-test an AI system

Test any LLM

One unified interface for your own model or any third-party API — swap targets without rewriting tests.

Automatic test generation

Generate tests across a wide range of scenarios and edge cases — coverage that doesn't depend on hand-written suites.

SOTA red-teaming attacks

Built-in, state-of-the-art algorithmic attacks probe model robustness and safety the way real adversaries do.

Versatile probes & evaluators

Evaluate behaviour from every angle — a deep library of probes and evaluators, extensible with your own.

Red teaming + functional evals

Adversarial attacks and functional quality checks in one framework — cleanly separating cases, evaluators and scenarios.

Full traceability

Track, record and analyse every test, run, evaluator and scenario — locally or via the integrated platform.

Capabilities

Everything you need to stress-test an AI systemEverything you need to stress-test an AI system

Test any LLM

One unified interface for your own model or any third-party API — swap targets without rewriting tests.

Automatic test generation

Generate tests across a wide range of scenarios and edge cases — coverage that doesn't depend on hand-written suites.

SOTA red-teaming attacks

Built-in, state-of-the-art algorithmic attacks probe model robustness and safety the way real adversaries do.

Versatile probes & evaluators

Evaluate behaviour from every angle — a deep library of probes and evaluators, extensible with your own.

Red teaming + functional evals

Adversarial attacks and functional quality checks in one framework — cleanly separating cases, evaluators and scenarios.

Full traceability

Track, record and analyse every test, run, evaluator and scenario — locally or via the integrated platform.

Catalog

One taxonomy of attacks and failure modesOne taxonomy of attacks and failure modes

Prompt injection

Indirect injection

PII & data leakage

System-prompt extraction

Toxicity

Hallucination

Bias & fairness

Off-topic / misuse

Agent & tool abuse

Custom probes

Prompt injection

Indirect injection

PII & data leakage

System-prompt extraction

Toxicity

Hallucination

Bias & fairness

Off-topic / misuse

Agent & tool abuse

Custom probes

Lifecycle

Test on every change — not once a quarterTest on every change — not once a quarter

TrustTest meets your AI where it already lives. Run a pass from your terminal while you build, wire it into CI/CD so a failed attack blocks the merge, or schedule continuous runs on the platform. Every surface uses the same tests, evaluators and verdicts.

TrustTest lifecycle: build, test every change via local CLI, CI/CD gate or platform, ship and protect with TrustGuard

Traceability

Every run, recorded and comparableEvery run, recorded and comparable

Each pass is captured in full: which tests ran, which attacks landed, how every evaluator scored, and how results moved against the last run. Defensible evidence for stakeholders and auditors.

TrustTest run report dashboard: tests run, findings, attack success rate, evaluators, findings by attack type and run-over-run trend

Customers

Trusted by security leaders

Juan Manuel Sanchez-Quinza

With NeuralTrust we stress-tested our chatbot with GenAI ‘SOFia,’ validating a safe go-live that meets financial-sector security and regulatory standards.

Director of Transformation, ABANCA

Customers

Trusted by security leaders

Juan Manuel Sanchez-Quinza

With NeuralTrust we stress-tested our chatbot with GenAI ‘SOFia,’ validating a safe go-live that meets financial-sector security and regulatory standards.

Director of Transformation, ABANCA

Benchmarks