NewThe detectors that scored perfect collapsed the hardest under attack.
Services

Put an independent number on your detector.

Whether you build detection or buy it, the question is the same: what does it actually catch under adversarial conditions? We answer it two ways, on the same dataset and the same methodology.

What we offer

Two ways to red-team a detector.

Both engagements run on the same dataset and the same methodology. They differ in who initiates them, who owns the customer relationship, and what the report says on the cover.

01

Evaluation

You submit a detector. We red-team it.

An independent red team against your own model. We attack your detector with the fraud that is actually circulating, under the conditions a real platform imposes, then return a verdict: pass, conditional, or fail. Where it breaks, you get the exact recipe that beat it, so you can fix it.

Initiated by
The detection vendor
Duration
4 to 6 weeks, fixed scope
Deliverable
Red-team report with a verdict, per-group results, and the recipes that broke it
Used for
Procurement evidence, marketing-claim validation, pre-release QA
Request an evaluation

02

Co-delivered

Your red team brings us in for the technical layer.

A partnership with red-team and security-awareness firms. The partner runs the engagement and keeps the customer; we add an independent review of the detection technology in scope, so the end customer leaves with one joint report covering both the human and the technical layer.

Initiated by
A red team or security-awareness partner
Duration
Matches the host engagement
Deliverable
Single joint report covering human and technical layers
Used for
Enterprise security audits, joint customer engagements
Become a partner
Versus the alternatives

Why not just use an internal team or a generalist pentest?

Both have their place. Neither is an independent adversary with a deepfake-specific corpus and per-group reporting. That gap is what we fill.

CapabilityMargenInternal red teamGeneralist pentest
Independent of the vendor under test
Deepfake-specific attack corpus
Per-group fairness breakdown
Platform-realistic conditions
Pre-registered, reproducible method
Hands back the breaking recipe
YesPartialNo
Where to start

From a free benchmark to a standing red team.

Most buyers start with the public benchmark, then commission an evaluation of the detector they actually run. The white-glove tier is for teams that need ongoing, tailored coverage.

Public benchmark

Free

See how the market's detectors hold up.

  • Our published results on leading open-source detectors.
  • Per-group breakdowns and platform-condition drops.
  • A starting point for narrowing a shortlist.
Read the research
Most requested

Evaluation

Per engagement

An independent grade for one detector.

  • Your model run through our full benchmark and attack suite.
  • Pass, conditional, or fail across groups and conditions.
  • The recipe that broke it, where it fails, for fixing.
Request an evaluation

White-glove red team

Custom

Ongoing, tailored adversarial coverage.

  • A sequestered attack set built for your exact threat.
  • Co-delivered alongside your red-team or security partner.
  • Repeat engagements as new generators emerge.
Talk to us

Tell us which mode fits.

Submitting a model for evaluation, or extending a red-team engagement to the detection layer? Either way, start here.