We break the deepfake detectors your fraud defense relies on.
Margen is the independent red team for your deepfake defense. We attack the detection you have bought or built with the fraud that is actually circulating, then tell you where it holds, where it fails, and what to do about it.
Best fit for organizations holding large amounts of biometric data.
Detector report card
vendor-x v4.2
- Unseen generatorsFails
- Platform-realistic conditionsDegrades
- Per-group fairnessGap found
- Clean benchmarkPasses
Illustrative. Real engagements report confidence intervals and the exact failure path.
Results reported against these external standards, not certified against them
- ISO/IEC 30107-3
- ISO/IEC 19795
- ISO/IEC 24027
- NIST AI RMF
- EU AI Act
At scale, no one is checking by hand. The system is.
High-volume identity flows, onboarding, verification, remote interviews, move far more traffic than any team can review by hand. And the best fakes now slip past trained reviewers anyway. The human layer cannot hold the line.
So your detection system is the fraud-prevention layer. These systems run several checks at once, and a weak spot at any one of them lets the fraud through.
That technical control is what we evaluate. We find where it fails, before an attacker does.
The core challenge
Why a strong-looking detector still lets fraud through.
Trained on yesterday
Detectors learn from the generators that existed when they were built. New models ship every month, and the detector has never seen them.
Tested in a lab
Vendor numbers come from clean, pristine images. Real fraud arrives compressed, resized, and re-encoded by the platforms it passes through.
Measured on the average
A strong overall score can hide groups the detector barely catches. The average looks fine while a whole subgroup is an open lane.
The problem, measured
The detectors that scored perfect collapsed the hardest.
Detection score, where 1.00 is perfect and 0.50 is a coin flip.
Clean lab test
1.00score
Two open-source detectors that hit a perfect score on a clean test.
Real conditions
0.34score
The same two, re-tested against fresh attacks and the compression real platforms apply. Six other detectors slipped too, but far less.
Source: Margen open-source detector benchmark · 14 detectors
What we test
We test against attacks your detector has never seen.
Fraudsters do not use last year's models. We evaluate against generators held out of your training and add new ones as threats emerge, so the score reflects tomorrow's attack.
Two ways to red-team a detector.
Both engagements run on the same dataset and the same methodology. They differ in who initiates them, who owns the customer relationship, and what the report says on the cover.
01
You submit a detector. We red-team it.
An independent red team against your own model. We attack your detector with the fraud that is actually circulating, under the conditions a real platform imposes, then return a verdict: pass, conditional, or fail. Where it breaks, you get the exact recipe that beat it, so you can fix it.
- Initiated by
- The detection vendor
- Duration
- 4 to 6 weeks, fixed scope
- Deliverable
- Red-team report with a verdict, per-group results, and the recipes that broke it
- Used for
- Procurement evidence, marketing-claim validation, pre-release QA
02
Your red team brings us in for the technical layer.
A partnership with red-team and security-awareness firms. The partner runs the engagement and keeps the customer; we add an independent review of the detection technology in scope, so the end customer leaves with one joint report covering both the human and the technical layer.
- Initiated by
- A red team or security-awareness partner
- Duration
- Matches the host engagement
- Deliverable
- Single joint report covering human and technical layers
- Used for
- Enterprise security audits, joint customer engagements
Five teams, one measurement layer.
“The third-party red team that helps you close the deal.”
Your buyers ask for proof that goes beyond your own benchmark. We are the independent red team that supplies it: an evaluation grounded in a corpus your team did not assemble and a method your team did not design, so the number holds up in the room where the deal is won.
What we measure for them
- Per-group performance. Demographic and platform breakdowns of every score.
- Bypass recipes. Every failure annotated with the recipe that surfaced it.
- Pre-release QA. A second pair of eyes before you ship.
Three commitments the measurement layer cannot exist without.
01 / Coverage
Coverage that tracks the threat.
Our evaluation corpus expands toward the frontier generators adversaries are adopting, so the benchmark keeps pace with the attack.
02 / Method
Reproducible methodology.
Every claim is backed by a dataset, a documented pipeline that recompresses media the way platforms do, and a pre-registered statistical methodology. Results can be independently re-run by anyone with corpus access on request.
03 / Fit
Context-fit evaluation.
We tailor the assessment to the threat the enterprise actually faces. The methodology is rigorous within each context, not generic across them.
Not an internal team. Not a generalist pentest.
A detector is a specialized control, and evaluating it takes a specialized, independent adversary. Here is how an engagement compares to the alternatives most teams reach for first.
| Capability | Margen | Internal red team | Generalist pentest |
|---|---|---|---|
| Independent of the vendor under test | |||
| Deepfake-specific attack corpus | |||
| Per-group fairness breakdown | |||
| Platform-realistic conditions | |||
| Pre-registered, reproducible method | |||
| Hands back the breaking recipe |
- 14
- detectors evaluated
- 12
- demographic groups covered per detector
- 1.00 to 0.34
- top detectors, from perfect to below a coin flip
- 0
- detectors we sell, by design
Near-perfect on paper, near-random under attack.
99% to a coin flip
Detectors that scored near-perfect on a clean benchmark dropped to barely better than a coin flip once we sent attacks they had not seen, re-compressed the way a real platform would. The pitch number was not the production number.
Read the researchFindingHidden blind spots
Strong overall scores concealed groups the same detector barely caught. For a buyer that is both an open fraud lane and a fairness liability, invisible until someone stratifies the results.
Read the researchFindingStale by the quarter
Detectors that lead today degrade as new generators ship. A validation from six months ago does not tell you whether you are covered against the attacks circulating now.
Read the research
SyntheticWe build the impersonations your detector has to catch.
A detector is only proven on real and fake side by side. We build the hard half: frontier-quality fakes, including impersonations made to pass as a real person. Your detector is scored on both, so the number reflects whether it can still tell a genuine face from the attack built to wear it. The faces shown here are the synthetic attack side.
Find your blind spot before someone else does.
Submit a model for evaluation, or add the detection layer to a red-team engagement. We return a per-group report card showing both kinds of mistake and the margin of error on each, and where a detector fails, the recipe that broke it. For buyers, we can point you to the detection that actually holds for your use case.