Continuous measurement of the detection layer inside your liveness flow.
Your detection layer is one decision in a longer pipeline, but it is the decision attackers will probe hardest. We test it against the attacks adversaries are adopting and the compression every upload in your funnel goes through.
At onboarding scale, no one reviews the selfie. The system does.
Identity flows move far more traffic than any team can check by hand, so the detection layer is the only control standing in front of the fraud. Here is where it has to hold.
Millions of sign-ups, zero reviewers.
At onboarding scale, no analyst reviews each selfie or liveness capture. The detection layer is the only thing standing between a synthetic face and a funded account. When it misses, the fraudulent account is already open.
SyntheticThe faces attacking your funnel look exactly this real.
Two kinds of fake hit an identity funnel: a synthetic identity with no real person behind it, opening a fraudulent account, and an impersonation that wears a real customer's face to take one over. We build both at frontier quality and score your detection layer on genuine faces and these fakes together, so the number reflects what your funnel will actually meet.
The problem, measured
The detectors that scored perfect collapsed the hardest.
Detection score, where 1.00 is perfect and 0.50 is a coin flip.
Clean lab test
1.00score
Two open-source detectors that hit a perfect score on a clean test.
Real conditions
0.34score
The same two, re-tested against fresh attacks and the compression real platforms apply. Six other detectors slipped too, but far less.
Source: Margen open-source detector benchmark · 14 detectors
Platform compression
1.00
perfect score, clean test
What a detector scores on clean, original footage.
0.34
below a coin flip
On the same footage after the compression every upload goes through. A perfect score is 1.00; a coin flip is 0.50.
From our open benchmark: detectors that scored a perfect 1.00 on clean footage dropped to 0.34, below a coin flip, once the video was compressed the way platforms compress every upload.
Read the benchmarkRe-tested every quarter, against what is new.
Attack tools and the way apps compress video keep changing. We re-run the evaluation on a regular cadence, so your result reflects what is circulating now, not what was true the day you signed.
New attack tools
Each quarter we add the latest face-swap and generation tools attackers have picked up.
Your real upload formats
We score on the compression and capture paths your live traffic actually carries, not a lab sample.
The groups that were weakest
We re-check where detection struggled last time, so a fix in one place does not open a gap in another.
Each cycle ends with a short report: what changed, what held, and what slipped.
Everything procurement asks for is in the box.
Per-group performance
Results broken out by demographic and platform, with the worst case shown.
Both kinds of mistake
Fakes let through and real users wrongly blocked, with the margin of error on each.
Bypass recipes
Every failure annotated with the recipe that surfaced it.
Methodology, documented
Public, versioned, and signed by the lead researcher.
Platform-realistic conditions
Scored under the re-encoding your deployment actually applies.
Audit-ready exhibits
Findings packaged to hand to a board or a regulator.
One measurement layer, every side of it.
Whichever side you are on, the same arms race runs underneath. See how we serve the rest of the market, or go straight to scoping your own evaluation.
See where the detection in your funnel stands.
Tell us the attacks and conditions your funnel carries. We will scope an evaluation against the fraud actually hitting it.