NewThe detectors that scored perfect collapsed the hardest under attack.
Buyer · High-volume detection, real candidates, real funnels.

Detection accuracy measured against the attacks your funnel actually sees.

Synthetic candidates are already interviewing their way into real jobs. We evaluate your detection layer against the personas, accents, and lighting conditions that real interview traffic carries.

Where the volume removes the human check

A synthetic candidate only has to clear one live interview.

Hiring funnels move thousands of candidates, and no recruiter verifies that each face on the call is real. The detection layer is the check that has to hold across every one of these flows.

A real-time face swap on the call.

A proxy candidate runs a real-time face swap during the live interview, so the person on screen is not the person who will take the job. At interview volume, no recruiter is verifying that each face is genuine. The detection layer is the only check.

real-time face swapproxy candidateinterview fraud
What a pooled average hides

One strong number can hide a group the detection system does worse than guessing on.

A detection system can post a healthy overall accuracy while specific groups perform far worse. We report performance per group so the weakest one is visible, not averaged away.

Female-presentingMale-presenting
Light
Healthy
Healthy
Medium
Healthy
Watch
Dark
Weak
Coin flip
Relative detection confidence
Higher
Lower
FigIllustrative schematic. Per-group results are reported in the benchmark; exact cell values are pending and not shown here.
The evidence

The average hides the gap

What you were told

one overall number

A single strong accuracy figure across all candidates.

What holds in your funnel

split by group

Groups the detector does worse than guessing on, hidden inside a healthy-looking average.

Our per-group breakdown shows some groups falling below a coin flip while the overall number still looks healthy.

Read the benchmark
What we measure for you

The deliverables you get back

Interview-realistic conditions

Evaluated against what real interview traffic carries.

Geographic and accent drift

Performance reported per group.

Vendor-selection evidence

Reports that hold up in procurement conversations.

The core challenge

A high average is not the same as protection for every candidate.

The core challenge

Why a strong-looking detector still lets fraud through.

Trained on yesterday

Detectors learn from the generators that existed when they were built. New models ship every month, and the detector has never seen them.

Tested in a lab

Vendor numbers come from clean, pristine images. Real fraud arrives compressed, resized, and re-encoded by the platforms it passes through.

Measured on the average

A strong overall score can hide groups the detector barely catches. The average looks fine while a whole subgroup is an open lane.

What you get

Everything procurement asks for is in the box.

  • Per-group performance

    Results broken out by demographic and platform, with the worst case shown.

  • Both kinds of mistake

    Fakes let through and real users wrongly blocked, with the margin of error on each.

  • Bypass recipes

    Every failure annotated with the recipe that surfaced it.

  • Methodology, documented

    Public, versioned, and signed by the lead researcher.

  • Platform-realistic conditions

    Scored under the re-encoding your deployment actually applies.

  • Audit-ready exhibits

    Findings packaged to hand to a board or a regulator.

See if your detection catches the synthetic candidate.

Tell us how candidates reach you. We will scope an evaluation against interview-realistic attacks, broken out by group.