NewThe detectors that scored perfect collapsed the hardest under attack.
Trust

Neutrality you can check.

Every result we publish is built to be checked, not taken on faith. Four mechanisms let any independent reader verify our numbers against external references: public benchmarks, full methodology transparency, anonymized reporting, and named external standards.

The basis

Four ways to check our work yourself.

Public benchmarks

We run and publish our own benchmark of the market's detectors under platform-realistic conditions: open proof of our methodology. Private engagement results are never part of it.

Reproducible by anyone

Metric definitions, cell-axis definitions, the perturbation pipeline, and dataset cards are documented. Results can be independently re-run by anyone with corpus access on request.

Anonymize by default

Detectors are reported anonymized unless the vendor chooses to self-reveal. Placement and evaluation outcomes cannot be purchased.

Reported against external standards

Results are reported against named external standards by number, so a reader can check our work against an independent reference.

What we are

Margen is the measurement layer between detection vendors and the buyers who depend on them.

An independent assessment layer.

Third-party, with no detection product of our own to sell.

A reproducible methodology.

Every claim is backed by a dataset, a pipeline, and a margin of error.

Adversarially honest.

We test under the hardest conditions buyers face in production.

External standards

Reported against named external standards.

Where a standard exists, we report against it so a reader can compare our numbers to an independent definition.

  • ISO/IEC 30107-3Presentation attack detection reporting (APCER / BPCER).
  • ISO/IEC 19795Biometric performance testing and reporting.
  • ISO/IEC 24027Bias in AI systems and AI-aided decision making.
  • NIST AI RMFAI risk management framework alignment.
  • EU AI ActTransparency and risk obligations for AI systems.
No conflict of interest

We sell both the ruler and the test. Here is the firewall.

We publish a public benchmark and we run private evaluations for the vendors and buyers measured against it. That is the same competence at two trust levels, not a conflict, as long as the line between them is clear.

The public methodology is fixed and open, so no one can buy a better position on it. Private engagement findings stay confidential and are never fed back into the public benchmark to favor a paying vendor. Placement and outcomes cannot be purchased.

Your data

A secure engagement by default.

The engagement never has to leave your system. We bring our tools into your environment and run the whole evaluation there, so your models and data never move.

When you share data with us, it stays inside the engagement, never redistributed, published, or fed into the public benchmark.

Verify it yourself.

Read the pre-registration, the methodology, and the published findings, then bring us a detector to measure.