Benchmark · Synthetic Face v1

Real and AI-generated face crops, across demographics and platform conditions.

A data card for the Synthetic Face v1 benchmark: real (bona-fide) and AI-generated face crops for evaluating detectors across 12 demographic cells and a range of clean and perturbed conditions.

What it is

A fairness- and robustness-aware detector test set.

Synthetic Face v1 pairs real (bona-fide) face crops with AI-generated face crops so a detector can be scored on both sides of the decision. Every item is grouped by demographic cell and by platform condition, so results break down by group and by condition rather than collapsing to a single average.

Access is via the Margen platform. See API keys to get started, or the API docs for how to pull it.

At a glance

Specifications.

Kinds: real and fake (AI-generated)
Demographic cells: 12 (6 skin-tone bands x 2 genders)
Queryable dimensions: skin_tone, gender, kind, generator, perturbation (alias condition), layer, base_id, source_real_id
Item and per-cell counts: returned live by the API. Read the total field on GET /items under any cell filter, and call GET /catalog for current dimensions and values.

Format: JPEG
Color space: 8-bit sRGB
Geometry: face-centered, delivered at native resolution (not resized to a fixed size); resize in your own preprocessing
Delivery: one signed URL per item via GET /download/{item_id}

Demographic cells

12 cells: 6 skin-tone bands by 2 genders.

Skin tone uses a 6-band ordinal scale (very_light to dark), reconciled against Individual Typology Angle (ITA degrees). Each band is crossed with two genders for 12 cells total.

Synthetic face, very light skin tone, female

Synthetic face, very light skin tone, male

Synthetic face, intermediate skin tone, female

Synthetic face, intermediate skin tone, male

Per-cell item counts are returned by the API: call GET /items with a skin_tone and gender filter and read the total field.

Conditions

Clean, single transforms, and platform re-encode pipelines.

Each face is available clean and under a set of perturbations. Layer 1 applies a single transform. Layer 2 models a social platform's re-encode pipeline for Facebook, Instagram, TikTok, and X.

clean

Clean

The unperturbed crop, as the baseline condition.

layer1

Layer 1 — single transform

One transform applied on its own: JPEG-quality reduction, blur, noise, or resize.

layer2

Layer 2 — platform pipeline

A social-platform re-encode pipeline that models what Facebook, Instagram, TikTok, and X do to an uploaded image.

layer2_recropped

Layer 2 — recropped

The platform-pipeline output re-cropped to the face, matching how media is often re-framed after upload.

Generation

Diffusion-based synthesis with identity conditioning.

The AI-generated faces are produced by diffusion-based synthesis with identity conditioning (SDXL + InstantID), targeted to fill each demographic cell.

Labeling

Reviewed for quality.

Every item carries a real-versus-fake label and a demographic-cell assignment. Labels are reviewed for quality before an item enters the benchmark, so the ground truth you score against is consistent.

Skin tone is assigned on a 6-band ordinal scale (very_light to dark) and reconciled against Individual Typology Angle (ITA degrees), a standard colorimetric measure, so cells stay comparable across items.

Licensing

Licensed for commercial evaluation use.

All imagery is obtained and licensed for commercial use, including building and evaluating detection systems. Real faces are licensed from commercial stock-media providers; synthetic faces are generated in house. Each delivered item is licensed for your evaluation use under the Margen data license.

See the Attack Data overview for the product and licensing summary.

How to pull it.

Access is via the Margen platform. Generate an API key, then follow the API docs to pull the catalog and score your detector against it.

Get an API key API docs