Check detectors from our latest benchmark

Pick a detector and a condition and see how it scores on our corpus, down to a single demographic cell. The same measurement we run in an evaluation. Full results are on our benchmark page.

Query the benchmark

Pick a selection and press Run.

AUC is a 0 to 1 score for how well a detector tells fake images from real ones. Picture showing it one fake photo and one real photo: AUC is how often it rates the fake as the more likely fake. 1.0 means it always gets it right, 0.5 is a coin flip (no better than guessing), and below 0.5 means it is getting them backwards.

Get involved

Put in a request

Public model

We'll run it and put it up on the interactive.

Private model

Share it privately. Weights or API Key.

Data and methodology

Self service on your own model.

Request a generator

Suggest one to add to the interactive.

Want the full report on your detector?

The interactive shows one score at a time. A full evaluation runs your model across the entire sequestered corpus and returns a per-group report card: AUC with DeLong confidence intervals, how it holds up under each platform condition, and where it fails, the recipe that broke it.

Request an evaluation How it works