Tasfi Bench methodology

A public, false-pass-weighted benchmark for Islamic AI verification.

Most AI evaluations measure capability. Tasfi Bench measures the opposite. It measures whether a verifier catches plausible looking wrong answers before they reach a Muslim user. The fixture suite is biased toward false pass cases on purpose: a false pass on Islamic content is worse than a false warning, because the cost compounds across the Ummah every time an unverified answer ships.

Fixture taxonomy

What the benchmark fixtures actually test.

The current suite is 420 cases against the controlled source bundle. The weighting reflects the failure modes that real Islamic AI products produce in the wild, not the ones easiest to catch.

Core fixtures

Valid Quran citations with matching quotes. Valid sahih hadith citations. Properly scoped fiqh answers. These prove the verifier does not over-block when the input is honest.

False-pass-heavy

Fabricated Quran references, Quran quote mismatches, fabricated hadith citations, hadith quote mismatches, non-four-madhab authority claims, fiqh overconfidence, prompt injection attempts. 320 of 420 cases sit here on purpose.

Real-source overclaim

Real Quran or hadith references attached to answer text that exaggerates, generalizes, or rephrases the source in a way that changes the religious claim. 70 cases. These are the hardest to catch and the most consequential.

Risk categories

Every flag is bucketed before it becomes a number.

Category What it covers Default action
claim_support Answer text is not supported by a valid, in-scope source. fail or warn
policy Hadith authenticity, Quran reference validity, or madhab scope rule. fail or warn
review_escalation High-stakes religious domain, unclear authority, or fatwa-shaped framing. warn and route to scholar review
madhab_scope Authority cited outside the four Sunni madhabs in product scope. fail
clinical_boundary Medical, clinical, or healthcare-shaped religious claim that crosses Tasfi's parked-boundary rules. warn or fail

Source bundle pinning

Every score is tied to a specific bundle checksum.

The verifier and the benchmark run against the same approved source bundle. The bundle has an id, a version, and a SHA-256 checksum. A score is meaningless without those three values. The bundle metadata travels with every Trust Receipt the verifier emits.

See the Trust Receipt schema for the exact field shape.

Bundle metadata
{
  "id": "tasfi-v1-controlled",
  "version": "tasfi-v1-controlled-0.3.0",
  "sha256": "cd9a8b99e98f9ea0e2ce2cc3ed893610535818a55143cb58a4f93108184f37c2"
}

Live evidence

Current run, against the current bundle.

The panel below renders the latest generated benchmark report. It is regenerated by npm run benchmark and npm run guard:scorecard.

Loading benchmark evidence...

Leaderboard

Compare verifiers, not models.

Tasfi Bench is a benchmark for source-grounded verifiers, not for answer generators. Any verifier that consumes the same inputs and emits a pass, warn, or fail decision can be scored. To submit a verifier for evaluation, contact the founder.

Loading leaderboard...

What this benchmark does not prove

Honest limits, named up front.

Not religious correctness

The benchmark tests deterministic source handling. It does not certify any answer as religiously correct, and it does not replace scholarly review.

Not coverage of every source

Approved sources are scoped. Internal Maktaba lifecycle gates apply. The benchmark cannot score what the bundle does not contain.

Not protection against eval gaming

The public suite is auditable on purpose. A separate private holdout set, regenerated quarterly, is the structural defense against verifiers trained against the public cases.

Why frame the category

If no one publishes the benchmark, no one is accountable.

Tasfi Bench is the public reliability scoreboard for Islamic AI verification. We publish the methodology and the fixture taxonomy so the category has a real test, not a marketing claim. New entries are welcome. The bundle, the suite, and the source policy are the same for everyone who shows up.