Conformance

Benchmarks

Open suites that any legal AI system can run against the standards. Results are versioned against pinned standards releases so trust survives engine updates.

Suite 01Drafting

Gold Proof Set

Manually verified reasoning examples used as regression baseline. Detects logical degradation between engine versions.

Suite 02Drafting

Temporal Drift

Probes status-at-time correctness across statute amendments, repeals, and supersession events.

Suite 03Drafting

Jurisdiction Swap

Holds facts constant; varies jurisdiction. Confirms outputs change in the right places.

Suite 04Drafting

Citation Replay

Verifies every cited authority resolves to a stable AuthorityRef and survives republication.

Suite 05Drafting

Authority Resolution

Tests resolver accuracy for ambiguous, partial, and historical citations.

Suite 06Drafting

Defeater Handling

Confirms exceptions, conflicts, rebuttals, and superior authority are modeled rather than collapsed.

Suite 07Drafting

Prompt Stability

Surface-form perturbations should not flip a conclusion when underlying facts and authorities are unchanged.