Name the atomic claim
A proof starts with claim, target, setup/actions, evidence, and verdict. Before/after change proof is only one pattern built from smaller claims.
Riddle Proof proves claims about a running browser or app target. A ratchet keeps those claims small, reusable, and honest: each run says what was proved, which evidence role it used, what changed, and what remains outside the verdict.
A proof starts with claim, target, setup/actions, evidence, and verdict. Before/after change proof is only one pattern built from smaller claims.
Start with a fast current-target audit. If it cannot support the claim, classify the weak layer before editing anything.
Prefer profile JSON, proof-pack thresholds, fixture data, or a tiny app proof contract before touching Riddle Proof core.
A pass is not the end. Ratchets turn a useful receipt into the next more specific claim: wider viewport matrix, interaction proof, or reference/candidate comparison.
Core wording: Riddle Proof proves claims about a running browser/app target. A before/after change proof is one pattern built from smaller proof claims, not the core primitive.
Audit one deployed or preview target. No implementation diff is required. Useful for route, layout, browser-health, app-contract, and metric receipts.
Compare reference evidence with candidate evidence. Use this when the claim is about a release change, regression fix, or measured visual/behavioral delta.
Capture pre-action and post-action evidence inside one proof script. Useful for clicks, playback, drag gestures, forms, and control changes.
Every run should record the smallest weak layer before a fix. That keeps architecture work out of normal profile calibration and keeps product failures from being misfiled as proof noise.
product_regressionproof_insufficientprofile_calibrationapp_contract_gapruntime_environment_blockedneeds_human_review{
"claim": "The current target renders a healthy mix window",
"target": "/games/drum-sequencer",
"evidence_role_pattern": "current_target",
"setup_actions": [
"capture app contract",
"prepare audio sources",
"render offline metrics"
],
"verdict": "passed",
"does_not_prove": [
"subjective mix quality",
"every song section"
]
}Reusable profiles, thresholds, examples, and human-review rubrics. Packs should say which evidence-role pattern they use and what they do not prove.
Tiny, deliberate browser globals or debug endpoints that expose redacted state and receipts an agent can inspect without guessing from DOM text alone.
Run locally with Playwright for control, or use Riddle hosted when you want managed browsers, durable artifacts, preview handling, and agent handoff.