Riddle Proof
Proof Ratchets

Turn one browser proof into the next sharper question.

Riddle Proof proves claims about a running browser or app target. A ratchet keeps those claims small, reusable, and honest: each run says what was proved, which evidence role it used, what changed, and what remains outside the verdict.

5Atomic proof parts: claim, target, setup/actions, evidence, verdict.
3Evidence-role patterns: current target, reference/candidate, and interaction snapshots.
1Smallest layer changed per run: profile, pack, contract, fixture, runner, or app.
0Assumed pass value. The value is the next better proof question.

The atomic proof model.

claim - target - evidence
Step 1

Name the atomic claim

A proof starts with claim, target, setup/actions, evidence, and verdict. Before/after change proof is only one pattern built from smaller claims.

Step 2

Run the smallest useful profile

Start with a fast current-target audit. If it cannot support the claim, classify the weak layer before editing anything.

Step 3

Change the narrowest layer

Prefer profile JSON, proof-pack thresholds, fixture data, or a tiny app proof contract before touching Riddle Proof core.

Step 4

Ask the next sharper question

A pass is not the end. Ratchets turn a useful receipt into the next more specific claim: wider viewport matrix, interaction proof, or reference/candidate comparison.

Core wording: Riddle Proof proves claims about a running browser/app target. A before/after change proof is one pattern built from smaller proof claims, not the core primitive.

Evidence-role patterns.

name the shape

current_target

Audit one deployed or preview target. No implementation diff is required. Useful for route, layout, browser-health, app-contract, and metric receipts.

reference_candidate

Compare reference evidence with candidate evidence. Use this when the claim is about a release change, regression fix, or measured visual/behavioral delta.

interaction_snapshots

Capture pre-action and post-action evidence inside one proof script. Useful for clicks, playback, drag gestures, forms, and control changes.

Classify the weak layer

Every run should record the smallest weak layer before a fix. That keeps architecture work out of normal profile calibration and keeps product failures from being misfiled as proof noise.

  • product_regression
  • proof_insufficient
  • profile_calibration
  • app_contract_gap
  • runtime_environment_blocked
  • needs_human_review
{
  "claim": "The current target renders a healthy mix window",
  "target": "/games/drum-sequencer",
  "evidence_role_pattern": "current_target",
  "setup_actions": [
    "capture app contract",
    "prepare audio sources",
    "render offline metrics"
  ],
  "verdict": "passed",
  "does_not_prove": [
    "subjective mix quality",
    "every song section"
  ]
}

What gets reused.

packs and contracts

Proof packs

Reusable profiles, thresholds, examples, and human-review rubrics. Packs should say which evidence-role pattern they use and what they do not prove.

App contracts

Tiny, deliberate browser globals or debug endpoints that expose redacted state and receipts an agent can inspect without guessing from DOM text alone.

Runner choice

Run locally with Playwright for control, or use Riddle hosted when you want managed browsers, durable artifacts, preview handling, and agent handoff.