Docs

Verify

The verify workflow tests a claim. Given a statement like "the rate limiter blocks at 100 req/s" or "this PR fixes the crash on resize", it installs and runs the code in the sandbox, captures evidence with bash, and reports whether the evidence confirms or refutes the claim. It never modifies code.

Permission profile: issues-write — can read repo contents and post a findings comment; it never pushes code.

Pipeline

Verify run + report verdict

What it does

  • Reads the claim (and, for a PR, its description + diff) to decide what would convince a skeptic
  • Installs and runs the code in the sandbox, capturing test output, command stdout, exit codes, or curl responses as evidence
  • For a before/after claim, reproduces on the base ref and shows the change on the head ref
  • Reports a verdict: CONFIRMED, REFUTED, or INCONCLUSIVE — a clearly-evidenced "this is broken" is a valid result, never buried
Text evidence by default, screenshots when available. The text phase uses bash, file read, and the GitHub tools — no browser. On a Docker host with the browser-QA image built, a second gated phase drives the claim in a real headless Chromium and posts screenshot evidence on top of the text verdict; everywhere else that phase is silently skipped and a UI-only claim is reported INCONCLUSIVE.

Triggers

  • GitHub: a maintainer comment @last-light verify <claim> on an issue or PR, or a natural-language request the classifier reads as a verify intent ("does this actually fix the crash?")
  • Slack: a message classified as a verify intent against a managed repo
  • CLI: lastlight verify owner/repo#N -- "<claim>"

Skills

This workflow uses the verify skill (the investigator procedure and report shape) plus the building skill (install + run the repo). The gated browser phase additionally uses the browser-qa skill (a bundled Playwright driver) to capture screenshot evidence.