Skip to content

t1k:test

FieldValue
Modulet1k-base
Version2.17.3
Effortmedium
ToolsAgent, Bash, Glob, Grep, Read, Task

Keywords: compile, coverage, failing, flaky, run-tests, test, unit

/t1k:test
[context] OR compile OR coverage OR --flaky OR --diff

Delegate to the registered t1k-tester agent. Never ignore failing tests.

FlagModeBehavior
(default)FullRun entire test suite
--flakyFlaky detectionRe-run failing tests up to 3 times, report retry rate per test
--diffDiff-awareOnly run tests for files changed since main branch
--coverageCoverageRun with coverage reporting, flag uncovered critical paths
  1. Run test suite normally
  2. For any failing test, re-run it up to 3 times in isolation
  3. If test passes on retry: mark as FLAKY (intermittent)
  4. If test fails all retries: mark as FAILING (genuine failure)
  5. Report: flaky tests with retry rate, genuine failures separately
  1. Run git diff --name-only origin/main...HEAD to find changed files
  2. Map changed source files → corresponding test files
  3. Run only mapped test files
  4. If no mapping found: run full suite (fallback)

NEVER IGNORE FAILING TESTS. Fix root causes, not symptoms.

Full-Verify Contract — What “Tests Green” Actually Means

Section titled “Full-Verify Contract — What “Tests Green” Actually Means”

“Tests passed” is insufficient evidence to claim done. Three gate types must all pass before reporting complete:

GateQuestion it answersTypical command
Type-checkDoes it compile?bun run typecheck / tsc --noEmit / cargo check
Lint / FormatDoes it meet project style?bun run lint / biome check / eslint . / ruff check
TestsDoes behavior match spec?bun test / npm test / pytest / cargo test

These gates are orthogonal. A change can pass tests and fail lint (or vice versa). CI runs all three; claiming “done” after running only one or two is a false positive that surfaces as a red PR later.

Before claiming work complete:

  1. Run the kit’s type-check script — read the output
  2. Run the kit’s lint/format script — read the output
  3. Run the kit’s test command — read the output
  4. Only then claim “done” / “green” / “verified”

If the kit has no lint script: note this explicitly (“no lint gate in this project”) so it’s visible rather than silently skipped.

Check package.json:

Terminal window
jq -r '.scripts | to_entries[] | select(.key | test("^(test|lint|typecheck|check|format|build)$")) | "\(.key): \(.value)"' package.json

If a repo has lint + typecheck + test scripts, run all three. Don’t assume one covers the others.

Invoking Watch-Capable Runners (vitest / jest) — foreground + timeout, never background

Section titled “Invoking Watch-Capable Runners (vitest / jest) — foreground + timeout, never background”

Vitest and Jest default to watch mode when invoked bare and never exit — always use the one-shot form (vitest run, jest --ci).

Run the one-shot form in the foreground with a Bash timeout, not as a background job:

Terminal window
CI=1 timeout 300 npx vitest run path/to/suite --reporter=basic # one-shot, exits on its own
  • Don’t background vitest run. Output capture is unreliable under a non-TTY background pipe — the run can finish (or hang) while the captured log stays empty.
  • Force non-interactive output: CI=1 and/or --reporter=basic / --no-color.
  • Genuine-hang fallback: re-run with --pool=forks --no-file-parallelism; kill the stuck process rather than polling its empty log.

MANDATORY — Kill vitest processes on completion (avoid OOM / IDE crash)

Section titled “MANDATORY — Kill vitest processes on completion (avoid OOM / IDE crash)”

--pool=forks (and --pool=threads on older versions) leaves orphaned node workers alive on Windows when the parent exits via timeout, Ctrl-C, or normal completion. Accumulated orphans eat RAM and have crashed the editor / Claude Code host repeatedly.

Every vitest invocation MUST end with a kill sweep — pass, fail, foreground, or background. Sweep = kill the captured PID, kill its child tree, then name-fallback against node.*vitest. Wrap in trap EXIT (Bash) / try-finally (PowerShell) when scripted. Apply after run returns, after timeout 124, and before launching the next vitest.

Snippets: see references/vitest-process-management.md.

MANDATORY — Cap concurrent background vitest runs by device hardware

Section titled “MANDATORY — Cap concurrent background vitest runs by device hardware”

A static cap is wrong — a 4-core/8GB laptop chokes at 2, a 16-core/64GB workstation idles at 2. Compute the cap from the host:

cap = max(1, min(floor(logicalCores / 4), floor(freeMemGB / 4), 6))

cores/4 and freeMemGB/4 because one run already saturates ~25% of cores and 2-4 GB RAM (worker pool ≈ cores/2). Ceiling 6 prevents IDE/host paging. Floor 1 keeps low-end boxes functional.

Compute once per session (hardware doesn’t change). Before launch, count live node.*vitest processes — if ≥ cap, STOP (wait or kill oldest). Prefer foreground + timeout; at most cap - 1 background runs so a foreground slot stays free.

Compute snippets + reference table: see references/vitest-process-management.md.

Self-Teach on CI Gate Failures — MANDATORY

Section titled “Self-Teach on CI Gate Failures — MANDATORY”

Every time a CI quality gate (lint, format, typecheck, test, build, security scan) fails on a PR, you MUST update the responsible skill to prevent the same failure in future sessions. This is the development-principles.md “Update Skills After Every Error” rule applied specifically to CI.

Failure → lesson loop:

  1. CI gate fails on PR
  2. Identify which local-verify command would have caught it (the CI step’s name + what it runs)
  3. If that command is NOT in the skill’s verify checklist → add it
  4. If it IS in the checklist but was skipped → note why, reinforce the checklist
  5. Commit the skill update in the SAME session as the CI fix, not “later”

If you are about to push a fix-up commit for a CI failure without updating a skill, stop and update the skill first. The fix-up without the skill update guarantees the same person (or future you) will hit the same gate again.

PR theonekit-cli#79 (2026-04-21) had bun run typecheck clean + 433 tests green, but CI’s Lint stage failed on 3 OS matrices because biome check wasn’t in the local verify step. Cost: one extra round trip. Had a full-verify contract been followed, the formatter violations would have been caught and fixed locally before push. This skill section exists BECAUSE that happened.

Follow protocol: skills/t1k-cook/references/routing-protocol.md This command uses role: t1k-tester

Follow protocol: skills/t1k-cook/references/activation-protocol.md

  1. Compilation check (read console or build output)
  2. Run tests via registered t1k-tester agent
  3. Analyze test results for failures
  4. If failures → spawn registered t1k-debugger for root cause
  5. Report structured results

Module Context for Tester (if installedModules or modules present in metadata.json)

Section titled “Module Context for Tester (if installedModules or modules present in metadata.json)”

Follow protocol: skills/t1k-cook/references/subagent-injection-protocol.md Before spawning t1k-tester agent, inject:

  • Which module’s files are being tested (from .claude/metadata.json)
  • Module’s test skills if available
  • Boundary: “Test files in module {name} should not test cross-module behavior”

Sub-agent forking: see skills/t1k-architecture/references/fork-hygiene.md.