t1k:test

Field	Value
Module	`t1k-base`
Version	`2.30.4`
Effort	`medium`
Tools	`Agent`, `Bash`, `Glob`, `Grep`, `Read`, `Task`

Keywords: compile, coverage, failing, flaky, run-tests, test, unit

How to invoke

/t1k:test
[context] OR compile OR coverage OR --flaky OR --diff

TheOneKit Test — Test Runner

Delegate to the registered t1k-tester agent. Never ignore failing tests.

Modes

Flag	Mode	Behavior
(default)	Full	Run entire test suite
`--flaky`	Flaky detection	Re-run failing tests up to 3 times, report retry rate per test
`--diff`	Diff-aware	Only run tests for files changed since main branch
`--coverage`	Coverage	Run with coverage reporting, flag uncovered critical paths

Flaky Test Detection (`--flaky`)

Run test suite normally
For any failing test, re-run it up to 3 times in isolation
If test passes on retry: mark as FLAKY (intermittent)
If test fails all retries: mark as FAILING (genuine failure)
Report: flaky tests with retry rate, genuine failures separately

Diff-Aware Testing (`--diff`)

Run git diff --name-only origin/main...HEAD to find changed files
Map changed source files → corresponding test files
Run only mapped test files
If no mapping found: run full suite (fallback)

Core Principle

NEVER IGNORE FAILING TESTS. Fix root causes, not symptoms.

Full-Verify Contract — What “Tests Green” Actually Means

“Tests passed” is insufficient evidence to claim done. Three gate types must all pass before reporting complete:

Gate	Question it answers	Typical command
Type-check	Does it compile?	`bun run typecheck` / `tsc --noEmit` / `cargo check`
Lint / Format	Does it meet project style?	`bun run lint` / `biome check` / `eslint .` / `ruff check`
Tests	Does behavior match spec?	`bun test` / `npm test` / `pytest` / `cargo test`

These gates are orthogonal. A change can pass tests and fail lint (or vice versa). CI runs all three; claiming “done” after running only one or two is a false positive that surfaces as a red PR later.

Rule

Before claiming work complete:

Run the kit’s type-check script — read the output
Run the kit’s lint/format script — read the output
Run the kit’s test command — read the output
Only then claim “done” / “green” / “verified”

If the kit has no lint script: note this explicitly (“no lint gate in this project”) so it’s visible rather than silently skipped.

Common scripts to discover

Check package.json:

jq -r '.scripts | to_entries[] | select(.key | test("^(test|lint|typecheck|check|format|build)$")) | "\(.key): \(.value)"' package.json

If a repo has lint + typecheck + test scripts, run all three. Don’t assume one covers the others.

Invoking Watch-Capable Runners (vitest / jest) — foreground + timeout, never background

Vitest and Jest default to watch mode when invoked bare and never exit — always use the one-shot form (vitest run, jest --ci).

Run the one-shot form in the foreground with a Bash timeout, not as a background job:

CI=1 timeout 300 npx vitest run path/to/suite --reporter=basic   # one-shot, exits on its own

Don’t background vitest run. Output capture is unreliable under a non-TTY background pipe — the run can finish (or hang) while the captured log stays empty.
Force non-interactive output: CI=1 and/or --reporter=basic / --no-color.
Genuine-hang fallback: re-run with --pool=forks --no-file-parallelism; kill the stuck process rather than polling its empty log.

MANDATORY — Kill vitest processes on completion (avoid OOM / IDE crash)

--pool=forks (and --pool=threads on older versions) leaves orphaned node workers alive on Windows when the parent exits via timeout, Ctrl-C, or normal completion. Accumulated orphans eat RAM and have crashed the editor / Claude Code host repeatedly.

Every vitest invocation MUST end with a kill sweep — pass, fail, foreground, or background. Sweep = kill the captured PID, kill its child tree, then name-fallback against node.*vitest. Wrap in trap EXIT (Bash) / try-finally (PowerShell) when scripted. Apply after run returns, after timeout 124, and before launching the next vitest.

Snippets: see references/vitest-process-management.md.

MANDATORY — Cap concurrent background vitest runs by device hardware

A static cap is wrong — a 4-core/8GB laptop chokes at 2, a 16-core/64GB workstation idles at 2. Compute the cap from the host:

cap = max(1, min(floor(logicalCores / 4), floor(freeMemGB / 4), 6))

cores/4 and freeMemGB/4 because one run already saturates ~25% of cores and 2-4 GB RAM (worker pool ≈ cores/2). Ceiling 6 prevents IDE/host paging. Floor 1 keeps low-end boxes functional.

Compute once per session (hardware doesn’t change). Before launch, count live node.*vitest processes — if ≥ cap, STOP (wait or kill oldest). Prefer foreground + timeout; at most cap - 1 background runs so a foreground slot stays free.

Compute snippets + reference table: see references/vitest-process-management.md.

Self-Teach on CI Gate Failures — MANDATORY

Every time a CI quality gate (lint, format, typecheck, test, build, security scan) fails on a PR, you MUST update the responsible skill to prevent the same failure in future sessions. This is the development-principles.md “Update Skills After Every Error” rule applied specifically to CI.

Failure → lesson loop:

CI gate fails on PR
Identify which local-verify command would have caught it (the CI step’s name + what it runs)
If that command is NOT in the skill’s verify checklist → add it
If it IS in the checklist but was skipped → note why, reinforce the checklist
Commit the skill update in the SAME session as the CI fix, not “later”

If you are about to push a fix-up commit for a CI failure without updating a skill, stop and update the skill first. The fix-up without the skill update guarantees the same person (or future you) will hit the same gate again.

Bug trail (why this matters)

PR theonekit-cli#79 (2026-04-21) had bun run typecheck clean + 433 tests green, but CI’s Lint stage failed on 3 OS matrices because biome check wasn’t in the local verify step. Cost: one extra round trip. Had a full-verify contract been followed, the formatter violations would have been caught and fixed locally before push. This skill section exists BECAUSE that happened.

Agent Routing

Follow protocol: skills/t1k-cook/references/routing-protocol.md This command uses role: t1k-tester

Skill Activation

Follow protocol: skills/t1k-cook/references/activation-protocol.md

Workflow

Compilation check (read console or build output)
Run tests via registered t1k-tester agent
Analyze test results for failures
If failures → spawn registered t1k-debugger for root cause
Report structured results

Module Context for Tester (if `installedModules` or `modules` present in metadata.json)

Follow protocol: skills/t1k-cook/references/subagent-injection-protocol.md Before spawning t1k-tester agent, inject:

Which module’s files are being tested (from .claude/metadata.json)
Module’s test skills if available
Boundary: “Test files in module {name} should not test cross-module behavior”

Sub-Agent Fork Hygiene

Sub-agent forking: see skills/t1k-architecture/references/fork-hygiene.md.