Commit Graph

7 Commits

Author SHA1 Message Date
adlee-was-taken
b8bc432175 feat(soak): artifacts, graceful shutdown, health probes, smoke script, v3.3.4
Batched remaining harness tasks (27-30, 33):

Task 27 — Artifact capture on failure: screenshots, HTML snapshots,
game state JSON, and console error tails are captured into
tests/soak/artifacts/<run-id>/ when a scenario throws. Successful
runs get a summary.json. Old runs (>7d) are pruned on startup.

Task 28 — Graceful shutdown: first SIGINT/SIGTERM flips the abort
signal (scenarios finish current turn then unwind). 10s after, a
hard-kill fires if cleanup hangs. Double Ctrl-C = immediate exit.
Exit codes: 0 success, 1 errors, 2 interrupted.

Task 29 — Periodic health probes: every 30s GET /health against the
target server. Three consecutive failures abort the run with
health_fatal, preventing staging outages from being misattributed
to harness bugs. Corrected endpoint from /api/health to /health
per server/routers/health.py.

Task 30 — Smoke test script: tests/soak/scripts/smoke.sh, a 60s
end-to-end canary that health-probes the target, seeds if needed,
and runs one minimal populate game.

Task 33 — Version bump to v3.3.4: both index.html footers (was
v3.1.6), new footer added to admin.html (had none), pyproject.toml.

Also fixes discovered during stress testing:
- SessionPool sets baseURL on all contexts so relative goto('/')
  resolves correctly between games (was "invalid URL" error)
- RoomCoordinator key is now unique per game-start (Date.now
  suffix) so Deferred promises don't carry stale room codes from
  previous games

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:57:15 -04:00
adlee-was-taken
d3b468575b feat(soak): per-room watchdog + heartbeat wiring + multi-game lobby fix
Watchdog class with 4 Vitest tests (27 total now), wired into
ctx.heartbeat in the runner. One watchdog per room with a 60s
timeout; firing logs an error, marks the room's dashboard tile
as errored, and triggers the abort signal so the scenario unwinds.
Watchdogs are explicitly stopped in the runner's finally block
so pending timers don't keep the node process alive on exit.

Also fixes a multi-game bug discovered during stress scenario
verification: after a game ends sessions stay parked on the
game_over screen, which hides the lobby and makes a subsequent
#create-room-btn click time out. runOneMultiplayerGame now
navigates every session to / before each game — localStorage
auth persists so nothing re-logs in.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:52:49 -04:00
adlee-was-taken
c307027dd0 feat(soak): --watch=tiled launches N headed host windows
SessionPool accepts headedHostCount; when > 0 it launches a second
Chromium in headed mode and creates the first N sessions in it.
Joiners (sessions N..count) stay headless in the main browser.

Each headed context gets a 960×900 viewport — tall enough to show
the full game table (deck + opponent row + own 2×3 card grid +
status area) without clipping. Horizontal tiling still fits two
windows side-by-side on a 1920-wide display.

window.moveTo is kept as a best-effort tile-placement hint, but
viewport from newContext() is what actually sizes the window
(window.resizeTo is a no-op on modern Chromium / Wayland).

Verified: 1-room tiled run plays a full game cleanly; 2-room
parallel tiled had one window get closed mid-run, which is
consistent with a user manually dismissing a window — tiled mode
is a best-effort hands-on debugging aid, not an automation mode.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:01:21 -04:00
adlee-was-taken
21fe53eaf7 feat(soak): click-to-watch live video via CDP screencast
Runner creates a Screencaster before sessions are acquired, then
wires its start/stop into DashboardServer.onStartStream / onStopStream
after sessions exist (the handlers close over a sessionsByKey map).
Clicking a player tile in the dashboard starts a CDP screencast on
that session's page, forwards JPEG frames as WS "frame" messages.
Closing the modal (or disconnecting the WS) stops all screencasts.

Verified end-to-end: programmatically connected WS, sent start_stream,
received 5 frames (13.7KB each), sent stop_stream, screencast_stopped
log line fired, run completed cleanly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 20:54:52 -04:00
adlee-was-taken
796de876b7 feat(soak): wire --watch=dashboard in runner
Starts DashboardServer on port 7777 (or --dashboard-port), uses its
reporter as ctx.dashboard, auto-opens the URL via xdg-open/open/start.
Cleans up on exit. WS client connections logged as info events so
you can see when the browser attaches.

Verified: 2-account populate run with --watch=dashboard serves the
static page on :7777, accepts WS connections, cleanly shuts down
when the run completes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 19:05:44 -04:00
adlee-was-taken
d1688aae0b feat(soak): runner.ts end-to-end with --watch=none
First full end-to-end milestone: parses CLI, builds SessionPool +
RoomCoordinator, loads a scenario by name, runs it, reports results,
cleans up. Watch modes other than "none" log a warning and fall back
until Tasks 19-24 implement them.

Smoke test passed against local dev:
  bun run soak -- --scenario=populate --accounts=2 --rooms=1
    --cpus-per-room=0 --games-per-room=1 --holes=1 --watch=none
  → Games completed: 1, Errors: 0, Duration: 78.2s, exit 0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 18:52:08 -04:00
adlee-was-taken
5478a4299e feat(soak): scaffold tests/soak package
Placeholder runner, tsconfig with @bot alias to tests/e2e/bot,
gitignored .env.stresstest + artifacts. Real behavior follows
in Task 10 onward.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:19:09 -04:00