# Golf Soak & UX Test Harness Standalone Playwright-based runner that drives multiple authenticated browser sessions playing real multiplayer games. Used for: - **Scoreboard population** — fill staging leaderboards with realistic data - **Stability stress testing** — hunt race conditions, WebSocket leaks, cleanup bugs - **Live monitoring** — watch bot sessions play in real time via CDP screencast ## Prerequisites - [Bun](https://bun.sh/) (or Node.js + npm) - Chromium browser binary (installed via `bunx playwright install chromium`) - A running Golf Card Game server (local dev or staging) - An invite code flagged as `marks_as_test=TRUE` (see [Bring-up](#first-time-setup)) ## First-time setup ### 1. Install dependencies ```bash cd tests/soak bun install bunx playwright install chromium ``` ### 2. Flag the invite code as test-seed Any account registered with a test-seed invite gets `is_test_account=TRUE`, which keeps it out of real-user stats and leaderboards. **Local dev:** ```bash PGPASSWORD=devpassword psql -h localhost -U golf -d golf <<'SQL' INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test) SELECT 'SOAKTEST', id, NOW() + INTERVAL '10 years', 100, TRUE, TRUE FROM users_v2 LIMIT 1 ON CONFLICT (code) DO UPDATE SET marks_as_test = TRUE; SQL ``` **Staging:** ```bash ssh root@129.212.150.189 \ 'docker compose -f /opt/golfgame/docker-compose.staging.yml exec -T postgres psql -U postgres -d golfgame' <<'SQL' UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN'; SQL ``` ### 3. Seed test accounts ```bash # Local dev TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST bun run seed # Staging TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run seed ``` This registers 16 accounts via the invite code and caches their credentials in `.env.stresstest`. Only needs to run once — subsequent runs reuse the cached credentials (re-logging in if tokens expire). ### 4. Verify with a smoke test ```bash # Local dev TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST bash scripts/smoke.sh ``` Expected: one game plays to completion in ~60 seconds, exits 0. ## Usage ### Populate scoreboards (recommended first run) ```bash TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run soak -- \ --scenario=populate \ --watch=dashboard ``` This runs 4 rooms x 10 games x 9 holes with varied CPU personalities. The dashboard opens automatically at `http://localhost:7777`. ### Quick smoke against staging ```bash TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run soak -- \ --scenario=populate \ --accounts=2 --rooms=1 --cpus-per-room=0 \ --games-per-room=1 --holes=1 \ --watch=dashboard ``` ### Stress test with chaos injection ```bash TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run soak -- \ --scenario=stress \ --accounts=4 --rooms=1 --games-per-room=5 \ --watch=dashboard ``` Rapid 1-hole games with random chaos events (rapid clicks, tab blur, brief network outage) injected during gameplay. ### Headless mode (CI / overnight) ```bash TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run soak -- \ --scenario=populate --watch=none ``` Outputs structured JSONL to stdout. Pipe to `jq` for filtering: ```bash bun run soak -- --scenario=populate --watch=none 2>&1 | jq 'select(.msg == "game_complete")' ``` ### Tiled mode (native browser windows) ```bash bun run soak -- --scenario=populate --rooms=2 --watch=tiled ``` Opens visible Chromium windows for each room's host session. Useful for hands-on debugging with DevTools. ## CLI flags ``` --scenario=populate|stress required — which scenario to run --accounts= total sessions (default: from scenario) --rooms= parallel rooms (default: from scenario) --cpus-per-room= CPU opponents per room (default: from scenario) --games-per-room= games per room (default: from scenario) --holes= holes per game (default: from scenario) --watch=none|dashboard|tiled visualization mode (default: dashboard) --dashboard-port= dashboard server port (default: 7777) --target= override TEST_URL env var --run-id= custom run identifier (default: timestamp) --list print available scenarios and exit --dry-run validate config without running ``` `accounts / rooms` must divide evenly. ## Environment variables | Variable | Description | Default | |---|---|---| | `TEST_URL` | Target server base URL | `http://localhost:8000` | | `SOAK_INVITE_CODE` | Invite code for account seeding | `SOAKTEST` | | `SOAK_HOLES` | Override `--holes` | — | | `SOAK_ROOMS` | Override `--rooms` | — | | `SOAK_ACCOUNTS` | Override `--accounts` | — | | `SOAK_CPUS_PER_ROOM` | Override `--cpus-per-room` | — | | `SOAK_GAMES_PER_ROOM` | Override `--games-per-room` | — | | `SOAK_WATCH` | Override `--watch` | — | | `SOAK_DASHBOARD_PORT` | Override `--dashboard-port` | — | Config precedence: CLI flags > env vars > scenario defaults. ## Watch modes ### `dashboard` (default) Opens `http://localhost:7777` with a live status grid: - 2x2 room tiles showing phase, current player, move count, progress bar - Activity log at the bottom - **Click any player tile** to watch their live session via CDP screencast - Press Esc or click Close to stop the video feed - WS connection status indicator The dashboard runs **locally on your machine** — the runner's headless browsers connect to the target server remotely while the dashboard UI is served from your workstation. ### `tiled` Opens native Chromium windows for each room's host session, positioned in a grid. Joiners stay headless. Useful for interactive debugging with DevTools. The viewport is sized at 960x900 to show the full game table. ### `none` Pure headless, structured JSONL to stdout. Use for CI, overnight runs, or piping to `jq`. ## Scenarios ### `populate` Long multi-round games to populate scoreboards with realistic data. | Setting | Default | |---|---| | Accounts | 16 | | Rooms | 4 | | CPUs per room | 1 | | Games per room | 10 | | Holes | 9 | | Decks | 2 | | Think time | 800-2200ms | ### `stress` Rapid short games with chaos injection for stability testing. | Setting | Default | |---|---| | Accounts | 16 | | Rooms | 4 | | CPUs per room | 2 | | Games per room | 50 | | Holes | 1 | | Decks | 1 | | Think time | 50-150ms | | Chaos chance | 5% per turn | Chaos events: `rapid_clicks`, `tab_blur`, `brief_offline` ### Adding new scenarios Create `scenarios/.ts` exporting a `Scenario` object, then register it in `scenarios/index.ts`. See existing scenarios for the pattern. ## Error handling - **Per-room isolation**: a failure in one room never unwinds other rooms (`Promise.allSettled`) - **Watchdog**: 60s per-room timeout — fires if no heartbeat arrives - **Health probes**: `GET /health` every 30s, 3 consecutive failures = fatal abort - **Graceful shutdown**: Ctrl-C finishes current turn, then cleans up (10s timeout). Double Ctrl-C = immediate force exit - **Artifacts**: on failure, screenshots + HTML + game state JSON saved to `artifacts//`. Old artifacts auto-pruned after 7 days - **Exit codes**: `0` = success, `1` = errors, `2` = interrupted ## Test account filtering Soak accounts are flagged `is_test_account=TRUE` in the database. They are: - **Hidden by default** from public leaderboards and stats (`?include_test=false`) - **Visible to admins** by default in the admin panel - **Togglable** via the "Include test accounts" checkbox in the admin panel - **Badged** with `[Test]` in the admin user list and `[Test-seed]` on the invite code ## Unit tests ```bash bun run test ``` 27 tests covering Deferred, RoomCoordinator, Watchdog, Logger, and Config. Integration-level modules (SessionPool, scenarios, dashboard) are verified by the smoke test and live runs. ## Architecture ``` runner.ts CLI entry — parses flags, wires everything, runs scenario core/ session-pool.ts Owns browser contexts, seeds/logs in accounts room-coordinator Deferred-based host→joiners room code handoff watchdog.ts Per-room timeout detector screencaster.ts CDP Page.startScreencast for live video logger.ts Structured JSONL logger with child contexts artifacts.ts Screenshot/HTML/state capture on failure types.ts Scenario/Session/Logger contracts scenarios/ populate.ts Long multi-round games stress.ts Rapid games with chaos injection shared/ multiplayer-game.ts Shared "play one game" loop chaos.ts Chaos event injector dashboard/ server.ts HTTP + WS server index.html Status grid UI dashboard.js WS client + click-to-watch scripts/ seed-accounts.ts Account seeding CLI smoke.sh End-to-end canary (~60s) ``` Reuses `tests/e2e/bot/golf-bot.ts` unchanged for all game interactions. ## Related docs - [Design spec](../../docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md) - [Bring-up steps](../../docs/soak-harness-bringup.md) - [Implementation plan](../../docs/superpowers/plans/2026-04-10-multiplayer-soak-test.md)