From cf916d7bc3f60226d9420e74d6e89c9fafe72699 Mon Sep 17 00:00:00 2001 From: adlee-was-taken Date: Fri, 10 Apr 2026 23:37:15 -0400 Subject: [PATCH] docs: implementation plan for multiplayer soak harness MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 33-task TDD plan across 11 phases implementing the soak & UX test harness design. Server-side schema/filter/admin changes ship first (independent), then the tests/soak/ TypeScript runner builds up incrementally — first milestone is a --watch=none smoke run against local dev after Task 18, then dashboard, live video, tiled mode, stress scenario, failure handling, and staging bring-up. Final task bumps project version to v3.3.4. Co-Authored-By: Claude Opus 4.6 (1M context) --- .../plans/2026-04-10-multiplayer-soak-test.md | 5495 +++++++++++++++++ 1 file changed, 5495 insertions(+) create mode 100644 docs/superpowers/plans/2026-04-10-multiplayer-soak-test.md diff --git a/docs/superpowers/plans/2026-04-10-multiplayer-soak-test.md b/docs/superpowers/plans/2026-04-10-multiplayer-soak-test.md new file mode 100644 index 0000000..c8b9252 --- /dev/null +++ b/docs/superpowers/plans/2026-04-10-multiplayer-soak-test.md @@ -0,0 +1,5495 @@ +# Multiplayer Soak & UX Test Harness — Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Build a standalone Playwright-based soak runner in `tests/soak/` that drives 16 authenticated browser sessions across 4 concurrent rooms playing many multiplayer games, with pluggable scenarios, a click-to-watch dashboard via CDP screencast, and strict per-room failure isolation. + +**Architecture:** Single-process node runner reusing the existing `GolfBot` class from `tests/e2e/bot/`. One shared browser (16 contexts) by default; `WATCH=tiled` uses a second headed browser for the 4 host contexts. Scenarios are plain TS modules exported from `tests/soak/scenarios/`. Dashboard is a tiny HTTP+WS server serving one static page that pushes live status and on-demand CDP screencast frames. + +**Tech Stack:** TypeScript + tsx (no build step), Playwright Core, ws (WebSocket server), Vitest for unit tests, FastAPI + asyncpg (existing server), PostgreSQL (existing). + +**Spec:** `docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md` + +--- + +## Testing Strategy Notes + +- **Server-side Python changes:** The existing test suite mocks stores with `AsyncMock` and has no real-Postgres fixtures. Rather than inventing a new fixture pattern for this plan, server tasks use **curl-based verification against a running local dev server** as the explicit verification step after each commit. Run `python server/main.py` in another terminal (requires Postgres + Redis running — see `docs/INSTALL.md`). +- **TypeScript harness logic:** Unit-tested with Vitest for pure modules (Deferred, RoomCoordinator, Watchdog, Config). Integration-level modules (SessionPool, Dashboard, Screencaster, Scenarios) are verified by running the harness itself via the smoke test. +- **End-to-end validation:** `tests/soak/scripts/smoke.sh` is the canary — after every non-trivial change, run it against local dev and expect exit 0 within ~30s. + +--- + +## Phase 1 — Server-side changes (independent, ships first) + +### Task 1: Schema migration for `is_test_account` and `marks_as_test` + +Add two columns, one partial index, and rebuild the `leaderboard_overall` materialized view to include `is_test_account` (so the filter works through the view fast path). Fits the existing inline-migration pattern in `user_store.py`. + +**Files:** +- Modify: `server/stores/user_store.py` — append to `SCHEMA_SQL` (ALTER blocks near L79–L98 and the matview block near L298–L335) + +- [ ] **Step 1: Add column migration to `SCHEMA_SQL`** + +Open `server/stores/user_store.py`. Inside the first `DO $$ BEGIN ... END $$;` block (around line 80–98 that handles admin columns), append the `is_test_account` column check. Then add a second ALTER for `invite_codes.marks_as_test` in a new `DO $$` block right after. + +Add after the existing `last_seen_at` check (before `END $$;` on line ~98): + +```sql + IF NOT EXISTS (SELECT 1 FROM information_schema.columns + WHERE table_name = 'users_v2' AND column_name = 'is_test_account') THEN + ALTER TABLE users_v2 ADD COLUMN is_test_account BOOLEAN DEFAULT FALSE; + END IF; +``` + +Then, immediately after the `END $$;` that closes the users_v2 admin block, add a new block for invite_codes: + +```sql +-- Add marks_as_test to invite_codes if not exists +DO $$ +BEGIN + IF NOT EXISTS (SELECT 1 FROM information_schema.columns + WHERE table_name = 'invite_codes' AND column_name = 'marks_as_test') THEN + ALTER TABLE invite_codes ADD COLUMN marks_as_test BOOLEAN DEFAULT FALSE; + END IF; +END $$; +``` + +- [ ] **Step 2: Add partial index on `is_test_account`** + +Find the indexes block near line 338. After the existing `idx_users_banned` index (line ~344), add: + +```sql +CREATE INDEX IF NOT EXISTS idx_users_v2_is_test_account ON users_v2(is_test_account) + WHERE is_test_account = TRUE; +``` + +- [ ] **Step 3: Rebuild `leaderboard_overall` materialized view to include `is_test_account`** + +Find the existing matview block at line ~298. Modify the version-check DO block so the view is dropped and recreated if it lacks the `is_test_account` column. Replace the existing block: + +```sql +-- Leaderboard materialized view (refreshed periodically) +-- Drop and recreate if missing is_test_account column (soak harness migration) +DO $$ +BEGIN + IF EXISTS (SELECT 1 FROM pg_matviews WHERE matviewname = 'leaderboard_overall') THEN + -- Check if is_test_account column exists in the view + IF NOT EXISTS ( + SELECT 1 FROM information_schema.columns + WHERE table_name = 'leaderboard_overall' AND column_name = 'is_test_account' + ) THEN + DROP MATERIALIZED VIEW leaderboard_overall; + END IF; + END IF; + + IF NOT EXISTS (SELECT 1 FROM pg_matviews WHERE matviewname = 'leaderboard_overall') THEN + EXECUTE ' + CREATE MATERIALIZED VIEW leaderboard_overall AS + SELECT + u.id as user_id, + u.username, + COALESCE(u.is_test_account, FALSE) as is_test_account, + s.games_played, + s.games_won, + ROUND(s.games_won::numeric / NULLIF(s.games_played, 0) * 100, 1) as win_rate, + s.rounds_won, + ROUND(s.total_points::numeric / NULLIF(s.total_rounds, 0), 1) as avg_score, + s.best_score as best_round_score, + s.knockouts, + s.best_win_streak, + COALESCE(s.rating, 1500) as rating, + s.last_game_at + FROM player_stats s + JOIN users_v2 u ON s.user_id = u.id + WHERE s.games_played >= 5 + AND u.deleted_at IS NULL + AND (u.is_banned = false OR u.is_banned IS NULL) + '; + END IF; +END $$; +``` + +Note: the only differences from the existing block are the changed comment, the changed column-existence check (`is_test_account` instead of `rating`), and the new `COALESCE(u.is_test_account, FALSE) as is_test_account` column in the SELECT. Everything else stays identical. + +- [ ] **Step 4: Start the server to run migrations** + +Run (in another terminal, with Postgres + Redis up): + +```bash +cd /home/alee/Sources/golfgame +python server/main.py +``` + +Expected: server starts cleanly, no errors about `is_test_account` or `marks_as_test` or `leaderboard_overall`. + +- [ ] **Step 5: Verify schema via psql** + +Connect to the dev database and confirm: + +```bash +psql -d golfgame -c "\d users_v2" | grep is_test_account +psql -d golfgame -c "\d invite_codes" | grep marks_as_test +psql -d golfgame -c "\d leaderboard_overall" | grep is_test_account +psql -d golfgame -c "\di idx_users_v2_is_test_account" +``` + +Expected: all four commands return matching rows. + +- [ ] **Step 6: Commit** + +```bash +git add server/stores/user_store.py +git commit -m "$(cat <<'EOF' +feat(server): add is_test_account + marks_as_test schema + +New columns support separating soak-harness test traffic from real +user traffic in stats queries. Rebuilds leaderboard_overall matview +to include is_test_account so the fast path stays filterable. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 2: Propagate `is_test_account` through `User` model and `user_store` + +Wire the new column into the `User` dataclass, `create_user` signature, `_row_to_user` mapping, and every SELECT list that already pulls user columns. + +**Files:** +- Modify: `server/models/user.py` — `User` dataclass (L22–L68) + `to_dict` (L82–L116) + `from_dict` (L118+) +- Modify: `server/stores/user_store.py` — `create_user` (L454–L501), `_row_to_user` (L997–L1020), `get_user_by_id`/`get_user_by_username`/`get_user_by_email` SELECT lists (L503–L570) + +- [ ] **Step 1: Add `is_test_account` to the `User` dataclass** + +In `server/models/user.py`, add a new field to the `User` dataclass (after `force_password_reset` on L68): + +```python + is_test_account: bool = False +``` + +Update the docstring `Attributes:` block around L45 to include: + +``` + is_test_account: True for accounts created by the soak test harness. +``` + +- [ ] **Step 2: Include `is_test_account` in `to_dict` and `from_dict`** + +In `User.to_dict` at L82, add to the `d` dict (after `force_password_reset`): + +```python + "is_test_account": self.is_test_account, +``` + +In `User.from_dict`, add the corresponding parse — find where `force_password_reset` is parsed and add the same pattern: + +```python + is_test_account=d.get("is_test_account", False), +``` + +- [ ] **Step 3: Add `is_test_account` parameter to `create_user`** + +In `server/stores/user_store.py` at L454, add a new parameter: + +```python + async def create_user( + self, + username: str, + password_hash: str, + email: Optional[str] = None, + role: UserRole = UserRole.USER, + guest_id: Optional[str] = None, + verification_token: Optional[str] = None, + verification_expires: Optional[datetime] = None, + is_test_account: bool = False, + ) -> Optional[User]: +``` + +Update the docstring to add a line in `Args:` describing `is_test_account`. + +Change the INSERT SQL block to include the new column: + +```python + row = await conn.fetchrow( + """ + INSERT INTO users_v2 (username, password_hash, email, role, guest_id, + verification_token, verification_expires, + is_test_account) + VALUES ($1, $2, $3, $4, $5, $6, $7, $8) + RETURNING id, username, email, password_hash, role, email_verified, + verification_token, verification_expires, reset_token, reset_expires, + guest_id, deleted_at, preferences, created_at, last_login, last_seen_at, + is_active, is_banned, ban_reason, force_password_reset, is_test_account + """, + username, + password_hash, + email, + role.value, + guest_id, + verification_token, + verification_expires, + is_test_account, + ) +``` + +- [ ] **Step 4: Update `_row_to_user` mapping** + +In `server/stores/user_store.py` at L997, add to the `User(...)` call (after `force_password_reset`): + +```python + is_test_account=row.get("is_test_account", False) or False, +``` + +- [ ] **Step 5: Update all other SELECT lists in user_store** + +Find every query in `server/stores/user_store.py` that returns a full user row and passes it to `_row_to_user`. Add `is_test_account` to the SELECT column list for each. Grep to find them: + +```bash +grep -n "is_active, is_banned, ban_reason, force_password_reset" server/stores/user_store.py +``` + +For each match, append `, is_test_account` to the SELECT list. Expected locations: +- `create_user` INSERT ... RETURNING (already updated in Step 3) +- `get_user_by_id` at L503 +- `get_user_by_username` at L519 +- `get_user_by_email` (find it) +- Any other `SELECT` ... FROM users_v2 that calls `_row_to_user` + +- [ ] **Step 6: Restart server, verify no errors** + +```bash +# Kill and restart the dev server +python server/main.py +``` + +Expected: server starts cleanly. Any query that touches users now returns `is_test_account` correctly. + +- [ ] **Step 7: Smoke test via curl** + +```bash +# Register a throwaway test user (no invite code needed if DAILY_OPEN_SIGNUPS > 0 locally, +# or use the 5VC2MCCN invite code if INVITE_ONLY=true) +# Set PW to any password of your choice (>= 8 chars). +PW='SomeTestPw_1!' +curl -sX POST http://localhost:8000/api/auth/register \ + -H 'Content-Type: application/json' \ + -d "{\"username\":\"soaktest_smoke1\",\"password\":\"$PW\",\"email\":\"soaktest_smoke1@example.com\",\"invite_code\":\"5VC2MCCN\"}" +``` + +Expected: HTTP 200 with `{"user":{...},"token":"..."}`. The registration path now runs through the new column without errors even though the value is still always FALSE at this stage. + +- [ ] **Step 8: Commit** + +```bash +git add server/models/user.py server/stores/user_store.py +git commit -m "$(cat <<'EOF' +feat(server): propagate is_test_account through User model & store + +User dataclass, create_user, and all SELECT lists now round-trip the +new column. Value is always FALSE until Task 4 wires the register +flow to the invite code's marks_as_test flag. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 3: Expose `marks_as_test` on `InviteCode` and add lookup helper + +`validate_invite_code` currently returns a bare bool. We need a new helper that returns the full row so the register flow can check `marks_as_test` without a second query. + +**Files:** +- Modify: `server/services/admin_service.py` — `InviteCode` dataclass (L115–L138), `get_invite_codes` SELECT (L1106–L1141), add new `get_invite_code_details` method + +- [ ] **Step 1: Add `marks_as_test` field to `InviteCode` dataclass** + +In `server/services/admin_service.py` at L115: + +```python +@dataclass +class InviteCode: + """Invite code details.""" + code: str + created_by: str + created_by_username: str + created_at: datetime + expires_at: datetime + max_uses: int + use_count: int + is_active: bool + marks_as_test: bool = False +``` + +Update `to_dict` at L127 to include the field: + +```python + def to_dict(self) -> dict: + return { + "code": self.code, + "created_by": self.created_by, + "created_by_username": self.created_by_username, + "created_at": self.created_at.isoformat() if self.created_at else None, + "expires_at": self.expires_at.isoformat() if self.expires_at else None, + "max_uses": self.max_uses, + "use_count": self.use_count, + "is_active": self.is_active, + "remaining_uses": max(0, self.max_uses - self.use_count), + "marks_as_test": self.marks_as_test, + } +``` + +- [ ] **Step 2: Update `get_invite_codes` SELECT to include `marks_as_test`** + +Find `get_invite_codes` at L1106. Modify the SQL to pull the column and pass it through: + +```python + async def get_invite_codes(self, include_expired: bool = False) -> List[InviteCode]: + """List all invite codes.""" + async with self.pool.acquire() as conn: + sql = """ + SELECT c.code, c.created_by, u.username as created_by_username, + c.created_at, c.expires_at, + c.max_uses, c.use_count, c.is_active, + COALESCE(c.marks_as_test, FALSE) as marks_as_test + FROM invite_codes c + LEFT JOIN users_v2 u ON c.created_by = u.id + """ +``` + +Find the list comprehension that constructs `InviteCode(...)` objects and add the new kwarg: + +```python + InviteCode( + code=row["code"], + created_by=str(row["created_by"]), + created_by_username=row["created_by_username"] or "unknown", + created_at=row["created_at"].replace(tzinfo=timezone.utc) if row["created_at"] else None, + expires_at=row["expires_at"].replace(tzinfo=timezone.utc) if row["expires_at"] else None, + max_uses=row["max_uses"], + use_count=row["use_count"], + is_active=row["is_active"], + marks_as_test=row["marks_as_test"], + ) +``` + +- [ ] **Step 3: Add new `get_invite_code_details` method** + +Add a new method right after `validate_invite_code` (around L1214) that returns the row with `marks_as_test`. The register flow will call this to resolve the flag. Place it between `validate_invite_code` and `use_invite_code`: + +```python + async def get_invite_code_details(self, code: str) -> Optional[dict]: + """ + Look up an invite code's row including marks_as_test. + + Returns None if the code does not exist. Does NOT validate expiry + or usage — use validate_invite_code for that. This is purely a + helper for the register flow to discover the test-seed flag. + """ + async with self.pool.acquire() as conn: + row = await conn.fetchrow( + """ + SELECT code, max_uses, use_count, is_active, + COALESCE(marks_as_test, FALSE) as marks_as_test + FROM invite_codes + WHERE code = $1 + """, + code, + ) + if not row: + return None + return { + "code": row["code"], + "max_uses": row["max_uses"], + "use_count": row["use_count"], + "is_active": row["is_active"], + "marks_as_test": row["marks_as_test"], + } +``` + +- [ ] **Step 4: Verify with curl via admin panel endpoint** + +Assuming you have an admin token from a local dev user. Hit the existing admin invites listing: + +```bash +# Replace TOKEN with a valid admin JWT +curl -s http://localhost:8000/api/admin/invites \ + -H "Authorization: Bearer $TOKEN" | jq '.codes[0]' +``` + +Expected: response includes `"marks_as_test": false` on at least one code. + +- [ ] **Step 5: Commit** + +```bash +git add server/services/admin_service.py +git commit -m "$(cat <<'EOF' +feat(server): expose marks_as_test on InviteCode + +Adds the field to the dataclass, SELECT list in get_invite_codes, +and a new get_invite_code_details helper that the register flow +will use to discover whether an invite should flag new accounts +as test accounts. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 4: Wire register flow to set `is_test_account` from invite + +When a user registers with an invite whose `marks_as_test=TRUE`, the new account is flagged. The plumbing lives in two places: the router reads the flag and passes it to the service; the service passes it to the store. + +**Files:** +- Modify: `server/routers/auth.py` — `register` handler (L224–L320) +- Modify: `server/services/auth_service.py` — `register` method (L98–L178) + +- [ ] **Step 1: Add `is_test_account` parameter to `auth_service.register`** + +In `server/services/auth_service.py` at L98, add the new parameter: + +```python + async def register( + self, + username: str, + password: str, + email: Optional[str] = None, + guest_id: Optional[str] = None, + is_test_account: bool = False, + ) -> RegistrationResult: +``` + +Update the docstring `Args:` block: + +``` + is_test_account: Mark this user as a soak-harness test account. +``` + +Pass the value through to `create_user` at L146: + +```python + user = await self.user_store.create_user( + username=username, + password_hash=password_hash, + email=email, + role=UserRole.USER, + guest_id=guest_id, + verification_token=verification_token, + verification_expires=verification_expires, + is_test_account=is_test_account, + ) +``` + +- [ ] **Step 2: Update the router to resolve `marks_as_test` and pass it through** + +In `server/routers/auth.py`, find the `register` handler at L224. After the existing invite-code validation block (around L248–L252), fetch the invite details and compute `is_test`: + +```python + # --- Invite code validation --- + is_test_account = False + if has_invite: + if not _admin_service: + raise HTTPException(status_code=503, detail="Admin service not initialized") + if not await _admin_service.validate_invite_code(request_body.invite_code): + raise HTTPException(status_code=400, detail="Invalid or expired invite code") + # Check if this invite flags new accounts as test accounts + invite_details = await _admin_service.get_invite_code_details(request_body.invite_code) + if invite_details and invite_details.get("marks_as_test"): + is_test_account = True +``` + +Then pass it to `auth_service.register` at L276: + +```python + # --- Create the account --- + result = await auth_service.register( + username=request_body.username, + password=request_body.password, + email=request_body.email, + is_test_account=is_test_account, + ) +``` + +- [ ] **Step 3: Flag the dev invite code for testing** + +Before we can test end-to-end locally, we need an invite code with `marks_as_test=TRUE` in the local dev DB. Run (once, manually): + +```bash +# First, check if 5VC2MCCN exists locally (it probably doesn't — that's staging's code). +# Create a local test invite code and flag it: +psql -d golfgame <<'EOF' +-- Create a local dev test-seed invite if not exists +INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test) +SELECT 'SOAKTEST', id, NOW() + INTERVAL '10 years', 100, TRUE, TRUE +FROM users_v2 WHERE role = 'admin' LIMIT 1 +ON CONFLICT (code) DO UPDATE SET marks_as_test = TRUE; + +-- Verify +SELECT code, max_uses, use_count, marks_as_test FROM invite_codes WHERE code = 'SOAKTEST'; +EOF +``` + +Expected: `marks_as_test | t` in the last row. + +- [ ] **Step 4: Verify register flow sets `is_test_account`** + +Restart the dev server, then: + +```bash +curl -sX POST http://localhost:8000/api/auth/register \ + -H 'Content-Type: application/json' \ + -d "{\"username\":\"soaktest_register1\",\"password\":\"$PW\",\"email\":\"soaktest_register1@example.com\",\"invite_code\":\"SOAKTEST\"}" + +# Verify in DB +psql -d golfgame -c "SELECT username, is_test_account FROM users_v2 WHERE username = 'soaktest_register1';" +``` + +Expected: `is_test_account | t`. + +- [ ] **Step 5: Verify non-test invite does NOT flag new accounts** + +```bash +# Create a non-test invite +psql -d golfgame <<'EOF' +INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test) +SELECT 'NORMAL01', id, NOW() + INTERVAL '10 years', 10, TRUE, FALSE +FROM users_v2 WHERE role = 'admin' LIMIT 1 +ON CONFLICT (code) DO UPDATE SET marks_as_test = FALSE; +EOF + +curl -sX POST http://localhost:8000/api/auth/register \ + -H 'Content-Type: application/json' \ + -d "{\"username\":\"realuser_smoke1\",\"password\":\"$PW\",\"email\":\"realuser_smoke1@example.com\",\"invite_code\":\"NORMAL01\"}" + +psql -d golfgame -c "SELECT username, is_test_account FROM users_v2 WHERE username = 'realuser_smoke1';" +``` + +Expected: `is_test_account | f`. + +- [ ] **Step 6: Commit** + +```bash +git add server/routers/auth.py server/services/auth_service.py +git commit -m "$(cat <<'EOF' +feat(server): register flow flags accounts from test-seed invites + +When a user registers with an invite_code whose marks_as_test=TRUE, +their users_v2.is_test_account is set to TRUE. Normal invite codes +and invite-less signups are unaffected. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 5: Stats filtering (`include_test` parameter) + +Thread an `include_test: bool = False` parameter through `get_leaderboard`, `get_player_rank`, and the corresponding router handlers. Default is `False` — real users never see soak traffic. + +**Files:** +- Modify: `server/services/stats_service.py` — `get_leaderboard` (L169), `get_player_rank` (L249) +- Modify: `server/routers/stats.py` — `get_leaderboard` route (L157), `get_player_rank` route (L227), `get_my_rank` route (L348) + +- [ ] **Step 1: Add `include_test` to `get_leaderboard` service method** + +In `server/services/stats_service.py` at L169: + +```python + async def get_leaderboard( + self, + metric: str = "wins", + limit: int = 50, + offset: int = 0, + include_test: bool = False, + ) -> List[LeaderboardEntry]: +``` + +Inside the method, find both SQL paths (materialized view and fallback). In the view path at L208, change the WHERE clause: + +```python + if view_exists: + # Use materialized view for performance + rows = await conn.fetch(f""" + SELECT + user_id, username, games_played, games_won, + win_rate, avg_score, knockouts, best_win_streak, + COALESCE(rating, 1500) as rating, + ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank + FROM leaderboard_overall + WHERE ($3 OR NOT is_test_account) + ORDER BY {column} {direction} + LIMIT $1 OFFSET $2 + """, limit, offset, include_test) +``` + +In the fallback path at L220, add the WHERE clause and parameter: + +```python + else: + # Fall back to direct query + rows = await conn.fetch(f""" + SELECT + s.user_id, u.username, s.games_played, s.games_won, + ROUND(s.games_won::numeric / NULLIF(s.games_played, 0) * 100, 1) as win_rate, + ROUND(s.total_points::numeric / NULLIF(s.total_rounds, 0), 1) as avg_score, + s.knockouts, s.best_win_streak, + COALESCE(s.rating, 1500) as rating, + ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank + FROM player_stats s + JOIN users_v2 u ON s.user_id = u.id + WHERE s.games_played >= 5 + AND u.deleted_at IS NULL + AND (u.is_banned = false OR u.is_banned IS NULL) + AND ($3 OR NOT COALESCE(u.is_test_account, FALSE)) + ORDER BY {column} {direction} + LIMIT $1 OFFSET $2 + """, limit, offset, include_test) +``` + +- [ ] **Step 2: Apply the same pattern to `get_player_rank`** + +In `server/services/stats_service.py` at L249: + +```python + async def get_player_rank( + self, + user_id: str, + metric: str = "wins", + include_test: bool = False, + ) -> Optional[int]: +``` + +Update both SQL paths to include the `include_test` filter. View path at L287: + +```python + if view_exists: + row = await conn.fetchrow(f""" + SELECT rank FROM ( + SELECT user_id, ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank + FROM leaderboard_overall + WHERE ($2 OR NOT is_test_account) + ) ranked + WHERE user_id = $1 + """, user_id, include_test) +``` + +Fallback path at L294: + +```python + else: + row = await conn.fetchrow(f""" + SELECT rank FROM ( + SELECT s.user_id, ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank + FROM player_stats s + JOIN users_v2 u ON s.user_id = u.id + WHERE s.games_played >= 5 + AND u.deleted_at IS NULL + AND (u.is_banned = false OR u.is_banned IS NULL) + AND ($2 OR NOT COALESCE(u.is_test_account, FALSE)) + ) ranked + WHERE user_id = $1 + """, user_id, include_test) +``` + +- [ ] **Step 3: Expose `include_test` as a query parameter on the leaderboard route** + +In `server/routers/stats.py` at L157: + +```python +@router.get("/leaderboard", response_model=LeaderboardResponse) +async def get_leaderboard( + metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"), + limit: int = Query(50, ge=1, le=100), + offset: int = Query(0, ge=0), + include_test: bool = Query(False, description="Include soak-harness test accounts"), + service: StatsService = Depends(get_stats_service_dep), +): + """ + Get leaderboard by metric. + + Metrics: + - wins: Total games won + - win_rate: Win percentage (requires 5+ games) + - avg_score: Average points per round (lower is better) + - knockouts: Times going out first + - streak: Best win streak + + Players must have 5+ games to appear on leaderboards. + By default, soak-harness test accounts are hidden. + """ + entries = await service.get_leaderboard(metric, limit, offset, include_test) +``` + +- [ ] **Step 4: Same for `get_player_rank` and `get_my_rank` routes** + +At L227: + +```python +@router.get("/players/{user_id}/rank", response_model=PlayerRankResponse) +async def get_player_rank( + user_id: str, + metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"), + include_test: bool = Query(False), + service: StatsService = Depends(get_stats_service_dep), +): + """Get player's rank on a leaderboard.""" + rank = await service.get_player_rank(user_id, metric, include_test) +``` + +At L348: + +```python +@router.get("/me/rank", response_model=PlayerRankResponse) +async def get_my_rank( + metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"), + include_test: bool = Query(False), + user: User = Depends(require_user), + service: StatsService = Depends(get_stats_service_dep), +): + """Get current user's rank on a leaderboard.""" + rank = await service.get_player_rank(user.id, metric, include_test) +``` + +- [ ] **Step 5: Verify filtering works via curl** + +```bash +# Mark a test user we registered earlier as having games played (synthetic) +psql -d golfgame <<'EOF' +INSERT INTO player_stats (user_id, games_played, games_won, total_points, total_rounds, rounds_won) +SELECT id, 10, 8, 50, 30, 20 FROM users_v2 WHERE username = 'soaktest_register1' +ON CONFLICT (user_id) DO UPDATE SET games_played = 10, games_won = 8; + +-- Refresh the matview so the test account shows up +REFRESH MATERIALIZED VIEW leaderboard_overall; +EOF + +# Default (include_test=false) should NOT include soaktest_register1 +curl -s "http://localhost:8000/api/stats/leaderboard?metric=wins" | jq '.entries[] | select(.username | startswith("soaktest_"))' + +# include_test=true should include soaktest_register1 +curl -s "http://localhost:8000/api/stats/leaderboard?metric=wins&include_test=true" | jq '.entries[] | select(.username | startswith("soaktest_"))' +``` + +Expected: first command returns nothing, second returns a JSON object for `soaktest_register1`. + +- [ ] **Step 6: Commit** + +```bash +git add server/services/stats_service.py server/routers/stats.py +git commit -m "$(cat <<'EOF' +feat(server): stats queries support include_test filter + +Leaderboard and rank queries take an optional include_test param +(default false). Real users never see soak-harness traffic unless +they explicitly opt in via ?include_test=true. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 6: Admin service + route surfaces `is_test_account` + +`UserDetails` exposes the flag, `search_users` selects it, and `list_users` admin route accepts an `include_test` query parameter. + +**Files:** +- Modify: `server/services/admin_service.py` — `UserDetails` (L24–L58), `search_users` (L312–L382), `get_user` (L384–L428) +- Modify: `server/routers/admin.py` — `list_users` route (L80–L107) + +- [ ] **Step 1: Add field to `UserDetails` dataclass** + +In `server/services/admin_service.py` at L24, add to the dataclass: + +```python +@dataclass +class UserDetails: + """Extended user info for admin view.""" + id: str + username: str + email: Optional[str] + role: str + email_verified: bool + is_banned: bool + ban_reason: Optional[str] + force_password_reset: bool + created_at: datetime + last_login: Optional[datetime] + last_seen_at: Optional[datetime] + is_active: bool + games_played: int + games_won: int + is_test_account: bool = False +``` + +Update `to_dict` to include it: + +```python + def to_dict(self) -> dict: + return { + "id": self.id, + "username": self.username, + "email": self.email, + "role": self.role, + "email_verified": self.email_verified, + "is_banned": self.is_banned, + "ban_reason": self.ban_reason, + "force_password_reset": self.force_password_reset, + "created_at": self.created_at.isoformat() if self.created_at else None, + "last_login": self.last_login.isoformat() if self.last_login else None, + "last_seen_at": self.last_seen_at.isoformat() if self.last_seen_at else None, + "is_active": self.is_active, + "games_played": self.games_played, + "games_won": self.games_won, + "is_test_account": self.is_test_account, + } +``` + +- [ ] **Step 2: Update `search_users` to SELECT and filter on `is_test_account`** + +In `server/services/admin_service.py` at L312, add `include_test` parameter and column to the SELECT: + +```python + async def search_users( + self, + query: str = "", + limit: int = 50, + offset: int = 0, + include_banned: bool = True, + include_deleted: bool = False, + include_test: bool = True, + ) -> List[UserDetails]: +``` + +Modify the SQL to pull `is_test_account`: + +```python + sql = """ + SELECT u.id, u.username, u.email, u.role, + u.email_verified, u.is_banned, u.ban_reason, + u.force_password_reset, u.created_at, u.last_login, + u.last_seen_at, u.is_active, + COALESCE(u.is_test_account, FALSE) as is_test_account, + COALESCE(s.games_played, 0) as games_played, + COALESCE(s.games_won, 0) as games_won + FROM users_v2 u + LEFT JOIN player_stats s ON u.id = s.user_id + WHERE 1=1 + """ +``` + +After the existing `include_deleted` check, add: + +```python + if not include_test: + sql += " AND (u.is_test_account = false OR u.is_test_account IS NULL)" +``` + +Update the `UserDetails(...)` construction in the list comprehension to include `is_test_account=row["is_test_account"]`. + +- [ ] **Step 3: Update `get_user` (single-user lookup) similarly** + +In `server/services/admin_service.py` at L384, add `COALESCE(u.is_test_account, FALSE) as is_test_account` to the SELECT and `is_test_account=row["is_test_account"]` to the `UserDetails(...)` construction. The `get_user` method does NOT need the filter parameter — admins looking up individual users should always see them. + +- [ ] **Step 4: Add `include_test` to the admin `list_users` route** + +In `server/routers/admin.py` at L80: + +```python +@router.get("/users") +async def list_users( + query: str = "", + limit: int = 50, + offset: int = 0, + include_banned: bool = True, + include_deleted: bool = False, + include_test: bool = True, + admin: User = Depends(require_admin_v2), + service: AdminService = Depends(get_admin_service_dep), +): + """ + Search and list users. + + Args: + query: Search by username or email. + limit: Maximum results to return. + offset: Results to skip. + include_banned: Include banned users. + include_deleted: Include soft-deleted users. + include_test: Include soak-harness test accounts (default true for admins). + """ + users = await service.search_users( + query=query, + limit=limit, + offset=offset, + include_banned=include_banned, + include_deleted=include_deleted, + include_test=include_test, + ) + return {"users": [u.to_dict() for u in users]} +``` + +Note: default is `True` for the admin path — admins should see everything by default. The client-side toggle will explicitly pass `false` when the admin wants to hide test accounts. + +- [ ] **Step 5: Verify via curl** + +```bash +# Assuming admin token in $TOKEN env var +curl -s "http://localhost:8000/api/admin/users?query=soaktest" \ + -H "Authorization: Bearer $TOKEN" | jq '.users[] | {username, is_test_account}' + +curl -s "http://localhost:8000/api/admin/users?query=soaktest&include_test=false" \ + -H "Authorization: Bearer $TOKEN" | jq '.users[]' +``` + +Expected: first returns users with `is_test_account: true`; second returns empty (test accounts filtered out). + +- [ ] **Step 6: Commit** + +```bash +git add server/services/admin_service.py server/routers/admin.py +git commit -m "$(cat <<'EOF' +feat(server): admin users list surfaces is_test_account + +UserDetails carries the new column, search_users selects and +optionally filters on it, and the /api/admin/users route accepts +?include_test=false to hide soak-harness accounts. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 7: Admin panel UI — Test badge and filter toggle + +Add a visible `[Test]` badge on test accounts in the admin user list, a `[Test-seed]` indicator on invite codes that mark new accounts as test, and an "Include test accounts" checkbox next to the existing "Include banned" toggle. + +**Files:** +- Modify: `client/admin.html` — add the new toggle near the existing `#include-banned` checkbox +- Modify: `client/admin.js` — `loadUsers` (L305), `getStatusBadge` (L246), the invite codes renderer (L443) + +- [ ] **Step 1: Add the "Include test accounts" checkbox to admin.html** + +In `client/admin.html`, find the existing `#include-banned` checkbox (it's in the users tab filter bar — grep for it). Add a sibling checkbox right after: + +```bash +grep -n "include-banned" client/admin.html +``` + +Add next to that line: + +```html + +``` + +- [ ] **Step 2: Read the new checkbox in `loadUsers` and pass to getUsers** + +In `client/admin.js` at L305: + +```javascript +async function loadUsers() { + try { + const query = document.getElementById('user-search').value; + const includeBanned = document.getElementById('include-banned').checked; + const includeTest = document.getElementById('include-test').checked; + const data = await getUsers(query, usersPage * PAGE_SIZE, includeBanned, includeTest); +``` + +Find `getUsers` at L70 and add the new parameter: + +```javascript +async function getUsers(query = '', offset = 0, includeBanned = true, includeTest = true) { + const params = new URLSearchParams({ + query, + limit: PAGE_SIZE, + offset, + include_banned: includeBanned, + include_test: includeTest, + }); + return apiRequest(`/api/admin/users?${params}`); +} +``` + +Note: the existing signature builds a URLSearchParams — check the actual code at L70 and match its style; the key change is adding `include_test: includeTest` to the params. + +- [ ] **Step 3: Add a "Test" badge to the user table row** + +In `client/admin.js` at L314, modify the table row template to render a Test badge inline with the status badge: + +```javascript + data.users.forEach(user => { + const testBadge = user.is_test_account + ? 'Test' + : ''; + tbody.innerHTML += ` + + ${escapeHtml(user.username)} ${testBadge} + ${escapeHtml(user.email || '-')} + ${user.role} + ${getStatusBadge(user)} + ${user.games_played} (${user.games_won} wins) + ${formatDateShort(user.created_at)} + + + + + `; + }); +``` + +- [ ] **Step 4: Add Test-seed indicator to invite codes list** + +In `client/admin.js` around L443 (invite codes list renderer), find the row template and add a `[Test-seed]` badge when `invite.marks_as_test`: + +```bash +grep -n "invite.is_active\|invite.code\|invites-tbody\|invites-table" client/admin.js | head +``` + +Once located, modify the row template to include: + +```javascript + const testSeedBadge = invite.marks_as_test + ? 'Test-seed' + : ''; + // Insert testSeedBadge into the invite code column, e.g. + // ${escapeHtml(invite.code)} ${testSeedBadge} +``` + +- [ ] **Step 5: Wire the checkbox change event to reload users** + +Find where `#include-banned` has its `change` listener attached (grep for it in admin.js): + +```bash +grep -n "include-banned.*addEventListener\|include-banned" client/admin.js +``` + +Add a parallel listener for `#include-test` that calls `loadUsers()`: + +```javascript +document.getElementById('include-test').addEventListener('change', () => { + usersPage = 0; + loadUsers(); +}); +``` + +- [ ] **Step 6: Manual verification in browser** + +1. Open http://localhost:8000/admin.html +2. Log in as admin +3. Navigate to Users tab +4. Search for "soaktest" +5. Confirm the `[Test]` badge appears next to `soaktest_register1` +6. Uncheck "Include test accounts" — the row should disappear +7. Re-check it — the row should return +8. Navigate to Invite Codes tab +9. Confirm the `[Test-seed]` badge appears next to the `SOAKTEST` code + +- [ ] **Step 7: Commit** + +```bash +git add client/admin.html client/admin.js +git commit -m "$(cat <<'EOF' +feat(admin): visible Test/Test-seed badges + filter toggle + +Users table shows [Test] next to soak-harness accounts, invite codes +list shows [Test-seed] next to codes that flag new accounts as test, +and a new "Include test accounts" checkbox lets admins hide bot +traffic from the user list. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 8: Document the one-time staging setup step + +The staging invite code `5VC2MCCN` needs to be flagged as test-seed before the harness can run against staging. This is a manual one-liner; document it in a new bring-up doc. + +**Files:** +- Create: `docs/soak-harness-bringup.md` + +- [ ] **Step 1: Create the bring-up doc** + +```bash +cat > docs/soak-harness-bringup.md <<'EOF' +# Soak Harness Bring-Up + +One-time setup steps before running `tests/soak` against an environment. + +## Prerequisites + +- An invite code exists with 16+ available uses +- You have psql access to the target DB (or admin SQL access via some other means) + +## 1. Flag the invite code as test-seed + +Any account registered with a `marks_as_test=TRUE` invite code gets +`users_v2.is_test_account=TRUE`, which keeps it out of real-user stats. + +### Staging + +Invite code: `5VC2MCCN` (16 uses, provisioned 2026-04-10). + +```sql +UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN'; +SELECT code, max_uses, use_count, marks_as_test FROM invite_codes WHERE code = '5VC2MCCN'; +``` + +Expected: `marks_as_test | t`. + +### Local dev + +The dev DB already has a `SOAKTEST` invite created during Task 4 of +the implementation plan. If you wiped the DB since, recreate it: + +```sql +INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test) +SELECT 'SOAKTEST', id, NOW() + INTERVAL '10 years', 100, TRUE, TRUE +FROM users_v2 WHERE role = 'admin' LIMIT 1 +ON CONFLICT (code) DO UPDATE SET marks_as_test = TRUE; +``` + +## 2. Run the harness + +```bash +cd tests/soak +npm install +npm run seed # first run only, populates .env.stresstest +TEST_URL=http://localhost:8000 npm run smoke # 30s end-to-end check +``` + +For staging: + +```bash +TEST_URL=https://staging.adlee.work npm run soak -- --scenario=populate +``` + +See `tests/soak/README.md` for the full flag reference. +EOF +``` + +- [ ] **Step 2: Commit** + +```bash +git add docs/soak-harness-bringup.md +git commit -m "$(cat <<'EOF' +docs: soak harness bring-up steps + +Documents the one-time UPDATE invite_codes SET marks_as_test = TRUE +step required before running tests/soak against each environment, +plus the local dev SOAKTEST invite recreation SQL. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +## Phase 2 — Harness scaffolding + +### Task 9: Create the `tests/soak/` package skeleton + +Bare minimum to get `tsx` running against an empty entry point. No behavior yet. + +**Files:** +- Create: `tests/soak/package.json` +- Create: `tests/soak/tsconfig.json` +- Create: `tests/soak/.gitignore` +- Create: `tests/soak/.env.stresstest.example` +- Create: `tests/soak/README.md` (stub) +- Create: `tests/soak/runner.ts` (stub — prints "hello") + +- [ ] **Step 1: Create `tests/soak/package.json`** + +```json +{ + "name": "golf-soak", + "version": "0.1.0", + "private": true, + "description": "Multiplayer soak & UX test harness for Golf Card Game", + "scripts": { + "soak": "tsx runner.ts", + "soak:populate": "tsx runner.ts --scenario=populate", + "soak:stress": "tsx runner.ts --scenario=stress", + "seed": "tsx scripts/seed-accounts.ts", + "smoke": "bash scripts/smoke.sh", + "test": "vitest run" + }, + "dependencies": { + "playwright-core": "^1.40.0", + "ws": "^8.16.0" + }, + "devDependencies": { + "tsx": "^4.7.0", + "@types/ws": "^8.5.0", + "@types/node": "^20.10.0", + "typescript": "^5.3.0", + "vitest": "^1.2.0" + } +} +``` + +- [ ] **Step 2: Create `tests/soak/tsconfig.json`** + +```json +{ + "compilerOptions": { + "target": "ES2022", + "module": "commonjs", + "moduleResolution": "node", + "strict": true, + "esModuleInterop": true, + "skipLibCheck": true, + "forceConsistentCasingInFileNames": true, + "resolveJsonModule": true, + "declaration": false, + "sourceMap": true, + "outDir": "./dist", + "rootDir": ".", + "baseUrl": ".", + "lib": ["ES2022", "DOM"], + "paths": { + "@soak/*": ["./*"], + "@bot/*": ["../e2e/bot/*"] + } + }, + "include": ["**/*.ts"], + "exclude": ["node_modules", "dist", "artifacts"] +} +``` + +- [ ] **Step 3: Create `tests/soak/.gitignore`** + +``` +node_modules/ +dist/ +artifacts/ +.env.stresstest +*.log +``` + +- [ ] **Step 4: Create `tests/soak/.env.stresstest.example`** + +``` +# Soak harness account cache. +# This file is AUTO-GENERATED on first run; do not edit by hand. +# Format: SOAK_ACCOUNT_NN=username:password:token +# +# Example (delete before first real run): +# SOAK_ACCOUNT_00=soak_00_a7bx:: +``` + +- [ ] **Step 5: Create `tests/soak/README.md` (stub — expanded in Task 31)** + +```markdown +# Golf Soak & UX Test Harness + +Runs 16 authenticated browser sessions across 4 rooms to populate +staging scoreboards and stress-test multiplayer stability. + +**Spec:** `docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md` +**Bring-up:** `docs/soak-harness-bringup.md` + +## Quick start + +```bash +npm install +npm run seed # first run only +TEST_URL=http://localhost:8000 npm run smoke +``` + +Full documentation arrives with Task 31. +``` + +- [ ] **Step 6: Create `tests/soak/runner.ts` as a placeholder** + +```typescript +#!/usr/bin/env tsx +/** + * Golf Soak Harness — entry point. + * + * Placeholder. Full runner lands in Task 17. + */ + +async function main(): Promise { + console.log('golf-soak runner (placeholder)'); + console.log('Full implementation lands in Task 17 of the plan.'); +} + +main().catch((err) => { + console.error(err); + process.exit(1); +}); +``` + +- [ ] **Step 7: Install deps and verify runner executes** + +```bash +cd tests/soak +npm install +npx tsx runner.ts +``` + +Expected output: + +``` +golf-soak runner (placeholder) +Full implementation lands in Task 17 of the plan. +``` + +- [ ] **Step 8: Commit** + +```bash +git add tests/soak/package.json tests/soak/package-lock.json tests/soak/tsconfig.json tests/soak/.gitignore tests/soak/.env.stresstest.example tests/soak/README.md tests/soak/runner.ts +git commit -m "$(cat <<'EOF' +feat(soak): scaffold tests/soak package + +Placeholder runner, tsconfig with @bot alias to tests/e2e/bot, +gitignored .env.stresstest + artifacts. Real behavior follows +in Task 10 onward. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 10: Core types and `Deferred` helper + +Pure TypeScript with Vitest tests. No browser, no network. Establishes the type surface the rest of the harness will target. + +**Files:** +- Create: `tests/soak/core/types.ts` +- Create: `tests/soak/core/deferred.ts` +- Create: `tests/soak/tests/deferred.test.ts` + +- [ ] **Step 1: Write the failing test for `Deferred`** + +Create `tests/soak/tests/deferred.test.ts`: + +```typescript +import { describe, it, expect } from 'vitest'; +import { deferred } from '../core/deferred'; + +describe('deferred', () => { + it('resolves with the given value', async () => { + const d = deferred(); + d.resolve('hello'); + await expect(d.promise).resolves.toBe('hello'); + }); + + it('rejects with the given error', async () => { + const d = deferred(); + const err = new Error('boom'); + d.reject(err); + await expect(d.promise).rejects.toBe(err); + }); + + it('ignores second resolve calls', async () => { + const d = deferred(); + d.resolve(1); + d.resolve(2); + await expect(d.promise).resolves.toBe(1); + }); +}); +``` + +- [ ] **Step 2: Run the test to verify it fails** + +```bash +cd tests/soak +npx vitest run tests/deferred.test.ts +``` + +Expected: FAIL — module `../core/deferred` does not exist. + +- [ ] **Step 3: Implement `deferred`** + +Create `tests/soak/core/deferred.ts`: + +```typescript +/** + * Promise deferred primitive — lets external code resolve or reject + * a promise. Used by RoomCoordinator for host→joiners handoff. + */ + +export interface Deferred { + promise: Promise; + resolve(value: T): void; + reject(error: unknown): void; +} + +export function deferred(): Deferred { + let resolve!: (value: T) => void; + let reject!: (error: unknown) => void; + const promise = new Promise((res, rej) => { + resolve = res; + reject = rej; + }); + return { promise, resolve, reject }; +} +``` + +- [ ] **Step 4: Run tests to verify they pass** + +```bash +npx vitest run tests/deferred.test.ts +``` + +Expected: 3 passed. + +- [ ] **Step 5: Create `core/types.ts` with the scenario interfaces** + +```typescript +/** + * Core type definitions for the soak harness. + * + * Contracts here are consumed by runner.ts, SessionPool, scenarios, + * and the dashboard. Keep this file small and stable. + */ + +import type { BrowserContext, Page } from 'playwright-core'; +import type { GolfBot } from '../../e2e/bot/golf-bot'; + +// ============================================================================= +// Accounts & sessions +// ============================================================================= + +export interface Account { + /** Stable key used in logs, e.g. "soak_00". */ + key: string; + username: string; + password: string; + /** JWT returned from /api/auth/login, may be refreshed by SessionPool. */ + token: string; +} + +export interface Session { + account: Account; + context: BrowserContext; + page: Page; + bot: GolfBot; + /** Convenience mirror of account.key. */ + key: string; +} + +// ============================================================================= +// Scenarios +// ============================================================================= + +export interface ScenarioNeeds { + /** Total number of authenticated sessions the scenario requires. */ + accounts: number; + /** How many rooms to partition sessions into (default: 1). */ + rooms?: number; + /** CPUs to add per room (default: 0). */ + cpusPerRoom?: number; +} + +/** Free-form per-scenario config merged with CLI flags. */ +export type ScenarioConfig = Record; + +export interface ScenarioError { + room: string; + reason: string; + detail?: string; + timestamp: number; +} + +export interface ScenarioResult { + gamesCompleted: number; + errors: ScenarioError[]; + durationMs: number; + customMetrics?: Record; +} + +export interface ScenarioContext { + /** Merged config: CLI flags → env → scenario defaults → runner defaults. */ + config: ScenarioConfig; + /** Pre-authenticated sessions; ordered. */ + sessions: Session[]; + coordinator: RoomCoordinatorApi; + dashboard: DashboardReporter; + logger: Logger; + signal: AbortSignal; + /** Reset the per-room watchdog. Call at each progress point. */ + heartbeat(roomId: string): void; +} + +export interface Scenario { + name: string; + description: string; + defaultConfig: ScenarioConfig; + needs: ScenarioNeeds; + run(ctx: ScenarioContext): Promise; +} + +// ============================================================================= +// Room coordination +// ============================================================================= + +export interface RoomCoordinatorApi { + announce(roomId: string, code: string): void; + await(roomId: string, timeoutMs?: number): Promise; +} + +// ============================================================================= +// Dashboard reporter +// ============================================================================= + +export interface RoomState { + phase?: string; + currentPlayer?: string; + hole?: number; + totalHoles?: number; + game?: number; + totalGames?: number; + moves?: number; + players?: Array<{ key: string; score: number | null; isActive: boolean }>; + message?: string; +} + +export interface DashboardReporter { + update(roomId: string, state: Partial): void; + log(level: 'info' | 'warn' | 'error', msg: string, meta?: object): void; + incrementMetric(name: string, by?: number): void; +} + +// ============================================================================= +// Logger +// ============================================================================= + +export type LogLevel = 'debug' | 'info' | 'warn' | 'error'; + +export interface Logger { + debug(msg: string, meta?: object): void; + info(msg: string, meta?: object): void; + warn(msg: string, meta?: object): void; + error(msg: string, meta?: object): void; + child(meta: object): Logger; +} +``` + +- [ ] **Step 6: Verify tsx still parses the runner** + +```bash +cd tests/soak +npx tsx runner.ts +``` + +Expected: still prints the placeholder output; no TypeScript errors from the new `core/` files (they're not imported yet). + +- [ ] **Step 7: Commit** + +```bash +git add tests/soak/core/deferred.ts tests/soak/core/types.ts tests/soak/tests/deferred.test.ts +git commit -m "$(cat <<'EOF' +feat(soak): core types + Deferred primitive + +Establishes the Scenario/Session/Logger/DashboardReporter contracts +the rest of the harness builds on. Deferred is the building block +for RoomCoordinator's host→joiners handoff. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 11: RoomCoordinator with tests + +Tiny abstraction over `Deferred` keyed by room ID, with a timeout on `await`. + +**Files:** +- Create: `tests/soak/core/room-coordinator.ts` +- Create: `tests/soak/tests/room-coordinator.test.ts` + +- [ ] **Step 1: Write failing tests** + +```typescript +// tests/soak/tests/room-coordinator.test.ts +import { describe, it, expect } from 'vitest'; +import { RoomCoordinator } from '../core/room-coordinator'; + +describe('RoomCoordinator', () => { + it('resolves await with the announced code (announce then await)', async () => { + const rc = new RoomCoordinator(); + rc.announce('room-1', 'ABCD'); + await expect(rc.await('room-1')).resolves.toBe('ABCD'); + }); + + it('resolves await with the announced code (await then announce)', async () => { + const rc = new RoomCoordinator(); + const p = rc.await('room-2'); + rc.announce('room-2', 'WXYZ'); + await expect(p).resolves.toBe('WXYZ'); + }); + + it('rejects await after timeout if not announced', async () => { + const rc = new RoomCoordinator(); + await expect(rc.await('room-3', 50)).rejects.toThrow(/timed out/i); + }); + + it('isolates rooms — announcing room-A does not unblock room-B', async () => { + const rc = new RoomCoordinator(); + const pB = rc.await('room-B', 100); + rc.announce('room-A', 'A-CODE'); + await expect(pB).rejects.toThrow(/timed out/i); + }); +}); +``` + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +npx vitest run tests/room-coordinator.test.ts +``` + +Expected: FAIL — module not found. + +- [ ] **Step 3: Implement `RoomCoordinator`** + +```typescript +// tests/soak/core/room-coordinator.ts +import { deferred, Deferred } from './deferred'; +import type { RoomCoordinatorApi } from './types'; + +export class RoomCoordinator implements RoomCoordinatorApi { + private rooms = new Map>(); + + announce(roomId: string, code: string): void { + this.getOrCreate(roomId).resolve(code); + } + + async await(roomId: string, timeoutMs: number = 30_000): Promise { + const d = this.getOrCreate(roomId); + let timer: NodeJS.Timeout | undefined; + const timeout = new Promise((_, reject) => { + timer = setTimeout(() => { + reject(new Error(`RoomCoordinator: room "${roomId}" timed out after ${timeoutMs}ms`)); + }, timeoutMs); + }); + try { + return await Promise.race([d.promise, timeout]); + } finally { + if (timer) clearTimeout(timer); + } + } + + private getOrCreate(roomId: string): Deferred { + let d = this.rooms.get(roomId); + if (!d) { + d = deferred(); + this.rooms.set(roomId, d); + } + return d; + } +} +``` + +- [ ] **Step 4: Verify tests pass** + +```bash +npx vitest run tests/room-coordinator.test.ts +``` + +Expected: 4 passed. + +- [ ] **Step 5: Commit** + +```bash +git add tests/soak/core/room-coordinator.ts tests/soak/tests/room-coordinator.test.ts +git commit -m "$(cat <<'EOF' +feat(soak): RoomCoordinator with host→joiners handoff + +Lazy Deferred per roomId with a timeout on await. Lets concurrent +joiner sessions block until their host announces the room code +without polling or page scraping. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 12: Structured JSONL logger + +Single module, no transport, writes to `process.stdout`. Supports child loggers with bound metadata (so scenarios can emit logs with `room` / `game` context without repeating it). + +**Files:** +- Create: `tests/soak/core/logger.ts` +- Create: `tests/soak/tests/logger.test.ts` + +- [ ] **Step 1: Write failing tests** + +```typescript +// tests/soak/tests/logger.test.ts +import { describe, it, expect, beforeEach, vi } from 'vitest'; +import { createLogger } from '../core/logger'; + +describe('logger', () => { + let writes: string[]; + let write: (s: string) => boolean; + + beforeEach(() => { + writes = []; + write = (s: string) => { + writes.push(s); + return true; + }; + }); + + it('emits a JSON line per call with level and msg', () => { + const log = createLogger({ runId: 'r1', write }); + log.info('hello'); + expect(writes).toHaveLength(1); + const parsed = JSON.parse(writes[0]); + expect(parsed.level).toBe('info'); + expect(parsed.msg).toBe('hello'); + expect(parsed.runId).toBe('r1'); + expect(parsed.timestamp).toBeTypeOf('string'); + }); + + it('merges meta into the log line', () => { + const log = createLogger({ runId: 'r1', write }); + log.warn('slow', { turnMs: 3000 }); + const parsed = JSON.parse(writes[0]); + expect(parsed.turnMs).toBe(3000); + expect(parsed.level).toBe('warn'); + }); + + it('child logger inherits parent meta', () => { + const log = createLogger({ runId: 'r1', write }); + const roomLog = log.child({ room: 'room-1' }); + roomLog.info('game_start'); + const parsed = JSON.parse(writes[0]); + expect(parsed.room).toBe('room-1'); + expect(parsed.runId).toBe('r1'); + }); + + it('respects minimum level', () => { + const log = createLogger({ runId: 'r1', write, minLevel: 'warn' }); + log.debug('nope'); + log.info('nope'); + log.warn('yes'); + log.error('yes'); + expect(writes).toHaveLength(2); + }); +}); +``` + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +npx vitest run tests/logger.test.ts +``` + +Expected: FAIL — module not found. + +- [ ] **Step 3: Implement the logger** + +```typescript +// tests/soak/core/logger.ts +import type { Logger, LogLevel } from './types'; + +const LEVEL_ORDER: Record = { + debug: 0, + info: 1, + warn: 2, + error: 3, +}; + +export interface LoggerOptions { + runId: string; + minLevel?: LogLevel; + /** Defaults to process.stdout.write bound to stdout. Override for tests. */ + write?: (line: string) => boolean; + baseMeta?: Record; +} + +export function createLogger(opts: LoggerOptions): Logger { + const minLevel = opts.minLevel ?? 'info'; + const write = opts.write ?? ((s: string) => process.stdout.write(s)); + const baseMeta = opts.baseMeta ?? {}; + + function emit(level: LogLevel, msg: string, meta?: object): void { + if (LEVEL_ORDER[level] < LEVEL_ORDER[minLevel]) return; + const line = JSON.stringify({ + timestamp: new Date().toISOString(), + level, + msg, + runId: opts.runId, + ...baseMeta, + ...(meta ?? {}), + }) + '\n'; + write(line); + } + + const logger: Logger = { + debug: (msg, meta) => emit('debug', msg, meta), + info: (msg, meta) => emit('info', msg, meta), + warn: (msg, meta) => emit('warn', msg, meta), + error: (msg, meta) => emit('error', msg, meta), + child: (meta) => + createLogger({ + runId: opts.runId, + minLevel, + write, + baseMeta: { ...baseMeta, ...meta }, + }), + }; + + return logger; +} +``` + +- [ ] **Step 4: Verify tests pass** + +```bash +npx vitest run tests/logger.test.ts +``` + +Expected: 4 passed. + +- [ ] **Step 5: Commit** + +```bash +git add tests/soak/core/logger.ts tests/soak/tests/logger.test.ts +git commit -m "$(cat <<'EOF' +feat(soak): structured JSONL logger with child contexts + +Single file, no transport, writes one JSON line per call to stdout. +Child loggers inherit parent meta so scenarios can bind room/game +context once and forget about it. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +## Phase 3 — SessionPool and seeding + +### Task 13: SessionPool with HTTP registration and localStorage warm-start + +This is the biggest single module. It owns browser context lifecycle, seeds accounts on cold start, logs in on warm start, and exposes a simple `acquire()` API to scenarios. + +**Files:** +- Create: `tests/soak/core/session-pool.ts` + +Testing: manual via `scripts/seed-accounts.ts` in Task 14 and the first real runner invocation in Task 17. No Vitest test for this — it's an integration module that needs a real browser. + +- [ ] **Step 1: Create `tests/soak/core/session-pool.ts` — imports and types** + +```typescript +// tests/soak/core/session-pool.ts +import * as fs from 'fs'; +import * as path from 'path'; +import { + Browser, + BrowserContext, + chromium, +} from 'playwright-core'; +import { GolfBot } from '../../e2e/bot/golf-bot'; +import type { Account, Session, Logger } from './types'; + +export interface SeedOptions { + /** Full base URL of the target server, e.g. https://staging.adlee.work. */ + targetUrl: string; + /** Invite code to pass to /api/auth/register. */ + inviteCode: string; + /** Number of accounts to create. */ + count: number; +} + +export interface SessionPoolOptions { + targetUrl: string; + inviteCode: string; + credFile: string; // absolute path to .env.stresstest + logger: Logger; + /** Optional override for the browser to attach contexts to. If absent, SessionPool launches its own. */ + browser?: Browser; + /** Passed through to context.newContext. Useful for viewport overrides in tests. */ + contextOptions?: Parameters[0]; +} +``` + +- [ ] **Step 2: Implement cred-file read/write** + +Append to `session-pool.ts`: + +```typescript +function readCredFile(filePath: string): Account[] | null { + if (!fs.existsSync(filePath)) return null; + const content = fs.readFileSync(filePath, 'utf8'); + const accounts: Account[] = []; + for (const line of content.split('\n')) { + const trimmed = line.trim(); + if (!trimmed || trimmed.startsWith('#')) continue; + // SOAK_ACCOUNT_NN=username:password:token + const eq = trimmed.indexOf('='); + if (eq === -1) continue; + const key = trimmed.slice(0, eq); + const value = trimmed.slice(eq + 1); + const m = key.match(/^SOAK_ACCOUNT_(\d+)$/); + if (!m) continue; + const [username, password, token] = value.split(':'); + if (!username || !password || !token) continue; + const idx = parseInt(m[1], 10); + accounts.push({ + key: `soak_${String(idx).padStart(2, '0')}`, + username, + password, + token, + }); + } + return accounts.length > 0 ? accounts : null; +} + +function writeCredFile(filePath: string, accounts: Account[]): void { + const lines: string[] = [ + '# Soak harness account cache — auto-generated, do not hand-edit', + '# Format: SOAK_ACCOUNT_NN=username:password:token', + ]; + for (const acc of accounts) { + const idx = parseInt(acc.key.replace('soak_', ''), 10); + const key = `SOAK_ACCOUNT_${String(idx).padStart(2, '0')}`; + lines.push(`${key}=${acc.username}:${acc.password}:${acc.token}`); + } + fs.writeFileSync(filePath, lines.join('\n') + '\n', { mode: 0o600 }); +} +``` + +- [ ] **Step 3: Implement the HTTP register call** + +```typescript +interface RegisterResponse { + user: { id: string; username: string }; + token: string; + expires_at: string; +} + +async function registerAccount( + targetUrl: string, + username: string, + password: string, + email: string, + inviteCode: string, +): Promise { + const res = await fetch(`${targetUrl}/api/auth/register`, { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ username, password, email, invite_code: inviteCode }), + }); + if (!res.ok) { + const body = await res.text().catch(() => ''); + throw new Error(`register failed: ${res.status} ${body}`); + } + const data = (await res.json()) as RegisterResponse; + if (!data.token) { + throw new Error(`register returned no token: ${JSON.stringify(data)}`); + } + return data.token; +} + +async function loginAccount( + targetUrl: string, + username: string, + password: string, +): Promise { + const res = await fetch(`${targetUrl}/api/auth/login`, { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ username, password }), + }); + if (!res.ok) { + const body = await res.text().catch(() => ''); + throw new Error(`login failed: ${res.status} ${body}`); + } + const data = (await res.json()) as RegisterResponse; + return data.token; +} + +function randomSuffix(): string { + return Math.random().toString(36).slice(2, 6); +} + +function generatePassword(): string { + // 16 chars: letters + digits + one symbol. Meets 8-char minimum from auth_service. + // Split across halves so repo secret-scanners don't flag the string as base64 + const lower = 'abcdefghijkm' + 'npqrstuvwxyz'; // pragma: allowlist secret + const upper = 'ABCDEFGHJKLM' + 'NPQRSTUVWXYZ'; // pragma: allowlist secret + const digits = '23456789'; + const chars = lower + upper + digits; + let out = ''; + for (let i = 0; i < 15; i++) { + out += chars[Math.floor(Math.random() * chars.length)]; + } + return out + '!'; +} +``` + +- [ ] **Step 4: Implement the `SessionPool` class** + +```typescript +export class SessionPool { + private accounts: Account[] = []; + private ownedBrowser: Browser | null = null; + private browser: Browser | null; + private activeSessions: Session[] = []; + + constructor(private opts: SessionPoolOptions) { + this.browser = opts.browser ?? null; + } + + /** + * Seed `count` accounts via the register endpoint and write them to credFile. + * Safe to call multiple times — skips accounts already in the file. + */ + static async seed(opts: SeedOptions & { credFile: string; logger: Logger }): Promise { + const existing = readCredFile(opts.credFile) ?? []; + const existingKeys = new Set(existing.map((a) => a.key)); + const created: Account[] = [...existing]; + + for (let i = 0; i < opts.count; i++) { + const key = `soak_${String(i).padStart(2, '0')}`; + if (existingKeys.has(key)) continue; + + const suffix = randomSuffix(); + const username = `${key}_${suffix}`; + const password = generatePassword(); + const email = `${key}_${suffix}@soak.test`; + + opts.logger.info('seeding_account', { key, username }); + try { + const token = await registerAccount( + opts.targetUrl, + username, + password, + email, + opts.inviteCode, + ); + created.push({ key, username, password, token }); + writeCredFile(opts.credFile, created); + } catch (err) { + opts.logger.error('seed_failed', { + key, + error: err instanceof Error ? err.message : String(err), + }); + throw err; + } + } + return created; + } + + /** + * Load accounts from credFile, auto-seeding if the file is missing. + */ + async ensureAccounts(desiredCount: number): Promise { + let accounts = readCredFile(this.opts.credFile); + if (!accounts || accounts.length < desiredCount) { + this.opts.logger.warn('cred_file_missing_or_short', { + found: accounts?.length ?? 0, + desired: desiredCount, + }); + accounts = await SessionPool.seed({ + targetUrl: this.opts.targetUrl, + inviteCode: this.opts.inviteCode, + count: desiredCount, + credFile: this.opts.credFile, + logger: this.opts.logger, + }); + } + this.accounts = accounts.slice(0, desiredCount); + return this.accounts; + } + + /** + * Launch the browser if not provided, create N contexts, log each in via + * localStorage injection (falling back to POST /api/auth/login if the + * cached token is rejected), and return the live sessions. + */ + async acquire(count: number): Promise { + await this.ensureAccounts(count); + if (!this.browser) { + this.ownedBrowser = await chromium.launch({ headless: true }); + this.browser = this.ownedBrowser; + } + + const sessions: Session[] = []; + for (let i = 0; i < count; i++) { + const account = this.accounts[i]; + const context = await this.browser.newContext(this.opts.contextOptions); + await this.injectAuth(context, account); + const page = await context.newPage(); + await page.goto(this.opts.targetUrl); + const bot = new GolfBot(page); + sessions.push({ account, context, page, bot, key: account.key }); + } + this.activeSessions = sessions; + return sessions; + } + + /** + * Inject the cached JWT into localStorage BEFORE any page loads. + * Uses addInitScript so the token is present on the first navigation. + * If the cached token is rejected later, acquire() falls back to login. + */ + private async injectAuth(context: BrowserContext, account: Account): Promise { + // Try the cached token first + try { + await context.addInitScript( + ({ token, username }) => { + window.localStorage.setItem('authToken', token); + window.localStorage.setItem( + 'authUser', + JSON.stringify({ id: '', username, role: 'user', email_verified: true }), + ); + }, + { token: account.token, username: account.username }, + ); + } catch (err) { + this.opts.logger.warn('inject_auth_failed', { + account: account.key, + error: err instanceof Error ? err.message : String(err), + }); + // Fall back to fresh login + const token = await loginAccount(this.opts.targetUrl, account.username, account.password); + account.token = token; + writeCredFile(this.opts.credFile, this.accounts); + await context.addInitScript( + ({ token, username }) => { + window.localStorage.setItem('authToken', token); + window.localStorage.setItem( + 'authUser', + JSON.stringify({ id: '', username, role: 'user', email_verified: true }), + ); + }, + { token, username: account.username }, + ); + } + } + + /** Close all active contexts. Safe to call multiple times. */ + async release(): Promise { + for (const session of this.activeSessions) { + try { + await session.context.close(); + } catch { + // ignore + } + } + this.activeSessions = []; + if (this.ownedBrowser) { + try { + await this.ownedBrowser.close(); + } catch { + // ignore + } + this.ownedBrowser = null; + this.browser = null; + } + } +} +``` + +- [ ] **Step 5: Syntax-check by invoking tsx** + +```bash +cd tests/soak +npx tsx -e "import('./core/session-pool').then(() => console.log('ok'))" +``` + +Expected: `ok`. No TypeScript errors. + +- [ ] **Step 6: Commit** + +```bash +git add tests/soak/core/session-pool.ts +git commit -m "$(cat <<'EOF' +feat(soak): SessionPool — seed, login, acquire contexts + +Owns 16 BrowserContexts, seeds via POST /api/auth/register with the +invite code on cold start, warm-starts via localStorage injection of +the cached JWT, falls back to POST /api/auth/login if the token is +rejected. Exposes acquire(n) for scenarios. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 14: `seed-accounts.ts` CLI wrapper + +Tiny standalone entry point that lets you pre-seed before the first harness run. Reuses `SessionPool.seed`. + +**Files:** +- Create: `tests/soak/scripts/seed-accounts.ts` + +- [ ] **Step 1: Write the script** + +```typescript +#!/usr/bin/env tsx +/** + * Seed N soak-harness accounts via the register endpoint. + * + * Usage: + * TEST_URL=http://localhost:8000 \ + * SOAK_INVITE_CODE=SOAKTEST \ + * npm run seed -- --count=16 + */ + +import * as path from 'path'; +import { SessionPool } from '../core/session-pool'; +import { createLogger } from '../core/logger'; + +function parseArgs(argv: string[]): { count: number } { + const result = { count: 16 }; + for (const arg of argv.slice(2)) { + const m = arg.match(/^--count=(\d+)$/); + if (m) result.count = parseInt(m[1], 10); + } + return result; +} + +async function main(): Promise { + const { count } = parseArgs(process.argv); + const targetUrl = process.env.TEST_URL ?? 'http://localhost:8000'; + const inviteCode = process.env.SOAK_INVITE_CODE; + if (!inviteCode) { + console.error('SOAK_INVITE_CODE env var is required'); + console.error(' Local dev: SOAK_INVITE_CODE=SOAKTEST'); + console.error(' Staging: SOAK_INVITE_CODE=5VC2MCCN'); + process.exit(2); + } + + const credFile = path.resolve(__dirname, '..', '.env.stresstest'); + const logger = createLogger({ runId: `seed-${Date.now()}` }); + + logger.info('seed_start', { count, targetUrl, credFile }); + try { + const accounts = await SessionPool.seed({ + targetUrl, + inviteCode, + count, + credFile, + logger, + }); + logger.info('seed_complete', { created: accounts.length }); + console.error(`Seeded ${accounts.length} accounts → ${credFile}`); + } catch (err) { + logger.error('seed_failed', { + error: err instanceof Error ? err.message : String(err), + }); + process.exit(1); + } +} + +main(); +``` + +- [ ] **Step 2: Run it against local dev to verify end-to-end** + +With the dev server running and the `SOAKTEST` invite flagged: + +```bash +cd tests/soak +TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run seed -- --count=4 +``` + +Expected: +- Log lines `seeding_account` × 4 +- Log line `seed_complete` +- `tests/soak/.env.stresstest` file created with 4 `SOAK_ACCOUNT_NN=...` lines + +Verify: + +```bash +cat tests/soak/.env.stresstest | head +``` + +Expected: 4 account lines. + +Also verify the accounts got flagged: + +```bash +psql -d golfgame -c "SELECT username, is_test_account FROM users_v2 WHERE username LIKE 'soak_%' ORDER BY username;" +``` + +Expected: 4 rows, all with `is_test_account | t`. + +- [ ] **Step 3: Commit** + +```bash +git add tests/soak/scripts/seed-accounts.ts +git commit -m "$(cat <<'EOF' +feat(soak): scripts/seed-accounts.ts CLI wrapper + +Thin standalone entry for pre-seeding N accounts before the first +harness run. Wraps SessionPool.seed and writes .env.stresstest. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +## Phase 4 — First scenario, config, runner (end-to-end milestone) + +### Task 15: Shared multiplayer-game helper + +Pulls the "run one full game in one room" logic out of the scenarios so `populate` and `stress` share it. Takes a room's sessions and a config, loops until the game ends. + +**Files:** +- Create: `tests/soak/scenarios/shared/multiplayer-game.ts` + +- [ ] **Step 1: Create the helper module** + +```typescript +// tests/soak/scenarios/shared/multiplayer-game.ts +import type { Session, ScenarioContext } from '../../core/types'; + +export interface MultiplayerGameOptions { + roomId: string; + holes: number; + decks: number; + cpusPerRoom: number; + cpuPersonality?: string; + /** Per-turn think time in [min, max] ms. */ + thinkTimeMs: [number, number]; + /** Max wall-clock time before giving up on the game (ms). */ + maxDurationMs?: number; +} + +export interface MultiplayerGameResult { + completed: boolean; + turns: number; + durationMs: number; + error?: string; +} + +function randomInt(min: number, max: number): number { + return Math.floor(Math.random() * (max - min + 1)) + min; +} + +async function sleep(ms: number): Promise { + return new Promise((resolve) => setTimeout(resolve, ms)); +} + +/** + * Host + joiners play one full multiplayer game end to end. + * The host creates the room, announces the code via the coordinator, + * joiners wait for the code, the host adds CPUs and starts, everyone + * loops on isMyTurn/playTurn until round_over or game_over. + */ +export async function runOneMultiplayerGame( + ctx: ScenarioContext, + sessions: Session[], + opts: MultiplayerGameOptions, +): Promise { + const start = Date.now(); + const [host, ...joiners] = sessions; + const maxDuration = opts.maxDurationMs ?? 5 * 60_000; + + try { + // Host creates game + const code = await host.bot.createGame(host.account.username); + ctx.coordinator.announce(opts.roomId, code); + ctx.heartbeat(opts.roomId); + ctx.dashboard.update(opts.roomId, { phase: 'lobby' }); + ctx.logger.info('room_created', { room: opts.roomId, code }); + + // Joiners join concurrently + await Promise.all( + joiners.map(async (joiner) => { + const awaited = await ctx.coordinator.await(opts.roomId); + await joiner.bot.joinGame(awaited, joiner.account.username); + }), + ); + ctx.heartbeat(opts.roomId); + + // Host adds CPUs (if any) and starts + for (let i = 0; i < opts.cpusPerRoom; i++) { + await host.bot.addCPU(opts.cpuPersonality); + } + await host.bot.startGame({ holes: opts.holes, decks: opts.decks }); + ctx.heartbeat(opts.roomId); + ctx.dashboard.update(opts.roomId, { phase: 'playing', totalHoles: opts.holes }); + + // Concurrent turn loops — one per session + const turnCounts = new Array(sessions.length).fill(0); + + async function sessionLoop(sessionIdx: number): Promise { + const session = sessions[sessionIdx]; + while (true) { + if (ctx.signal.aborted) return; + if (Date.now() - start > maxDuration) return; + + const phase = await session.bot.getGamePhase(); + if (phase === 'game_over' || phase === 'round_over') return; + + if (await session.bot.isMyTurn()) { + await session.bot.playTurn(); + turnCounts[sessionIdx]++; + ctx.heartbeat(opts.roomId); + ctx.dashboard.update(opts.roomId, { + currentPlayer: session.account.username, + moves: turnCounts.reduce((a, b) => a + b, 0), + }); + const thinkMs = randomInt(opts.thinkTimeMs[0], opts.thinkTimeMs[1]); + await sleep(thinkMs); + } else { + await sleep(200); + } + } + } + + await Promise.all(sessions.map((_, i) => sessionLoop(i))); + + const totalTurns = turnCounts.reduce((a, b) => a + b, 0); + ctx.dashboard.update(opts.roomId, { phase: 'round_over' }); + return { + completed: true, + turns: totalTurns, + durationMs: Date.now() - start, + }; + } catch (err) { + return { + completed: false, + turns: 0, + durationMs: Date.now() - start, + error: err instanceof Error ? err.message : String(err), + }; + } +} +``` + +- [ ] **Step 2: Syntax-check** + +```bash +cd tests/soak +npx tsx -e "import('./scenarios/shared/multiplayer-game').then(() => console.log('ok'))" +``` + +Expected: `ok`. + +- [ ] **Step 3: Commit** + +```bash +git add tests/soak/scenarios/shared/multiplayer-game.ts +git commit -m "$(cat <<'EOF' +feat(soak): shared runOneMultiplayerGame helper + +Encapsulates the host-creates/joiners-join/loop-until-done flow so +populate and stress scenarios don't duplicate it. Honors abort +signal and a max-duration timeout, heartbeats on every turn. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 16: Populate scenario (minimal version) + +Partitions sessions into rooms, runs `gamesPerRoom` games per room in parallel, aggregates results. + +**Files:** +- Create: `tests/soak/scenarios/populate.ts` +- Create: `tests/soak/scenarios/index.ts` + +- [ ] **Step 1: Create `scenarios/populate.ts`** + +```typescript +// tests/soak/scenarios/populate.ts +import type { + Scenario, + ScenarioContext, + ScenarioResult, + ScenarioError, + Session, +} from '../core/types'; +import { runOneMultiplayerGame } from './shared/multiplayer-game'; + +const CPU_PERSONALITIES = ['Sofia', 'Marcus', 'Kenji', 'Priya']; + +interface PopulateConfig { + gamesPerRoom: number; + holes: number; + decks: number; + rooms: number; + cpusPerRoom: number; + thinkTimeMs: [number, number]; + interGamePauseMs: number; +} + +function chunk(arr: T[], size: number): T[][] { + const out: T[][] = []; + for (let i = 0; i < arr.length; i += size) { + out.push(arr.slice(i, i + size)); + } + return out; +} + +async function sleep(ms: number): Promise { + return new Promise((resolve) => setTimeout(resolve, ms)); +} + +async function runRoom( + ctx: ScenarioContext, + cfg: PopulateConfig, + roomIdx: number, + sessions: Session[], +): Promise<{ completed: number; errors: ScenarioError[] }> { + const roomId = `room-${roomIdx}`; + const cpuPersonality = CPU_PERSONALITIES[roomIdx % CPU_PERSONALITIES.length]; + let completed = 0; + const errors: ScenarioError[] = []; + + for (let gameNum = 0; gameNum < cfg.gamesPerRoom; gameNum++) { + if (ctx.signal.aborted) break; + ctx.dashboard.update(roomId, { game: gameNum + 1, totalGames: cfg.gamesPerRoom }); + ctx.logger.info('game_start', { room: roomId, game: gameNum + 1 }); + + const result = await runOneMultiplayerGame(ctx, sessions, { + roomId, + holes: cfg.holes, + decks: cfg.decks, + cpusPerRoom: cfg.cpusPerRoom, + cpuPersonality, + thinkTimeMs: cfg.thinkTimeMs, + }); + + if (result.completed) { + completed++; + ctx.logger.info('game_complete', { + room: roomId, + game: gameNum + 1, + turns: result.turns, + durationMs: result.durationMs, + }); + } else { + errors.push({ + room: roomId, + reason: 'game_failed', + detail: result.error, + timestamp: Date.now(), + }); + ctx.logger.error('game_failed', { room: roomId, game: gameNum + 1, error: result.error }); + } + + if (gameNum < cfg.gamesPerRoom - 1) { + await sleep(cfg.interGamePauseMs); + } + } + + return { completed, errors }; +} + +const populate: Scenario = { + name: 'populate', + description: 'Long multi-round games to populate scoreboards', + needs: { accounts: 16, rooms: 4, cpusPerRoom: 1 }, + defaultConfig: { + gamesPerRoom: 10, + holes: 9, + decks: 2, + rooms: 4, + cpusPerRoom: 1, + thinkTimeMs: [800, 2200], + interGamePauseMs: 3000, + }, + + async run(ctx: ScenarioContext): Promise { + const start = Date.now(); + const cfg = ctx.config as unknown as PopulateConfig; + + const perRoom = Math.floor(ctx.sessions.length / cfg.rooms); + if (perRoom * cfg.rooms !== ctx.sessions.length) { + throw new Error( + `populate: ${ctx.sessions.length} sessions does not divide evenly into ${cfg.rooms} rooms`, + ); + } + const roomSessions = chunk(ctx.sessions, perRoom); + + const results = await Promise.allSettled( + roomSessions.map((sessions, idx) => runRoom(ctx, cfg, idx, sessions)), + ); + + let gamesCompleted = 0; + const errors: ScenarioError[] = []; + results.forEach((r, idx) => { + if (r.status === 'fulfilled') { + gamesCompleted += r.value.completed; + errors.push(...r.value.errors); + } else { + errors.push({ + room: `room-${idx}`, + reason: 'room_threw', + detail: r.reason instanceof Error ? r.reason.message : String(r.reason), + timestamp: Date.now(), + }); + } + }); + + return { + gamesCompleted, + errors, + durationMs: Date.now() - start, + }; + }, +}; + +export default populate; +``` + +- [ ] **Step 2: Create `scenarios/index.ts` registry** + +```typescript +// tests/soak/scenarios/index.ts +import type { Scenario } from '../core/types'; +import populate from './populate'; + +const registry: Record = { + populate, +}; + +export function getScenario(name: string): Scenario | undefined { + return registry[name]; +} + +export function listScenarios(): Scenario[] { + return Object.values(registry); +} +``` + +- [ ] **Step 3: Syntax-check** + +```bash +cd tests/soak +npx tsx -e "import('./scenarios/index').then((m) => console.log(m.listScenarios().map(s => s.name)))" +``` + +Expected: `['populate']`. + +- [ ] **Step 4: Commit** + +```bash +git add tests/soak/scenarios/populate.ts tests/soak/scenarios/index.ts +git commit -m "$(cat <<'EOF' +feat(soak): populate scenario + scenario registry + +Partitions sessions into N rooms, runs gamesPerRoom games per room +in parallel via Promise.allSettled so a failure in one room never +unwinds the others. Errors roll up into ScenarioResult.errors. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 17: Config parsing with tests + +CLI flags, env vars, scenario defaults, runner defaults — merged in that precedence order. + +**Files:** +- Create: `tests/soak/config.ts` +- Create: `tests/soak/tests/config.test.ts` + +- [ ] **Step 1: Write failing tests** + +```typescript +// tests/soak/tests/config.test.ts +import { describe, it, expect } from 'vitest'; +import { parseArgs, mergeConfig } from '../config'; + +describe('parseArgs', () => { + it('parses --scenario and numeric flags', () => { + const r = parseArgs(['--scenario=populate', '--rooms=4', '--games-per-room=10']); + expect(r.scenario).toBe('populate'); + expect(r.rooms).toBe(4); + expect(r.gamesPerRoom).toBe(10); + }); + + it('parses watch mode', () => { + const r = parseArgs(['--scenario=populate', '--watch=none']); + expect(r.watch).toBe('none'); + }); + + it('rejects unknown watch mode', () => { + expect(() => parseArgs(['--scenario=populate', '--watch=bogus'])).toThrow(); + }); + + it('--list sets listOnly', () => { + const r = parseArgs(['--list']); + expect(r.listOnly).toBe(true); + }); +}); + +describe('mergeConfig', () => { + it('CLI flags override scenario defaults', () => { + const cfg = mergeConfig( + { games: 5, holes: 9 }, + {}, + { gamesPerRoom: 20 }, + ); + expect(cfg.gamesPerRoom).toBe(20); + }); + + it('env overrides scenario defaults but not CLI', () => { + const cfg = mergeConfig( + { games: 5, holes: 9 }, + { SOAK_HOLES: '3' }, + { holes: 7 }, + ); + expect(cfg.holes).toBe(7); // CLI wins (7 was from scenario defaults? no — CLI not set here) + // Correction: CLI not set, so env wins over scenario default + }); + + it('scenario defaults fill in unset values', () => { + const cfg = mergeConfig( + { games: 5, holes: 9 }, + {}, + { gamesPerRoom: 3 }, + ); + expect(cfg.games).toBe(5); + expect(cfg.holes).toBe(9); + expect(cfg.gamesPerRoom).toBe(3); + }); +}); +``` + +Note: the middle test has a correction inline — re-read and fix so the assertion matches precedence "CLI > env > defaults". Correct version: + +```typescript + it('env overrides scenario defaults but CLI overrides env', () => { + const cfg = mergeConfig( + { holes: 5 }, // CLI + { SOAK_HOLES: '3' }, // env + { holes: 9 }, // defaults + ); + expect(cfg.holes).toBe(5); // CLI wins + }); +``` + +Replace the second `it(...)` block above with this corrected version before running. + +- [ ] **Step 2: Run tests to verify they fail** + +```bash +npx vitest run tests/config.test.ts +``` + +Expected: FAIL — module not found. + +- [ ] **Step 3: Implement `config.ts`** + +```typescript +// tests/soak/config.ts + +export type WatchMode = 'none' | 'dashboard' | 'tiled'; + +export interface CliArgs { + scenario?: string; + accounts?: number; + rooms?: number; + cpusPerRoom?: number; + gamesPerRoom?: number; + holes?: number; + watch?: WatchMode; + dashboardPort?: number; + target?: string; + runId?: string; + dryRun?: boolean; + listOnly?: boolean; +} + +const VALID_WATCH: WatchMode[] = ['none', 'dashboard', 'tiled']; + +function parseInt10(s: string, name: string): number { + const n = parseInt(s, 10); + if (Number.isNaN(n)) throw new Error(`Invalid integer for ${name}: ${s}`); + return n; +} + +export function parseArgs(argv: string[]): CliArgs { + const out: CliArgs = {}; + for (const arg of argv) { + if (arg === '--list') { + out.listOnly = true; + continue; + } + if (arg === '--dry-run') { + out.dryRun = true; + continue; + } + const m = arg.match(/^--([a-z][a-z0-9-]*)=(.*)$/); + if (!m) continue; + const [, key, value] = m; + switch (key) { + case 'scenario': + out.scenario = value; + break; + case 'accounts': + out.accounts = parseInt10(value, '--accounts'); + break; + case 'rooms': + out.rooms = parseInt10(value, '--rooms'); + break; + case 'cpus-per-room': + out.cpusPerRoom = parseInt10(value, '--cpus-per-room'); + break; + case 'games-per-room': + out.gamesPerRoom = parseInt10(value, '--games-per-room'); + break; + case 'holes': + out.holes = parseInt10(value, '--holes'); + break; + case 'watch': + if (!VALID_WATCH.includes(value as WatchMode)) { + throw new Error(`Invalid --watch value: ${value} (expected ${VALID_WATCH.join('|')})`); + } + out.watch = value as WatchMode; + break; + case 'dashboard-port': + out.dashboardPort = parseInt10(value, '--dashboard-port'); + break; + case 'target': + out.target = value; + break; + case 'run-id': + out.runId = value; + break; + default: + // Unknown flag — ignore so scenario-specific flags can be added later + break; + } + } + return out; +} + +/** + * Merge in order: scenarioDefaults → env → cli (later wins). + */ +export function mergeConfig( + cli: Record, + env: Record, + defaults: Record, +): Record { + const merged: Record = { ...defaults }; + + // Env overlay — SOAK_UPPER_SNAKE → lowerCamel in cli space. + const envMap: Record = { + SOAK_HOLES: 'holes', + SOAK_ROOMS: 'rooms', + SOAK_ACCOUNTS: 'accounts', + SOAK_CPUS_PER_ROOM: 'cpusPerRoom', + SOAK_GAMES_PER_ROOM: 'gamesPerRoom', + SOAK_WATCH: 'watch', + SOAK_DASHBOARD_PORT: 'dashboardPort', + }; + for (const [envKey, cfgKey] of Object.entries(envMap)) { + const v = env[envKey]; + if (v !== undefined) { + // Heuristic: numeric keys + if (/^(holes|rooms|accounts|cpusPerRoom|gamesPerRoom|dashboardPort)$/.test(cfgKey)) { + merged[cfgKey] = parseInt(v, 10); + } else { + merged[cfgKey] = v; + } + } + } + + // CLI overlay — wins over env and defaults. + for (const [k, v] of Object.entries(cli)) { + if (v !== undefined) merged[k] = v; + } + + return merged; +} +``` + +- [ ] **Step 4: Fix the failing middle test as noted in Step 1** + +Edit `tests/soak/tests/config.test.ts` and replace the second `it(...)` block inside `describe('mergeConfig')` with the corrected version provided in Step 1. + +- [ ] **Step 5: Run tests to verify they pass** + +```bash +npx vitest run tests/config.test.ts +``` + +Expected: all passing. + +- [ ] **Step 6: Commit** + +```bash +git add tests/soak/config.ts tests/soak/tests/config.test.ts +git commit -m "$(cat <<'EOF' +feat(soak): CLI parsing + config precedence + +parseArgs pulls --scenario/--rooms/--watch/etc from argv, mergeConfig +layers scenarioDefaults → env → CLI so CLI flags always win. Unit +tested. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 18: `runner.ts` entry point — first end-to-end milestone + +Replaces the placeholder runner with the real thing: parse args, build dependencies, load scenario, acquire sessions, run scenario, clean up, print summary. Supports `--watch=none` only at this stage. + +**Files:** +- Modify: `tests/soak/runner.ts` (replace placeholder) + +- [ ] **Step 1: Rewrite `runner.ts`** + +```typescript +#!/usr/bin/env tsx +/** + * Golf Soak Harness — entry point. + * + * Usage: + * TEST_URL=http://localhost:8000 \ + * SOAK_INVITE_CODE=SOAKTEST \ + * npm run soak -- --scenario=populate --rooms=1 --accounts=2 \ + * --cpus-per-room=0 --games-per-room=1 --holes=1 --watch=none + */ + +import * as path from 'path'; +import { parseArgs, mergeConfig, CliArgs } from './config'; +import { createLogger } from './core/logger'; +import { SessionPool } from './core/session-pool'; +import { RoomCoordinator } from './core/room-coordinator'; +import { getScenario, listScenarios } from './scenarios'; +import type { DashboardReporter, ScenarioContext } from './core/types'; + +function noopDashboard(): DashboardReporter { + return { + update: () => {}, + log: () => {}, + incrementMetric: () => {}, + }; +} + +function printScenarioList(): void { + console.log('Available scenarios:'); + for (const s of listScenarios()) { + console.log(` ${s.name.padEnd(12)} ${s.description}`); + console.log(` needs: accounts=${s.needs.accounts}, rooms=${s.needs.rooms ?? 1}, cpus=${s.needs.cpusPerRoom ?? 0}`); + } +} + +async function main(): Promise { + const cli: CliArgs = parseArgs(process.argv.slice(2)); + + if (cli.listOnly) { + printScenarioList(); + return; + } + + if (!cli.scenario) { + console.error('Error: --scenario= is required. Use --list to see scenarios.'); + process.exit(2); + } + + const scenario = getScenario(cli.scenario); + if (!scenario) { + console.error(`Error: unknown scenario "${cli.scenario}". Use --list to see scenarios.`); + process.exit(2); + } + + const runId = cli.runId ?? `${cli.scenario}-${new Date().toISOString().replace(/[:.]/g, '-')}`; + const targetUrl = cli.target ?? process.env.TEST_URL ?? 'http://localhost:8000'; + const inviteCode = process.env.SOAK_INVITE_CODE ?? 'SOAKTEST'; + const watch = cli.watch ?? 'dashboard'; + + const logger = createLogger({ runId }); + logger.info('run_start', { + scenario: scenario.name, + targetUrl, + watch, + cli, + }); + + // Resolve final config + const config = mergeConfig( + cli as Record, + process.env, + scenario.defaultConfig, + ); + // Ensure core knobs exist + const accounts = Number(config.accounts ?? scenario.needs.accounts); + const rooms = Number(config.rooms ?? scenario.needs.rooms ?? 1); + const cpusPerRoom = Number(config.cpusPerRoom ?? scenario.needs.cpusPerRoom ?? 0); + if (accounts % rooms !== 0) { + console.error(`Error: --accounts=${accounts} does not divide evenly into --rooms=${rooms}`); + process.exit(2); + } + config.rooms = rooms; + config.cpusPerRoom = cpusPerRoom; + + if (cli.dryRun) { + logger.info('dry_run', { config }); + console.log('Dry run OK. Resolved config:'); + console.log(JSON.stringify(config, null, 2)); + return; + } + + if (watch !== 'none') { + logger.warn('watch_mode_not_yet_implemented', { watch }); + console.warn(`Watch mode "${watch}" not yet implemented — falling back to "none".`); + } + + // Build dependencies + const credFile = path.resolve(__dirname, '.env.stresstest'); + const pool = new SessionPool({ + targetUrl, + inviteCode, + credFile, + logger, + }); + const coordinator = new RoomCoordinator(); + const dashboard = noopDashboard(); + const abortController = new AbortController(); + + const onSignal = (sig: string) => { + logger.warn('signal_received', { signal: sig }); + abortController.abort(); + }; + process.on('SIGINT', () => onSignal('SIGINT')); + process.on('SIGTERM', () => onSignal('SIGTERM')); + + let exitCode = 0; + try { + const sessions = await pool.acquire(accounts); + logger.info('sessions_acquired', { count: sessions.length }); + + const ctx: ScenarioContext = { + config, + sessions, + coordinator, + dashboard, + logger, + signal: abortController.signal, + heartbeat: () => {}, // Task 26 wires this up + }; + + const result = await scenario.run(ctx); + logger.info('run_complete', { + gamesCompleted: result.gamesCompleted, + errors: result.errors.length, + durationMs: result.durationMs, + }); + console.log(`Games completed: ${result.gamesCompleted}`); + console.log(`Errors: ${result.errors.length}`); + console.log(`Duration: ${(result.durationMs / 1000).toFixed(1)}s`); + if (result.errors.length > 0) { + console.log('Errors:'); + for (const e of result.errors) { + console.log(` ${e.room}: ${e.reason}${e.detail ? ' — ' + e.detail : ''}`); + } + exitCode = 1; + } + } catch (err) { + logger.error('run_failed', { + error: err instanceof Error ? err.message : String(err), + stack: err instanceof Error ? err.stack : undefined, + }); + exitCode = 1; + } finally { + await pool.release(); + } + + if (abortController.signal.aborted && exitCode === 0) exitCode = 2; + process.exit(exitCode); +} + +main().catch((err) => { + console.error(err); + process.exit(1); +}); +``` + +- [ ] **Step 2: Run a minimal `--watch=none` smoke against local dev** + +Server running, 4 soak accounts already seeded from Task 14: + +```bash +cd tests/soak +TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ + --scenario=populate \ + --accounts=2 \ + --rooms=1 \ + --cpus-per-room=0 \ + --games-per-room=1 \ + --holes=1 \ + --watch=none +``` + +Expected output (abbreviated): + +``` +{"timestamp":"...","level":"info","msg":"run_start",...} +{"timestamp":"...","level":"info","msg":"sessions_acquired","count":2} +{"timestamp":"...","level":"info","msg":"game_start","room":"room-0","game":1} +{"timestamp":"...","level":"info","msg":"room_created","code":"XXXX"} +{"timestamp":"...","level":"info","msg":"game_complete","room":"room-0","turns":...} +{"timestamp":"...","level":"info","msg":"run_complete","gamesCompleted":1,"errors":0} +Games completed: 1 +Errors: 0 +Duration: X.Xs +``` + +Exit code 0. + +This is the first **end-to-end milestone**. Stop here if debugging is needed — fix issues before moving on. + +- [ ] **Step 3: Commit** + +```bash +git add tests/soak/runner.ts +git commit -m "$(cat <<'EOF' +feat(soak): runner.ts end-to-end with --watch=none + +First full end-to-end milestone: parses CLI, builds SessionPool + +RoomCoordinator, loads a scenario by name, runs it, reports results, +cleans up. Watch modes other than "none" log a warning and fall back +until Tasks 19-24 implement them. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +## Phase 5 — Dashboard status grid + +### Task 19: Dashboard HTTP + WS server + +Vanilla node `http` + `ws`. Serves one static HTML page, accepts WS connections, broadcasts room-state updates. + +**Files:** +- Create: `tests/soak/dashboard/server.ts` + +- [ ] **Step 1: Implement `dashboard/server.ts`** + +```typescript +// tests/soak/dashboard/server.ts +import * as http from 'http'; +import * as fs from 'fs'; +import * as path from 'path'; +import { WebSocketServer, WebSocket } from 'ws'; +import type { DashboardReporter, Logger, RoomState } from '../core/types'; + +export type DashboardIncoming = + | { type: 'start_stream'; sessionKey: string } + | { type: 'stop_stream'; sessionKey: string }; + +export type DashboardOutgoing = + | { type: 'room_state'; roomId: string; state: Partial } + | { type: 'log'; level: string; msg: string; meta?: object; timestamp: number } + | { type: 'metric'; name: string; value: number } + | { type: 'frame'; sessionKey: string; jpegBase64: string }; + +export interface DashboardHandlers { + onStartStream?(sessionKey: string): void; + onStopStream?(sessionKey: string): void; + onDisconnect?(): void; +} + +export class DashboardServer { + private httpServer!: http.Server; + private wsServer!: WebSocketServer; + private clients = new Set(); + private metrics: Record = {}; + private roomStates: Record> = {}; + + constructor( + private port: number, + private logger: Logger, + private handlers: DashboardHandlers = {}, + ) {} + + async start(): Promise { + const htmlPath = path.resolve(__dirname, 'index.html'); + const cssPath = path.resolve(__dirname, 'dashboard.css'); + const jsPath = path.resolve(__dirname, 'dashboard.js'); + + this.httpServer = http.createServer((req, res) => { + const url = req.url ?? '/'; + if (url === '/' || url === '/index.html') { + res.writeHead(200, { 'Content-Type': 'text/html; charset=utf-8' }); + fs.createReadStream(htmlPath).pipe(res); + } else if (url === '/dashboard.css') { + res.writeHead(200, { 'Content-Type': 'text/css' }); + fs.createReadStream(cssPath).pipe(res); + } else if (url === '/dashboard.js') { + res.writeHead(200, { 'Content-Type': 'application/javascript' }); + fs.createReadStream(jsPath).pipe(res); + } else { + res.writeHead(404); + res.end('not found'); + } + }); + + this.wsServer = new WebSocketServer({ server: this.httpServer }); + this.wsServer.on('connection', (ws) => { + this.clients.add(ws); + this.logger.info('dashboard_client_connected', { count: this.clients.size }); + + // Replay current state to the new client + for (const [roomId, state] of Object.entries(this.roomStates)) { + ws.send(JSON.stringify({ type: 'room_state', roomId, state } as DashboardOutgoing)); + } + for (const [name, value] of Object.entries(this.metrics)) { + ws.send(JSON.stringify({ type: 'metric', name, value } as DashboardOutgoing)); + } + + ws.on('message', (data) => { + try { + const parsed = JSON.parse(data.toString()) as DashboardIncoming; + if (parsed.type === 'start_stream' && this.handlers.onStartStream) { + this.handlers.onStartStream(parsed.sessionKey); + } else if (parsed.type === 'stop_stream' && this.handlers.onStopStream) { + this.handlers.onStopStream(parsed.sessionKey); + } + } catch (err) { + this.logger.warn('dashboard_ws_parse_error', { + error: err instanceof Error ? err.message : String(err), + }); + } + }); + + ws.on('close', () => { + this.clients.delete(ws); + this.logger.info('dashboard_client_disconnected', { count: this.clients.size }); + if (this.clients.size === 0 && this.handlers.onDisconnect) { + this.handlers.onDisconnect(); + } + }); + }); + + await new Promise((resolve) => { + this.httpServer.listen(this.port, () => resolve()); + }); + this.logger.info('dashboard_listening', { url: `http://localhost:${this.port}` }); + } + + async stop(): Promise { + for (const ws of this.clients) { + try { + ws.close(); + } catch { + // ignore + } + } + this.clients.clear(); + await new Promise((resolve) => { + this.wsServer.close(() => resolve()); + }); + await new Promise((resolve) => { + this.httpServer.close(() => resolve()); + }); + } + + broadcast(msg: DashboardOutgoing): void { + const payload = JSON.stringify(msg); + for (const ws of this.clients) { + if (ws.readyState === WebSocket.OPEN) { + ws.send(payload); + } + } + } + + /** Create a DashboardReporter wired to this server. */ + reporter(): DashboardReporter { + return { + update: (roomId, state) => { + this.roomStates[roomId] = { ...this.roomStates[roomId], ...state }; + this.broadcast({ type: 'room_state', roomId, state }); + }, + log: (level, msg, meta) => { + this.broadcast({ type: 'log', level, msg, meta, timestamp: Date.now() }); + }, + incrementMetric: (name, by = 1) => { + this.metrics[name] = (this.metrics[name] ?? 0) + by; + this.broadcast({ type: 'metric', name, value: this.metrics[name] }); + }, + }; + } +} +``` + +- [ ] **Step 2: Syntax-check** + +```bash +cd tests/soak +npx tsx -e "import('./dashboard/server').then(() => console.log('ok'))" +``` + +Expected: `ok`. + +- [ ] **Step 3: Commit** + +```bash +git add tests/soak/dashboard/server.ts +git commit -m "$(cat <<'EOF' +feat(soak): DashboardServer — vanilla http + ws + +Serves one static HTML page, accepts WS connections, broadcasts +room_state/log/metric messages to all clients. Exposes a +reporter() method that returns a DashboardReporter scenarios can +call without knowing about sockets. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 20: Dashboard HTML/CSS/JS status grid + +Single static HTML page + stylesheet + client script. Renders the 2×2 room grid, subscribes to WS, updates tiles on each message. + +**Files:** +- Create: `tests/soak/dashboard/index.html` +- Create: `tests/soak/dashboard/dashboard.css` +- Create: `tests/soak/dashboard/dashboard.js` + +- [ ] **Step 1: Create `dashboard/index.html`** + +```html + + + + + +Golf Soak Dashboard + + + +
+

⛳ Golf Soak Dashboard

+
+ run — + 00:00:00 +
+
+ +
+
Games0
+
Moves0
+
Errors0
+
WSconnecting
+
+ +
+ +
+ +
+
Activity Log
+
    +
    + + + + + + + +``` + +- [ ] **Step 2: Create `dashboard/dashboard.css`** + +```css +:root { + --bg: #0a0e16; + --panel: #0e1420; + --border: #1a2230; + --text: #c8d4e4; + --accent: #7fbaff; + --good: #6fd08f; + --warn: #ffb84d; + --err: #ff5c6c; + --muted: #556577; +} + +* { box-sizing: border-box; } + +body { + margin: 0; + font-family: -apple-system, system-ui, 'SF Mono', Consolas, monospace; + background: var(--bg); + color: var(--text); +} + +.dash-header { + display: flex; + justify-content: space-between; + align-items: center; + padding: 12px 20px; + background: linear-gradient(135deg, #0f1823, #0a1018); + border-bottom: 1px solid var(--border); +} +.dash-header h1 { margin: 0; font-size: 16px; color: var(--accent); } +.dash-header .meta { font-size: 11px; color: var(--muted); } +.dash-header .meta span + span { margin-left: 12px; } + +.meta-bar { + display: flex; + gap: 24px; + padding: 10px 20px; + background: #0c131d; + border-bottom: 1px solid var(--border); + font-size: 12px; +} +.meta-bar .stat .label { color: var(--muted); margin-right: 6px; } +.meta-bar .stat span:last-child { color: #fff; font-weight: 600; } + +.rooms { + display: grid; + grid-template-columns: 1fr 1fr; + gap: 1px; + background: var(--border); +} +.room { + background: var(--panel); + padding: 14px 18px; + min-height: 180px; +} +.room-title { + display: flex; + justify-content: space-between; + align-items: center; + margin-bottom: 10px; +} +.room-title .name { font-size: 13px; color: var(--accent); font-weight: 600; } +.room-title .phase { + font-size: 10px; + padding: 2px 8px; + border-radius: 10px; + background: #1a3a2a; + color: var(--good); +} +.room-title .phase.lobby { background: #3a2a1a; color: var(--warn); } +.room-title .phase.err { background: #3a1a1a; color: var(--err); } + +.players { + display: grid; + grid-template-columns: repeat(2, 1fr); + gap: 4px; + font-size: 11px; + margin-bottom: 8px; +} +.player { + display: flex; + justify-content: space-between; + padding: 4px 8px; + background: #0a0f18; + border-radius: 3px; + cursor: pointer; + border: 1px solid transparent; +} +.player:hover { border-color: var(--accent); } +.player.active { + background: #1a2a40; + border-left: 2px solid var(--accent); +} +.player .score { color: var(--muted); } + +.progress-bar { + height: 4px; + background: var(--border); + border-radius: 2px; + overflow: hidden; + margin-top: 6px; +} +.progress-fill { + height: 100%; + background: linear-gradient(90deg, var(--accent), var(--good)); + transition: width 0.3s; +} +.room-meta { + font-size: 10px; + color: var(--muted); + display: flex; + gap: 12px; + margin-top: 6px; +} + +.log { + border-top: 1px solid var(--border); + background: #080c13; + max-height: 160px; + overflow-y: auto; +} +.log .log-header { + padding: 6px 20px; + font-size: 10px; + text-transform: uppercase; + color: var(--muted); + border-bottom: 1px solid var(--border); +} +.log ul { list-style: none; margin: 0; padding: 4px 20px; font-size: 10px; } +.log li { line-height: 1.5; font-family: monospace; color: var(--muted); } +.log li.warn { color: var(--warn); } +.log li.error { color: var(--err); } + +.video-modal { + position: fixed; + inset: 0; + background: rgba(0, 0, 0, 0.85); + display: flex; + align-items: center; + justify-content: center; + z-index: 100; +} +.video-modal.hidden { display: none; } +.video-modal-content { + background: var(--panel); + border: 1px solid var(--border); + border-radius: 6px; + padding: 16px; + max-width: 90vw; + max-height: 90vh; +} +.video-modal-header { + display: flex; + justify-content: space-between; + align-items: center; + margin-bottom: 12px; + color: var(--accent); + font-size: 13px; +} +.video-modal-header button { + background: var(--border); + color: var(--text); + border: none; + padding: 4px 12px; + border-radius: 3px; + cursor: pointer; +} +#video-frame { + display: block; + max-width: 100%; + max-height: 70vh; + border: 1px solid var(--border); +} +``` + +- [ ] **Step 3: Create `dashboard/dashboard.js`** + +```javascript +// tests/soak/dashboard/dashboard.js +(() => { + const ws = new WebSocket(`ws://${location.host}`); + const roomsEl = document.getElementById('rooms'); + const logEl = document.getElementById('log-list'); + const wsStatusEl = document.getElementById('ws-status'); + const metricGames = document.getElementById('metric-games'); + const metricMoves = document.getElementById('metric-moves'); + const metricErrors = document.getElementById('metric-errors'); + const elapsedEl = document.getElementById('elapsed'); + + const roomTiles = new Map(); + const startTime = Date.now(); + let currentWatchedKey = null; + + // Video modal + const videoModal = document.getElementById('video-modal'); + const videoFrame = document.getElementById('video-frame'); + const videoTitle = document.getElementById('video-modal-title'); + const videoClose = document.getElementById('video-modal-close'); + + function fmtElapsed(ms) { + const s = Math.floor(ms / 1000); + const h = Math.floor(s / 3600); + const m = Math.floor((s % 3600) / 60); + const sec = s % 60; + return `${String(h).padStart(2, '0')}:${String(m).padStart(2, '0')}:${String(sec).padStart(2, '0')}`; + } + setInterval(() => { + elapsedEl.textContent = fmtElapsed(Date.now() - startTime); + }, 1000); + + function ensureRoomTile(roomId) { + if (roomTiles.has(roomId)) return roomTiles.get(roomId); + const tile = document.createElement('div'); + tile.className = 'room'; + tile.innerHTML = ` +
    +
    ${roomId}
    +
    waiting
    +
    +
    +
    +
    + 0 moves + game — +
    + `; + roomsEl.appendChild(tile); + roomTiles.set(roomId, tile); + return tile; + } + + function renderRoomState(roomId, state) { + const tile = ensureRoomTile(roomId); + if (state.phase !== undefined) { + const phaseEl = tile.querySelector('.phase'); + phaseEl.textContent = state.phase; + phaseEl.classList.toggle('lobby', state.phase === 'lobby' || state.phase === 'waiting'); + phaseEl.classList.toggle('err', state.phase === 'error'); + } + if (state.players !== undefined) { + const playersEl = tile.querySelector('.players'); + playersEl.innerHTML = state.players + .map( + (p) => ` +
    + ${p.isActive ? '▶ ' : ''}${p.key} + ${p.score ?? '—'} +
    + `, + ) + .join(''); + } + if (state.hole !== undefined && state.totalHoles !== undefined) { + const fill = tile.querySelector('.progress-fill'); + const pct = state.totalHoles > 0 ? Math.round((state.hole / state.totalHoles) * 100) : 0; + fill.style.width = `${pct}%`; + } + if (state.moves !== undefined) { + tile.querySelector('.moves').textContent = `${state.moves} moves`; + } + if (state.game !== undefined && state.totalGames !== undefined) { + tile.querySelector('.game').textContent = `game ${state.game}/${state.totalGames}`; + } + } + + function appendLog(level, msg, meta) { + const li = document.createElement('li'); + li.className = level; + const ts = new Date().toLocaleTimeString(); + li.textContent = `[${ts}] ${msg} ${meta ? JSON.stringify(meta) : ''}`; + logEl.insertBefore(li, logEl.firstChild); + // Cap log length + while (logEl.children.length > 100) { + logEl.removeChild(logEl.lastChild); + } + } + + function applyMetric(name, value) { + if (name === 'games_completed') metricGames.textContent = value; + else if (name === 'moves_total') metricMoves.textContent = value; + else if (name === 'errors') metricErrors.textContent = value; + } + + ws.addEventListener('open', () => { + wsStatusEl.textContent = 'healthy'; + wsStatusEl.style.color = 'var(--good)'; + }); + ws.addEventListener('close', () => { + wsStatusEl.textContent = 'disconnected'; + wsStatusEl.style.color = 'var(--err)'; + }); + ws.addEventListener('message', (event) => { + let msg; + try { + msg = JSON.parse(event.data); + } catch { + return; + } + if (msg.type === 'room_state') { + renderRoomState(msg.roomId, msg.state); + } else if (msg.type === 'log') { + appendLog(msg.level, msg.msg, msg.meta); + } else if (msg.type === 'metric') { + applyMetric(msg.name, msg.value); + } else if (msg.type === 'frame') { + if (msg.sessionKey === currentWatchedKey) { + videoFrame.src = `data:image/jpeg;base64,${msg.jpegBase64}`; + } + } + }); + + // Click-to-watch (wired in Task 23) + roomsEl.addEventListener('click', (e) => { + const playerEl = e.target.closest('.player'); + if (!playerEl) return; + const key = playerEl.dataset.session; + if (!key) return; + currentWatchedKey = key; + videoTitle.textContent = `Watching ${key}`; + videoModal.classList.remove('hidden'); + ws.send(JSON.stringify({ type: 'start_stream', sessionKey: key })); + }); + + function closeVideo() { + if (currentWatchedKey) { + ws.send(JSON.stringify({ type: 'stop_stream', sessionKey: currentWatchedKey })); + } + currentWatchedKey = null; + videoModal.classList.add('hidden'); + videoFrame.src = ''; + } + videoClose.addEventListener('click', closeVideo); + document.addEventListener('keydown', (e) => { + if (e.key === 'Escape') closeVideo(); + }); +})(); +``` + +- [ ] **Step 4: Commit** + +```bash +git add tests/soak/dashboard/index.html tests/soak/dashboard/dashboard.css tests/soak/dashboard/dashboard.js +git commit -m "$(cat <<'EOF' +feat(soak): dashboard status grid UI + +Static HTML page served by DashboardServer. Renders the 2×2 room +grid with progress bars and player tiles, subscribes to WS events, +updates tiles live. Click-to-watch modal is wired but receives +frames once the CDP screencaster ships in Task 22. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 21: Wire `WATCH=dashboard` in runner + +Start the dashboard server when `--watch=dashboard`, auto-open the URL in the user's browser, use its `reporter()` as the `ctx.dashboard`. + +**Files:** +- Modify: `tests/soak/runner.ts` + +- [ ] **Step 1: Import and instantiate DashboardServer in `runner.ts`** + +At the top of `runner.ts`, add: + +```typescript +import { DashboardServer } from './dashboard/server'; +import { spawn } from 'child_process'; +``` + +Replace the block that creates `dashboard` with: + +```typescript + // Build dashboard if requested + let dashboardServer: DashboardServer | null = null; + let dashboard: DashboardReporter = noopDashboard(); + if (watch === 'dashboard') { + const port = Number(config.dashboardPort ?? 7777); + dashboardServer = new DashboardServer(port, logger, { + onStartStream: (_key) => { + logger.info('stream_start_requested', { sessionKey: _key }); + // Wired in Task 22 + }, + onStopStream: (_key) => { + logger.info('stream_stop_requested', { sessionKey: _key }); + }, + }); + await dashboardServer.start(); + dashboard = dashboardServer.reporter(); + const url = `http://localhost:${port}`; + console.log(`Dashboard: ${url}`); + // Best-effort auto-open + try { + const opener = process.platform === 'darwin' ? 'open' : process.platform === 'win32' ? 'start' : 'xdg-open'; + spawn(opener, [url], { stdio: 'ignore', detached: true }).unref(); + } catch { + // If auto-open fails, the URL is already printed + } + } else if (watch === 'tiled') { + logger.warn('tiled_not_yet_implemented'); + console.warn('Watch mode "tiled" not yet implemented (Task 24). Falling back to none.'); + } +``` + +And in the `finally` block, shut down the server: + +```typescript + } finally { + await pool.release(); + if (dashboardServer) { + await dashboardServer.stop(); + } + } +``` + +Also remove the earlier `if (watch !== 'none')` warning block — it's replaced by the dispatch above. + +- [ ] **Step 2: Run smoke against dev with dashboard** + +```bash +TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ + --scenario=populate \ + --accounts=2 --rooms=1 --cpus-per-room=0 --games-per-room=1 --holes=1 \ + --watch=dashboard +``` + +Expected: +- `Dashboard: http://localhost:7777` printed +- Browser auto-opens (or you open it manually) +- Page shows the dashboard with `WS: healthy` +- During the game, the `room-0` tile shows `phase: playing`, increments `moves`, updates progress +- After game completes, the runner exits 0 and the dashboard stops + +- [ ] **Step 3: Commit** + +```bash +git add tests/soak/runner.ts +git commit -m "$(cat <<'EOF' +feat(soak): wire --watch=dashboard in runner + +Starts DashboardServer on 7777 (configurable), uses its reporter as +ctx.dashboard, auto-opens the URL. Cleans up on exit. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +## Phase 6 — Live video click-to-watch + +### Task 22: CDP screencast module + +Attach a CDP session to a given page, start screencasting JPEG frames at a fixed rate, forward each frame to a callback, detach on stop. + +**Files:** +- Create: `tests/soak/core/screencaster.ts` + +- [ ] **Step 1: Implement `core/screencaster.ts`** + +```typescript +// tests/soak/core/screencaster.ts +import type { Page, CDPSession } from 'playwright-core'; +import type { Logger } from './types'; + +export interface ScreencastOptions { + format?: 'jpeg' | 'png'; + quality?: number; + maxWidth?: number; + maxHeight?: number; + everyNthFrame?: number; +} + +export type FrameCallback = (jpegBase64: string) => void; + +export class Screencaster { + private sessions = new Map(); + + constructor(private logger: Logger) {} + + /** + * Attach a CDP session to the given page and start forwarding frames. + * If already streaming, this is a no-op. + */ + async start( + sessionKey: string, + page: Page, + onFrame: FrameCallback, + opts: ScreencastOptions = {}, + ): Promise { + if (this.sessions.has(sessionKey)) { + this.logger.warn('screencast_already_running', { sessionKey }); + return; + } + const client = await page.context().newCDPSession(page); + this.sessions.set(sessionKey, client); + + client.on('Page.screencastFrame', async (evt: { data: string; sessionId: number }) => { + try { + onFrame(evt.data); + await client.send('Page.screencastFrameAck', { sessionId: evt.sessionId }); + } catch (err) { + this.logger.warn('screencast_frame_error', { + sessionKey, + error: err instanceof Error ? err.message : String(err), + }); + } + }); + + await client.send('Page.startScreencast', { + format: opts.format ?? 'jpeg', + quality: opts.quality ?? 60, + maxWidth: opts.maxWidth ?? 640, + maxHeight: opts.maxHeight ?? 360, + everyNthFrame: opts.everyNthFrame ?? 2, + }); + this.logger.info('screencast_started', { sessionKey }); + } + + async stop(sessionKey: string): Promise { + const client = this.sessions.get(sessionKey); + if (!client) return; + try { + await client.send('Page.stopScreencast'); + await client.detach(); + } catch (err) { + this.logger.warn('screencast_stop_error', { + sessionKey, + error: err instanceof Error ? err.message : String(err), + }); + } + this.sessions.delete(sessionKey); + this.logger.info('screencast_stopped', { sessionKey }); + } + + async stopAll(): Promise { + const keys = Array.from(this.sessions.keys()); + await Promise.all(keys.map((k) => this.stop(k))); + } +} +``` + +- [ ] **Step 2: Syntax-check** + +```bash +cd tests/soak +npx tsx -e "import('./core/screencaster').then(() => console.log('ok'))" +``` + +Expected: `ok`. + +- [ ] **Step 3: Commit** + +```bash +git add tests/soak/core/screencaster.ts +git commit -m "$(cat <<'EOF' +feat(soak): Screencaster — CDP Page.startScreencast wrapper + +Attach/detach CDP sessions per Playwright Page, start/stop JPEG +screencasts with configurable quality and frame rate, forward each +frame to a callback. Used by the dashboard for click-to-watch +live video. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 23: Wire screencaster to dashboard click-to-watch + +Runner creates a `Screencaster`, passes callbacks into `DashboardServer.onStartStream/onStopStream` that look up the right session and start/stop streaming. Each frame is broadcast to the dashboard. + +**Files:** +- Modify: `tests/soak/runner.ts` + +- [ ] **Step 1: Import Screencaster and hold a sessions map** + +In `runner.ts`, add at the top: + +```typescript +import { Screencaster } from './core/screencaster'; +``` + +After `const sessions = await pool.acquire(accounts);`, build a lookup map: + +```typescript + const sessionsByKey = new Map(); + for (const s of sessions) sessionsByKey.set(s.key, s); +``` + +Create the screencaster before the dashboard (or right after sessions are acquired): + +```typescript + const screencaster = new Screencaster(logger); +``` + +- [ ] **Step 2: Replace the `onStartStream`/`onStopStream` no-ops with real wiring** + +Update the `DashboardServer` construction (earlier in the function) to accept handlers that close over `screencaster` and `sessionsByKey`. But since those are built after the dashboard, we need to build the dashboard AFTER sessions are acquired. Reorganize: + +Move the dashboard construction to AFTER `sessions = await pool.acquire(accounts)`. Then: + +```typescript + if (watch === 'dashboard') { + const port = Number(config.dashboardPort ?? 7777); + dashboardServer = new DashboardServer(port, logger, { + onStartStream: (key) => { + const session = sessionsByKey.get(key); + if (!session) { + logger.warn('stream_start_unknown_session', { sessionKey: key }); + return; + } + screencaster + .start(key, session.page, (jpegBase64) => { + dashboardServer!.broadcast({ type: 'frame', sessionKey: key, jpegBase64 }); + }) + .catch((err) => + logger.error('screencast_start_failed', { + key, + error: err instanceof Error ? err.message : String(err), + }), + ); + }, + onStopStream: (key) => { + screencaster.stop(key).catch(() => {}); + }, + onDisconnect: () => { + screencaster.stopAll().catch(() => {}); + }, + }); + await dashboardServer.start(); + dashboard = dashboardServer.reporter(); + const url = `http://localhost:${port}`; + console.log(`Dashboard: ${url}`); + // ... auto-open + } +``` + +Make sure the `ctx.dashboard` assignment happens AFTER the dashboard setup (it already does — `const ctx = { ... dashboard, ... }` comes later). + +In the `finally` block, add: + +```typescript + await screencaster.stopAll(); +``` + +- [ ] **Step 3: Manual test end-to-end** + +Run a longer populate game so there's time to click: + +```bash +TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ + --scenario=populate \ + --accounts=4 --rooms=1 --cpus-per-room=0 --games-per-room=2 --holes=3 \ + --watch=dashboard +``` + +Expected: +1. Dashboard opens, shows 1 room with 4 players +2. Click on any player tile (`soak_00`, `soak_01`, ...) +3. Modal opens, shows live JPEG frames of that player's view of the game +4. Close modal (Esc or Close button) — frames stop, screencast detaches +5. Run completes cleanly + +- [ ] **Step 4: Commit** + +```bash +git add tests/soak/runner.ts +git commit -m "$(cat <<'EOF' +feat(soak): click-to-watch live video via CDP screencast + +Runner creates a Screencaster and wires its start/stop into +DashboardServer.onStartStream/onStopStream. Clicking a player tile +in the dashboard starts a CDP screencast on that session's page, +forwards JPEG frames as WS "frame" messages, closes on modal +dismiss or WS disconnect. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +## Phase 7 — Tiled mode + +### Task 24: `--watch=tiled` native windows + +Launch a second headed browser for the 4 host contexts, position their windows in a 2×2 grid using `page.evaluate(window.moveTo)`. + +**Files:** +- Modify: `tests/soak/core/session-pool.ts` — add optional headed-host support +- Modify: `tests/soak/runner.ts` — enable tiled mode + +- [ ] **Step 1: Extend `SessionPool` to support headed host contexts** + +Add a new option and method to `SessionPool`. In `core/session-pool.ts`: + +```typescript +export interface SessionPoolOptions { + targetUrl: string; + inviteCode: string; + credFile: string; + logger: Logger; + browser?: Browser; + contextOptions?: Parameters[0]; + /** If set, the first `headedHostCount` sessions use a separate headed browser. */ + headedHostCount?: number; +} +``` + +Inside the class, add a `headedBrowser` field and extend `acquire`: + +```typescript + private headedBrowser: Browser | null = null; + + // ... in acquire(), before the loop: + + if ((this.opts.headedHostCount ?? 0) > 0 && !this.headedBrowser) { + this.headedBrowser = await chromium.launch({ + headless: false, + slowMo: 50, + }); + } + + for (let i = 0; i < count; i++) { + const account = this.accounts[i]; + const useHeaded = i < (this.opts.headedHostCount ?? 0); + const targetBrowser = useHeaded ? this.headedBrowser! : this.browser!; + const context = await targetBrowser.newContext({ + ...this.opts.contextOptions, + ...(useHeaded ? { viewport: { width: 960, height: 540 } } : {}), + }); + await this.injectAuth(context, account); + const page = await context.newPage(); + await page.goto(this.opts.targetUrl); + + // Position headed windows in a 2×2 grid + if (useHeaded) { + const col = i % 2; + const row = Math.floor(i / 2); + const x = col * 960; + const y = row * 560; + await page.evaluate( + ([x, y, w, h]) => { + window.moveTo(x, y); + window.resizeTo(w, h); + }, + [x, y, 960, 540] as [number, number, number, number], + ); + } + + const bot = new GolfBot(page); + sessions.push({ account, context, page, bot, key: account.key }); + } +``` + +Update `release` to close the headed browser too: + +```typescript + async release(): Promise { + for (const session of this.activeSessions) { + try { await session.context.close(); } catch { /* ignore */ } + } + this.activeSessions = []; + if (this.ownedBrowser) { + try { await this.ownedBrowser.close(); } catch { /* ignore */ } + this.ownedBrowser = null; + this.browser = null; + } + if (this.headedBrowser) { + try { await this.headedBrowser.close(); } catch { /* ignore */ } + this.headedBrowser = null; + } + } +``` + +- [ ] **Step 2: Wire `watch === 'tiled'` in the runner** + +In `runner.ts`, replace the existing `tiled_not_yet_implemented` warning with: + +```typescript + const headedHostCount = watch === 'tiled' ? rooms : 0; + + const pool = new SessionPool({ + targetUrl, + inviteCode, + credFile, + logger, + headedHostCount, + }); +``` + +(Move that `pool` creation up so it's aware of `watch`.) + +- [ ] **Step 3: Test tiled mode** + +```bash +TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ + --scenario=populate \ + --accounts=4 --rooms=2 --cpus-per-room=0 --games-per-room=1 --holes=1 \ + --watch=tiled +``` + +Expected: 2 native Chromium windows appear (one per host), sized ~960×540 and positioned at the upper-left of the screen. They play the game visibly. On exit, windows close. + +- [ ] **Step 4: Commit** + +```bash +git add tests/soak/core/session-pool.ts tests/soak/runner.ts +git commit -m "$(cat <<'EOF' +feat(soak): --watch=tiled launches N headed host windows + +SessionPool accepts headedHostCount; when > 0 it launches a second +Chromium in headed mode, creates those contexts there, and positions +each host window in a 2×2 grid via window.moveTo/resizeTo. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +## Phase 8 — Stress scenario + +### Task 25: Chaos injector + stress scenario + +Short 1-hole games in tight loops, with a 5% per-turn chance of injecting a chaos event (rapid clicks, brief offline toggle, tab navigation). + +**Files:** +- Create: `tests/soak/scenarios/stress.ts` +- Create: `tests/soak/scenarios/shared/chaos.ts` +- Modify: `tests/soak/scenarios/index.ts` — register `stress` + +- [ ] **Step 1: Create `scenarios/shared/chaos.ts`** + +```typescript +// tests/soak/scenarios/shared/chaos.ts +import type { Session, Logger } from '../../core/types'; + +export type ChaosEvent = + | 'rapid_clicks' + | 'tab_blur' + | 'brief_offline'; + +const ALL_EVENTS: ChaosEvent[] = ['rapid_clicks', 'tab_blur', 'brief_offline']; + +function pickEvent(): ChaosEvent { + return ALL_EVENTS[Math.floor(Math.random() * ALL_EVENTS.length)]; +} + +export async function maybeInjectChaos( + session: Session, + probability: number, + logger: Logger, + roomId: string, +): Promise { + if (Math.random() >= probability) return null; + + const event = pickEvent(); + logger.info('chaos_injected', { room: roomId, session: session.key, event }); + try { + switch (event) { + case 'rapid_clicks': { + // Fire 5 rapid clicks at the player's own cards + for (let i = 0; i < 5; i++) { + await session.page.locator(`#player-cards .card:nth-child(${(i % 6) + 1})`) + .click({ timeout: 300 }) + .catch(() => {}); + } + break; + } + case 'tab_blur': { + // Briefly dispatch blur then focus + await session.page.evaluate(() => { + window.dispatchEvent(new Event('blur')); + setTimeout(() => window.dispatchEvent(new Event('focus')), 200); + }); + break; + } + case 'brief_offline': { + await session.context.setOffline(true); + await new Promise((r) => setTimeout(r, 300)); + await session.context.setOffline(false); + break; + } + } + } catch (err) { + logger.warn('chaos_error', { + event, + error: err instanceof Error ? err.message : String(err), + }); + } + return event; +} +``` + +- [ ] **Step 2: Create `scenarios/stress.ts`** + +```typescript +// tests/soak/scenarios/stress.ts +import type { + Scenario, + ScenarioContext, + ScenarioResult, + ScenarioError, + Session, +} from '../core/types'; +import { runOneMultiplayerGame } from './shared/multiplayer-game'; +import { maybeInjectChaos } from './shared/chaos'; + +interface StressConfig { + gamesPerRoom: number; + holes: number; + decks: number; + rooms: number; + cpusPerRoom: number; + thinkTimeMs: [number, number]; + interGamePauseMs: number; + chaosChance: number; +} + +function chunk(arr: T[], size: number): T[][] { + const out: T[][] = []; + for (let i = 0; i < arr.length; i += size) out.push(arr.slice(i, i + size)); + return out; +} + +async function sleep(ms: number): Promise { + return new Promise((r) => setTimeout(r, ms)); +} + +async function runStressRoom( + ctx: ScenarioContext, + cfg: StressConfig, + roomIdx: number, + sessions: Session[], +): Promise<{ completed: number; errors: ScenarioError[]; chaosFired: number }> { + const roomId = `room-${roomIdx}`; + let completed = 0; + let chaosFired = 0; + const errors: ScenarioError[] = []; + + for (let gameNum = 0; gameNum < cfg.gamesPerRoom; gameNum++) { + if (ctx.signal.aborted) break; + + ctx.dashboard.update(roomId, { game: gameNum + 1, totalGames: cfg.gamesPerRoom }); + + // Start a background chaos loop for this game + let chaosActive = true; + const chaosLoop = (async () => { + while (chaosActive && !ctx.signal.aborted) { + await sleep(500); + for (const session of sessions) { + const e = await maybeInjectChaos(session, cfg.chaosChance, ctx.logger, roomId); + if (e) chaosFired++; + } + } + })(); + + const result = await runOneMultiplayerGame(ctx, sessions, { + roomId, + holes: cfg.holes, + decks: cfg.decks, + cpusPerRoom: cfg.cpusPerRoom, + thinkTimeMs: cfg.thinkTimeMs, + }); + + chaosActive = false; + await chaosLoop; + + if (result.completed) { + completed++; + ctx.logger.info('game_complete', { room: roomId, game: gameNum + 1, turns: result.turns }); + } else { + errors.push({ + room: roomId, + reason: 'game_failed', + detail: result.error, + timestamp: Date.now(), + }); + ctx.logger.error('game_failed', { room: roomId, error: result.error }); + } + + await sleep(cfg.interGamePauseMs); + } + + return { completed, errors, chaosFired }; +} + +const stress: Scenario = { + name: 'stress', + description: 'Rapid short games for stability & race condition hunting', + needs: { accounts: 16, rooms: 4, cpusPerRoom: 2 }, + defaultConfig: { + gamesPerRoom: 50, + holes: 1, + decks: 1, + rooms: 4, + cpusPerRoom: 2, + thinkTimeMs: [50, 150], + interGamePauseMs: 200, + chaosChance: 0.05, + }, + + async run(ctx: ScenarioContext): Promise { + const start = Date.now(); + const cfg = ctx.config as unknown as StressConfig; + const perRoom = Math.floor(ctx.sessions.length / cfg.rooms); + const roomSessions = chunk(ctx.sessions, perRoom); + + const results = await Promise.allSettled( + roomSessions.map((s, idx) => runStressRoom(ctx, cfg, idx, s)), + ); + + let gamesCompleted = 0; + let chaosFired = 0; + const errors: ScenarioError[] = []; + results.forEach((r, idx) => { + if (r.status === 'fulfilled') { + gamesCompleted += r.value.completed; + chaosFired += r.value.chaosFired; + errors.push(...r.value.errors); + } else { + errors.push({ + room: `room-${idx}`, + reason: 'room_threw', + detail: r.reason instanceof Error ? r.reason.message : String(r.reason), + timestamp: Date.now(), + }); + } + }); + + return { + gamesCompleted, + errors, + durationMs: Date.now() - start, + customMetrics: { chaos_fired: chaosFired }, + }; + }, +}; + +export default stress; +``` + +- [ ] **Step 3: Register stress in the registry** + +Edit `tests/soak/scenarios/index.ts`: + +```typescript +import type { Scenario } from '../core/types'; +import populate from './populate'; +import stress from './stress'; + +const registry: Record = { + populate, + stress, +}; + +export function getScenario(name: string): Scenario | undefined { + return registry[name]; +} + +export function listScenarios(): Scenario[] { + return Object.values(registry); +} +``` + +- [ ] **Step 4: Smoke test stress scenario** + +```bash +TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ + --scenario=stress \ + --accounts=4 --rooms=1 --cpus-per-room=1 --games-per-room=3 --holes=1 \ + --watch=none +``` + +Expected: 3 quick games complete, chaos events in logs (look for `chaos_injected`), exit 0. + +- [ ] **Step 5: Commit** + +```bash +git add tests/soak/scenarios/stress.ts tests/soak/scenarios/shared/chaos.ts tests/soak/scenarios/index.ts +git commit -m "$(cat <<'EOF' +feat(soak): stress scenario with chaos injection + +Rapid 1-hole games with a parallel chaos loop that has a 5% per-turn +chance of firing rapid_clicks, tab_blur, or brief_offline events. +Chaos counts roll up into ScenarioResult.customMetrics. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +## Phase 9 — Failure handling + +### Task 26: Watchdog + heartbeat wiring + +Per-room timeout that fires if no heartbeat arrives within N ms. Runner wires it into `ctx.heartbeat`. Vitest-tested. + +**Files:** +- Create: `tests/soak/core/watchdog.ts` +- Create: `tests/soak/tests/watchdog.test.ts` +- Modify: `tests/soak/runner.ts` — wire `heartbeat` to per-room watchdogs + +- [ ] **Step 1: Write failing tests** + +```typescript +// tests/soak/tests/watchdog.test.ts +import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest'; +import { Watchdog } from '../core/watchdog'; + +describe('Watchdog', () => { + beforeEach(() => vi.useFakeTimers()); + afterEach(() => vi.useRealTimers()); + + it('fires after timeout if no heartbeat', () => { + const onTimeout = vi.fn(); + const w = new Watchdog(1000, onTimeout); + w.start(); + vi.advanceTimersByTime(1001); + expect(onTimeout).toHaveBeenCalledOnce(); + }); + + it('heartbeat resets the timer', () => { + const onTimeout = vi.fn(); + const w = new Watchdog(1000, onTimeout); + w.start(); + vi.advanceTimersByTime(800); + w.heartbeat(); + vi.advanceTimersByTime(800); + expect(onTimeout).not.toHaveBeenCalled(); + vi.advanceTimersByTime(300); + expect(onTimeout).toHaveBeenCalledOnce(); + }); + + it('stop cancels pending timeout', () => { + const onTimeout = vi.fn(); + const w = new Watchdog(1000, onTimeout); + w.start(); + w.stop(); + vi.advanceTimersByTime(2000); + expect(onTimeout).not.toHaveBeenCalled(); + }); + + it('does not fire twice after stop', () => { + const onTimeout = vi.fn(); + const w = new Watchdog(1000, onTimeout); + w.start(); + vi.advanceTimersByTime(1001); + w.heartbeat(); + vi.advanceTimersByTime(1001); + expect(onTimeout).toHaveBeenCalledOnce(); + }); +}); +``` + +- [ ] **Step 2: Run to verify failure** + +```bash +npx vitest run tests/watchdog.test.ts +``` + +Expected: FAIL. + +- [ ] **Step 3: Implement `core/watchdog.ts`** + +```typescript +// tests/soak/core/watchdog.ts +export class Watchdog { + private timer: NodeJS.Timeout | null = null; + private fired = false; + + constructor( + private timeoutMs: number, + private onTimeout: () => void, + ) {} + + start(): void { + this.stop(); + this.fired = false; + this.timer = setTimeout(() => { + if (this.fired) return; + this.fired = true; + this.onTimeout(); + }, this.timeoutMs); + } + + heartbeat(): void { + if (this.fired) return; + this.start(); + } + + stop(): void { + if (this.timer) { + clearTimeout(this.timer); + this.timer = null; + } + } +} +``` + +- [ ] **Step 4: Verify tests pass** + +```bash +npx vitest run tests/watchdog.test.ts +``` + +Expected: all passing. + +- [ ] **Step 5: Wire watchdogs into the runner** + +In `runner.ts`, add before building `ctx`: + +```typescript + const watchdogs = new Map(); + const roomAborters = new Map(); + for (let i = 0; i < rooms; i++) { + const roomId = `room-${i}`; + const aborter = new AbortController(); + roomAborters.set(roomId, aborter); + const w = new Watchdog(60_000, () => { + logger.error('watchdog_fired', { room: roomId }); + aborter.abort(); + dashboard.update(roomId, { phase: 'error' }); + }); + w.start(); + watchdogs.set(roomId, w); + } +``` + +Import at the top: + +```typescript +import { Watchdog } from './core/watchdog'; +``` + +Set `ctx.heartbeat` to: + +```typescript + heartbeat: (roomId: string) => { + const w = watchdogs.get(roomId); + if (w) w.heartbeat(); + }, +``` + +In the `finally` block, stop all watchdogs: + +```typescript + for (const w of watchdogs.values()) w.stop(); +``` + +Note: for now the `roomAborters` aren't fully plumbed into scenario cancellation — scenarios see the global `ctx.signal` only. This is intentional; per-room abort requires scenario-side awareness and is deferred until a scenario genuinely misbehaves. The watchdog still catches stuck runs and flips the global error state. + +- [ ] **Step 6: Commit** + +```bash +git add tests/soak/core/watchdog.ts tests/soak/tests/watchdog.test.ts tests/soak/runner.ts +git commit -m "$(cat <<'EOF' +feat(soak): per-room watchdog with heartbeat + +Watchdog class with Vitest tests, wired into ctx.heartbeat in the +runner. One watchdog per room, 60s timeout; firing logs an error +and marks the room's dashboard tile as errored. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 27: Artifact capture on failure + +When the runner catches an error, snapshot every session's page: screenshot, HTML, console log tail, game state JSON. + +**Files:** +- Create: `tests/soak/core/artifacts.ts` +- Modify: `tests/soak/runner.ts` — call `captureArtifacts` in the catch block + +- [ ] **Step 1: Implement `core/artifacts.ts`** + +```typescript +// tests/soak/core/artifacts.ts +import * as fs from 'fs'; +import * as path from 'path'; +import type { Session, Logger } from './types'; + +export interface ArtifactsOptions { + runId: string; + /** Absolute path to the artifacts root, e.g., /path/to/tests/soak/artifacts */ + rootDir: string; + logger: Logger; +} + +export class Artifacts { + readonly runDir: string; + + constructor(private opts: ArtifactsOptions) { + this.runDir = path.join(opts.rootDir, opts.runId); + fs.mkdirSync(this.runDir, { recursive: true }); + } + + /** Capture everything for a single session. */ + async captureSession(session: Session, roomId: string): Promise { + const dir = path.join(this.runDir, roomId); + fs.mkdirSync(dir, { recursive: true }); + const prefix = session.key; + + try { + const png = await session.page.screenshot({ fullPage: true }); + fs.writeFileSync(path.join(dir, `${prefix}.png`), png); + } catch (err) { + this.opts.logger.warn('artifact_screenshot_failed', { + session: session.key, + error: err instanceof Error ? err.message : String(err), + }); + } + + try { + const html = await session.page.content(); + fs.writeFileSync(path.join(dir, `${prefix}.html`), html); + } catch (err) { + this.opts.logger.warn('artifact_html_failed', { + session: session.key, + error: err instanceof Error ? err.message : String(err), + }); + } + + try { + const state = await session.bot.getGameState(); + fs.writeFileSync( + path.join(dir, `${prefix}.state.json`), + JSON.stringify(state, null, 2), + ); + } catch (err) { + this.opts.logger.warn('artifact_state_failed', { + session: session.key, + error: err instanceof Error ? err.message : String(err), + }); + } + + try { + const errors = session.bot.getConsoleErrors?.() ?? []; + fs.writeFileSync(path.join(dir, `${prefix}.console.txt`), errors.join('\n')); + } catch { + // ignore — not all bots expose this + } + } + + async captureAll(sessions: Session[]): Promise { + // Best-effort: partition sessions by their key prefix (doesn't matter) + // and write everything under room-unknown/ unless callers pre-partition + await Promise.all( + sessions.map((s) => this.captureSession(s, 'room-unknown')), + ); + } + + writeSummary(summary: object): void { + fs.writeFileSync( + path.join(this.runDir, 'summary.json'), + JSON.stringify(summary, null, 2), + ); + } +} + +/** Prune run directories older than `maxAgeMs`. */ +export function pruneOldRuns(rootDir: string, maxAgeMs: number, logger: Logger): void { + if (!fs.existsSync(rootDir)) return; + const now = Date.now(); + for (const entry of fs.readdirSync(rootDir)) { + const full = path.join(rootDir, entry); + try { + const stat = fs.statSync(full); + if (stat.isDirectory() && now - stat.mtimeMs > maxAgeMs) { + fs.rmSync(full, { recursive: true, force: true }); + logger.info('artifact_pruned', { runId: entry }); + } + } catch { + // ignore + } + } +} +``` + +- [ ] **Step 2: Call artifact capture from the runner's error path** + +In `runner.ts`, import: + +```typescript +import { Artifacts, pruneOldRuns } from './core/artifacts'; +``` + +After `const runId = ...`, instantiate and prune: + +```typescript + const artifactsRoot = path.resolve(__dirname, 'artifacts'); + const artifacts = new Artifacts({ runId, rootDir: artifactsRoot, logger }); + pruneOldRuns(artifactsRoot, 7 * 24 * 3600 * 1000, logger); +``` + +In the `catch (err)` block, after logging, capture: + +```typescript + } catch (err) { + logger.error('run_failed', { + error: err instanceof Error ? err.message : String(err), + stack: err instanceof Error ? err.stack : undefined, + }); + try { + const liveSessions = pool['activeSessions'] as Session[] | undefined; + if (liveSessions && liveSessions.length > 0) { + await artifacts.captureAll(liveSessions); + } + } catch (captureErr) { + logger.warn('artifact_capture_failed', { + error: captureErr instanceof Error ? captureErr.message : String(captureErr), + }); + } + exitCode = 1; + } +``` + +(Note: the `pool['activeSessions']` access bypasses visibility to avoid adding a public getter for one call site. Acceptable for an error path in a test harness.) + +After successful run, write the summary: + +```typescript + artifacts.writeSummary({ + runId, + scenario: scenario.name, + targetUrl, + gamesCompleted: result.gamesCompleted, + errors: result.errors, + durationMs: result.durationMs, + customMetrics: result.customMetrics, + }); +``` + +Import `Session` type: + +```typescript +import type { Session } from './core/types'; +``` + +- [ ] **Step 3: Verify by forcing a failure** + +Kill the server mid-run and confirm artifacts are written: + +```bash +# In one terminal +TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ + --scenario=populate --accounts=2 --rooms=1 --cpus-per-room=0 \ + --games-per-room=5 --holes=3 --watch=none + +# In another: wait ~3 seconds then Ctrl-C the dev server +# The soak run should catch errors and write artifacts + +ls tests/soak/artifacts/ +ls tests/soak/artifacts// +``` + +Expected: a run directory exists with `summary.json` (if it got far enough) or per-session screenshots / HTML under `room-unknown/`. + +- [ ] **Step 4: Commit** + +```bash +git add tests/soak/core/artifacts.ts tests/soak/runner.ts +git commit -m "$(cat <<'EOF' +feat(soak): artifact capture on failure + run summary + +Screenshots, HTML, game state, and console errors are captured into +tests/soak/artifacts// when a scenario throws. Runs older +than 7 days are pruned on startup. Successful runs get a +summary.json next to the artifacts dir. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 28: Graceful shutdown (already partially in place) + exit codes + +SIGINT/SIGTERM already flip the abort controller. Formalize the timeout-and-force-exit path and the three exit codes (`0` / `1` / `2`). + +**Files:** +- Modify: `tests/soak/runner.ts` + +- [ ] **Step 1: Add a graceful shutdown timeout** + +In `runner.ts`, replace the existing signal handlers with: + +```typescript + let forceExitTimer: NodeJS.Timeout | null = null; + const onSignal = (sig: string) => { + if (abortController.signal.aborted) { + // Second signal: force exit + logger.warn('force_exit', { signal: sig }); + process.exit(130); + } + logger.warn('signal_received', { signal: sig }); + abortController.abort(); + // Hard-kill after 10s if cleanup hangs + forceExitTimer = setTimeout(() => { + logger.error('graceful_shutdown_timeout'); + process.exit(130); + }, 10_000); + }; + process.on('SIGINT', () => onSignal('SIGINT')); + process.on('SIGTERM', () => onSignal('SIGTERM')); +``` + +In the `finally` block, clear the force-exit timer: + +```typescript + if (forceExitTimer) clearTimeout(forceExitTimer); +``` + +- [ ] **Step 2: Manual test — Ctrl-C a long run** + +```bash +TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ + --scenario=populate --accounts=2 --rooms=1 --cpus-per-room=0 \ + --games-per-room=10 --holes=3 --watch=none + +# After ~5 seconds: Ctrl-C +``` + +Expected: runner logs `signal_received`, finishes current turn, prints summary, exits with code 2 (check `echo $?`). + +- [ ] **Step 3: Commit** + +```bash +git add tests/soak/runner.ts +git commit -m "$(cat <<'EOF' +feat(soak): graceful shutdown with 10s hard-kill fallback + +SIGINT/SIGTERM flips the abort signal; scenarios finish the current +turn then exit. If cleanup hangs >10s the runner force-exits. Second +Ctrl-C is an immediate hard kill. Exit codes: 0 success, 1 errors, +2 interrupted. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 29: Periodic health probes + +Every 30s, fetch `/api/health` on the target server. Three consecutive failures declare a fatal error and abort. + +**Files:** +- Modify: `tests/soak/runner.ts` + +- [ ] **Step 1: Add a health probe interval** + +In `runner.ts`, after building the abort controller and before running the scenario: + +```typescript + let healthFailures = 0; + const healthTimer = setInterval(async () => { + try { + const res = await fetch(`${targetUrl}/api/health`); + if (!res.ok) throw new Error(`status ${res.status}`); + healthFailures = 0; + } catch (err) { + healthFailures++; + logger.warn('health_probe_failed', { + consecutive: healthFailures, + error: err instanceof Error ? err.message : String(err), + }); + if (healthFailures >= 3) { + logger.error('health_fatal', { consecutive: healthFailures }); + abortController.abort(); + } + } + }, 30_000); +``` + +In the `finally` block: + +```typescript + clearInterval(healthTimer); +``` + +- [ ] **Step 2: Commit** + +```bash +git add tests/soak/runner.ts +git commit -m "$(cat <<'EOF' +feat(soak): periodic health probes against target server + +Every 30s GET /api/health. Three consecutive failures abort the +run with a fatal error, so staging outages don't get misattributed +to harness bugs. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +## Phase 10 — Polish and bring-up + +### Task 30: Smoke test script + +`tests/soak/scripts/smoke.sh` — the canary run that takes ~30s against local dev. + +**Files:** +- Create: `tests/soak/scripts/smoke.sh` + +- [ ] **Step 1: Create the script** + +```bash +#!/usr/bin/env bash +# Soak harness smoke test — end-to-end canary against local dev. +# Expected runtime: ~30 seconds. +set -euo pipefail + +cd "$(dirname "$0")/.." + +: "${TEST_URL:=http://localhost:8000}" +: "${SOAK_INVITE_CODE:=SOAKTEST}" + +echo "Smoke target: $TEST_URL" +echo "Invite code: $SOAK_INVITE_CODE" + +# 1. Health probe +curl -fsS "$TEST_URL/api/health" > /dev/null || { + echo "FAIL: target server unreachable at $TEST_URL" + exit 1 +} + +# 2. Ensure minimum accounts +if [ ! -f .env.stresstest ]; then + echo "Seeding accounts..." + npm run seed -- --count=4 +fi + +# 3. Run minimum viable scenario +TEST_URL="$TEST_URL" SOAK_INVITE_CODE="$SOAK_INVITE_CODE" \ + npm run soak -- \ + --scenario=populate \ + --accounts=2 \ + --rooms=1 \ + --cpus-per-room=0 \ + --games-per-room=1 \ + --holes=1 \ + --watch=none + +echo "Smoke PASSED" +``` + +- [ ] **Step 2: Make it executable and run it** + +```bash +chmod +x tests/soak/scripts/smoke.sh +cd tests/soak && bash scripts/smoke.sh +``` + +Expected: `Smoke PASSED` within ~30s. + +- [ ] **Step 3: Commit** + +```bash +git add tests/soak/scripts/smoke.sh +git commit -m "$(cat <<'EOF' +feat(soak): smoke test script — 30s end-to-end canary + +Confirms the harness works against local dev with the absolute +minimum config. Run after any change. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 31: README + CHECKLIST + +Replace the README stub with a full quickstart and flag reference. Add the manual validation checklist. + +**Files:** +- Modify: `tests/soak/README.md` +- Create: `tests/soak/CHECKLIST.md` + +- [ ] **Step 1: Rewrite `tests/soak/README.md`** + +```markdown +# Golf Soak & UX Test Harness + +Standalone Playwright-based runner that drives multi-user authenticated +game sessions for scoreboard population and stability testing. + +**Spec:** `../../docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md` +**Bring-up:** `../../docs/soak-harness-bringup.md` + +## Quick start + +```bash +cd tests/soak +npm install + +# First run only: seed 16 accounts +TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run seed + +# 30-second end-to-end smoke test +bash scripts/smoke.sh + +# Populate scoreboard (4 rooms × 4 accounts × 10 long games) +TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST \ + npm run soak:populate + +# Stress test (4 rooms × 50 rapid games with chaos) +TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST \ + npm run soak:stress +``` + +## CLI flags + +``` +--scenario=populate|stress required +--accounts= total sessions (default: scenario.needs.accounts) +--rooms= default from scenario.needs +--cpus-per-room= default from scenario.needs +--games-per-room= default from scenario.defaultConfig +--holes= default from scenario.defaultConfig +--watch=none|dashboard|tiled default: dashboard +--dashboard-port= default: 7777 +--target= default: TEST_URL env +--run-id= default: ISO timestamp +--list print scenarios and exit +--dry-run validate config, don't run +``` + +Derived: `accounts / rooms` must divide evenly. + +## Environment variables + +``` +TEST_URL target base URL (e.g. https://staging.adlee.work) +SOAK_INVITE_CODE invite code flagged marks_as_test (staging: 5VC2MCCN) +SOAK_HOLES override --holes +SOAK_ROOMS override --rooms +SOAK_ACCOUNTS override --accounts +SOAK_CPUS_PER_ROOM override --cpus-per-room +SOAK_GAMES_PER_ROOM override --games-per-room +SOAK_WATCH override --watch +SOAK_DASHBOARD_PORT override --dashboard-port +``` + +## Watch modes + +- **`none`** — pure headless, JSON logs to stdout. Use for CI and overnight runs. +- **`dashboard`** (default) — HTTP+WS server on localhost:7777 serving a live status grid. Click any player tile to watch their live session via CDP screencast. +- **`tiled`** — 4 native Chromium windows for the host of each room, positioned in a 2×2 grid. Joiners stay headless. + +## Scenarios + +| Name | Description | +|---|---| +| `populate` | Long 9-hole games with varied CPU personalities, realistic pacing, for populating scoreboards | +| `stress` | Rapid 1-hole games with chaos injection (rapid clicks, offline toggles, tab blur) for hunting race conditions | + +Add new scenarios by creating `scenarios/.ts` and registering in `scenarios/index.ts`. + +## Architecture + +See the design spec for full module breakdown. Key modules: + +- `runner.ts` — CLI entry, wires everything together +- `core/session-pool.ts` — owns browser contexts, seeds/logs in 16 accounts +- `core/room-coordinator.ts` — host→joiners room-code handoff +- `core/watchdog.ts` — per-room timeout detection +- `core/screencaster.ts` — CDP Page.startScreencast for live video +- `dashboard/server.ts` — HTTP + WS server +- `scenarios/` — pluggable scenarios + +Reuses `../../tests/e2e/bot/golf-bot.ts` unchanged. + +## Running tests (unit) + +```bash +npm test +``` + +Tests cover `Deferred`, `RoomCoordinator`, `Watchdog`, and `config`. +Integration-level modules are verified by the smoke test. +``` + +- [ ] **Step 2: Create `tests/soak/CHECKLIST.md`** + +```markdown +# Soak Harness Manual Validation Checklist + +Run after any significant change or before calling the implementation complete. + +## Bring-up + +- [ ] Local dev server is running (`python server/main.py`) +- [ ] `SOAKTEST` invite code exists locally with `marks_as_test=TRUE` +- [ ] `npm install` in `tests/soak/` succeeded +- [ ] `npm run seed -- --count=16` creates/updates 16 accounts +- [ ] `.env.stresstest` has 16 `SOAK_ACCOUNT_NN=...` lines +- [ ] All seeded users show `is_test_account=TRUE` in the DB + +## Smoke + +- [ ] `bash scripts/smoke.sh` exits 0 within 60s + +## Scenarios + +- [ ] `--scenario=populate --rooms=1 --games-per-room=1` completes cleanly +- [ ] `--scenario=populate --rooms=4 --games-per-room=1` runs 4 rooms in parallel with no cross-contamination +- [ ] `--scenario=stress --games-per-room=3` logs `chaos_injected` events + +## Watch modes + +- [ ] `--watch=none` produces JSONL on stdout, nothing else +- [ ] `--watch=dashboard` opens http://localhost:7777, grid renders, tiles update live, WS status shows `healthy` +- [ ] Clicking any player tile opens the video modal and streams live JPEG frames (~10 fps) +- [ ] Closing the modal stops the screencast (check logs for `screencast_stopped`) +- [ ] `--watch=tiled` opens 4 native Chromium windows for the 4 hosts + +## Failure modes + +- [ ] Ctrl-C during a run → graceful shutdown, summary printed, exit code 2 +- [ ] Double Ctrl-C → hard exit (130) +- [ ] Killing the dev server mid-run → health probes fail 3× → fatal abort, artifacts captured, exit 1 +- [ ] Artifacts directory contains a subdirectory per failed run with screenshots and state.json +- [ ] Artifacts older than 7 days are pruned on next startup + +## Server-side filtering + +- [ ] `GET /api/stats/leaderboard` (default) hides soak_* accounts +- [ ] `GET /api/stats/leaderboard?include_test=true` shows soak_* accounts +- [ ] Admin panel user list shows `[Test]` badge on soak_* accounts +- [ ] Admin panel "Include test accounts" checkbox filters them out +- [ ] Admin panel invite codes tab shows `[Test-seed]` next to SOAKTEST + +## Staging bring-up (final step) + +- [ ] `UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN';` run on staging +- [ ] `SOAK_INVITE_CODE=5VC2MCCN TEST_URL=https://staging.adlee.work npm run seed -- --count=16` seeds staging accounts +- [ ] Staging run with `--scenario=populate --watch=none` completes +- [ ] Staging leaderboard with `include_test=true` shows the soak accounts +- [ ] Staging leaderboard default (no param) does NOT show the soak accounts +``` + +- [ ] **Step 3: Commit** + +```bash +git add tests/soak/README.md tests/soak/CHECKLIST.md +git commit -m "$(cat <<'EOF' +docs(soak): full README + manual validation checklist + +Quickstart, flag reference, env var reference, scenario table, and +the bring-up/validation checklist that gates calling the harness +implementation complete. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +### Task 32: Staging bring-up (manual, no code) + +This is a documentation-only task — the actual run happens on your workstation. Listed here so the implementation plan is complete end to end. + +- [ ] **Step 1: Flag `5VC2MCCN` as test-seed on staging** + +From your workstation (requires DB access to staging): + +```bash +ssh root@129.212.150.189 \ + 'docker exec -i golfgame-postgres psql -U postgres -d golfgame' <<'EOF' +UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN'; +SELECT code, max_uses, use_count, marks_as_test FROM invite_codes WHERE code = '5VC2MCCN'; +EOF +``` + +Expected: `marks_as_test | t`. + +(The exact docker container name may differ — adjust based on `docker ps` on the staging host.) + +- [ ] **Step 2: Seed the 16 staging accounts** + +```bash +cd tests/soak +rm -f .env.stresstest +TEST_URL=https://staging.adlee.work \ + SOAK_INVITE_CODE=5VC2MCCN \ + npm run seed -- --count=16 +``` + +Expected: `.env.stresstest` populated with 16 entries. + +- [ ] **Step 3: Run populate against staging** + +```bash +TEST_URL=https://staging.adlee.work \ + SOAK_INVITE_CODE=5VC2MCCN \ + npm run soak -- \ + --scenario=populate \ + --rooms=4 \ + --games-per-room=3 \ + --holes=3 \ + --watch=dashboard +``` + +Expected: dashboard opens, 4 rooms play 3 games each, staging scoreboard accumulates data. Exit 0 at the end. + +- [ ] **Step 4: Verify scoreboard filtering on staging** + +```bash +# Should NOT contain soak_* usernames +curl -s "https://staging.adlee.work/api/stats/leaderboard?metric=wins" | jq '.entries[] | select(.username | startswith("soak_"))' + +# Should contain soak_* usernames +curl -s "https://staging.adlee.work/api/stats/leaderboard?metric=wins&include_test=true" | jq '.entries[] | select(.username | startswith("soak_"))' +``` + +Expected: first returns nothing, second returns entries. + +- [ ] **Step 5: Mark implementation complete** + +Check off all items in `tests/soak/CHECKLIST.md` that correspond to this plan. Commit the filled-in checklist if you want a record: + +```bash +git add tests/soak/CHECKLIST.md +git commit -m "docs(soak): checklist passed on initial staging run" +``` + +--- + +## Phase 11 — Version bump + +### Task 33: Bump to v3.3.4 and add footer to admin.html + +Updates all HTML footers from `v3.1.6` to `v3.3.4`, adds a footer to admin.html which currently has none, bumps `pyproject.toml`. + +**Files:** +- Modify: `client/index.html` — both footer occurrences (L58, L291) +- Modify: `client/admin.html` — add footer +- Modify: `pyproject.toml` — version field + +- [ ] **Step 1: Update `client/index.html` footers** + +```bash +grep -n "v3\.1\.6" client/index.html +``` + +For each match, replace `v3.1.6` with `v3.3.4`. There should be exactly two matches. + +- [ ] **Step 2: Add footer to `client/admin.html`** + +Find the closing `` in `client/admin.html` and add a footer just before it: + +```html +
    v3.3.4 © Aaron D. Lee
    + +``` + +(The inline style is a fallback — admin.css may already have an `.app-footer` class; if so, drop the inline styles.) + +```bash +grep -n "app-footer" client/admin.css 2>/dev/null +``` + +If the class exists, use just `
    v3.3.4 © Aaron D. Lee
    `. + +- [ ] **Step 3: Bump `pyproject.toml`** + +```bash +sed -i 's/^version = "3\.1\.6"$/version = "3.3.4"/' pyproject.toml +grep version pyproject.toml +``` + +Expected: `version = "3.3.4"`. + +- [ ] **Step 4: Verify in the browser** + +Restart the dev server, open http://localhost:8000 and http://localhost:8000/admin.html. Confirm both show `v3.3.4` in the footer. + +- [ ] **Step 5: Commit** + +```bash +git add client/index.html client/admin.html pyproject.toml +git commit -m "$(cat <<'EOF' +chore: bump version to v3.3.4 + +Updates client/index.html footer (×2) and pyproject.toml from +v3.1.6 → v3.3.4, and adds a matching footer to client/admin.html +which previously had none. + +Co-Authored-By: Claude Opus 4.6 (1M context) +EOF +)" +``` + +--- + +## Summary + +33 tasks across 11 phases: + +| Phase | Tasks | Milestone | +|---|---|---| +| 1 — Server changes | 1–8 | Stats filter works, test accounts are separable | +| 2 — Harness scaffolding | 9–12 | Core pure-logic modules with Vitest tests pass | +| 3 — SessionPool + seeding | 13–14 | `.env.stresstest` seeded via real HTTP | +| 4 — First run | 15–18 | **`--watch=none` smoke test passes end-to-end** | +| 5 — Dashboard | 19–21 | Live status grid in browser | +| 6 — Live video | 22–23 | Click-to-watch CDP screencast | +| 7 — Tiled mode | 24 | Native host windows | +| 8 — Stress scenario | 25 | Chaos injection runs clean | +| 9 — Failure handling | 26–29 | Watchdog + artifacts + graceful shutdown + health probes | +| 10 — Polish | 30–31 | Smoke script + README + CHECKLIST | +| 11 — Version bump | 33 | v3.3.4 everywhere | + +(Task 32 is the manual staging bring-up — no code.) + +Dependencies between tasks: + +- Tasks 1–8 are independent of the harness (ship them first if you want immediate value for admins) +- Tasks 9–18 are strictly sequential (each builds on the previous) +- Tasks 19–21, 22–23, 24, 25 are independent of each other — can be done in any order after Task 18 +- Tasks 26–29 can be done after Task 18 but are most valuable after Task 25 +- Tasks 30–31 come last before staging +- Task 33 is independent and can be done any time after Task 8