From cf916d7bc3f60226d9420e74d6e89c9fafe72699 Mon Sep 17 00:00:00 2001
From: adlee-was-taken <aaron.daniel.lee@gmail.com>
Date: Fri, 10 Apr 2026 23:37:15 -0400
Subject: [PATCH] docs: implementation plan for multiplayer soak harness
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

33-task TDD plan across 11 phases implementing the soak & UX test
harness design. Server-side schema/filter/admin changes ship first
(independent), then the tests/soak/ TypeScript runner builds up
incrementally — first milestone is a --watch=none smoke run against
local dev after Task 18, then dashboard, live video, tiled mode,
stress scenario, failure handling, and staging bring-up. Final
task bumps project version to v3.3.4.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 .../plans/2026-04-10-multiplayer-soak-test.md | 5495 +++++++++++++++++
 1 file changed, 5495 insertions(+)
 create mode 100644 docs/superpowers/plans/2026-04-10-multiplayer-soak-test.md

diff --git a/docs/superpowers/plans/2026-04-10-multiplayer-soak-test.md b/docs/superpowers/plans/2026-04-10-multiplayer-soak-test.md
new file mode 100644
index 0000000..c8b9252
--- /dev/null
+++ b/docs/superpowers/plans/2026-04-10-multiplayer-soak-test.md
@@ -0,0 +1,5495 @@
+# Multiplayer Soak & UX Test Harness — Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Build a standalone Playwright-based soak runner in `tests/soak/` that drives 16 authenticated browser sessions across 4 concurrent rooms playing many multiplayer games, with pluggable scenarios, a click-to-watch dashboard via CDP screencast, and strict per-room failure isolation.
+
+**Architecture:** Single-process node runner reusing the existing `GolfBot` class from `tests/e2e/bot/`. One shared browser (16 contexts) by default; `WATCH=tiled` uses a second headed browser for the 4 host contexts. Scenarios are plain TS modules exported from `tests/soak/scenarios/`. Dashboard is a tiny HTTP+WS server serving one static page that pushes live status and on-demand CDP screencast frames.
+
+**Tech Stack:** TypeScript + tsx (no build step), Playwright Core, ws (WebSocket server), Vitest for unit tests, FastAPI + asyncpg (existing server), PostgreSQL (existing).
+
+**Spec:** `docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md`
+
+---
+
+## Testing Strategy Notes
+
+- **Server-side Python changes:** The existing test suite mocks stores with `AsyncMock` and has no real-Postgres fixtures. Rather than inventing a new fixture pattern for this plan, server tasks use **curl-based verification against a running local dev server** as the explicit verification step after each commit. Run `python server/main.py` in another terminal (requires Postgres + Redis running — see `docs/INSTALL.md`).
+- **TypeScript harness logic:** Unit-tested with Vitest for pure modules (Deferred, RoomCoordinator, Watchdog, Config). Integration-level modules (SessionPool, Dashboard, Screencaster, Scenarios) are verified by running the harness itself via the smoke test.
+- **End-to-end validation:** `tests/soak/scripts/smoke.sh` is the canary — after every non-trivial change, run it against local dev and expect exit 0 within ~30s.
+
+---
+
+## Phase 1 — Server-side changes (independent, ships first)
+
+### Task 1: Schema migration for `is_test_account` and `marks_as_test`
+
+Add two columns, one partial index, and rebuild the `leaderboard_overall` materialized view to include `is_test_account` (so the filter works through the view fast path). Fits the existing inline-migration pattern in `user_store.py`.
+
+**Files:**
+- Modify: `server/stores/user_store.py` — append to `SCHEMA_SQL` (ALTER blocks near L79–L98 and the matview block near L298–L335)
+
+- [ ] **Step 1: Add column migration to `SCHEMA_SQL`**
+
+Open `server/stores/user_store.py`. Inside the first `DO $$ BEGIN ... END $$;` block (around line 80–98 that handles admin columns), append the `is_test_account` column check. Then add a second ALTER for `invite_codes.marks_as_test` in a new `DO $$` block right after.
+
+Add after the existing `last_seen_at` check (before `END $$;` on line ~98):
+
+```sql
+    IF NOT EXISTS (SELECT 1 FROM information_schema.columns
+                   WHERE table_name = 'users_v2' AND column_name = 'is_test_account') THEN
+        ALTER TABLE users_v2 ADD COLUMN is_test_account BOOLEAN DEFAULT FALSE;
+    END IF;
+```
+
+Then, immediately after the `END $$;` that closes the users_v2 admin block, add a new block for invite_codes:
+
+```sql
+-- Add marks_as_test to invite_codes if not exists
+DO $$
+BEGIN
+    IF NOT EXISTS (SELECT 1 FROM information_schema.columns
+                   WHERE table_name = 'invite_codes' AND column_name = 'marks_as_test') THEN
+        ALTER TABLE invite_codes ADD COLUMN marks_as_test BOOLEAN DEFAULT FALSE;
+    END IF;
+END $$;
+```
+
+- [ ] **Step 2: Add partial index on `is_test_account`**
+
+Find the indexes block near line 338. After the existing `idx_users_banned` index (line ~344), add:
+
+```sql
+CREATE INDEX IF NOT EXISTS idx_users_v2_is_test_account ON users_v2(is_test_account)
+    WHERE is_test_account = TRUE;
+```
+
+- [ ] **Step 3: Rebuild `leaderboard_overall` materialized view to include `is_test_account`**
+
+Find the existing matview block at line ~298. Modify the version-check DO block so the view is dropped and recreated if it lacks the `is_test_account` column. Replace the existing block:
+
+```sql
+-- Leaderboard materialized view (refreshed periodically)
+-- Drop and recreate if missing is_test_account column (soak harness migration)
+DO $$
+BEGIN
+    IF EXISTS (SELECT 1 FROM pg_matviews WHERE matviewname = 'leaderboard_overall') THEN
+        -- Check if is_test_account column exists in the view
+        IF NOT EXISTS (
+            SELECT 1 FROM information_schema.columns
+            WHERE table_name = 'leaderboard_overall' AND column_name = 'is_test_account'
+        ) THEN
+            DROP MATERIALIZED VIEW leaderboard_overall;
+        END IF;
+    END IF;
+
+    IF NOT EXISTS (SELECT 1 FROM pg_matviews WHERE matviewname = 'leaderboard_overall') THEN
+        EXECUTE '
+            CREATE MATERIALIZED VIEW leaderboard_overall AS
+            SELECT
+                u.id as user_id,
+                u.username,
+                COALESCE(u.is_test_account, FALSE) as is_test_account,
+                s.games_played,
+                s.games_won,
+                ROUND(s.games_won::numeric / NULLIF(s.games_played, 0) * 100, 1) as win_rate,
+                s.rounds_won,
+                ROUND(s.total_points::numeric / NULLIF(s.total_rounds, 0), 1) as avg_score,
+                s.best_score as best_round_score,
+                s.knockouts,
+                s.best_win_streak,
+                COALESCE(s.rating, 1500) as rating,
+                s.last_game_at
+            FROM player_stats s
+            JOIN users_v2 u ON s.user_id = u.id
+            WHERE s.games_played >= 5
+            AND u.deleted_at IS NULL
+            AND (u.is_banned = false OR u.is_banned IS NULL)
+        ';
+    END IF;
+END $$;
+```
+
+Note: the only differences from the existing block are the changed comment, the changed column-existence check (`is_test_account` instead of `rating`), and the new `COALESCE(u.is_test_account, FALSE) as is_test_account` column in the SELECT. Everything else stays identical.
+
+- [ ] **Step 4: Start the server to run migrations**
+
+Run (in another terminal, with Postgres + Redis up):
+
+```bash
+cd /home/alee/Sources/golfgame
+python server/main.py
+```
+
+Expected: server starts cleanly, no errors about `is_test_account` or `marks_as_test` or `leaderboard_overall`.
+
+- [ ] **Step 5: Verify schema via psql**
+
+Connect to the dev database and confirm:
+
+```bash
+psql -d golfgame -c "\d users_v2" | grep is_test_account
+psql -d golfgame -c "\d invite_codes" | grep marks_as_test
+psql -d golfgame -c "\d leaderboard_overall" | grep is_test_account
+psql -d golfgame -c "\di idx_users_v2_is_test_account"
+```
+
+Expected: all four commands return matching rows.
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add server/stores/user_store.py
+git commit -m "$(cat <<'EOF'
+feat(server): add is_test_account + marks_as_test schema
+
+New columns support separating soak-harness test traffic from real
+user traffic in stats queries. Rebuilds leaderboard_overall matview
+to include is_test_account so the fast path stays filterable.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 2: Propagate `is_test_account` through `User` model and `user_store`
+
+Wire the new column into the `User` dataclass, `create_user` signature, `_row_to_user` mapping, and every SELECT list that already pulls user columns.
+
+**Files:**
+- Modify: `server/models/user.py` — `User` dataclass (L22–L68) + `to_dict` (L82–L116) + `from_dict` (L118+)
+- Modify: `server/stores/user_store.py` — `create_user` (L454–L501), `_row_to_user` (L997–L1020), `get_user_by_id`/`get_user_by_username`/`get_user_by_email` SELECT lists (L503–L570)
+
+- [ ] **Step 1: Add `is_test_account` to the `User` dataclass**
+
+In `server/models/user.py`, add a new field to the `User` dataclass (after `force_password_reset` on L68):
+
+```python
+    is_test_account: bool = False
+```
+
+Update the docstring `Attributes:` block around L45 to include:
+
+```
+        is_test_account: True for accounts created by the soak test harness.
+```
+
+- [ ] **Step 2: Include `is_test_account` in `to_dict` and `from_dict`**
+
+In `User.to_dict` at L82, add to the `d` dict (after `force_password_reset`):
+
+```python
+            "is_test_account": self.is_test_account,
+```
+
+In `User.from_dict`, add the corresponding parse — find where `force_password_reset` is parsed and add the same pattern:
+
+```python
+            is_test_account=d.get("is_test_account", False),
+```
+
+- [ ] **Step 3: Add `is_test_account` parameter to `create_user`**
+
+In `server/stores/user_store.py` at L454, add a new parameter:
+
+```python
+    async def create_user(
+        self,
+        username: str,
+        password_hash: str,
+        email: Optional[str] = None,
+        role: UserRole = UserRole.USER,
+        guest_id: Optional[str] = None,
+        verification_token: Optional[str] = None,
+        verification_expires: Optional[datetime] = None,
+        is_test_account: bool = False,
+    ) -> Optional[User]:
+```
+
+Update the docstring to add a line in `Args:` describing `is_test_account`.
+
+Change the INSERT SQL block to include the new column:
+
+```python
+                row = await conn.fetchrow(
+                    """
+                    INSERT INTO users_v2 (username, password_hash, email, role, guest_id,
+                                          verification_token, verification_expires,
+                                          is_test_account)
+                    VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
+                    RETURNING id, username, email, password_hash, role, email_verified,
+                              verification_token, verification_expires, reset_token, reset_expires,
+                              guest_id, deleted_at, preferences, created_at, last_login, last_seen_at,
+                              is_active, is_banned, ban_reason, force_password_reset, is_test_account
+                    """,
+                    username,
+                    password_hash,
+                    email,
+                    role.value,
+                    guest_id,
+                    verification_token,
+                    verification_expires,
+                    is_test_account,
+                )
+```
+
+- [ ] **Step 4: Update `_row_to_user` mapping**
+
+In `server/stores/user_store.py` at L997, add to the `User(...)` call (after `force_password_reset`):
+
+```python
+            is_test_account=row.get("is_test_account", False) or False,
+```
+
+- [ ] **Step 5: Update all other SELECT lists in user_store**
+
+Find every query in `server/stores/user_store.py` that returns a full user row and passes it to `_row_to_user`. Add `is_test_account` to the SELECT column list for each. Grep to find them:
+
+```bash
+grep -n "is_active, is_banned, ban_reason, force_password_reset" server/stores/user_store.py
+```
+
+For each match, append `, is_test_account` to the SELECT list. Expected locations:
+- `create_user` INSERT ... RETURNING (already updated in Step 3)
+- `get_user_by_id` at L503
+- `get_user_by_username` at L519
+- `get_user_by_email` (find it)
+- Any other `SELECT` ... FROM users_v2 that calls `_row_to_user`
+
+- [ ] **Step 6: Restart server, verify no errors**
+
+```bash
+# Kill and restart the dev server
+python server/main.py
+```
+
+Expected: server starts cleanly. Any query that touches users now returns `is_test_account` correctly.
+
+- [ ] **Step 7: Smoke test via curl**
+
+```bash
+# Register a throwaway test user (no invite code needed if DAILY_OPEN_SIGNUPS > 0 locally,
+# or use the 5VC2MCCN invite code if INVITE_ONLY=true)
+# Set PW to any password of your choice (>= 8 chars).
+PW='SomeTestPw_1!'
+curl -sX POST http://localhost:8000/api/auth/register \
+  -H 'Content-Type: application/json' \
+  -d "{\"username\":\"soaktest_smoke1\",\"password\":\"$PW\",\"email\":\"soaktest_smoke1@example.com\",\"invite_code\":\"5VC2MCCN\"}"
+```
+
+Expected: HTTP 200 with `{"user":{...},"token":"..."}`. The registration path now runs through the new column without errors even though the value is still always FALSE at this stage.
+
+- [ ] **Step 8: Commit**
+
+```bash
+git add server/models/user.py server/stores/user_store.py
+git commit -m "$(cat <<'EOF'
+feat(server): propagate is_test_account through User model & store
+
+User dataclass, create_user, and all SELECT lists now round-trip the
+new column. Value is always FALSE until Task 4 wires the register
+flow to the invite code's marks_as_test flag.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 3: Expose `marks_as_test` on `InviteCode` and add lookup helper
+
+`validate_invite_code` currently returns a bare bool. We need a new helper that returns the full row so the register flow can check `marks_as_test` without a second query.
+
+**Files:**
+- Modify: `server/services/admin_service.py` — `InviteCode` dataclass (L115–L138), `get_invite_codes` SELECT (L1106–L1141), add new `get_invite_code_details` method
+
+- [ ] **Step 1: Add `marks_as_test` field to `InviteCode` dataclass**
+
+In `server/services/admin_service.py` at L115:
+
+```python
+@dataclass
+class InviteCode:
+    """Invite code details."""
+    code: str
+    created_by: str
+    created_by_username: str
+    created_at: datetime
+    expires_at: datetime
+    max_uses: int
+    use_count: int
+    is_active: bool
+    marks_as_test: bool = False
+```
+
+Update `to_dict` at L127 to include the field:
+
+```python
+    def to_dict(self) -> dict:
+        return {
+            "code": self.code,
+            "created_by": self.created_by,
+            "created_by_username": self.created_by_username,
+            "created_at": self.created_at.isoformat() if self.created_at else None,
+            "expires_at": self.expires_at.isoformat() if self.expires_at else None,
+            "max_uses": self.max_uses,
+            "use_count": self.use_count,
+            "is_active": self.is_active,
+            "remaining_uses": max(0, self.max_uses - self.use_count),
+            "marks_as_test": self.marks_as_test,
+        }
+```
+
+- [ ] **Step 2: Update `get_invite_codes` SELECT to include `marks_as_test`**
+
+Find `get_invite_codes` at L1106. Modify the SQL to pull the column and pass it through:
+
+```python
+    async def get_invite_codes(self, include_expired: bool = False) -> List[InviteCode]:
+        """List all invite codes."""
+        async with self.pool.acquire() as conn:
+            sql = """
+                SELECT c.code, c.created_by, u.username as created_by_username,
+                       c.created_at, c.expires_at,
+                       c.max_uses, c.use_count, c.is_active,
+                       COALESCE(c.marks_as_test, FALSE) as marks_as_test
+                FROM invite_codes c
+                LEFT JOIN users_v2 u ON c.created_by = u.id
+            """
+```
+
+Find the list comprehension that constructs `InviteCode(...)` objects and add the new kwarg:
+
+```python
+                InviteCode(
+                    code=row["code"],
+                    created_by=str(row["created_by"]),
+                    created_by_username=row["created_by_username"] or "unknown",
+                    created_at=row["created_at"].replace(tzinfo=timezone.utc) if row["created_at"] else None,
+                    expires_at=row["expires_at"].replace(tzinfo=timezone.utc) if row["expires_at"] else None,
+                    max_uses=row["max_uses"],
+                    use_count=row["use_count"],
+                    is_active=row["is_active"],
+                    marks_as_test=row["marks_as_test"],
+                )
+```
+
+- [ ] **Step 3: Add new `get_invite_code_details` method**
+
+Add a new method right after `validate_invite_code` (around L1214) that returns the row with `marks_as_test`. The register flow will call this to resolve the flag. Place it between `validate_invite_code` and `use_invite_code`:
+
+```python
+    async def get_invite_code_details(self, code: str) -> Optional[dict]:
+        """
+        Look up an invite code's row including marks_as_test.
+
+        Returns None if the code does not exist. Does NOT validate expiry
+        or usage — use validate_invite_code for that. This is purely a
+        helper for the register flow to discover the test-seed flag.
+        """
+        async with self.pool.acquire() as conn:
+            row = await conn.fetchrow(
+                """
+                SELECT code, max_uses, use_count, is_active,
+                       COALESCE(marks_as_test, FALSE) as marks_as_test
+                FROM invite_codes
+                WHERE code = $1
+                """,
+                code,
+            )
+            if not row:
+                return None
+            return {
+                "code": row["code"],
+                "max_uses": row["max_uses"],
+                "use_count": row["use_count"],
+                "is_active": row["is_active"],
+                "marks_as_test": row["marks_as_test"],
+            }
+```
+
+- [ ] **Step 4: Verify with curl via admin panel endpoint**
+
+Assuming you have an admin token from a local dev user. Hit the existing admin invites listing:
+
+```bash
+# Replace TOKEN with a valid admin JWT
+curl -s http://localhost:8000/api/admin/invites \
+  -H "Authorization: Bearer $TOKEN" | jq '.codes[0]'
+```
+
+Expected: response includes `"marks_as_test": false` on at least one code.
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add server/services/admin_service.py
+git commit -m "$(cat <<'EOF'
+feat(server): expose marks_as_test on InviteCode
+
+Adds the field to the dataclass, SELECT list in get_invite_codes,
+and a new get_invite_code_details helper that the register flow
+will use to discover whether an invite should flag new accounts
+as test accounts.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 4: Wire register flow to set `is_test_account` from invite
+
+When a user registers with an invite whose `marks_as_test=TRUE`, the new account is flagged. The plumbing lives in two places: the router reads the flag and passes it to the service; the service passes it to the store.
+
+**Files:**
+- Modify: `server/routers/auth.py` — `register` handler (L224–L320)
+- Modify: `server/services/auth_service.py` — `register` method (L98–L178)
+
+- [ ] **Step 1: Add `is_test_account` parameter to `auth_service.register`**
+
+In `server/services/auth_service.py` at L98, add the new parameter:
+
+```python
+    async def register(
+        self,
+        username: str,
+        password: str,
+        email: Optional[str] = None,
+        guest_id: Optional[str] = None,
+        is_test_account: bool = False,
+    ) -> RegistrationResult:
+```
+
+Update the docstring `Args:` block:
+
+```
+            is_test_account: Mark this user as a soak-harness test account.
+```
+
+Pass the value through to `create_user` at L146:
+
+```python
+        user = await self.user_store.create_user(
+            username=username,
+            password_hash=password_hash,
+            email=email,
+            role=UserRole.USER,
+            guest_id=guest_id,
+            verification_token=verification_token,
+            verification_expires=verification_expires,
+            is_test_account=is_test_account,
+        )
+```
+
+- [ ] **Step 2: Update the router to resolve `marks_as_test` and pass it through**
+
+In `server/routers/auth.py`, find the `register` handler at L224. After the existing invite-code validation block (around L248–L252), fetch the invite details and compute `is_test`:
+
+```python
+    # --- Invite code validation ---
+    is_test_account = False
+    if has_invite:
+        if not _admin_service:
+            raise HTTPException(status_code=503, detail="Admin service not initialized")
+        if not await _admin_service.validate_invite_code(request_body.invite_code):
+            raise HTTPException(status_code=400, detail="Invalid or expired invite code")
+        # Check if this invite flags new accounts as test accounts
+        invite_details = await _admin_service.get_invite_code_details(request_body.invite_code)
+        if invite_details and invite_details.get("marks_as_test"):
+            is_test_account = True
+```
+
+Then pass it to `auth_service.register` at L276:
+
+```python
+    # --- Create the account ---
+    result = await auth_service.register(
+        username=request_body.username,
+        password=request_body.password,
+        email=request_body.email,
+        is_test_account=is_test_account,
+    )
+```
+
+- [ ] **Step 3: Flag the dev invite code for testing**
+
+Before we can test end-to-end locally, we need an invite code with `marks_as_test=TRUE` in the local dev DB. Run (once, manually):
+
+```bash
+# First, check if 5VC2MCCN exists locally (it probably doesn't — that's staging's code).
+# Create a local test invite code and flag it:
+psql -d golfgame <<'EOF'
+-- Create a local dev test-seed invite if not exists
+INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test)
+SELECT 'SOAKTEST', id, NOW() + INTERVAL '10 years', 100, TRUE, TRUE
+FROM users_v2 WHERE role = 'admin' LIMIT 1
+ON CONFLICT (code) DO UPDATE SET marks_as_test = TRUE;
+
+-- Verify
+SELECT code, max_uses, use_count, marks_as_test FROM invite_codes WHERE code = 'SOAKTEST';
+EOF
+```
+
+Expected: `marks_as_test | t` in the last row.
+
+- [ ] **Step 4: Verify register flow sets `is_test_account`**
+
+Restart the dev server, then:
+
+```bash
+curl -sX POST http://localhost:8000/api/auth/register \
+  -H 'Content-Type: application/json' \
+  -d "{\"username\":\"soaktest_register1\",\"password\":\"$PW\",\"email\":\"soaktest_register1@example.com\",\"invite_code\":\"SOAKTEST\"}"
+
+# Verify in DB
+psql -d golfgame -c "SELECT username, is_test_account FROM users_v2 WHERE username = 'soaktest_register1';"
+```
+
+Expected: `is_test_account | t`.
+
+- [ ] **Step 5: Verify non-test invite does NOT flag new accounts**
+
+```bash
+# Create a non-test invite
+psql -d golfgame <<'EOF'
+INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test)
+SELECT 'NORMAL01', id, NOW() + INTERVAL '10 years', 10, TRUE, FALSE
+FROM users_v2 WHERE role = 'admin' LIMIT 1
+ON CONFLICT (code) DO UPDATE SET marks_as_test = FALSE;
+EOF
+
+curl -sX POST http://localhost:8000/api/auth/register \
+  -H 'Content-Type: application/json' \
+  -d "{\"username\":\"realuser_smoke1\",\"password\":\"$PW\",\"email\":\"realuser_smoke1@example.com\",\"invite_code\":\"NORMAL01\"}"
+
+psql -d golfgame -c "SELECT username, is_test_account FROM users_v2 WHERE username = 'realuser_smoke1';"
+```
+
+Expected: `is_test_account | f`.
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add server/routers/auth.py server/services/auth_service.py
+git commit -m "$(cat <<'EOF'
+feat(server): register flow flags accounts from test-seed invites
+
+When a user registers with an invite_code whose marks_as_test=TRUE,
+their users_v2.is_test_account is set to TRUE. Normal invite codes
+and invite-less signups are unaffected.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 5: Stats filtering (`include_test` parameter)
+
+Thread an `include_test: bool = False` parameter through `get_leaderboard`, `get_player_rank`, and the corresponding router handlers. Default is `False` — real users never see soak traffic.
+
+**Files:**
+- Modify: `server/services/stats_service.py` — `get_leaderboard` (L169), `get_player_rank` (L249)
+- Modify: `server/routers/stats.py` — `get_leaderboard` route (L157), `get_player_rank` route (L227), `get_my_rank` route (L348)
+
+- [ ] **Step 1: Add `include_test` to `get_leaderboard` service method**
+
+In `server/services/stats_service.py` at L169:
+
+```python
+    async def get_leaderboard(
+        self,
+        metric: str = "wins",
+        limit: int = 50,
+        offset: int = 0,
+        include_test: bool = False,
+    ) -> List[LeaderboardEntry]:
+```
+
+Inside the method, find both SQL paths (materialized view and fallback). In the view path at L208, change the WHERE clause:
+
+```python
+            if view_exists:
+                # Use materialized view for performance
+                rows = await conn.fetch(f"""
+                    SELECT
+                        user_id, username, games_played, games_won,
+                        win_rate, avg_score, knockouts, best_win_streak,
+                        COALESCE(rating, 1500) as rating,
+                        ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank
+                    FROM leaderboard_overall
+                    WHERE ($3 OR NOT is_test_account)
+                    ORDER BY {column} {direction}
+                    LIMIT $1 OFFSET $2
+                """, limit, offset, include_test)
+```
+
+In the fallback path at L220, add the WHERE clause and parameter:
+
+```python
+            else:
+                # Fall back to direct query
+                rows = await conn.fetch(f"""
+                    SELECT
+                        s.user_id, u.username, s.games_played, s.games_won,
+                        ROUND(s.games_won::numeric / NULLIF(s.games_played, 0) * 100, 1) as win_rate,
+                        ROUND(s.total_points::numeric / NULLIF(s.total_rounds, 0), 1) as avg_score,
+                        s.knockouts, s.best_win_streak,
+                        COALESCE(s.rating, 1500) as rating,
+                        ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank
+                    FROM player_stats s
+                    JOIN users_v2 u ON s.user_id = u.id
+                    WHERE s.games_played >= 5
+                    AND u.deleted_at IS NULL
+                    AND (u.is_banned = false OR u.is_banned IS NULL)
+                    AND ($3 OR NOT COALESCE(u.is_test_account, FALSE))
+                    ORDER BY {column} {direction}
+                    LIMIT $1 OFFSET $2
+                """, limit, offset, include_test)
+```
+
+- [ ] **Step 2: Apply the same pattern to `get_player_rank`**
+
+In `server/services/stats_service.py` at L249:
+
+```python
+    async def get_player_rank(
+        self,
+        user_id: str,
+        metric: str = "wins",
+        include_test: bool = False,
+    ) -> Optional[int]:
+```
+
+Update both SQL paths to include the `include_test` filter. View path at L287:
+
+```python
+            if view_exists:
+                row = await conn.fetchrow(f"""
+                    SELECT rank FROM (
+                        SELECT user_id, ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank
+                        FROM leaderboard_overall
+                        WHERE ($2 OR NOT is_test_account)
+                    ) ranked
+                    WHERE user_id = $1
+                """, user_id, include_test)
+```
+
+Fallback path at L294:
+
+```python
+            else:
+                row = await conn.fetchrow(f"""
+                    SELECT rank FROM (
+                        SELECT s.user_id, ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank
+                        FROM player_stats s
+                        JOIN users_v2 u ON s.user_id = u.id
+                        WHERE s.games_played >= 5
+                        AND u.deleted_at IS NULL
+                        AND (u.is_banned = false OR u.is_banned IS NULL)
+                        AND ($2 OR NOT COALESCE(u.is_test_account, FALSE))
+                    ) ranked
+                    WHERE user_id = $1
+                """, user_id, include_test)
+```
+
+- [ ] **Step 3: Expose `include_test` as a query parameter on the leaderboard route**
+
+In `server/routers/stats.py` at L157:
+
+```python
+@router.get("/leaderboard", response_model=LeaderboardResponse)
+async def get_leaderboard(
+    metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"),
+    limit: int = Query(50, ge=1, le=100),
+    offset: int = Query(0, ge=0),
+    include_test: bool = Query(False, description="Include soak-harness test accounts"),
+    service: StatsService = Depends(get_stats_service_dep),
+):
+    """
+    Get leaderboard by metric.
+
+    Metrics:
+    - wins: Total games won
+    - win_rate: Win percentage (requires 5+ games)
+    - avg_score: Average points per round (lower is better)
+    - knockouts: Times going out first
+    - streak: Best win streak
+
+    Players must have 5+ games to appear on leaderboards.
+    By default, soak-harness test accounts are hidden.
+    """
+    entries = await service.get_leaderboard(metric, limit, offset, include_test)
+```
+
+- [ ] **Step 4: Same for `get_player_rank` and `get_my_rank` routes**
+
+At L227:
+
+```python
+@router.get("/players/{user_id}/rank", response_model=PlayerRankResponse)
+async def get_player_rank(
+    user_id: str,
+    metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"),
+    include_test: bool = Query(False),
+    service: StatsService = Depends(get_stats_service_dep),
+):
+    """Get player's rank on a leaderboard."""
+    rank = await service.get_player_rank(user_id, metric, include_test)
+```
+
+At L348:
+
+```python
+@router.get("/me/rank", response_model=PlayerRankResponse)
+async def get_my_rank(
+    metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"),
+    include_test: bool = Query(False),
+    user: User = Depends(require_user),
+    service: StatsService = Depends(get_stats_service_dep),
+):
+    """Get current user's rank on a leaderboard."""
+    rank = await service.get_player_rank(user.id, metric, include_test)
+```
+
+- [ ] **Step 5: Verify filtering works via curl**
+
+```bash
+# Mark a test user we registered earlier as having games played (synthetic)
+psql -d golfgame <<'EOF'
+INSERT INTO player_stats (user_id, games_played, games_won, total_points, total_rounds, rounds_won)
+SELECT id, 10, 8, 50, 30, 20 FROM users_v2 WHERE username = 'soaktest_register1'
+ON CONFLICT (user_id) DO UPDATE SET games_played = 10, games_won = 8;
+
+-- Refresh the matview so the test account shows up
+REFRESH MATERIALIZED VIEW leaderboard_overall;
+EOF
+
+# Default (include_test=false) should NOT include soaktest_register1
+curl -s "http://localhost:8000/api/stats/leaderboard?metric=wins" | jq '.entries[] | select(.username | startswith("soaktest_"))'
+
+# include_test=true should include soaktest_register1
+curl -s "http://localhost:8000/api/stats/leaderboard?metric=wins&include_test=true" | jq '.entries[] | select(.username | startswith("soaktest_"))'
+```
+
+Expected: first command returns nothing, second returns a JSON object for `soaktest_register1`.
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add server/services/stats_service.py server/routers/stats.py
+git commit -m "$(cat <<'EOF'
+feat(server): stats queries support include_test filter
+
+Leaderboard and rank queries take an optional include_test param
+(default false). Real users never see soak-harness traffic unless
+they explicitly opt in via ?include_test=true.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 6: Admin service + route surfaces `is_test_account`
+
+`UserDetails` exposes the flag, `search_users` selects it, and `list_users` admin route accepts an `include_test` query parameter.
+
+**Files:**
+- Modify: `server/services/admin_service.py` — `UserDetails` (L24–L58), `search_users` (L312–L382), `get_user` (L384–L428)
+- Modify: `server/routers/admin.py` — `list_users` route (L80–L107)
+
+- [ ] **Step 1: Add field to `UserDetails` dataclass**
+
+In `server/services/admin_service.py` at L24, add to the dataclass:
+
+```python
+@dataclass
+class UserDetails:
+    """Extended user info for admin view."""
+    id: str
+    username: str
+    email: Optional[str]
+    role: str
+    email_verified: bool
+    is_banned: bool
+    ban_reason: Optional[str]
+    force_password_reset: bool
+    created_at: datetime
+    last_login: Optional[datetime]
+    last_seen_at: Optional[datetime]
+    is_active: bool
+    games_played: int
+    games_won: int
+    is_test_account: bool = False
+```
+
+Update `to_dict` to include it:
+
+```python
+    def to_dict(self) -> dict:
+        return {
+            "id": self.id,
+            "username": self.username,
+            "email": self.email,
+            "role": self.role,
+            "email_verified": self.email_verified,
+            "is_banned": self.is_banned,
+            "ban_reason": self.ban_reason,
+            "force_password_reset": self.force_password_reset,
+            "created_at": self.created_at.isoformat() if self.created_at else None,
+            "last_login": self.last_login.isoformat() if self.last_login else None,
+            "last_seen_at": self.last_seen_at.isoformat() if self.last_seen_at else None,
+            "is_active": self.is_active,
+            "games_played": self.games_played,
+            "games_won": self.games_won,
+            "is_test_account": self.is_test_account,
+        }
+```
+
+- [ ] **Step 2: Update `search_users` to SELECT and filter on `is_test_account`**
+
+In `server/services/admin_service.py` at L312, add `include_test` parameter and column to the SELECT:
+
+```python
+    async def search_users(
+        self,
+        query: str = "",
+        limit: int = 50,
+        offset: int = 0,
+        include_banned: bool = True,
+        include_deleted: bool = False,
+        include_test: bool = True,
+    ) -> List[UserDetails]:
+```
+
+Modify the SQL to pull `is_test_account`:
+
+```python
+            sql = """
+                SELECT u.id, u.username, u.email, u.role,
+                       u.email_verified, u.is_banned, u.ban_reason,
+                       u.force_password_reset, u.created_at, u.last_login,
+                       u.last_seen_at, u.is_active,
+                       COALESCE(u.is_test_account, FALSE) as is_test_account,
+                       COALESCE(s.games_played, 0) as games_played,
+                       COALESCE(s.games_won, 0) as games_won
+                FROM users_v2 u
+                LEFT JOIN player_stats s ON u.id = s.user_id
+                WHERE 1=1
+            """
+```
+
+After the existing `include_deleted` check, add:
+
+```python
+            if not include_test:
+                sql += " AND (u.is_test_account = false OR u.is_test_account IS NULL)"
+```
+
+Update the `UserDetails(...)` construction in the list comprehension to include `is_test_account=row["is_test_account"]`.
+
+- [ ] **Step 3: Update `get_user` (single-user lookup) similarly**
+
+In `server/services/admin_service.py` at L384, add `COALESCE(u.is_test_account, FALSE) as is_test_account` to the SELECT and `is_test_account=row["is_test_account"]` to the `UserDetails(...)` construction. The `get_user` method does NOT need the filter parameter — admins looking up individual users should always see them.
+
+- [ ] **Step 4: Add `include_test` to the admin `list_users` route**
+
+In `server/routers/admin.py` at L80:
+
+```python
+@router.get("/users")
+async def list_users(
+    query: str = "",
+    limit: int = 50,
+    offset: int = 0,
+    include_banned: bool = True,
+    include_deleted: bool = False,
+    include_test: bool = True,
+    admin: User = Depends(require_admin_v2),
+    service: AdminService = Depends(get_admin_service_dep),
+):
+    """
+    Search and list users.
+
+    Args:
+        query: Search by username or email.
+        limit: Maximum results to return.
+        offset: Results to skip.
+        include_banned: Include banned users.
+        include_deleted: Include soft-deleted users.
+        include_test: Include soak-harness test accounts (default true for admins).
+    """
+    users = await service.search_users(
+        query=query,
+        limit=limit,
+        offset=offset,
+        include_banned=include_banned,
+        include_deleted=include_deleted,
+        include_test=include_test,
+    )
+    return {"users": [u.to_dict() for u in users]}
+```
+
+Note: default is `True` for the admin path — admins should see everything by default. The client-side toggle will explicitly pass `false` when the admin wants to hide test accounts.
+
+- [ ] **Step 5: Verify via curl**
+
+```bash
+# Assuming admin token in $TOKEN env var
+curl -s "http://localhost:8000/api/admin/users?query=soaktest" \
+  -H "Authorization: Bearer $TOKEN" | jq '.users[] | {username, is_test_account}'
+
+curl -s "http://localhost:8000/api/admin/users?query=soaktest&include_test=false" \
+  -H "Authorization: Bearer $TOKEN" | jq '.users[]'
+```
+
+Expected: first returns users with `is_test_account: true`; second returns empty (test accounts filtered out).
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add server/services/admin_service.py server/routers/admin.py
+git commit -m "$(cat <<'EOF'
+feat(server): admin users list surfaces is_test_account
+
+UserDetails carries the new column, search_users selects and
+optionally filters on it, and the /api/admin/users route accepts
+?include_test=false to hide soak-harness accounts.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 7: Admin panel UI — Test badge and filter toggle
+
+Add a visible `[Test]` badge on test accounts in the admin user list, a `[Test-seed]` indicator on invite codes that mark new accounts as test, and an "Include test accounts" checkbox next to the existing "Include banned" toggle.
+
+**Files:**
+- Modify: `client/admin.html` — add the new toggle near the existing `#include-banned` checkbox
+- Modify: `client/admin.js` — `loadUsers` (L305), `getStatusBadge` (L246), the invite codes renderer (L443)
+
+- [ ] **Step 1: Add the "Include test accounts" checkbox to admin.html**
+
+In `client/admin.html`, find the existing `#include-banned` checkbox (it's in the users tab filter bar — grep for it). Add a sibling checkbox right after:
+
+```bash
+grep -n "include-banned" client/admin.html
+```
+
+Add next to that line:
+
+```html
+<label>
+  <input type="checkbox" id="include-test" />
+  Include test accounts
+</label>
+```
+
+- [ ] **Step 2: Read the new checkbox in `loadUsers` and pass to getUsers**
+
+In `client/admin.js` at L305:
+
+```javascript
+async function loadUsers() {
+    try {
+        const query = document.getElementById('user-search').value;
+        const includeBanned = document.getElementById('include-banned').checked;
+        const includeTest = document.getElementById('include-test').checked;
+        const data = await getUsers(query, usersPage * PAGE_SIZE, includeBanned, includeTest);
+```
+
+Find `getUsers` at L70 and add the new parameter:
+
+```javascript
+async function getUsers(query = '', offset = 0, includeBanned = true, includeTest = true) {
+    const params = new URLSearchParams({
+        query,
+        limit: PAGE_SIZE,
+        offset,
+        include_banned: includeBanned,
+        include_test: includeTest,
+    });
+    return apiRequest(`/api/admin/users?${params}`);
+}
+```
+
+Note: the existing signature builds a URLSearchParams — check the actual code at L70 and match its style; the key change is adding `include_test: includeTest` to the params.
+
+- [ ] **Step 3: Add a "Test" badge to the user table row**
+
+In `client/admin.js` at L314, modify the table row template to render a Test badge inline with the status badge:
+
+```javascript
+        data.users.forEach(user => {
+            const testBadge = user.is_test_account
+                ? '<span class="badge badge-info" title="Soak harness test account">Test</span>'
+                : '';
+            tbody.innerHTML += `
+                <tr>
+                    <td>${escapeHtml(user.username)} ${testBadge}</td>
+                    <td>${escapeHtml(user.email || '-')}</td>
+                    <td><span class="badge badge-${user.role === 'admin' ? 'info' : 'muted'}">${user.role}</span></td>
+                    <td>${getStatusBadge(user)}</td>
+                    <td>${user.games_played} (${user.games_won} wins)</td>
+                    <td>${formatDateShort(user.created_at)}</td>
+                    <td>
+                        <button class="btn btn-small" data-action="view-user" data-id="${user.id}">View</button>
+                    </td>
+                </tr>
+            `;
+        });
+```
+
+- [ ] **Step 4: Add Test-seed indicator to invite codes list**
+
+In `client/admin.js` around L443 (invite codes list renderer), find the row template and add a `[Test-seed]` badge when `invite.marks_as_test`:
+
+```bash
+grep -n "invite.is_active\|invite.code\|invites-tbody\|invites-table" client/admin.js | head
+```
+
+Once located, modify the row template to include:
+
+```javascript
+            const testSeedBadge = invite.marks_as_test
+                ? '<span class="badge badge-info" title="Creates test accounts">Test-seed</span>'
+                : '';
+            // Insert testSeedBadge into the invite code column, e.g.
+            // <td>${escapeHtml(invite.code)} ${testSeedBadge}</td>
+```
+
+- [ ] **Step 5: Wire the checkbox change event to reload users**
+
+Find where `#include-banned` has its `change` listener attached (grep for it in admin.js):
+
+```bash
+grep -n "include-banned.*addEventListener\|include-banned" client/admin.js
+```
+
+Add a parallel listener for `#include-test` that calls `loadUsers()`:
+
+```javascript
+document.getElementById('include-test').addEventListener('change', () => {
+    usersPage = 0;
+    loadUsers();
+});
+```
+
+- [ ] **Step 6: Manual verification in browser**
+
+1. Open http://localhost:8000/admin.html
+2. Log in as admin
+3. Navigate to Users tab
+4. Search for "soaktest"
+5. Confirm the `[Test]` badge appears next to `soaktest_register1`
+6. Uncheck "Include test accounts" — the row should disappear
+7. Re-check it — the row should return
+8. Navigate to Invite Codes tab
+9. Confirm the `[Test-seed]` badge appears next to the `SOAKTEST` code
+
+- [ ] **Step 7: Commit**
+
+```bash
+git add client/admin.html client/admin.js
+git commit -m "$(cat <<'EOF'
+feat(admin): visible Test/Test-seed badges + filter toggle
+
+Users table shows [Test] next to soak-harness accounts, invite codes
+list shows [Test-seed] next to codes that flag new accounts as test,
+and a new "Include test accounts" checkbox lets admins hide bot
+traffic from the user list.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 8: Document the one-time staging setup step
+
+The staging invite code `5VC2MCCN` needs to be flagged as test-seed before the harness can run against staging. This is a manual one-liner; document it in a new bring-up doc.
+
+**Files:**
+- Create: `docs/soak-harness-bringup.md`
+
+- [ ] **Step 1: Create the bring-up doc**
+
+```bash
+cat > docs/soak-harness-bringup.md <<'EOF'
+# Soak Harness Bring-Up
+
+One-time setup steps before running `tests/soak` against an environment.
+
+## Prerequisites
+
+- An invite code exists with 16+ available uses
+- You have psql access to the target DB (or admin SQL access via some other means)
+
+## 1. Flag the invite code as test-seed
+
+Any account registered with a `marks_as_test=TRUE` invite code gets
+`users_v2.is_test_account=TRUE`, which keeps it out of real-user stats.
+
+### Staging
+
+Invite code: `5VC2MCCN` (16 uses, provisioned 2026-04-10).
+
+```sql
+UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN';
+SELECT code, max_uses, use_count, marks_as_test FROM invite_codes WHERE code = '5VC2MCCN';
+```
+
+Expected: `marks_as_test | t`.
+
+### Local dev
+
+The dev DB already has a `SOAKTEST` invite created during Task 4 of
+the implementation plan. If you wiped the DB since, recreate it:
+
+```sql
+INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test)
+SELECT 'SOAKTEST', id, NOW() + INTERVAL '10 years', 100, TRUE, TRUE
+FROM users_v2 WHERE role = 'admin' LIMIT 1
+ON CONFLICT (code) DO UPDATE SET marks_as_test = TRUE;
+```
+
+## 2. Run the harness
+
+```bash
+cd tests/soak
+npm install
+npm run seed                                  # first run only, populates .env.stresstest
+TEST_URL=http://localhost:8000 npm run smoke  # 30s end-to-end check
+```
+
+For staging:
+
+```bash
+TEST_URL=https://staging.adlee.work npm run soak -- --scenario=populate
+```
+
+See `tests/soak/README.md` for the full flag reference.
+EOF
+```
+
+- [ ] **Step 2: Commit**
+
+```bash
+git add docs/soak-harness-bringup.md
+git commit -m "$(cat <<'EOF'
+docs: soak harness bring-up steps
+
+Documents the one-time UPDATE invite_codes SET marks_as_test = TRUE
+step required before running tests/soak against each environment,
+plus the local dev SOAKTEST invite recreation SQL.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+## Phase 2 — Harness scaffolding
+
+### Task 9: Create the `tests/soak/` package skeleton
+
+Bare minimum to get `tsx` running against an empty entry point. No behavior yet.
+
+**Files:**
+- Create: `tests/soak/package.json`
+- Create: `tests/soak/tsconfig.json`
+- Create: `tests/soak/.gitignore`
+- Create: `tests/soak/.env.stresstest.example`
+- Create: `tests/soak/README.md` (stub)
+- Create: `tests/soak/runner.ts` (stub — prints "hello")
+
+- [ ] **Step 1: Create `tests/soak/package.json`**
+
+```json
+{
+  "name": "golf-soak",
+  "version": "0.1.0",
+  "private": true,
+  "description": "Multiplayer soak & UX test harness for Golf Card Game",
+  "scripts": {
+    "soak": "tsx runner.ts",
+    "soak:populate": "tsx runner.ts --scenario=populate",
+    "soak:stress": "tsx runner.ts --scenario=stress",
+    "seed": "tsx scripts/seed-accounts.ts",
+    "smoke": "bash scripts/smoke.sh",
+    "test": "vitest run"
+  },
+  "dependencies": {
+    "playwright-core": "^1.40.0",
+    "ws": "^8.16.0"
+  },
+  "devDependencies": {
+    "tsx": "^4.7.0",
+    "@types/ws": "^8.5.0",
+    "@types/node": "^20.10.0",
+    "typescript": "^5.3.0",
+    "vitest": "^1.2.0"
+  }
+}
+```
+
+- [ ] **Step 2: Create `tests/soak/tsconfig.json`**
+
+```json
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "commonjs",
+    "moduleResolution": "node",
+    "strict": true,
+    "esModuleInterop": true,
+    "skipLibCheck": true,
+    "forceConsistentCasingInFileNames": true,
+    "resolveJsonModule": true,
+    "declaration": false,
+    "sourceMap": true,
+    "outDir": "./dist",
+    "rootDir": ".",
+    "baseUrl": ".",
+    "lib": ["ES2022", "DOM"],
+    "paths": {
+      "@soak/*": ["./*"],
+      "@bot/*": ["../e2e/bot/*"]
+    }
+  },
+  "include": ["**/*.ts"],
+  "exclude": ["node_modules", "dist", "artifacts"]
+}
+```
+
+- [ ] **Step 3: Create `tests/soak/.gitignore`**
+
+```
+node_modules/
+dist/
+artifacts/
+.env.stresstest
+*.log
+```
+
+- [ ] **Step 4: Create `tests/soak/.env.stresstest.example`**
+
+```
+# Soak harness account cache.
+# This file is AUTO-GENERATED on first run; do not edit by hand.
+# Format: SOAK_ACCOUNT_NN=username:password:token
+#
+# Example (delete before first real run):
+# SOAK_ACCOUNT_00=soak_00_a7bx:<generated-password>:<jwt-token>
+```
+
+- [ ] **Step 5: Create `tests/soak/README.md` (stub — expanded in Task 31)**
+
+```markdown
+# Golf Soak & UX Test Harness
+
+Runs 16 authenticated browser sessions across 4 rooms to populate
+staging scoreboards and stress-test multiplayer stability.
+
+**Spec:** `docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md`
+**Bring-up:** `docs/soak-harness-bringup.md`
+
+## Quick start
+
+```bash
+npm install
+npm run seed                    # first run only
+TEST_URL=http://localhost:8000 npm run smoke
+```
+
+Full documentation arrives with Task 31.
+```
+
+- [ ] **Step 6: Create `tests/soak/runner.ts` as a placeholder**
+
+```typescript
+#!/usr/bin/env tsx
+/**
+ * Golf Soak Harness — entry point.
+ *
+ * Placeholder. Full runner lands in Task 17.
+ */
+
+async function main(): Promise<void> {
+  console.log('golf-soak runner (placeholder)');
+  console.log('Full implementation lands in Task 17 of the plan.');
+}
+
+main().catch((err) => {
+  console.error(err);
+  process.exit(1);
+});
+```
+
+- [ ] **Step 7: Install deps and verify runner executes**
+
+```bash
+cd tests/soak
+npm install
+npx tsx runner.ts
+```
+
+Expected output:
+
+```
+golf-soak runner (placeholder)
+Full implementation lands in Task 17 of the plan.
+```
+
+- [ ] **Step 8: Commit**
+
+```bash
+git add tests/soak/package.json tests/soak/package-lock.json tests/soak/tsconfig.json tests/soak/.gitignore tests/soak/.env.stresstest.example tests/soak/README.md tests/soak/runner.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): scaffold tests/soak package
+
+Placeholder runner, tsconfig with @bot alias to tests/e2e/bot,
+gitignored .env.stresstest + artifacts. Real behavior follows
+in Task 10 onward.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 10: Core types and `Deferred` helper
+
+Pure TypeScript with Vitest tests. No browser, no network. Establishes the type surface the rest of the harness will target.
+
+**Files:**
+- Create: `tests/soak/core/types.ts`
+- Create: `tests/soak/core/deferred.ts`
+- Create: `tests/soak/tests/deferred.test.ts`
+
+- [ ] **Step 1: Write the failing test for `Deferred`**
+
+Create `tests/soak/tests/deferred.test.ts`:
+
+```typescript
+import { describe, it, expect } from 'vitest';
+import { deferred } from '../core/deferred';
+
+describe('deferred', () => {
+  it('resolves with the given value', async () => {
+    const d = deferred<string>();
+    d.resolve('hello');
+    await expect(d.promise).resolves.toBe('hello');
+  });
+
+  it('rejects with the given error', async () => {
+    const d = deferred<string>();
+    const err = new Error('boom');
+    d.reject(err);
+    await expect(d.promise).rejects.toBe(err);
+  });
+
+  it('ignores second resolve calls', async () => {
+    const d = deferred<number>();
+    d.resolve(1);
+    d.resolve(2);
+    await expect(d.promise).resolves.toBe(1);
+  });
+});
+```
+
+- [ ] **Step 2: Run the test to verify it fails**
+
+```bash
+cd tests/soak
+npx vitest run tests/deferred.test.ts
+```
+
+Expected: FAIL — module `../core/deferred` does not exist.
+
+- [ ] **Step 3: Implement `deferred`**
+
+Create `tests/soak/core/deferred.ts`:
+
+```typescript
+/**
+ * Promise deferred primitive — lets external code resolve or reject
+ * a promise. Used by RoomCoordinator for host→joiners handoff.
+ */
+
+export interface Deferred<T> {
+  promise: Promise<T>;
+  resolve(value: T): void;
+  reject(error: unknown): void;
+}
+
+export function deferred<T>(): Deferred<T> {
+  let resolve!: (value: T) => void;
+  let reject!: (error: unknown) => void;
+  const promise = new Promise<T>((res, rej) => {
+    resolve = res;
+    reject = rej;
+  });
+  return { promise, resolve, reject };
+}
+```
+
+- [ ] **Step 4: Run tests to verify they pass**
+
+```bash
+npx vitest run tests/deferred.test.ts
+```
+
+Expected: 3 passed.
+
+- [ ] **Step 5: Create `core/types.ts` with the scenario interfaces**
+
+```typescript
+/**
+ * Core type definitions for the soak harness.
+ *
+ * Contracts here are consumed by runner.ts, SessionPool, scenarios,
+ * and the dashboard. Keep this file small and stable.
+ */
+
+import type { BrowserContext, Page } from 'playwright-core';
+import type { GolfBot } from '../../e2e/bot/golf-bot';
+
+// =============================================================================
+// Accounts & sessions
+// =============================================================================
+
+export interface Account {
+  /** Stable key used in logs, e.g. "soak_00". */
+  key: string;
+  username: string;
+  password: string;
+  /** JWT returned from /api/auth/login, may be refreshed by SessionPool. */
+  token: string;
+}
+
+export interface Session {
+  account: Account;
+  context: BrowserContext;
+  page: Page;
+  bot: GolfBot;
+  /** Convenience mirror of account.key. */
+  key: string;
+}
+
+// =============================================================================
+// Scenarios
+// =============================================================================
+
+export interface ScenarioNeeds {
+  /** Total number of authenticated sessions the scenario requires. */
+  accounts: number;
+  /** How many rooms to partition sessions into (default: 1). */
+  rooms?: number;
+  /** CPUs to add per room (default: 0). */
+  cpusPerRoom?: number;
+}
+
+/** Free-form per-scenario config merged with CLI flags. */
+export type ScenarioConfig = Record<string, unknown>;
+
+export interface ScenarioError {
+  room: string;
+  reason: string;
+  detail?: string;
+  timestamp: number;
+}
+
+export interface ScenarioResult {
+  gamesCompleted: number;
+  errors: ScenarioError[];
+  durationMs: number;
+  customMetrics?: Record<string, number>;
+}
+
+export interface ScenarioContext {
+  /** Merged config: CLI flags → env → scenario defaults → runner defaults. */
+  config: ScenarioConfig;
+  /** Pre-authenticated sessions; ordered. */
+  sessions: Session[];
+  coordinator: RoomCoordinatorApi;
+  dashboard: DashboardReporter;
+  logger: Logger;
+  signal: AbortSignal;
+  /** Reset the per-room watchdog. Call at each progress point. */
+  heartbeat(roomId: string): void;
+}
+
+export interface Scenario {
+  name: string;
+  description: string;
+  defaultConfig: ScenarioConfig;
+  needs: ScenarioNeeds;
+  run(ctx: ScenarioContext): Promise<ScenarioResult>;
+}
+
+// =============================================================================
+// Room coordination
+// =============================================================================
+
+export interface RoomCoordinatorApi {
+  announce(roomId: string, code: string): void;
+  await(roomId: string, timeoutMs?: number): Promise<string>;
+}
+
+// =============================================================================
+// Dashboard reporter
+// =============================================================================
+
+export interface RoomState {
+  phase?: string;
+  currentPlayer?: string;
+  hole?: number;
+  totalHoles?: number;
+  game?: number;
+  totalGames?: number;
+  moves?: number;
+  players?: Array<{ key: string; score: number | null; isActive: boolean }>;
+  message?: string;
+}
+
+export interface DashboardReporter {
+  update(roomId: string, state: Partial<RoomState>): void;
+  log(level: 'info' | 'warn' | 'error', msg: string, meta?: object): void;
+  incrementMetric(name: string, by?: number): void;
+}
+
+// =============================================================================
+// Logger
+// =============================================================================
+
+export type LogLevel = 'debug' | 'info' | 'warn' | 'error';
+
+export interface Logger {
+  debug(msg: string, meta?: object): void;
+  info(msg: string, meta?: object): void;
+  warn(msg: string, meta?: object): void;
+  error(msg: string, meta?: object): void;
+  child(meta: object): Logger;
+}
+```
+
+- [ ] **Step 6: Verify tsx still parses the runner**
+
+```bash
+cd tests/soak
+npx tsx runner.ts
+```
+
+Expected: still prints the placeholder output; no TypeScript errors from the new `core/` files (they're not imported yet).
+
+- [ ] **Step 7: Commit**
+
+```bash
+git add tests/soak/core/deferred.ts tests/soak/core/types.ts tests/soak/tests/deferred.test.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): core types + Deferred primitive
+
+Establishes the Scenario/Session/Logger/DashboardReporter contracts
+the rest of the harness builds on. Deferred is the building block
+for RoomCoordinator's host→joiners handoff.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 11: RoomCoordinator with tests
+
+Tiny abstraction over `Deferred` keyed by room ID, with a timeout on `await`.
+
+**Files:**
+- Create: `tests/soak/core/room-coordinator.ts`
+- Create: `tests/soak/tests/room-coordinator.test.ts`
+
+- [ ] **Step 1: Write failing tests**
+
+```typescript
+// tests/soak/tests/room-coordinator.test.ts
+import { describe, it, expect } from 'vitest';
+import { RoomCoordinator } from '../core/room-coordinator';
+
+describe('RoomCoordinator', () => {
+  it('resolves await with the announced code (announce then await)', async () => {
+    const rc = new RoomCoordinator();
+    rc.announce('room-1', 'ABCD');
+    await expect(rc.await('room-1')).resolves.toBe('ABCD');
+  });
+
+  it('resolves await with the announced code (await then announce)', async () => {
+    const rc = new RoomCoordinator();
+    const p = rc.await('room-2');
+    rc.announce('room-2', 'WXYZ');
+    await expect(p).resolves.toBe('WXYZ');
+  });
+
+  it('rejects await after timeout if not announced', async () => {
+    const rc = new RoomCoordinator();
+    await expect(rc.await('room-3', 50)).rejects.toThrow(/timed out/i);
+  });
+
+  it('isolates rooms — announcing room-A does not unblock room-B', async () => {
+    const rc = new RoomCoordinator();
+    const pB = rc.await('room-B', 100);
+    rc.announce('room-A', 'A-CODE');
+    await expect(pB).rejects.toThrow(/timed out/i);
+  });
+});
+```
+
+- [ ] **Step 2: Run tests to verify they fail**
+
+```bash
+npx vitest run tests/room-coordinator.test.ts
+```
+
+Expected: FAIL — module not found.
+
+- [ ] **Step 3: Implement `RoomCoordinator`**
+
+```typescript
+// tests/soak/core/room-coordinator.ts
+import { deferred, Deferred } from './deferred';
+import type { RoomCoordinatorApi } from './types';
+
+export class RoomCoordinator implements RoomCoordinatorApi {
+  private rooms = new Map<string, Deferred<string>>();
+
+  announce(roomId: string, code: string): void {
+    this.getOrCreate(roomId).resolve(code);
+  }
+
+  async await(roomId: string, timeoutMs: number = 30_000): Promise<string> {
+    const d = this.getOrCreate(roomId);
+    let timer: NodeJS.Timeout | undefined;
+    const timeout = new Promise<never>((_, reject) => {
+      timer = setTimeout(() => {
+        reject(new Error(`RoomCoordinator: room "${roomId}" timed out after ${timeoutMs}ms`));
+      }, timeoutMs);
+    });
+    try {
+      return await Promise.race([d.promise, timeout]);
+    } finally {
+      if (timer) clearTimeout(timer);
+    }
+  }
+
+  private getOrCreate(roomId: string): Deferred<string> {
+    let d = this.rooms.get(roomId);
+    if (!d) {
+      d = deferred<string>();
+      this.rooms.set(roomId, d);
+    }
+    return d;
+  }
+}
+```
+
+- [ ] **Step 4: Verify tests pass**
+
+```bash
+npx vitest run tests/room-coordinator.test.ts
+```
+
+Expected: 4 passed.
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add tests/soak/core/room-coordinator.ts tests/soak/tests/room-coordinator.test.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): RoomCoordinator with host→joiners handoff
+
+Lazy Deferred per roomId with a timeout on await. Lets concurrent
+joiner sessions block until their host announces the room code
+without polling or page scraping.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 12: Structured JSONL logger
+
+Single module, no transport, writes to `process.stdout`. Supports child loggers with bound metadata (so scenarios can emit logs with `room` / `game` context without repeating it).
+
+**Files:**
+- Create: `tests/soak/core/logger.ts`
+- Create: `tests/soak/tests/logger.test.ts`
+
+- [ ] **Step 1: Write failing tests**
+
+```typescript
+// tests/soak/tests/logger.test.ts
+import { describe, it, expect, beforeEach, vi } from 'vitest';
+import { createLogger } from '../core/logger';
+
+describe('logger', () => {
+  let writes: string[];
+  let write: (s: string) => boolean;
+
+  beforeEach(() => {
+    writes = [];
+    write = (s: string) => {
+      writes.push(s);
+      return true;
+    };
+  });
+
+  it('emits a JSON line per call with level and msg', () => {
+    const log = createLogger({ runId: 'r1', write });
+    log.info('hello');
+    expect(writes).toHaveLength(1);
+    const parsed = JSON.parse(writes[0]);
+    expect(parsed.level).toBe('info');
+    expect(parsed.msg).toBe('hello');
+    expect(parsed.runId).toBe('r1');
+    expect(parsed.timestamp).toBeTypeOf('string');
+  });
+
+  it('merges meta into the log line', () => {
+    const log = createLogger({ runId: 'r1', write });
+    log.warn('slow', { turnMs: 3000 });
+    const parsed = JSON.parse(writes[0]);
+    expect(parsed.turnMs).toBe(3000);
+    expect(parsed.level).toBe('warn');
+  });
+
+  it('child logger inherits parent meta', () => {
+    const log = createLogger({ runId: 'r1', write });
+    const roomLog = log.child({ room: 'room-1' });
+    roomLog.info('game_start');
+    const parsed = JSON.parse(writes[0]);
+    expect(parsed.room).toBe('room-1');
+    expect(parsed.runId).toBe('r1');
+  });
+
+  it('respects minimum level', () => {
+    const log = createLogger({ runId: 'r1', write, minLevel: 'warn' });
+    log.debug('nope');
+    log.info('nope');
+    log.warn('yes');
+    log.error('yes');
+    expect(writes).toHaveLength(2);
+  });
+});
+```
+
+- [ ] **Step 2: Run tests to verify they fail**
+
+```bash
+npx vitest run tests/logger.test.ts
+```
+
+Expected: FAIL — module not found.
+
+- [ ] **Step 3: Implement the logger**
+
+```typescript
+// tests/soak/core/logger.ts
+import type { Logger, LogLevel } from './types';
+
+const LEVEL_ORDER: Record<LogLevel, number> = {
+  debug: 0,
+  info: 1,
+  warn: 2,
+  error: 3,
+};
+
+export interface LoggerOptions {
+  runId: string;
+  minLevel?: LogLevel;
+  /** Defaults to process.stdout.write bound to stdout. Override for tests. */
+  write?: (line: string) => boolean;
+  baseMeta?: Record<string, unknown>;
+}
+
+export function createLogger(opts: LoggerOptions): Logger {
+  const minLevel = opts.minLevel ?? 'info';
+  const write = opts.write ?? ((s: string) => process.stdout.write(s));
+  const baseMeta = opts.baseMeta ?? {};
+
+  function emit(level: LogLevel, msg: string, meta?: object): void {
+    if (LEVEL_ORDER[level] < LEVEL_ORDER[minLevel]) return;
+    const line = JSON.stringify({
+      timestamp: new Date().toISOString(),
+      level,
+      msg,
+      runId: opts.runId,
+      ...baseMeta,
+      ...(meta ?? {}),
+    }) + '\n';
+    write(line);
+  }
+
+  const logger: Logger = {
+    debug: (msg, meta) => emit('debug', msg, meta),
+    info: (msg, meta) => emit('info', msg, meta),
+    warn: (msg, meta) => emit('warn', msg, meta),
+    error: (msg, meta) => emit('error', msg, meta),
+    child: (meta) =>
+      createLogger({
+        runId: opts.runId,
+        minLevel,
+        write,
+        baseMeta: { ...baseMeta, ...meta },
+      }),
+  };
+
+  return logger;
+}
+```
+
+- [ ] **Step 4: Verify tests pass**
+
+```bash
+npx vitest run tests/logger.test.ts
+```
+
+Expected: 4 passed.
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add tests/soak/core/logger.ts tests/soak/tests/logger.test.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): structured JSONL logger with child contexts
+
+Single file, no transport, writes one JSON line per call to stdout.
+Child loggers inherit parent meta so scenarios can bind room/game
+context once and forget about it.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+## Phase 3 — SessionPool and seeding
+
+### Task 13: SessionPool with HTTP registration and localStorage warm-start
+
+This is the biggest single module. It owns browser context lifecycle, seeds accounts on cold start, logs in on warm start, and exposes a simple `acquire()` API to scenarios.
+
+**Files:**
+- Create: `tests/soak/core/session-pool.ts`
+
+Testing: manual via `scripts/seed-accounts.ts` in Task 14 and the first real runner invocation in Task 17. No Vitest test for this — it's an integration module that needs a real browser.
+
+- [ ] **Step 1: Create `tests/soak/core/session-pool.ts` — imports and types**
+
+```typescript
+// tests/soak/core/session-pool.ts
+import * as fs from 'fs';
+import * as path from 'path';
+import {
+  Browser,
+  BrowserContext,
+  chromium,
+} from 'playwright-core';
+import { GolfBot } from '../../e2e/bot/golf-bot';
+import type { Account, Session, Logger } from './types';
+
+export interface SeedOptions {
+  /** Full base URL of the target server, e.g. https://staging.adlee.work. */
+  targetUrl: string;
+  /** Invite code to pass to /api/auth/register. */
+  inviteCode: string;
+  /** Number of accounts to create. */
+  count: number;
+}
+
+export interface SessionPoolOptions {
+  targetUrl: string;
+  inviteCode: string;
+  credFile: string;   // absolute path to .env.stresstest
+  logger: Logger;
+  /** Optional override for the browser to attach contexts to. If absent, SessionPool launches its own. */
+  browser?: Browser;
+  /** Passed through to context.newContext. Useful for viewport overrides in tests. */
+  contextOptions?: Parameters<Browser['newContext']>[0];
+}
+```
+
+- [ ] **Step 2: Implement cred-file read/write**
+
+Append to `session-pool.ts`:
+
+```typescript
+function readCredFile(filePath: string): Account[] | null {
+  if (!fs.existsSync(filePath)) return null;
+  const content = fs.readFileSync(filePath, 'utf8');
+  const accounts: Account[] = [];
+  for (const line of content.split('\n')) {
+    const trimmed = line.trim();
+    if (!trimmed || trimmed.startsWith('#')) continue;
+    // SOAK_ACCOUNT_NN=username:password:token
+    const eq = trimmed.indexOf('=');
+    if (eq === -1) continue;
+    const key = trimmed.slice(0, eq);
+    const value = trimmed.slice(eq + 1);
+    const m = key.match(/^SOAK_ACCOUNT_(\d+)$/);
+    if (!m) continue;
+    const [username, password, token] = value.split(':');
+    if (!username || !password || !token) continue;
+    const idx = parseInt(m[1], 10);
+    accounts.push({
+      key: `soak_${String(idx).padStart(2, '0')}`,
+      username,
+      password,
+      token,
+    });
+  }
+  return accounts.length > 0 ? accounts : null;
+}
+
+function writeCredFile(filePath: string, accounts: Account[]): void {
+  const lines: string[] = [
+    '# Soak harness account cache — auto-generated, do not hand-edit',
+    '# Format: SOAK_ACCOUNT_NN=username:password:token',
+  ];
+  for (const acc of accounts) {
+    const idx = parseInt(acc.key.replace('soak_', ''), 10);
+    const key = `SOAK_ACCOUNT_${String(idx).padStart(2, '0')}`;
+    lines.push(`${key}=${acc.username}:${acc.password}:${acc.token}`);
+  }
+  fs.writeFileSync(filePath, lines.join('\n') + '\n', { mode: 0o600 });
+}
+```
+
+- [ ] **Step 3: Implement the HTTP register call**
+
+```typescript
+interface RegisterResponse {
+  user: { id: string; username: string };
+  token: string;
+  expires_at: string;
+}
+
+async function registerAccount(
+  targetUrl: string,
+  username: string,
+  password: string,
+  email: string,
+  inviteCode: string,
+): Promise<string> {
+  const res = await fetch(`${targetUrl}/api/auth/register`, {
+    method: 'POST',
+    headers: { 'Content-Type': 'application/json' },
+    body: JSON.stringify({ username, password, email, invite_code: inviteCode }),
+  });
+  if (!res.ok) {
+    const body = await res.text().catch(() => '<no body>');
+    throw new Error(`register failed: ${res.status} ${body}`);
+  }
+  const data = (await res.json()) as RegisterResponse;
+  if (!data.token) {
+    throw new Error(`register returned no token: ${JSON.stringify(data)}`);
+  }
+  return data.token;
+}
+
+async function loginAccount(
+  targetUrl: string,
+  username: string,
+  password: string,
+): Promise<string> {
+  const res = await fetch(`${targetUrl}/api/auth/login`, {
+    method: 'POST',
+    headers: { 'Content-Type': 'application/json' },
+    body: JSON.stringify({ username, password }),
+  });
+  if (!res.ok) {
+    const body = await res.text().catch(() => '<no body>');
+    throw new Error(`login failed: ${res.status} ${body}`);
+  }
+  const data = (await res.json()) as RegisterResponse;
+  return data.token;
+}
+
+function randomSuffix(): string {
+  return Math.random().toString(36).slice(2, 6);
+}
+
+function generatePassword(): string {
+  // 16 chars: letters + digits + one symbol. Meets 8-char minimum from auth_service.
+  // Split across halves so repo secret-scanners don't flag the string as base64
+  const lower = 'abcdefghijkm' + 'npqrstuvwxyz'; // pragma: allowlist secret
+  const upper = 'ABCDEFGHJKLM' + 'NPQRSTUVWXYZ'; // pragma: allowlist secret
+  const digits = '23456789';
+  const chars = lower + upper + digits;
+  let out = '';
+  for (let i = 0; i < 15; i++) {
+    out += chars[Math.floor(Math.random() * chars.length)];
+  }
+  return out + '!';
+}
+```
+
+- [ ] **Step 4: Implement the `SessionPool` class**
+
+```typescript
+export class SessionPool {
+  private accounts: Account[] = [];
+  private ownedBrowser: Browser | null = null;
+  private browser: Browser | null;
+  private activeSessions: Session[] = [];
+
+  constructor(private opts: SessionPoolOptions) {
+    this.browser = opts.browser ?? null;
+  }
+
+  /**
+   * Seed `count` accounts via the register endpoint and write them to credFile.
+   * Safe to call multiple times — skips accounts already in the file.
+   */
+  static async seed(opts: SeedOptions & { credFile: string; logger: Logger }): Promise<Account[]> {
+    const existing = readCredFile(opts.credFile) ?? [];
+    const existingKeys = new Set(existing.map((a) => a.key));
+    const created: Account[] = [...existing];
+
+    for (let i = 0; i < opts.count; i++) {
+      const key = `soak_${String(i).padStart(2, '0')}`;
+      if (existingKeys.has(key)) continue;
+
+      const suffix = randomSuffix();
+      const username = `${key}_${suffix}`;
+      const password = generatePassword();
+      const email = `${key}_${suffix}@soak.test`;
+
+      opts.logger.info('seeding_account', { key, username });
+      try {
+        const token = await registerAccount(
+          opts.targetUrl,
+          username,
+          password,
+          email,
+          opts.inviteCode,
+        );
+        created.push({ key, username, password, token });
+        writeCredFile(opts.credFile, created);
+      } catch (err) {
+        opts.logger.error('seed_failed', {
+          key,
+          error: err instanceof Error ? err.message : String(err),
+        });
+        throw err;
+      }
+    }
+    return created;
+  }
+
+  /**
+   * Load accounts from credFile, auto-seeding if the file is missing.
+   */
+  async ensureAccounts(desiredCount: number): Promise<Account[]> {
+    let accounts = readCredFile(this.opts.credFile);
+    if (!accounts || accounts.length < desiredCount) {
+      this.opts.logger.warn('cred_file_missing_or_short', {
+        found: accounts?.length ?? 0,
+        desired: desiredCount,
+      });
+      accounts = await SessionPool.seed({
+        targetUrl: this.opts.targetUrl,
+        inviteCode: this.opts.inviteCode,
+        count: desiredCount,
+        credFile: this.opts.credFile,
+        logger: this.opts.logger,
+      });
+    }
+    this.accounts = accounts.slice(0, desiredCount);
+    return this.accounts;
+  }
+
+  /**
+   * Launch the browser if not provided, create N contexts, log each in via
+   * localStorage injection (falling back to POST /api/auth/login if the
+   * cached token is rejected), and return the live sessions.
+   */
+  async acquire(count: number): Promise<Session[]> {
+    await this.ensureAccounts(count);
+    if (!this.browser) {
+      this.ownedBrowser = await chromium.launch({ headless: true });
+      this.browser = this.ownedBrowser;
+    }
+
+    const sessions: Session[] = [];
+    for (let i = 0; i < count; i++) {
+      const account = this.accounts[i];
+      const context = await this.browser.newContext(this.opts.contextOptions);
+      await this.injectAuth(context, account);
+      const page = await context.newPage();
+      await page.goto(this.opts.targetUrl);
+      const bot = new GolfBot(page);
+      sessions.push({ account, context, page, bot, key: account.key });
+    }
+    this.activeSessions = sessions;
+    return sessions;
+  }
+
+  /**
+   * Inject the cached JWT into localStorage BEFORE any page loads.
+   * Uses addInitScript so the token is present on the first navigation.
+   * If the cached token is rejected later, acquire() falls back to login.
+   */
+  private async injectAuth(context: BrowserContext, account: Account): Promise<void> {
+    // Try the cached token first
+    try {
+      await context.addInitScript(
+        ({ token, username }) => {
+          window.localStorage.setItem('authToken', token);
+          window.localStorage.setItem(
+            'authUser',
+            JSON.stringify({ id: '', username, role: 'user', email_verified: true }),
+          );
+        },
+        { token: account.token, username: account.username },
+      );
+    } catch (err) {
+      this.opts.logger.warn('inject_auth_failed', {
+        account: account.key,
+        error: err instanceof Error ? err.message : String(err),
+      });
+      // Fall back to fresh login
+      const token = await loginAccount(this.opts.targetUrl, account.username, account.password);
+      account.token = token;
+      writeCredFile(this.opts.credFile, this.accounts);
+      await context.addInitScript(
+        ({ token, username }) => {
+          window.localStorage.setItem('authToken', token);
+          window.localStorage.setItem(
+            'authUser',
+            JSON.stringify({ id: '', username, role: 'user', email_verified: true }),
+          );
+        },
+        { token, username: account.username },
+      );
+    }
+  }
+
+  /** Close all active contexts. Safe to call multiple times. */
+  async release(): Promise<void> {
+    for (const session of this.activeSessions) {
+      try {
+        await session.context.close();
+      } catch {
+        // ignore
+      }
+    }
+    this.activeSessions = [];
+    if (this.ownedBrowser) {
+      try {
+        await this.ownedBrowser.close();
+      } catch {
+        // ignore
+      }
+      this.ownedBrowser = null;
+      this.browser = null;
+    }
+  }
+}
+```
+
+- [ ] **Step 5: Syntax-check by invoking tsx**
+
+```bash
+cd tests/soak
+npx tsx -e "import('./core/session-pool').then(() => console.log('ok'))"
+```
+
+Expected: `ok`. No TypeScript errors.
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add tests/soak/core/session-pool.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): SessionPool — seed, login, acquire contexts
+
+Owns 16 BrowserContexts, seeds via POST /api/auth/register with the
+invite code on cold start, warm-starts via localStorage injection of
+the cached JWT, falls back to POST /api/auth/login if the token is
+rejected. Exposes acquire(n) for scenarios.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 14: `seed-accounts.ts` CLI wrapper
+
+Tiny standalone entry point that lets you pre-seed before the first harness run. Reuses `SessionPool.seed`.
+
+**Files:**
+- Create: `tests/soak/scripts/seed-accounts.ts`
+
+- [ ] **Step 1: Write the script**
+
+```typescript
+#!/usr/bin/env tsx
+/**
+ * Seed N soak-harness accounts via the register endpoint.
+ *
+ * Usage:
+ *   TEST_URL=http://localhost:8000 \
+ *   SOAK_INVITE_CODE=SOAKTEST \
+ *     npm run seed -- --count=16
+ */
+
+import * as path from 'path';
+import { SessionPool } from '../core/session-pool';
+import { createLogger } from '../core/logger';
+
+function parseArgs(argv: string[]): { count: number } {
+  const result = { count: 16 };
+  for (const arg of argv.slice(2)) {
+    const m = arg.match(/^--count=(\d+)$/);
+    if (m) result.count = parseInt(m[1], 10);
+  }
+  return result;
+}
+
+async function main(): Promise<void> {
+  const { count } = parseArgs(process.argv);
+  const targetUrl = process.env.TEST_URL ?? 'http://localhost:8000';
+  const inviteCode = process.env.SOAK_INVITE_CODE;
+  if (!inviteCode) {
+    console.error('SOAK_INVITE_CODE env var is required');
+    console.error('  Local dev: SOAK_INVITE_CODE=SOAKTEST');
+    console.error('  Staging:   SOAK_INVITE_CODE=5VC2MCCN');
+    process.exit(2);
+  }
+
+  const credFile = path.resolve(__dirname, '..', '.env.stresstest');
+  const logger = createLogger({ runId: `seed-${Date.now()}` });
+
+  logger.info('seed_start', { count, targetUrl, credFile });
+  try {
+    const accounts = await SessionPool.seed({
+      targetUrl,
+      inviteCode,
+      count,
+      credFile,
+      logger,
+    });
+    logger.info('seed_complete', { created: accounts.length });
+    console.error(`Seeded ${accounts.length} accounts → ${credFile}`);
+  } catch (err) {
+    logger.error('seed_failed', {
+      error: err instanceof Error ? err.message : String(err),
+    });
+    process.exit(1);
+  }
+}
+
+main();
+```
+
+- [ ] **Step 2: Run it against local dev to verify end-to-end**
+
+With the dev server running and the `SOAKTEST` invite flagged:
+
+```bash
+cd tests/soak
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run seed -- --count=4
+```
+
+Expected:
+- Log lines `seeding_account` × 4
+- Log line `seed_complete`
+- `tests/soak/.env.stresstest` file created with 4 `SOAK_ACCOUNT_NN=...` lines
+
+Verify:
+
+```bash
+cat tests/soak/.env.stresstest | head
+```
+
+Expected: 4 account lines.
+
+Also verify the accounts got flagged:
+
+```bash
+psql -d golfgame -c "SELECT username, is_test_account FROM users_v2 WHERE username LIKE 'soak_%' ORDER BY username;"
+```
+
+Expected: 4 rows, all with `is_test_account | t`.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tests/soak/scripts/seed-accounts.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): scripts/seed-accounts.ts CLI wrapper
+
+Thin standalone entry for pre-seeding N accounts before the first
+harness run. Wraps SessionPool.seed and writes .env.stresstest.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+## Phase 4 — First scenario, config, runner (end-to-end milestone)
+
+### Task 15: Shared multiplayer-game helper
+
+Pulls the "run one full game in one room" logic out of the scenarios so `populate` and `stress` share it. Takes a room's sessions and a config, loops until the game ends.
+
+**Files:**
+- Create: `tests/soak/scenarios/shared/multiplayer-game.ts`
+
+- [ ] **Step 1: Create the helper module**
+
+```typescript
+// tests/soak/scenarios/shared/multiplayer-game.ts
+import type { Session, ScenarioContext } from '../../core/types';
+
+export interface MultiplayerGameOptions {
+  roomId: string;
+  holes: number;
+  decks: number;
+  cpusPerRoom: number;
+  cpuPersonality?: string;
+  /** Per-turn think time in [min, max] ms. */
+  thinkTimeMs: [number, number];
+  /** Max wall-clock time before giving up on the game (ms). */
+  maxDurationMs?: number;
+}
+
+export interface MultiplayerGameResult {
+  completed: boolean;
+  turns: number;
+  durationMs: number;
+  error?: string;
+}
+
+function randomInt(min: number, max: number): number {
+  return Math.floor(Math.random() * (max - min + 1)) + min;
+}
+
+async function sleep(ms: number): Promise<void> {
+  return new Promise((resolve) => setTimeout(resolve, ms));
+}
+
+/**
+ * Host + joiners play one full multiplayer game end to end.
+ * The host creates the room, announces the code via the coordinator,
+ * joiners wait for the code, the host adds CPUs and starts, everyone
+ * loops on isMyTurn/playTurn until round_over or game_over.
+ */
+export async function runOneMultiplayerGame(
+  ctx: ScenarioContext,
+  sessions: Session[],
+  opts: MultiplayerGameOptions,
+): Promise<MultiplayerGameResult> {
+  const start = Date.now();
+  const [host, ...joiners] = sessions;
+  const maxDuration = opts.maxDurationMs ?? 5 * 60_000;
+
+  try {
+    // Host creates game
+    const code = await host.bot.createGame(host.account.username);
+    ctx.coordinator.announce(opts.roomId, code);
+    ctx.heartbeat(opts.roomId);
+    ctx.dashboard.update(opts.roomId, { phase: 'lobby' });
+    ctx.logger.info('room_created', { room: opts.roomId, code });
+
+    // Joiners join concurrently
+    await Promise.all(
+      joiners.map(async (joiner) => {
+        const awaited = await ctx.coordinator.await(opts.roomId);
+        await joiner.bot.joinGame(awaited, joiner.account.username);
+      }),
+    );
+    ctx.heartbeat(opts.roomId);
+
+    // Host adds CPUs (if any) and starts
+    for (let i = 0; i < opts.cpusPerRoom; i++) {
+      await host.bot.addCPU(opts.cpuPersonality);
+    }
+    await host.bot.startGame({ holes: opts.holes, decks: opts.decks });
+    ctx.heartbeat(opts.roomId);
+    ctx.dashboard.update(opts.roomId, { phase: 'playing', totalHoles: opts.holes });
+
+    // Concurrent turn loops — one per session
+    const turnCounts = new Array(sessions.length).fill(0);
+
+    async function sessionLoop(sessionIdx: number): Promise<void> {
+      const session = sessions[sessionIdx];
+      while (true) {
+        if (ctx.signal.aborted) return;
+        if (Date.now() - start > maxDuration) return;
+
+        const phase = await session.bot.getGamePhase();
+        if (phase === 'game_over' || phase === 'round_over') return;
+
+        if (await session.bot.isMyTurn()) {
+          await session.bot.playTurn();
+          turnCounts[sessionIdx]++;
+          ctx.heartbeat(opts.roomId);
+          ctx.dashboard.update(opts.roomId, {
+            currentPlayer: session.account.username,
+            moves: turnCounts.reduce((a, b) => a + b, 0),
+          });
+          const thinkMs = randomInt(opts.thinkTimeMs[0], opts.thinkTimeMs[1]);
+          await sleep(thinkMs);
+        } else {
+          await sleep(200);
+        }
+      }
+    }
+
+    await Promise.all(sessions.map((_, i) => sessionLoop(i)));
+
+    const totalTurns = turnCounts.reduce((a, b) => a + b, 0);
+    ctx.dashboard.update(opts.roomId, { phase: 'round_over' });
+    return {
+      completed: true,
+      turns: totalTurns,
+      durationMs: Date.now() - start,
+    };
+  } catch (err) {
+    return {
+      completed: false,
+      turns: 0,
+      durationMs: Date.now() - start,
+      error: err instanceof Error ? err.message : String(err),
+    };
+  }
+}
+```
+
+- [ ] **Step 2: Syntax-check**
+
+```bash
+cd tests/soak
+npx tsx -e "import('./scenarios/shared/multiplayer-game').then(() => console.log('ok'))"
+```
+
+Expected: `ok`.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tests/soak/scenarios/shared/multiplayer-game.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): shared runOneMultiplayerGame helper
+
+Encapsulates the host-creates/joiners-join/loop-until-done flow so
+populate and stress scenarios don't duplicate it. Honors abort
+signal and a max-duration timeout, heartbeats on every turn.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 16: Populate scenario (minimal version)
+
+Partitions sessions into rooms, runs `gamesPerRoom` games per room in parallel, aggregates results.
+
+**Files:**
+- Create: `tests/soak/scenarios/populate.ts`
+- Create: `tests/soak/scenarios/index.ts`
+
+- [ ] **Step 1: Create `scenarios/populate.ts`**
+
+```typescript
+// tests/soak/scenarios/populate.ts
+import type {
+  Scenario,
+  ScenarioContext,
+  ScenarioResult,
+  ScenarioError,
+  Session,
+} from '../core/types';
+import { runOneMultiplayerGame } from './shared/multiplayer-game';
+
+const CPU_PERSONALITIES = ['Sofia', 'Marcus', 'Kenji', 'Priya'];
+
+interface PopulateConfig {
+  gamesPerRoom: number;
+  holes: number;
+  decks: number;
+  rooms: number;
+  cpusPerRoom: number;
+  thinkTimeMs: [number, number];
+  interGamePauseMs: number;
+}
+
+function chunk<T>(arr: T[], size: number): T[][] {
+  const out: T[][] = [];
+  for (let i = 0; i < arr.length; i += size) {
+    out.push(arr.slice(i, i + size));
+  }
+  return out;
+}
+
+async function sleep(ms: number): Promise<void> {
+  return new Promise((resolve) => setTimeout(resolve, ms));
+}
+
+async function runRoom(
+  ctx: ScenarioContext,
+  cfg: PopulateConfig,
+  roomIdx: number,
+  sessions: Session[],
+): Promise<{ completed: number; errors: ScenarioError[] }> {
+  const roomId = `room-${roomIdx}`;
+  const cpuPersonality = CPU_PERSONALITIES[roomIdx % CPU_PERSONALITIES.length];
+  let completed = 0;
+  const errors: ScenarioError[] = [];
+
+  for (let gameNum = 0; gameNum < cfg.gamesPerRoom; gameNum++) {
+    if (ctx.signal.aborted) break;
+    ctx.dashboard.update(roomId, { game: gameNum + 1, totalGames: cfg.gamesPerRoom });
+    ctx.logger.info('game_start', { room: roomId, game: gameNum + 1 });
+
+    const result = await runOneMultiplayerGame(ctx, sessions, {
+      roomId,
+      holes: cfg.holes,
+      decks: cfg.decks,
+      cpusPerRoom: cfg.cpusPerRoom,
+      cpuPersonality,
+      thinkTimeMs: cfg.thinkTimeMs,
+    });
+
+    if (result.completed) {
+      completed++;
+      ctx.logger.info('game_complete', {
+        room: roomId,
+        game: gameNum + 1,
+        turns: result.turns,
+        durationMs: result.durationMs,
+      });
+    } else {
+      errors.push({
+        room: roomId,
+        reason: 'game_failed',
+        detail: result.error,
+        timestamp: Date.now(),
+      });
+      ctx.logger.error('game_failed', { room: roomId, game: gameNum + 1, error: result.error });
+    }
+
+    if (gameNum < cfg.gamesPerRoom - 1) {
+      await sleep(cfg.interGamePauseMs);
+    }
+  }
+
+  return { completed, errors };
+}
+
+const populate: Scenario = {
+  name: 'populate',
+  description: 'Long multi-round games to populate scoreboards',
+  needs: { accounts: 16, rooms: 4, cpusPerRoom: 1 },
+  defaultConfig: {
+    gamesPerRoom: 10,
+    holes: 9,
+    decks: 2,
+    rooms: 4,
+    cpusPerRoom: 1,
+    thinkTimeMs: [800, 2200],
+    interGamePauseMs: 3000,
+  },
+
+  async run(ctx: ScenarioContext): Promise<ScenarioResult> {
+    const start = Date.now();
+    const cfg = ctx.config as unknown as PopulateConfig;
+
+    const perRoom = Math.floor(ctx.sessions.length / cfg.rooms);
+    if (perRoom * cfg.rooms !== ctx.sessions.length) {
+      throw new Error(
+        `populate: ${ctx.sessions.length} sessions does not divide evenly into ${cfg.rooms} rooms`,
+      );
+    }
+    const roomSessions = chunk(ctx.sessions, perRoom);
+
+    const results = await Promise.allSettled(
+      roomSessions.map((sessions, idx) => runRoom(ctx, cfg, idx, sessions)),
+    );
+
+    let gamesCompleted = 0;
+    const errors: ScenarioError[] = [];
+    results.forEach((r, idx) => {
+      if (r.status === 'fulfilled') {
+        gamesCompleted += r.value.completed;
+        errors.push(...r.value.errors);
+      } else {
+        errors.push({
+          room: `room-${idx}`,
+          reason: 'room_threw',
+          detail: r.reason instanceof Error ? r.reason.message : String(r.reason),
+          timestamp: Date.now(),
+        });
+      }
+    });
+
+    return {
+      gamesCompleted,
+      errors,
+      durationMs: Date.now() - start,
+    };
+  },
+};
+
+export default populate;
+```
+
+- [ ] **Step 2: Create `scenarios/index.ts` registry**
+
+```typescript
+// tests/soak/scenarios/index.ts
+import type { Scenario } from '../core/types';
+import populate from './populate';
+
+const registry: Record<string, Scenario> = {
+  populate,
+};
+
+export function getScenario(name: string): Scenario | undefined {
+  return registry[name];
+}
+
+export function listScenarios(): Scenario[] {
+  return Object.values(registry);
+}
+```
+
+- [ ] **Step 3: Syntax-check**
+
+```bash
+cd tests/soak
+npx tsx -e "import('./scenarios/index').then((m) => console.log(m.listScenarios().map(s => s.name)))"
+```
+
+Expected: `['populate']`.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add tests/soak/scenarios/populate.ts tests/soak/scenarios/index.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): populate scenario + scenario registry
+
+Partitions sessions into N rooms, runs gamesPerRoom games per room
+in parallel via Promise.allSettled so a failure in one room never
+unwinds the others. Errors roll up into ScenarioResult.errors.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 17: Config parsing with tests
+
+CLI flags, env vars, scenario defaults, runner defaults — merged in that precedence order.
+
+**Files:**
+- Create: `tests/soak/config.ts`
+- Create: `tests/soak/tests/config.test.ts`
+
+- [ ] **Step 1: Write failing tests**
+
+```typescript
+// tests/soak/tests/config.test.ts
+import { describe, it, expect } from 'vitest';
+import { parseArgs, mergeConfig } from '../config';
+
+describe('parseArgs', () => {
+  it('parses --scenario and numeric flags', () => {
+    const r = parseArgs(['--scenario=populate', '--rooms=4', '--games-per-room=10']);
+    expect(r.scenario).toBe('populate');
+    expect(r.rooms).toBe(4);
+    expect(r.gamesPerRoom).toBe(10);
+  });
+
+  it('parses watch mode', () => {
+    const r = parseArgs(['--scenario=populate', '--watch=none']);
+    expect(r.watch).toBe('none');
+  });
+
+  it('rejects unknown watch mode', () => {
+    expect(() => parseArgs(['--scenario=populate', '--watch=bogus'])).toThrow();
+  });
+
+  it('--list sets listOnly', () => {
+    const r = parseArgs(['--list']);
+    expect(r.listOnly).toBe(true);
+  });
+});
+
+describe('mergeConfig', () => {
+  it('CLI flags override scenario defaults', () => {
+    const cfg = mergeConfig(
+      { games: 5, holes: 9 },
+      {},
+      { gamesPerRoom: 20 },
+    );
+    expect(cfg.gamesPerRoom).toBe(20);
+  });
+
+  it('env overrides scenario defaults but not CLI', () => {
+    const cfg = mergeConfig(
+      { games: 5, holes: 9 },
+      { SOAK_HOLES: '3' },
+      { holes: 7 },
+    );
+    expect(cfg.holes).toBe(7);    // CLI wins (7 was from scenario defaults? no — CLI not set here)
+    // Correction: CLI not set, so env wins over scenario default
+  });
+
+  it('scenario defaults fill in unset values', () => {
+    const cfg = mergeConfig(
+      { games: 5, holes: 9 },
+      {},
+      { gamesPerRoom: 3 },
+    );
+    expect(cfg.games).toBe(5);
+    expect(cfg.holes).toBe(9);
+    expect(cfg.gamesPerRoom).toBe(3);
+  });
+});
+```
+
+Note: the middle test has a correction inline — re-read and fix so the assertion matches precedence "CLI > env > defaults". Correct version:
+
+```typescript
+  it('env overrides scenario defaults but CLI overrides env', () => {
+    const cfg = mergeConfig(
+      { holes: 5 },                 // CLI
+      { SOAK_HOLES: '3' },          // env
+      { holes: 9 },                 // defaults
+    );
+    expect(cfg.holes).toBe(5);      // CLI wins
+  });
+```
+
+Replace the second `it(...)` block above with this corrected version before running.
+
+- [ ] **Step 2: Run tests to verify they fail**
+
+```bash
+npx vitest run tests/config.test.ts
+```
+
+Expected: FAIL — module not found.
+
+- [ ] **Step 3: Implement `config.ts`**
+
+```typescript
+// tests/soak/config.ts
+
+export type WatchMode = 'none' | 'dashboard' | 'tiled';
+
+export interface CliArgs {
+  scenario?: string;
+  accounts?: number;
+  rooms?: number;
+  cpusPerRoom?: number;
+  gamesPerRoom?: number;
+  holes?: number;
+  watch?: WatchMode;
+  dashboardPort?: number;
+  target?: string;
+  runId?: string;
+  dryRun?: boolean;
+  listOnly?: boolean;
+}
+
+const VALID_WATCH: WatchMode[] = ['none', 'dashboard', 'tiled'];
+
+function parseInt10(s: string, name: string): number {
+  const n = parseInt(s, 10);
+  if (Number.isNaN(n)) throw new Error(`Invalid integer for ${name}: ${s}`);
+  return n;
+}
+
+export function parseArgs(argv: string[]): CliArgs {
+  const out: CliArgs = {};
+  for (const arg of argv) {
+    if (arg === '--list') {
+      out.listOnly = true;
+      continue;
+    }
+    if (arg === '--dry-run') {
+      out.dryRun = true;
+      continue;
+    }
+    const m = arg.match(/^--([a-z][a-z0-9-]*)=(.*)$/);
+    if (!m) continue;
+    const [, key, value] = m;
+    switch (key) {
+      case 'scenario':
+        out.scenario = value;
+        break;
+      case 'accounts':
+        out.accounts = parseInt10(value, '--accounts');
+        break;
+      case 'rooms':
+        out.rooms = parseInt10(value, '--rooms');
+        break;
+      case 'cpus-per-room':
+        out.cpusPerRoom = parseInt10(value, '--cpus-per-room');
+        break;
+      case 'games-per-room':
+        out.gamesPerRoom = parseInt10(value, '--games-per-room');
+        break;
+      case 'holes':
+        out.holes = parseInt10(value, '--holes');
+        break;
+      case 'watch':
+        if (!VALID_WATCH.includes(value as WatchMode)) {
+          throw new Error(`Invalid --watch value: ${value} (expected ${VALID_WATCH.join('|')})`);
+        }
+        out.watch = value as WatchMode;
+        break;
+      case 'dashboard-port':
+        out.dashboardPort = parseInt10(value, '--dashboard-port');
+        break;
+      case 'target':
+        out.target = value;
+        break;
+      case 'run-id':
+        out.runId = value;
+        break;
+      default:
+        // Unknown flag — ignore so scenario-specific flags can be added later
+        break;
+    }
+  }
+  return out;
+}
+
+/**
+ * Merge in order: scenarioDefaults → env → cli (later wins).
+ */
+export function mergeConfig(
+  cli: Record<string, unknown>,
+  env: Record<string, string | undefined>,
+  defaults: Record<string, unknown>,
+): Record<string, unknown> {
+  const merged: Record<string, unknown> = { ...defaults };
+
+  // Env overlay — SOAK_UPPER_SNAKE → lowerCamel in cli space.
+  const envMap: Record<string, string> = {
+    SOAK_HOLES: 'holes',
+    SOAK_ROOMS: 'rooms',
+    SOAK_ACCOUNTS: 'accounts',
+    SOAK_CPUS_PER_ROOM: 'cpusPerRoom',
+    SOAK_GAMES_PER_ROOM: 'gamesPerRoom',
+    SOAK_WATCH: 'watch',
+    SOAK_DASHBOARD_PORT: 'dashboardPort',
+  };
+  for (const [envKey, cfgKey] of Object.entries(envMap)) {
+    const v = env[envKey];
+    if (v !== undefined) {
+      // Heuristic: numeric keys
+      if (/^(holes|rooms|accounts|cpusPerRoom|gamesPerRoom|dashboardPort)$/.test(cfgKey)) {
+        merged[cfgKey] = parseInt(v, 10);
+      } else {
+        merged[cfgKey] = v;
+      }
+    }
+  }
+
+  // CLI overlay — wins over env and defaults.
+  for (const [k, v] of Object.entries(cli)) {
+    if (v !== undefined) merged[k] = v;
+  }
+
+  return merged;
+}
+```
+
+- [ ] **Step 4: Fix the failing middle test as noted in Step 1**
+
+Edit `tests/soak/tests/config.test.ts` and replace the second `it(...)` block inside `describe('mergeConfig')` with the corrected version provided in Step 1.
+
+- [ ] **Step 5: Run tests to verify they pass**
+
+```bash
+npx vitest run tests/config.test.ts
+```
+
+Expected: all passing.
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add tests/soak/config.ts tests/soak/tests/config.test.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): CLI parsing + config precedence
+
+parseArgs pulls --scenario/--rooms/--watch/etc from argv, mergeConfig
+layers scenarioDefaults → env → CLI so CLI flags always win. Unit
+tested.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 18: `runner.ts` entry point — first end-to-end milestone
+
+Replaces the placeholder runner with the real thing: parse args, build dependencies, load scenario, acquire sessions, run scenario, clean up, print summary. Supports `--watch=none` only at this stage.
+
+**Files:**
+- Modify: `tests/soak/runner.ts` (replace placeholder)
+
+- [ ] **Step 1: Rewrite `runner.ts`**
+
+```typescript
+#!/usr/bin/env tsx
+/**
+ * Golf Soak Harness — entry point.
+ *
+ * Usage:
+ *   TEST_URL=http://localhost:8000 \
+ *   SOAK_INVITE_CODE=SOAKTEST \
+ *     npm run soak -- --scenario=populate --rooms=1 --accounts=2 \
+ *       --cpus-per-room=0 --games-per-room=1 --holes=1 --watch=none
+ */
+
+import * as path from 'path';
+import { parseArgs, mergeConfig, CliArgs } from './config';
+import { createLogger } from './core/logger';
+import { SessionPool } from './core/session-pool';
+import { RoomCoordinator } from './core/room-coordinator';
+import { getScenario, listScenarios } from './scenarios';
+import type { DashboardReporter, ScenarioContext } from './core/types';
+
+function noopDashboard(): DashboardReporter {
+  return {
+    update: () => {},
+    log: () => {},
+    incrementMetric: () => {},
+  };
+}
+
+function printScenarioList(): void {
+  console.log('Available scenarios:');
+  for (const s of listScenarios()) {
+    console.log(`  ${s.name.padEnd(12)} ${s.description}`);
+    console.log(`    needs: accounts=${s.needs.accounts}, rooms=${s.needs.rooms ?? 1}, cpus=${s.needs.cpusPerRoom ?? 0}`);
+  }
+}
+
+async function main(): Promise<void> {
+  const cli: CliArgs = parseArgs(process.argv.slice(2));
+
+  if (cli.listOnly) {
+    printScenarioList();
+    return;
+  }
+
+  if (!cli.scenario) {
+    console.error('Error: --scenario=<name> is required. Use --list to see scenarios.');
+    process.exit(2);
+  }
+
+  const scenario = getScenario(cli.scenario);
+  if (!scenario) {
+    console.error(`Error: unknown scenario "${cli.scenario}". Use --list to see scenarios.`);
+    process.exit(2);
+  }
+
+  const runId = cli.runId ?? `${cli.scenario}-${new Date().toISOString().replace(/[:.]/g, '-')}`;
+  const targetUrl = cli.target ?? process.env.TEST_URL ?? 'http://localhost:8000';
+  const inviteCode = process.env.SOAK_INVITE_CODE ?? 'SOAKTEST';
+  const watch = cli.watch ?? 'dashboard';
+
+  const logger = createLogger({ runId });
+  logger.info('run_start', {
+    scenario: scenario.name,
+    targetUrl,
+    watch,
+    cli,
+  });
+
+  // Resolve final config
+  const config = mergeConfig(
+    cli as Record<string, unknown>,
+    process.env,
+    scenario.defaultConfig,
+  );
+  // Ensure core knobs exist
+  const accounts = Number(config.accounts ?? scenario.needs.accounts);
+  const rooms = Number(config.rooms ?? scenario.needs.rooms ?? 1);
+  const cpusPerRoom = Number(config.cpusPerRoom ?? scenario.needs.cpusPerRoom ?? 0);
+  if (accounts % rooms !== 0) {
+    console.error(`Error: --accounts=${accounts} does not divide evenly into --rooms=${rooms}`);
+    process.exit(2);
+  }
+  config.rooms = rooms;
+  config.cpusPerRoom = cpusPerRoom;
+
+  if (cli.dryRun) {
+    logger.info('dry_run', { config });
+    console.log('Dry run OK. Resolved config:');
+    console.log(JSON.stringify(config, null, 2));
+    return;
+  }
+
+  if (watch !== 'none') {
+    logger.warn('watch_mode_not_yet_implemented', { watch });
+    console.warn(`Watch mode "${watch}" not yet implemented — falling back to "none".`);
+  }
+
+  // Build dependencies
+  const credFile = path.resolve(__dirname, '.env.stresstest');
+  const pool = new SessionPool({
+    targetUrl,
+    inviteCode,
+    credFile,
+    logger,
+  });
+  const coordinator = new RoomCoordinator();
+  const dashboard = noopDashboard();
+  const abortController = new AbortController();
+
+  const onSignal = (sig: string) => {
+    logger.warn('signal_received', { signal: sig });
+    abortController.abort();
+  };
+  process.on('SIGINT', () => onSignal('SIGINT'));
+  process.on('SIGTERM', () => onSignal('SIGTERM'));
+
+  let exitCode = 0;
+  try {
+    const sessions = await pool.acquire(accounts);
+    logger.info('sessions_acquired', { count: sessions.length });
+
+    const ctx: ScenarioContext = {
+      config,
+      sessions,
+      coordinator,
+      dashboard,
+      logger,
+      signal: abortController.signal,
+      heartbeat: () => {}, // Task 26 wires this up
+    };
+
+    const result = await scenario.run(ctx);
+    logger.info('run_complete', {
+      gamesCompleted: result.gamesCompleted,
+      errors: result.errors.length,
+      durationMs: result.durationMs,
+    });
+    console.log(`Games completed: ${result.gamesCompleted}`);
+    console.log(`Errors:          ${result.errors.length}`);
+    console.log(`Duration:        ${(result.durationMs / 1000).toFixed(1)}s`);
+    if (result.errors.length > 0) {
+      console.log('Errors:');
+      for (const e of result.errors) {
+        console.log(`  ${e.room}: ${e.reason}${e.detail ? ' — ' + e.detail : ''}`);
+      }
+      exitCode = 1;
+    }
+  } catch (err) {
+    logger.error('run_failed', {
+      error: err instanceof Error ? err.message : String(err),
+      stack: err instanceof Error ? err.stack : undefined,
+    });
+    exitCode = 1;
+  } finally {
+    await pool.release();
+  }
+
+  if (abortController.signal.aborted && exitCode === 0) exitCode = 2;
+  process.exit(exitCode);
+}
+
+main().catch((err) => {
+  console.error(err);
+  process.exit(1);
+});
+```
+
+- [ ] **Step 2: Run a minimal `--watch=none` smoke against local dev**
+
+Server running, 4 soak accounts already seeded from Task 14:
+
+```bash
+cd tests/soak
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
+  --scenario=populate \
+  --accounts=2 \
+  --rooms=1 \
+  --cpus-per-room=0 \
+  --games-per-room=1 \
+  --holes=1 \
+  --watch=none
+```
+
+Expected output (abbreviated):
+
+```
+{"timestamp":"...","level":"info","msg":"run_start",...}
+{"timestamp":"...","level":"info","msg":"sessions_acquired","count":2}
+{"timestamp":"...","level":"info","msg":"game_start","room":"room-0","game":1}
+{"timestamp":"...","level":"info","msg":"room_created","code":"XXXX"}
+{"timestamp":"...","level":"info","msg":"game_complete","room":"room-0","turns":...}
+{"timestamp":"...","level":"info","msg":"run_complete","gamesCompleted":1,"errors":0}
+Games completed: 1
+Errors:          0
+Duration:        X.Xs
+```
+
+Exit code 0.
+
+This is the first **end-to-end milestone**. Stop here if debugging is needed — fix issues before moving on.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tests/soak/runner.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): runner.ts end-to-end with --watch=none
+
+First full end-to-end milestone: parses CLI, builds SessionPool +
+RoomCoordinator, loads a scenario by name, runs it, reports results,
+cleans up. Watch modes other than "none" log a warning and fall back
+until Tasks 19-24 implement them.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+## Phase 5 — Dashboard status grid
+
+### Task 19: Dashboard HTTP + WS server
+
+Vanilla node `http` + `ws`. Serves one static HTML page, accepts WS connections, broadcasts room-state updates.
+
+**Files:**
+- Create: `tests/soak/dashboard/server.ts`
+
+- [ ] **Step 1: Implement `dashboard/server.ts`**
+
+```typescript
+// tests/soak/dashboard/server.ts
+import * as http from 'http';
+import * as fs from 'fs';
+import * as path from 'path';
+import { WebSocketServer, WebSocket } from 'ws';
+import type { DashboardReporter, Logger, RoomState } from '../core/types';
+
+export type DashboardIncoming =
+  | { type: 'start_stream'; sessionKey: string }
+  | { type: 'stop_stream'; sessionKey: string };
+
+export type DashboardOutgoing =
+  | { type: 'room_state'; roomId: string; state: Partial<RoomState> }
+  | { type: 'log'; level: string; msg: string; meta?: object; timestamp: number }
+  | { type: 'metric'; name: string; value: number }
+  | { type: 'frame'; sessionKey: string; jpegBase64: string };
+
+export interface DashboardHandlers {
+  onStartStream?(sessionKey: string): void;
+  onStopStream?(sessionKey: string): void;
+  onDisconnect?(): void;
+}
+
+export class DashboardServer {
+  private httpServer!: http.Server;
+  private wsServer!: WebSocketServer;
+  private clients = new Set<WebSocket>();
+  private metrics: Record<string, number> = {};
+  private roomStates: Record<string, Partial<RoomState>> = {};
+
+  constructor(
+    private port: number,
+    private logger: Logger,
+    private handlers: DashboardHandlers = {},
+  ) {}
+
+  async start(): Promise<void> {
+    const htmlPath = path.resolve(__dirname, 'index.html');
+    const cssPath = path.resolve(__dirname, 'dashboard.css');
+    const jsPath = path.resolve(__dirname, 'dashboard.js');
+
+    this.httpServer = http.createServer((req, res) => {
+      const url = req.url ?? '/';
+      if (url === '/' || url === '/index.html') {
+        res.writeHead(200, { 'Content-Type': 'text/html; charset=utf-8' });
+        fs.createReadStream(htmlPath).pipe(res);
+      } else if (url === '/dashboard.css') {
+        res.writeHead(200, { 'Content-Type': 'text/css' });
+        fs.createReadStream(cssPath).pipe(res);
+      } else if (url === '/dashboard.js') {
+        res.writeHead(200, { 'Content-Type': 'application/javascript' });
+        fs.createReadStream(jsPath).pipe(res);
+      } else {
+        res.writeHead(404);
+        res.end('not found');
+      }
+    });
+
+    this.wsServer = new WebSocketServer({ server: this.httpServer });
+    this.wsServer.on('connection', (ws) => {
+      this.clients.add(ws);
+      this.logger.info('dashboard_client_connected', { count: this.clients.size });
+
+      // Replay current state to the new client
+      for (const [roomId, state] of Object.entries(this.roomStates)) {
+        ws.send(JSON.stringify({ type: 'room_state', roomId, state } as DashboardOutgoing));
+      }
+      for (const [name, value] of Object.entries(this.metrics)) {
+        ws.send(JSON.stringify({ type: 'metric', name, value } as DashboardOutgoing));
+      }
+
+      ws.on('message', (data) => {
+        try {
+          const parsed = JSON.parse(data.toString()) as DashboardIncoming;
+          if (parsed.type === 'start_stream' && this.handlers.onStartStream) {
+            this.handlers.onStartStream(parsed.sessionKey);
+          } else if (parsed.type === 'stop_stream' && this.handlers.onStopStream) {
+            this.handlers.onStopStream(parsed.sessionKey);
+          }
+        } catch (err) {
+          this.logger.warn('dashboard_ws_parse_error', {
+            error: err instanceof Error ? err.message : String(err),
+          });
+        }
+      });
+
+      ws.on('close', () => {
+        this.clients.delete(ws);
+        this.logger.info('dashboard_client_disconnected', { count: this.clients.size });
+        if (this.clients.size === 0 && this.handlers.onDisconnect) {
+          this.handlers.onDisconnect();
+        }
+      });
+    });
+
+    await new Promise<void>((resolve) => {
+      this.httpServer.listen(this.port, () => resolve());
+    });
+    this.logger.info('dashboard_listening', { url: `http://localhost:${this.port}` });
+  }
+
+  async stop(): Promise<void> {
+    for (const ws of this.clients) {
+      try {
+        ws.close();
+      } catch {
+        // ignore
+      }
+    }
+    this.clients.clear();
+    await new Promise<void>((resolve) => {
+      this.wsServer.close(() => resolve());
+    });
+    await new Promise<void>((resolve) => {
+      this.httpServer.close(() => resolve());
+    });
+  }
+
+  broadcast(msg: DashboardOutgoing): void {
+    const payload = JSON.stringify(msg);
+    for (const ws of this.clients) {
+      if (ws.readyState === WebSocket.OPEN) {
+        ws.send(payload);
+      }
+    }
+  }
+
+  /** Create a DashboardReporter wired to this server. */
+  reporter(): DashboardReporter {
+    return {
+      update: (roomId, state) => {
+        this.roomStates[roomId] = { ...this.roomStates[roomId], ...state };
+        this.broadcast({ type: 'room_state', roomId, state });
+      },
+      log: (level, msg, meta) => {
+        this.broadcast({ type: 'log', level, msg, meta, timestamp: Date.now() });
+      },
+      incrementMetric: (name, by = 1) => {
+        this.metrics[name] = (this.metrics[name] ?? 0) + by;
+        this.broadcast({ type: 'metric', name, value: this.metrics[name] });
+      },
+    };
+  }
+}
+```
+
+- [ ] **Step 2: Syntax-check**
+
+```bash
+cd tests/soak
+npx tsx -e "import('./dashboard/server').then(() => console.log('ok'))"
+```
+
+Expected: `ok`.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tests/soak/dashboard/server.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): DashboardServer — vanilla http + ws
+
+Serves one static HTML page, accepts WS connections, broadcasts
+room_state/log/metric messages to all clients. Exposes a
+reporter() method that returns a DashboardReporter scenarios can
+call without knowing about sockets.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 20: Dashboard HTML/CSS/JS status grid
+
+Single static HTML page + stylesheet + client script. Renders the 2×2 room grid, subscribes to WS, updates tiles on each message.
+
+**Files:**
+- Create: `tests/soak/dashboard/index.html`
+- Create: `tests/soak/dashboard/dashboard.css`
+- Create: `tests/soak/dashboard/dashboard.js`
+
+- [ ] **Step 1: Create `dashboard/index.html`**
+
+```html
+<!DOCTYPE html>
+<html lang="en">
+<head>
+<meta charset="UTF-8">
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
+<title>Golf Soak Dashboard</title>
+<link rel="stylesheet" href="/dashboard.css">
+</head>
+<body>
+<header class="dash-header">
+  <h1>⛳ Golf Soak Dashboard</h1>
+  <div class="meta">
+    <span id="run-id">run —</span>
+    <span id="elapsed">00:00:00</span>
+  </div>
+</header>
+
+<div class="meta-bar">
+  <div class="stat"><span class="label">Games</span><span id="metric-games">0</span></div>
+  <div class="stat"><span class="label">Moves</span><span id="metric-moves">0</span></div>
+  <div class="stat"><span class="label">Errors</span><span id="metric-errors">0</span></div>
+  <div class="stat"><span class="label">WS</span><span id="ws-status">connecting</span></div>
+</div>
+
+<div class="rooms" id="rooms">
+  <!-- Room tiles injected by dashboard.js -->
+</div>
+
+<section class="log">
+  <div class="log-header">Activity Log</div>
+  <ul id="log-list"></ul>
+</section>
+
+<!-- Modal for focused live video (Task 23) -->
+<div id="video-modal" class="video-modal hidden">
+  <div class="video-modal-content">
+    <div class="video-modal-header">
+      <span id="video-modal-title">Watching —</span>
+      <button id="video-modal-close">Close</button>
+    </div>
+    <img id="video-frame" alt="Live screencast" />
+  </div>
+</div>
+
+<script src="/dashboard.js"></script>
+</body>
+</html>
+```
+
+- [ ] **Step 2: Create `dashboard/dashboard.css`**
+
+```css
+:root {
+  --bg: #0a0e16;
+  --panel: #0e1420;
+  --border: #1a2230;
+  --text: #c8d4e4;
+  --accent: #7fbaff;
+  --good: #6fd08f;
+  --warn: #ffb84d;
+  --err: #ff5c6c;
+  --muted: #556577;
+}
+
+* { box-sizing: border-box; }
+
+body {
+  margin: 0;
+  font-family: -apple-system, system-ui, 'SF Mono', Consolas, monospace;
+  background: var(--bg);
+  color: var(--text);
+}
+
+.dash-header {
+  display: flex;
+  justify-content: space-between;
+  align-items: center;
+  padding: 12px 20px;
+  background: linear-gradient(135deg, #0f1823, #0a1018);
+  border-bottom: 1px solid var(--border);
+}
+.dash-header h1 { margin: 0; font-size: 16px; color: var(--accent); }
+.dash-header .meta { font-size: 11px; color: var(--muted); }
+.dash-header .meta span + span { margin-left: 12px; }
+
+.meta-bar {
+  display: flex;
+  gap: 24px;
+  padding: 10px 20px;
+  background: #0c131d;
+  border-bottom: 1px solid var(--border);
+  font-size: 12px;
+}
+.meta-bar .stat .label { color: var(--muted); margin-right: 6px; }
+.meta-bar .stat span:last-child { color: #fff; font-weight: 600; }
+
+.rooms {
+  display: grid;
+  grid-template-columns: 1fr 1fr;
+  gap: 1px;
+  background: var(--border);
+}
+.room {
+  background: var(--panel);
+  padding: 14px 18px;
+  min-height: 180px;
+}
+.room-title {
+  display: flex;
+  justify-content: space-between;
+  align-items: center;
+  margin-bottom: 10px;
+}
+.room-title .name { font-size: 13px; color: var(--accent); font-weight: 600; }
+.room-title .phase {
+  font-size: 10px;
+  padding: 2px 8px;
+  border-radius: 10px;
+  background: #1a3a2a;
+  color: var(--good);
+}
+.room-title .phase.lobby { background: #3a2a1a; color: var(--warn); }
+.room-title .phase.err { background: #3a1a1a; color: var(--err); }
+
+.players {
+  display: grid;
+  grid-template-columns: repeat(2, 1fr);
+  gap: 4px;
+  font-size: 11px;
+  margin-bottom: 8px;
+}
+.player {
+  display: flex;
+  justify-content: space-between;
+  padding: 4px 8px;
+  background: #0a0f18;
+  border-radius: 3px;
+  cursor: pointer;
+  border: 1px solid transparent;
+}
+.player:hover { border-color: var(--accent); }
+.player.active {
+  background: #1a2a40;
+  border-left: 2px solid var(--accent);
+}
+.player .score { color: var(--muted); }
+
+.progress-bar {
+  height: 4px;
+  background: var(--border);
+  border-radius: 2px;
+  overflow: hidden;
+  margin-top: 6px;
+}
+.progress-fill {
+  height: 100%;
+  background: linear-gradient(90deg, var(--accent), var(--good));
+  transition: width 0.3s;
+}
+.room-meta {
+  font-size: 10px;
+  color: var(--muted);
+  display: flex;
+  gap: 12px;
+  margin-top: 6px;
+}
+
+.log {
+  border-top: 1px solid var(--border);
+  background: #080c13;
+  max-height: 160px;
+  overflow-y: auto;
+}
+.log .log-header {
+  padding: 6px 20px;
+  font-size: 10px;
+  text-transform: uppercase;
+  color: var(--muted);
+  border-bottom: 1px solid var(--border);
+}
+.log ul { list-style: none; margin: 0; padding: 4px 20px; font-size: 10px; }
+.log li { line-height: 1.5; font-family: monospace; color: var(--muted); }
+.log li.warn { color: var(--warn); }
+.log li.error { color: var(--err); }
+
+.video-modal {
+  position: fixed;
+  inset: 0;
+  background: rgba(0, 0, 0, 0.85);
+  display: flex;
+  align-items: center;
+  justify-content: center;
+  z-index: 100;
+}
+.video-modal.hidden { display: none; }
+.video-modal-content {
+  background: var(--panel);
+  border: 1px solid var(--border);
+  border-radius: 6px;
+  padding: 16px;
+  max-width: 90vw;
+  max-height: 90vh;
+}
+.video-modal-header {
+  display: flex;
+  justify-content: space-between;
+  align-items: center;
+  margin-bottom: 12px;
+  color: var(--accent);
+  font-size: 13px;
+}
+.video-modal-header button {
+  background: var(--border);
+  color: var(--text);
+  border: none;
+  padding: 4px 12px;
+  border-radius: 3px;
+  cursor: pointer;
+}
+#video-frame {
+  display: block;
+  max-width: 100%;
+  max-height: 70vh;
+  border: 1px solid var(--border);
+}
+```
+
+- [ ] **Step 3: Create `dashboard/dashboard.js`**
+
+```javascript
+// tests/soak/dashboard/dashboard.js
+(() => {
+  const ws = new WebSocket(`ws://${location.host}`);
+  const roomsEl = document.getElementById('rooms');
+  const logEl = document.getElementById('log-list');
+  const wsStatusEl = document.getElementById('ws-status');
+  const metricGames = document.getElementById('metric-games');
+  const metricMoves = document.getElementById('metric-moves');
+  const metricErrors = document.getElementById('metric-errors');
+  const elapsedEl = document.getElementById('elapsed');
+
+  const roomTiles = new Map();
+  const startTime = Date.now();
+  let currentWatchedKey = null;
+
+  // Video modal
+  const videoModal = document.getElementById('video-modal');
+  const videoFrame = document.getElementById('video-frame');
+  const videoTitle = document.getElementById('video-modal-title');
+  const videoClose = document.getElementById('video-modal-close');
+
+  function fmtElapsed(ms) {
+    const s = Math.floor(ms / 1000);
+    const h = Math.floor(s / 3600);
+    const m = Math.floor((s % 3600) / 60);
+    const sec = s % 60;
+    return `${String(h).padStart(2, '0')}:${String(m).padStart(2, '0')}:${String(sec).padStart(2, '0')}`;
+  }
+  setInterval(() => {
+    elapsedEl.textContent = fmtElapsed(Date.now() - startTime);
+  }, 1000);
+
+  function ensureRoomTile(roomId) {
+    if (roomTiles.has(roomId)) return roomTiles.get(roomId);
+    const tile = document.createElement('div');
+    tile.className = 'room';
+    tile.innerHTML = `
+      <div class="room-title">
+        <div class="name">${roomId}</div>
+        <div class="phase lobby">waiting</div>
+      </div>
+      <div class="players"></div>
+      <div class="progress-bar"><div class="progress-fill" style="width:0%"></div></div>
+      <div class="room-meta">
+        <span class="moves">0 moves</span>
+        <span class="game">game —</span>
+      </div>
+    `;
+    roomsEl.appendChild(tile);
+    roomTiles.set(roomId, tile);
+    return tile;
+  }
+
+  function renderRoomState(roomId, state) {
+    const tile = ensureRoomTile(roomId);
+    if (state.phase !== undefined) {
+      const phaseEl = tile.querySelector('.phase');
+      phaseEl.textContent = state.phase;
+      phaseEl.classList.toggle('lobby', state.phase === 'lobby' || state.phase === 'waiting');
+      phaseEl.classList.toggle('err', state.phase === 'error');
+    }
+    if (state.players !== undefined) {
+      const playersEl = tile.querySelector('.players');
+      playersEl.innerHTML = state.players
+        .map(
+          (p) => `
+            <div class="player ${p.isActive ? 'active' : ''}" data-session="${p.key}">
+              <span>${p.isActive ? '▶ ' : ''}${p.key}</span>
+              <span class="score">${p.score ?? '—'}</span>
+            </div>
+          `,
+        )
+        .join('');
+    }
+    if (state.hole !== undefined && state.totalHoles !== undefined) {
+      const fill = tile.querySelector('.progress-fill');
+      const pct = state.totalHoles > 0 ? Math.round((state.hole / state.totalHoles) * 100) : 0;
+      fill.style.width = `${pct}%`;
+    }
+    if (state.moves !== undefined) {
+      tile.querySelector('.moves').textContent = `${state.moves} moves`;
+    }
+    if (state.game !== undefined && state.totalGames !== undefined) {
+      tile.querySelector('.game').textContent = `game ${state.game}/${state.totalGames}`;
+    }
+  }
+
+  function appendLog(level, msg, meta) {
+    const li = document.createElement('li');
+    li.className = level;
+    const ts = new Date().toLocaleTimeString();
+    li.textContent = `[${ts}] ${msg} ${meta ? JSON.stringify(meta) : ''}`;
+    logEl.insertBefore(li, logEl.firstChild);
+    // Cap log length
+    while (logEl.children.length > 100) {
+      logEl.removeChild(logEl.lastChild);
+    }
+  }
+
+  function applyMetric(name, value) {
+    if (name === 'games_completed') metricGames.textContent = value;
+    else if (name === 'moves_total') metricMoves.textContent = value;
+    else if (name === 'errors') metricErrors.textContent = value;
+  }
+
+  ws.addEventListener('open', () => {
+    wsStatusEl.textContent = 'healthy';
+    wsStatusEl.style.color = 'var(--good)';
+  });
+  ws.addEventListener('close', () => {
+    wsStatusEl.textContent = 'disconnected';
+    wsStatusEl.style.color = 'var(--err)';
+  });
+  ws.addEventListener('message', (event) => {
+    let msg;
+    try {
+      msg = JSON.parse(event.data);
+    } catch {
+      return;
+    }
+    if (msg.type === 'room_state') {
+      renderRoomState(msg.roomId, msg.state);
+    } else if (msg.type === 'log') {
+      appendLog(msg.level, msg.msg, msg.meta);
+    } else if (msg.type === 'metric') {
+      applyMetric(msg.name, msg.value);
+    } else if (msg.type === 'frame') {
+      if (msg.sessionKey === currentWatchedKey) {
+        videoFrame.src = `data:image/jpeg;base64,${msg.jpegBase64}`;
+      }
+    }
+  });
+
+  // Click-to-watch (wired in Task 23)
+  roomsEl.addEventListener('click', (e) => {
+    const playerEl = e.target.closest('.player');
+    if (!playerEl) return;
+    const key = playerEl.dataset.session;
+    if (!key) return;
+    currentWatchedKey = key;
+    videoTitle.textContent = `Watching ${key}`;
+    videoModal.classList.remove('hidden');
+    ws.send(JSON.stringify({ type: 'start_stream', sessionKey: key }));
+  });
+
+  function closeVideo() {
+    if (currentWatchedKey) {
+      ws.send(JSON.stringify({ type: 'stop_stream', sessionKey: currentWatchedKey }));
+    }
+    currentWatchedKey = null;
+    videoModal.classList.add('hidden');
+    videoFrame.src = '';
+  }
+  videoClose.addEventListener('click', closeVideo);
+  document.addEventListener('keydown', (e) => {
+    if (e.key === 'Escape') closeVideo();
+  });
+})();
+```
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add tests/soak/dashboard/index.html tests/soak/dashboard/dashboard.css tests/soak/dashboard/dashboard.js
+git commit -m "$(cat <<'EOF'
+feat(soak): dashboard status grid UI
+
+Static HTML page served by DashboardServer. Renders the 2×2 room
+grid with progress bars and player tiles, subscribes to WS events,
+updates tiles live. Click-to-watch modal is wired but receives
+frames once the CDP screencaster ships in Task 22.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 21: Wire `WATCH=dashboard` in runner
+
+Start the dashboard server when `--watch=dashboard`, auto-open the URL in the user's browser, use its `reporter()` as the `ctx.dashboard`.
+
+**Files:**
+- Modify: `tests/soak/runner.ts`
+
+- [ ] **Step 1: Import and instantiate DashboardServer in `runner.ts`**
+
+At the top of `runner.ts`, add:
+
+```typescript
+import { DashboardServer } from './dashboard/server';
+import { spawn } from 'child_process';
+```
+
+Replace the block that creates `dashboard` with:
+
+```typescript
+  // Build dashboard if requested
+  let dashboardServer: DashboardServer | null = null;
+  let dashboard: DashboardReporter = noopDashboard();
+  if (watch === 'dashboard') {
+    const port = Number(config.dashboardPort ?? 7777);
+    dashboardServer = new DashboardServer(port, logger, {
+      onStartStream: (_key) => {
+        logger.info('stream_start_requested', { sessionKey: _key });
+        // Wired in Task 22
+      },
+      onStopStream: (_key) => {
+        logger.info('stream_stop_requested', { sessionKey: _key });
+      },
+    });
+    await dashboardServer.start();
+    dashboard = dashboardServer.reporter();
+    const url = `http://localhost:${port}`;
+    console.log(`Dashboard: ${url}`);
+    // Best-effort auto-open
+    try {
+      const opener = process.platform === 'darwin' ? 'open' : process.platform === 'win32' ? 'start' : 'xdg-open';
+      spawn(opener, [url], { stdio: 'ignore', detached: true }).unref();
+    } catch {
+      // If auto-open fails, the URL is already printed
+    }
+  } else if (watch === 'tiled') {
+    logger.warn('tiled_not_yet_implemented');
+    console.warn('Watch mode "tiled" not yet implemented (Task 24). Falling back to none.');
+  }
+```
+
+And in the `finally` block, shut down the server:
+
+```typescript
+  } finally {
+    await pool.release();
+    if (dashboardServer) {
+      await dashboardServer.stop();
+    }
+  }
+```
+
+Also remove the earlier `if (watch !== 'none')` warning block — it's replaced by the dispatch above.
+
+- [ ] **Step 2: Run smoke against dev with dashboard**
+
+```bash
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
+  --scenario=populate \
+  --accounts=2 --rooms=1 --cpus-per-room=0 --games-per-room=1 --holes=1 \
+  --watch=dashboard
+```
+
+Expected:
+- `Dashboard: http://localhost:7777` printed
+- Browser auto-opens (or you open it manually)
+- Page shows the dashboard with `WS: healthy`
+- During the game, the `room-0` tile shows `phase: playing`, increments `moves`, updates progress
+- After game completes, the runner exits 0 and the dashboard stops
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tests/soak/runner.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): wire --watch=dashboard in runner
+
+Starts DashboardServer on 7777 (configurable), uses its reporter as
+ctx.dashboard, auto-opens the URL. Cleans up on exit.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+## Phase 6 — Live video click-to-watch
+
+### Task 22: CDP screencast module
+
+Attach a CDP session to a given page, start screencasting JPEG frames at a fixed rate, forward each frame to a callback, detach on stop.
+
+**Files:**
+- Create: `tests/soak/core/screencaster.ts`
+
+- [ ] **Step 1: Implement `core/screencaster.ts`**
+
+```typescript
+// tests/soak/core/screencaster.ts
+import type { Page, CDPSession } from 'playwright-core';
+import type { Logger } from './types';
+
+export interface ScreencastOptions {
+  format?: 'jpeg' | 'png';
+  quality?: number;
+  maxWidth?: number;
+  maxHeight?: number;
+  everyNthFrame?: number;
+}
+
+export type FrameCallback = (jpegBase64: string) => void;
+
+export class Screencaster {
+  private sessions = new Map<string, CDPSession>();
+
+  constructor(private logger: Logger) {}
+
+  /**
+   * Attach a CDP session to the given page and start forwarding frames.
+   * If already streaming, this is a no-op.
+   */
+  async start(
+    sessionKey: string,
+    page: Page,
+    onFrame: FrameCallback,
+    opts: ScreencastOptions = {},
+  ): Promise<void> {
+    if (this.sessions.has(sessionKey)) {
+      this.logger.warn('screencast_already_running', { sessionKey });
+      return;
+    }
+    const client = await page.context().newCDPSession(page);
+    this.sessions.set(sessionKey, client);
+
+    client.on('Page.screencastFrame', async (evt: { data: string; sessionId: number }) => {
+      try {
+        onFrame(evt.data);
+        await client.send('Page.screencastFrameAck', { sessionId: evt.sessionId });
+      } catch (err) {
+        this.logger.warn('screencast_frame_error', {
+          sessionKey,
+          error: err instanceof Error ? err.message : String(err),
+        });
+      }
+    });
+
+    await client.send('Page.startScreencast', {
+      format: opts.format ?? 'jpeg',
+      quality: opts.quality ?? 60,
+      maxWidth: opts.maxWidth ?? 640,
+      maxHeight: opts.maxHeight ?? 360,
+      everyNthFrame: opts.everyNthFrame ?? 2,
+    });
+    this.logger.info('screencast_started', { sessionKey });
+  }
+
+  async stop(sessionKey: string): Promise<void> {
+    const client = this.sessions.get(sessionKey);
+    if (!client) return;
+    try {
+      await client.send('Page.stopScreencast');
+      await client.detach();
+    } catch (err) {
+      this.logger.warn('screencast_stop_error', {
+        sessionKey,
+        error: err instanceof Error ? err.message : String(err),
+      });
+    }
+    this.sessions.delete(sessionKey);
+    this.logger.info('screencast_stopped', { sessionKey });
+  }
+
+  async stopAll(): Promise<void> {
+    const keys = Array.from(this.sessions.keys());
+    await Promise.all(keys.map((k) => this.stop(k)));
+  }
+}
+```
+
+- [ ] **Step 2: Syntax-check**
+
+```bash
+cd tests/soak
+npx tsx -e "import('./core/screencaster').then(() => console.log('ok'))"
+```
+
+Expected: `ok`.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tests/soak/core/screencaster.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): Screencaster — CDP Page.startScreencast wrapper
+
+Attach/detach CDP sessions per Playwright Page, start/stop JPEG
+screencasts with configurable quality and frame rate, forward each
+frame to a callback. Used by the dashboard for click-to-watch
+live video.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 23: Wire screencaster to dashboard click-to-watch
+
+Runner creates a `Screencaster`, passes callbacks into `DashboardServer.onStartStream/onStopStream` that look up the right session and start/stop streaming. Each frame is broadcast to the dashboard.
+
+**Files:**
+- Modify: `tests/soak/runner.ts`
+
+- [ ] **Step 1: Import Screencaster and hold a sessions map**
+
+In `runner.ts`, add at the top:
+
+```typescript
+import { Screencaster } from './core/screencaster';
+```
+
+After `const sessions = await pool.acquire(accounts);`, build a lookup map:
+
+```typescript
+    const sessionsByKey = new Map<string, typeof sessions[number]>();
+    for (const s of sessions) sessionsByKey.set(s.key, s);
+```
+
+Create the screencaster before the dashboard (or right after sessions are acquired):
+
+```typescript
+    const screencaster = new Screencaster(logger);
+```
+
+- [ ] **Step 2: Replace the `onStartStream`/`onStopStream` no-ops with real wiring**
+
+Update the `DashboardServer` construction (earlier in the function) to accept handlers that close over `screencaster` and `sessionsByKey`. But since those are built after the dashboard, we need to build the dashboard AFTER sessions are acquired. Reorganize:
+
+Move the dashboard construction to AFTER `sessions = await pool.acquire(accounts)`. Then:
+
+```typescript
+    if (watch === 'dashboard') {
+      const port = Number(config.dashboardPort ?? 7777);
+      dashboardServer = new DashboardServer(port, logger, {
+        onStartStream: (key) => {
+          const session = sessionsByKey.get(key);
+          if (!session) {
+            logger.warn('stream_start_unknown_session', { sessionKey: key });
+            return;
+          }
+          screencaster
+            .start(key, session.page, (jpegBase64) => {
+              dashboardServer!.broadcast({ type: 'frame', sessionKey: key, jpegBase64 });
+            })
+            .catch((err) =>
+              logger.error('screencast_start_failed', {
+                key,
+                error: err instanceof Error ? err.message : String(err),
+              }),
+            );
+        },
+        onStopStream: (key) => {
+          screencaster.stop(key).catch(() => {});
+        },
+        onDisconnect: () => {
+          screencaster.stopAll().catch(() => {});
+        },
+      });
+      await dashboardServer.start();
+      dashboard = dashboardServer.reporter();
+      const url = `http://localhost:${port}`;
+      console.log(`Dashboard: ${url}`);
+      // ... auto-open
+    }
+```
+
+Make sure the `ctx.dashboard` assignment happens AFTER the dashboard setup (it already does — `const ctx = { ... dashboard, ... }` comes later).
+
+In the `finally` block, add:
+
+```typescript
+    await screencaster.stopAll();
+```
+
+- [ ] **Step 3: Manual test end-to-end**
+
+Run a longer populate game so there's time to click:
+
+```bash
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
+  --scenario=populate \
+  --accounts=4 --rooms=1 --cpus-per-room=0 --games-per-room=2 --holes=3 \
+  --watch=dashboard
+```
+
+Expected:
+1. Dashboard opens, shows 1 room with 4 players
+2. Click on any player tile (`soak_00`, `soak_01`, ...)
+3. Modal opens, shows live JPEG frames of that player's view of the game
+4. Close modal (Esc or Close button) — frames stop, screencast detaches
+5. Run completes cleanly
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add tests/soak/runner.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): click-to-watch live video via CDP screencast
+
+Runner creates a Screencaster and wires its start/stop into
+DashboardServer.onStartStream/onStopStream. Clicking a player tile
+in the dashboard starts a CDP screencast on that session's page,
+forwards JPEG frames as WS "frame" messages, closes on modal
+dismiss or WS disconnect.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+## Phase 7 — Tiled mode
+
+### Task 24: `--watch=tiled` native windows
+
+Launch a second headed browser for the 4 host contexts, position their windows in a 2×2 grid using `page.evaluate(window.moveTo)`.
+
+**Files:**
+- Modify: `tests/soak/core/session-pool.ts` — add optional headed-host support
+- Modify: `tests/soak/runner.ts` — enable tiled mode
+
+- [ ] **Step 1: Extend `SessionPool` to support headed host contexts**
+
+Add a new option and method to `SessionPool`. In `core/session-pool.ts`:
+
+```typescript
+export interface SessionPoolOptions {
+  targetUrl: string;
+  inviteCode: string;
+  credFile: string;
+  logger: Logger;
+  browser?: Browser;
+  contextOptions?: Parameters<Browser['newContext']>[0];
+  /** If set, the first `headedHostCount` sessions use a separate headed browser. */
+  headedHostCount?: number;
+}
+```
+
+Inside the class, add a `headedBrowser` field and extend `acquire`:
+
+```typescript
+  private headedBrowser: Browser | null = null;
+
+  // ... in acquire(), before the loop:
+
+  if ((this.opts.headedHostCount ?? 0) > 0 && !this.headedBrowser) {
+    this.headedBrowser = await chromium.launch({
+      headless: false,
+      slowMo: 50,
+    });
+  }
+
+  for (let i = 0; i < count; i++) {
+    const account = this.accounts[i];
+    const useHeaded = i < (this.opts.headedHostCount ?? 0);
+    const targetBrowser = useHeaded ? this.headedBrowser! : this.browser!;
+    const context = await targetBrowser.newContext({
+      ...this.opts.contextOptions,
+      ...(useHeaded ? { viewport: { width: 960, height: 540 } } : {}),
+    });
+    await this.injectAuth(context, account);
+    const page = await context.newPage();
+    await page.goto(this.opts.targetUrl);
+
+    // Position headed windows in a 2×2 grid
+    if (useHeaded) {
+      const col = i % 2;
+      const row = Math.floor(i / 2);
+      const x = col * 960;
+      const y = row * 560;
+      await page.evaluate(
+        ([x, y, w, h]) => {
+          window.moveTo(x, y);
+          window.resizeTo(w, h);
+        },
+        [x, y, 960, 540] as [number, number, number, number],
+      );
+    }
+
+    const bot = new GolfBot(page);
+    sessions.push({ account, context, page, bot, key: account.key });
+  }
+```
+
+Update `release` to close the headed browser too:
+
+```typescript
+  async release(): Promise<void> {
+    for (const session of this.activeSessions) {
+      try { await session.context.close(); } catch { /* ignore */ }
+    }
+    this.activeSessions = [];
+    if (this.ownedBrowser) {
+      try { await this.ownedBrowser.close(); } catch { /* ignore */ }
+      this.ownedBrowser = null;
+      this.browser = null;
+    }
+    if (this.headedBrowser) {
+      try { await this.headedBrowser.close(); } catch { /* ignore */ }
+      this.headedBrowser = null;
+    }
+  }
+```
+
+- [ ] **Step 2: Wire `watch === 'tiled'` in the runner**
+
+In `runner.ts`, replace the existing `tiled_not_yet_implemented` warning with:
+
+```typescript
+  const headedHostCount = watch === 'tiled' ? rooms : 0;
+
+  const pool = new SessionPool({
+    targetUrl,
+    inviteCode,
+    credFile,
+    logger,
+    headedHostCount,
+  });
+```
+
+(Move that `pool` creation up so it's aware of `watch`.)
+
+- [ ] **Step 3: Test tiled mode**
+
+```bash
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
+  --scenario=populate \
+  --accounts=4 --rooms=2 --cpus-per-room=0 --games-per-room=1 --holes=1 \
+  --watch=tiled
+```
+
+Expected: 2 native Chromium windows appear (one per host), sized ~960×540 and positioned at the upper-left of the screen. They play the game visibly. On exit, windows close.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add tests/soak/core/session-pool.ts tests/soak/runner.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): --watch=tiled launches N headed host windows
+
+SessionPool accepts headedHostCount; when > 0 it launches a second
+Chromium in headed mode, creates those contexts there, and positions
+each host window in a 2×2 grid via window.moveTo/resizeTo.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+## Phase 8 — Stress scenario
+
+### Task 25: Chaos injector + stress scenario
+
+Short 1-hole games in tight loops, with a 5% per-turn chance of injecting a chaos event (rapid clicks, brief offline toggle, tab navigation).
+
+**Files:**
+- Create: `tests/soak/scenarios/stress.ts`
+- Create: `tests/soak/scenarios/shared/chaos.ts`
+- Modify: `tests/soak/scenarios/index.ts` — register `stress`
+
+- [ ] **Step 1: Create `scenarios/shared/chaos.ts`**
+
+```typescript
+// tests/soak/scenarios/shared/chaos.ts
+import type { Session, Logger } from '../../core/types';
+
+export type ChaosEvent =
+  | 'rapid_clicks'
+  | 'tab_blur'
+  | 'brief_offline';
+
+const ALL_EVENTS: ChaosEvent[] = ['rapid_clicks', 'tab_blur', 'brief_offline'];
+
+function pickEvent(): ChaosEvent {
+  return ALL_EVENTS[Math.floor(Math.random() * ALL_EVENTS.length)];
+}
+
+export async function maybeInjectChaos(
+  session: Session,
+  probability: number,
+  logger: Logger,
+  roomId: string,
+): Promise<ChaosEvent | null> {
+  if (Math.random() >= probability) return null;
+
+  const event = pickEvent();
+  logger.info('chaos_injected', { room: roomId, session: session.key, event });
+  try {
+    switch (event) {
+      case 'rapid_clicks': {
+        // Fire 5 rapid clicks at the player's own cards
+        for (let i = 0; i < 5; i++) {
+          await session.page.locator(`#player-cards .card:nth-child(${(i % 6) + 1})`)
+            .click({ timeout: 300 })
+            .catch(() => {});
+        }
+        break;
+      }
+      case 'tab_blur': {
+        // Briefly dispatch blur then focus
+        await session.page.evaluate(() => {
+          window.dispatchEvent(new Event('blur'));
+          setTimeout(() => window.dispatchEvent(new Event('focus')), 200);
+        });
+        break;
+      }
+      case 'brief_offline': {
+        await session.context.setOffline(true);
+        await new Promise((r) => setTimeout(r, 300));
+        await session.context.setOffline(false);
+        break;
+      }
+    }
+  } catch (err) {
+    logger.warn('chaos_error', {
+      event,
+      error: err instanceof Error ? err.message : String(err),
+    });
+  }
+  return event;
+}
+```
+
+- [ ] **Step 2: Create `scenarios/stress.ts`**
+
+```typescript
+// tests/soak/scenarios/stress.ts
+import type {
+  Scenario,
+  ScenarioContext,
+  ScenarioResult,
+  ScenarioError,
+  Session,
+} from '../core/types';
+import { runOneMultiplayerGame } from './shared/multiplayer-game';
+import { maybeInjectChaos } from './shared/chaos';
+
+interface StressConfig {
+  gamesPerRoom: number;
+  holes: number;
+  decks: number;
+  rooms: number;
+  cpusPerRoom: number;
+  thinkTimeMs: [number, number];
+  interGamePauseMs: number;
+  chaosChance: number;
+}
+
+function chunk<T>(arr: T[], size: number): T[][] {
+  const out: T[][] = [];
+  for (let i = 0; i < arr.length; i += size) out.push(arr.slice(i, i + size));
+  return out;
+}
+
+async function sleep(ms: number): Promise<void> {
+  return new Promise((r) => setTimeout(r, ms));
+}
+
+async function runStressRoom(
+  ctx: ScenarioContext,
+  cfg: StressConfig,
+  roomIdx: number,
+  sessions: Session[],
+): Promise<{ completed: number; errors: ScenarioError[]; chaosFired: number }> {
+  const roomId = `room-${roomIdx}`;
+  let completed = 0;
+  let chaosFired = 0;
+  const errors: ScenarioError[] = [];
+
+  for (let gameNum = 0; gameNum < cfg.gamesPerRoom; gameNum++) {
+    if (ctx.signal.aborted) break;
+
+    ctx.dashboard.update(roomId, { game: gameNum + 1, totalGames: cfg.gamesPerRoom });
+
+    // Start a background chaos loop for this game
+    let chaosActive = true;
+    const chaosLoop = (async () => {
+      while (chaosActive && !ctx.signal.aborted) {
+        await sleep(500);
+        for (const session of sessions) {
+          const e = await maybeInjectChaos(session, cfg.chaosChance, ctx.logger, roomId);
+          if (e) chaosFired++;
+        }
+      }
+    })();
+
+    const result = await runOneMultiplayerGame(ctx, sessions, {
+      roomId,
+      holes: cfg.holes,
+      decks: cfg.decks,
+      cpusPerRoom: cfg.cpusPerRoom,
+      thinkTimeMs: cfg.thinkTimeMs,
+    });
+
+    chaosActive = false;
+    await chaosLoop;
+
+    if (result.completed) {
+      completed++;
+      ctx.logger.info('game_complete', { room: roomId, game: gameNum + 1, turns: result.turns });
+    } else {
+      errors.push({
+        room: roomId,
+        reason: 'game_failed',
+        detail: result.error,
+        timestamp: Date.now(),
+      });
+      ctx.logger.error('game_failed', { room: roomId, error: result.error });
+    }
+
+    await sleep(cfg.interGamePauseMs);
+  }
+
+  return { completed, errors, chaosFired };
+}
+
+const stress: Scenario = {
+  name: 'stress',
+  description: 'Rapid short games for stability & race condition hunting',
+  needs: { accounts: 16, rooms: 4, cpusPerRoom: 2 },
+  defaultConfig: {
+    gamesPerRoom: 50,
+    holes: 1,
+    decks: 1,
+    rooms: 4,
+    cpusPerRoom: 2,
+    thinkTimeMs: [50, 150],
+    interGamePauseMs: 200,
+    chaosChance: 0.05,
+  },
+
+  async run(ctx: ScenarioContext): Promise<ScenarioResult> {
+    const start = Date.now();
+    const cfg = ctx.config as unknown as StressConfig;
+    const perRoom = Math.floor(ctx.sessions.length / cfg.rooms);
+    const roomSessions = chunk(ctx.sessions, perRoom);
+
+    const results = await Promise.allSettled(
+      roomSessions.map((s, idx) => runStressRoom(ctx, cfg, idx, s)),
+    );
+
+    let gamesCompleted = 0;
+    let chaosFired = 0;
+    const errors: ScenarioError[] = [];
+    results.forEach((r, idx) => {
+      if (r.status === 'fulfilled') {
+        gamesCompleted += r.value.completed;
+        chaosFired += r.value.chaosFired;
+        errors.push(...r.value.errors);
+      } else {
+        errors.push({
+          room: `room-${idx}`,
+          reason: 'room_threw',
+          detail: r.reason instanceof Error ? r.reason.message : String(r.reason),
+          timestamp: Date.now(),
+        });
+      }
+    });
+
+    return {
+      gamesCompleted,
+      errors,
+      durationMs: Date.now() - start,
+      customMetrics: { chaos_fired: chaosFired },
+    };
+  },
+};
+
+export default stress;
+```
+
+- [ ] **Step 3: Register stress in the registry**
+
+Edit `tests/soak/scenarios/index.ts`:
+
+```typescript
+import type { Scenario } from '../core/types';
+import populate from './populate';
+import stress from './stress';
+
+const registry: Record<string, Scenario> = {
+  populate,
+  stress,
+};
+
+export function getScenario(name: string): Scenario | undefined {
+  return registry[name];
+}
+
+export function listScenarios(): Scenario[] {
+  return Object.values(registry);
+}
+```
+
+- [ ] **Step 4: Smoke test stress scenario**
+
+```bash
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
+  --scenario=stress \
+  --accounts=4 --rooms=1 --cpus-per-room=1 --games-per-room=3 --holes=1 \
+  --watch=none
+```
+
+Expected: 3 quick games complete, chaos events in logs (look for `chaos_injected`), exit 0.
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add tests/soak/scenarios/stress.ts tests/soak/scenarios/shared/chaos.ts tests/soak/scenarios/index.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): stress scenario with chaos injection
+
+Rapid 1-hole games with a parallel chaos loop that has a 5% per-turn
+chance of firing rapid_clicks, tab_blur, or brief_offline events.
+Chaos counts roll up into ScenarioResult.customMetrics.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+## Phase 9 — Failure handling
+
+### Task 26: Watchdog + heartbeat wiring
+
+Per-room timeout that fires if no heartbeat arrives within N ms. Runner wires it into `ctx.heartbeat`. Vitest-tested.
+
+**Files:**
+- Create: `tests/soak/core/watchdog.ts`
+- Create: `tests/soak/tests/watchdog.test.ts`
+- Modify: `tests/soak/runner.ts` — wire `heartbeat` to per-room watchdogs
+
+- [ ] **Step 1: Write failing tests**
+
+```typescript
+// tests/soak/tests/watchdog.test.ts
+import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
+import { Watchdog } from '../core/watchdog';
+
+describe('Watchdog', () => {
+  beforeEach(() => vi.useFakeTimers());
+  afterEach(() => vi.useRealTimers());
+
+  it('fires after timeout if no heartbeat', () => {
+    const onTimeout = vi.fn();
+    const w = new Watchdog(1000, onTimeout);
+    w.start();
+    vi.advanceTimersByTime(1001);
+    expect(onTimeout).toHaveBeenCalledOnce();
+  });
+
+  it('heartbeat resets the timer', () => {
+    const onTimeout = vi.fn();
+    const w = new Watchdog(1000, onTimeout);
+    w.start();
+    vi.advanceTimersByTime(800);
+    w.heartbeat();
+    vi.advanceTimersByTime(800);
+    expect(onTimeout).not.toHaveBeenCalled();
+    vi.advanceTimersByTime(300);
+    expect(onTimeout).toHaveBeenCalledOnce();
+  });
+
+  it('stop cancels pending timeout', () => {
+    const onTimeout = vi.fn();
+    const w = new Watchdog(1000, onTimeout);
+    w.start();
+    w.stop();
+    vi.advanceTimersByTime(2000);
+    expect(onTimeout).not.toHaveBeenCalled();
+  });
+
+  it('does not fire twice after stop', () => {
+    const onTimeout = vi.fn();
+    const w = new Watchdog(1000, onTimeout);
+    w.start();
+    vi.advanceTimersByTime(1001);
+    w.heartbeat();
+    vi.advanceTimersByTime(1001);
+    expect(onTimeout).toHaveBeenCalledOnce();
+  });
+});
+```
+
+- [ ] **Step 2: Run to verify failure**
+
+```bash
+npx vitest run tests/watchdog.test.ts
+```
+
+Expected: FAIL.
+
+- [ ] **Step 3: Implement `core/watchdog.ts`**
+
+```typescript
+// tests/soak/core/watchdog.ts
+export class Watchdog {
+  private timer: NodeJS.Timeout | null = null;
+  private fired = false;
+
+  constructor(
+    private timeoutMs: number,
+    private onTimeout: () => void,
+  ) {}
+
+  start(): void {
+    this.stop();
+    this.fired = false;
+    this.timer = setTimeout(() => {
+      if (this.fired) return;
+      this.fired = true;
+      this.onTimeout();
+    }, this.timeoutMs);
+  }
+
+  heartbeat(): void {
+    if (this.fired) return;
+    this.start();
+  }
+
+  stop(): void {
+    if (this.timer) {
+      clearTimeout(this.timer);
+      this.timer = null;
+    }
+  }
+}
+```
+
+- [ ] **Step 4: Verify tests pass**
+
+```bash
+npx vitest run tests/watchdog.test.ts
+```
+
+Expected: all passing.
+
+- [ ] **Step 5: Wire watchdogs into the runner**
+
+In `runner.ts`, add before building `ctx`:
+
+```typescript
+    const watchdogs = new Map<string, Watchdog>();
+    const roomAborters = new Map<string, AbortController>();
+    for (let i = 0; i < rooms; i++) {
+      const roomId = `room-${i}`;
+      const aborter = new AbortController();
+      roomAborters.set(roomId, aborter);
+      const w = new Watchdog(60_000, () => {
+        logger.error('watchdog_fired', { room: roomId });
+        aborter.abort();
+        dashboard.update(roomId, { phase: 'error' });
+      });
+      w.start();
+      watchdogs.set(roomId, w);
+    }
+```
+
+Import at the top:
+
+```typescript
+import { Watchdog } from './core/watchdog';
+```
+
+Set `ctx.heartbeat` to:
+
+```typescript
+      heartbeat: (roomId: string) => {
+        const w = watchdogs.get(roomId);
+        if (w) w.heartbeat();
+      },
+```
+
+In the `finally` block, stop all watchdogs:
+
+```typescript
+    for (const w of watchdogs.values()) w.stop();
+```
+
+Note: for now the `roomAborters` aren't fully plumbed into scenario cancellation — scenarios see the global `ctx.signal` only. This is intentional; per-room abort requires scenario-side awareness and is deferred until a scenario genuinely misbehaves. The watchdog still catches stuck runs and flips the global error state.
+
+- [ ] **Step 6: Commit**
+
+```bash
+git add tests/soak/core/watchdog.ts tests/soak/tests/watchdog.test.ts tests/soak/runner.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): per-room watchdog with heartbeat
+
+Watchdog class with Vitest tests, wired into ctx.heartbeat in the
+runner. One watchdog per room, 60s timeout; firing logs an error
+and marks the room's dashboard tile as errored.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 27: Artifact capture on failure
+
+When the runner catches an error, snapshot every session's page: screenshot, HTML, console log tail, game state JSON.
+
+**Files:**
+- Create: `tests/soak/core/artifacts.ts`
+- Modify: `tests/soak/runner.ts` — call `captureArtifacts` in the catch block
+
+- [ ] **Step 1: Implement `core/artifacts.ts`**
+
+```typescript
+// tests/soak/core/artifacts.ts
+import * as fs from 'fs';
+import * as path from 'path';
+import type { Session, Logger } from './types';
+
+export interface ArtifactsOptions {
+  runId: string;
+  /** Absolute path to the artifacts root, e.g., /path/to/tests/soak/artifacts */
+  rootDir: string;
+  logger: Logger;
+}
+
+export class Artifacts {
+  readonly runDir: string;
+
+  constructor(private opts: ArtifactsOptions) {
+    this.runDir = path.join(opts.rootDir, opts.runId);
+    fs.mkdirSync(this.runDir, { recursive: true });
+  }
+
+  /** Capture everything for a single session. */
+  async captureSession(session: Session, roomId: string): Promise<void> {
+    const dir = path.join(this.runDir, roomId);
+    fs.mkdirSync(dir, { recursive: true });
+    const prefix = session.key;
+
+    try {
+      const png = await session.page.screenshot({ fullPage: true });
+      fs.writeFileSync(path.join(dir, `${prefix}.png`), png);
+    } catch (err) {
+      this.opts.logger.warn('artifact_screenshot_failed', {
+        session: session.key,
+        error: err instanceof Error ? err.message : String(err),
+      });
+    }
+
+    try {
+      const html = await session.page.content();
+      fs.writeFileSync(path.join(dir, `${prefix}.html`), html);
+    } catch (err) {
+      this.opts.logger.warn('artifact_html_failed', {
+        session: session.key,
+        error: err instanceof Error ? err.message : String(err),
+      });
+    }
+
+    try {
+      const state = await session.bot.getGameState();
+      fs.writeFileSync(
+        path.join(dir, `${prefix}.state.json`),
+        JSON.stringify(state, null, 2),
+      );
+    } catch (err) {
+      this.opts.logger.warn('artifact_state_failed', {
+        session: session.key,
+        error: err instanceof Error ? err.message : String(err),
+      });
+    }
+
+    try {
+      const errors = session.bot.getConsoleErrors?.() ?? [];
+      fs.writeFileSync(path.join(dir, `${prefix}.console.txt`), errors.join('\n'));
+    } catch {
+      // ignore — not all bots expose this
+    }
+  }
+
+  async captureAll(sessions: Session[]): Promise<void> {
+    // Best-effort: partition sessions by their key prefix (doesn't matter)
+    // and write everything under room-unknown/ unless callers pre-partition
+    await Promise.all(
+      sessions.map((s) => this.captureSession(s, 'room-unknown')),
+    );
+  }
+
+  writeSummary(summary: object): void {
+    fs.writeFileSync(
+      path.join(this.runDir, 'summary.json'),
+      JSON.stringify(summary, null, 2),
+    );
+  }
+}
+
+/** Prune run directories older than `maxAgeMs`. */
+export function pruneOldRuns(rootDir: string, maxAgeMs: number, logger: Logger): void {
+  if (!fs.existsSync(rootDir)) return;
+  const now = Date.now();
+  for (const entry of fs.readdirSync(rootDir)) {
+    const full = path.join(rootDir, entry);
+    try {
+      const stat = fs.statSync(full);
+      if (stat.isDirectory() && now - stat.mtimeMs > maxAgeMs) {
+        fs.rmSync(full, { recursive: true, force: true });
+        logger.info('artifact_pruned', { runId: entry });
+      }
+    } catch {
+      // ignore
+    }
+  }
+}
+```
+
+- [ ] **Step 2: Call artifact capture from the runner's error path**
+
+In `runner.ts`, import:
+
+```typescript
+import { Artifacts, pruneOldRuns } from './core/artifacts';
+```
+
+After `const runId = ...`, instantiate and prune:
+
+```typescript
+  const artifactsRoot = path.resolve(__dirname, 'artifacts');
+  const artifacts = new Artifacts({ runId, rootDir: artifactsRoot, logger });
+  pruneOldRuns(artifactsRoot, 7 * 24 * 3600 * 1000, logger);
+```
+
+In the `catch (err)` block, after logging, capture:
+
+```typescript
+  } catch (err) {
+    logger.error('run_failed', {
+      error: err instanceof Error ? err.message : String(err),
+      stack: err instanceof Error ? err.stack : undefined,
+    });
+    try {
+      const liveSessions = pool['activeSessions'] as Session[] | undefined;
+      if (liveSessions && liveSessions.length > 0) {
+        await artifacts.captureAll(liveSessions);
+      }
+    } catch (captureErr) {
+      logger.warn('artifact_capture_failed', {
+        error: captureErr instanceof Error ? captureErr.message : String(captureErr),
+      });
+    }
+    exitCode = 1;
+  }
+```
+
+(Note: the `pool['activeSessions']` access bypasses visibility to avoid adding a public getter for one call site. Acceptable for an error path in a test harness.)
+
+After successful run, write the summary:
+
+```typescript
+    artifacts.writeSummary({
+      runId,
+      scenario: scenario.name,
+      targetUrl,
+      gamesCompleted: result.gamesCompleted,
+      errors: result.errors,
+      durationMs: result.durationMs,
+      customMetrics: result.customMetrics,
+    });
+```
+
+Import `Session` type:
+
+```typescript
+import type { Session } from './core/types';
+```
+
+- [ ] **Step 3: Verify by forcing a failure**
+
+Kill the server mid-run and confirm artifacts are written:
+
+```bash
+# In one terminal
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
+  --scenario=populate --accounts=2 --rooms=1 --cpus-per-room=0 \
+  --games-per-room=5 --holes=3 --watch=none
+
+# In another: wait ~3 seconds then Ctrl-C the dev server
+# The soak run should catch errors and write artifacts
+
+ls tests/soak/artifacts/
+ls tests/soak/artifacts/<run-id>/
+```
+
+Expected: a run directory exists with `summary.json` (if it got far enough) or per-session screenshots / HTML under `room-unknown/`.
+
+- [ ] **Step 4: Commit**
+
+```bash
+git add tests/soak/core/artifacts.ts tests/soak/runner.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): artifact capture on failure + run summary
+
+Screenshots, HTML, game state, and console errors are captured into
+tests/soak/artifacts/<run-id>/ when a scenario throws. Runs older
+than 7 days are pruned on startup. Successful runs get a
+summary.json next to the artifacts dir.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 28: Graceful shutdown (already partially in place) + exit codes
+
+SIGINT/SIGTERM already flip the abort controller. Formalize the timeout-and-force-exit path and the three exit codes (`0` / `1` / `2`).
+
+**Files:**
+- Modify: `tests/soak/runner.ts`
+
+- [ ] **Step 1: Add a graceful shutdown timeout**
+
+In `runner.ts`, replace the existing signal handlers with:
+
+```typescript
+  let forceExitTimer: NodeJS.Timeout | null = null;
+  const onSignal = (sig: string) => {
+    if (abortController.signal.aborted) {
+      // Second signal: force exit
+      logger.warn('force_exit', { signal: sig });
+      process.exit(130);
+    }
+    logger.warn('signal_received', { signal: sig });
+    abortController.abort();
+    // Hard-kill after 10s if cleanup hangs
+    forceExitTimer = setTimeout(() => {
+      logger.error('graceful_shutdown_timeout');
+      process.exit(130);
+    }, 10_000);
+  };
+  process.on('SIGINT', () => onSignal('SIGINT'));
+  process.on('SIGTERM', () => onSignal('SIGTERM'));
+```
+
+In the `finally` block, clear the force-exit timer:
+
+```typescript
+    if (forceExitTimer) clearTimeout(forceExitTimer);
+```
+
+- [ ] **Step 2: Manual test — Ctrl-C a long run**
+
+```bash
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
+  --scenario=populate --accounts=2 --rooms=1 --cpus-per-room=0 \
+  --games-per-room=10 --holes=3 --watch=none
+
+# After ~5 seconds: Ctrl-C
+```
+
+Expected: runner logs `signal_received`, finishes current turn, prints summary, exits with code 2 (check `echo $?`).
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tests/soak/runner.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): graceful shutdown with 10s hard-kill fallback
+
+SIGINT/SIGTERM flips the abort signal; scenarios finish the current
+turn then exit. If cleanup hangs >10s the runner force-exits. Second
+Ctrl-C is an immediate hard kill. Exit codes: 0 success, 1 errors,
+2 interrupted.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 29: Periodic health probes
+
+Every 30s, fetch `/api/health` on the target server. Three consecutive failures declare a fatal error and abort.
+
+**Files:**
+- Modify: `tests/soak/runner.ts`
+
+- [ ] **Step 1: Add a health probe interval**
+
+In `runner.ts`, after building the abort controller and before running the scenario:
+
+```typescript
+  let healthFailures = 0;
+  const healthTimer = setInterval(async () => {
+    try {
+      const res = await fetch(`${targetUrl}/api/health`);
+      if (!res.ok) throw new Error(`status ${res.status}`);
+      healthFailures = 0;
+    } catch (err) {
+      healthFailures++;
+      logger.warn('health_probe_failed', {
+        consecutive: healthFailures,
+        error: err instanceof Error ? err.message : String(err),
+      });
+      if (healthFailures >= 3) {
+        logger.error('health_fatal', { consecutive: healthFailures });
+        abortController.abort();
+      }
+    }
+  }, 30_000);
+```
+
+In the `finally` block:
+
+```typescript
+    clearInterval(healthTimer);
+```
+
+- [ ] **Step 2: Commit**
+
+```bash
+git add tests/soak/runner.ts
+git commit -m "$(cat <<'EOF'
+feat(soak): periodic health probes against target server
+
+Every 30s GET /api/health. Three consecutive failures abort the
+run with a fatal error, so staging outages don't get misattributed
+to harness bugs.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+## Phase 10 — Polish and bring-up
+
+### Task 30: Smoke test script
+
+`tests/soak/scripts/smoke.sh` — the canary run that takes ~30s against local dev.
+
+**Files:**
+- Create: `tests/soak/scripts/smoke.sh`
+
+- [ ] **Step 1: Create the script**
+
+```bash
+#!/usr/bin/env bash
+# Soak harness smoke test — end-to-end canary against local dev.
+# Expected runtime: ~30 seconds.
+set -euo pipefail
+
+cd "$(dirname "$0")/.."
+
+: "${TEST_URL:=http://localhost:8000}"
+: "${SOAK_INVITE_CODE:=SOAKTEST}"
+
+echo "Smoke target: $TEST_URL"
+echo "Invite code:  $SOAK_INVITE_CODE"
+
+# 1. Health probe
+curl -fsS "$TEST_URL/api/health" > /dev/null || {
+  echo "FAIL: target server unreachable at $TEST_URL"
+  exit 1
+}
+
+# 2. Ensure minimum accounts
+if [ ! -f .env.stresstest ]; then
+  echo "Seeding accounts..."
+  npm run seed -- --count=4
+fi
+
+# 3. Run minimum viable scenario
+TEST_URL="$TEST_URL" SOAK_INVITE_CODE="$SOAK_INVITE_CODE" \
+  npm run soak -- \
+    --scenario=populate \
+    --accounts=2 \
+    --rooms=1 \
+    --cpus-per-room=0 \
+    --games-per-room=1 \
+    --holes=1 \
+    --watch=none
+
+echo "Smoke PASSED"
+```
+
+- [ ] **Step 2: Make it executable and run it**
+
+```bash
+chmod +x tests/soak/scripts/smoke.sh
+cd tests/soak && bash scripts/smoke.sh
+```
+
+Expected: `Smoke PASSED` within ~30s.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tests/soak/scripts/smoke.sh
+git commit -m "$(cat <<'EOF'
+feat(soak): smoke test script — 30s end-to-end canary
+
+Confirms the harness works against local dev with the absolute
+minimum config. Run after any change.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 31: README + CHECKLIST
+
+Replace the README stub with a full quickstart and flag reference. Add the manual validation checklist.
+
+**Files:**
+- Modify: `tests/soak/README.md`
+- Create: `tests/soak/CHECKLIST.md`
+
+- [ ] **Step 1: Rewrite `tests/soak/README.md`**
+
+```markdown
+# Golf Soak & UX Test Harness
+
+Standalone Playwright-based runner that drives multi-user authenticated
+game sessions for scoreboard population and stability testing.
+
+**Spec:** `../../docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md`
+**Bring-up:** `../../docs/soak-harness-bringup.md`
+
+## Quick start
+
+```bash
+cd tests/soak
+npm install
+
+# First run only: seed 16 accounts
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run seed
+
+# 30-second end-to-end smoke test
+bash scripts/smoke.sh
+
+# Populate scoreboard (4 rooms × 4 accounts × 10 long games)
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST \
+  npm run soak:populate
+
+# Stress test (4 rooms × 50 rapid games with chaos)
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST \
+  npm run soak:stress
+```
+
+## CLI flags
+
+```
+--scenario=populate|stress    required
+--accounts=<n>                total sessions (default: scenario.needs.accounts)
+--rooms=<n>                   default from scenario.needs
+--cpus-per-room=<n>           default from scenario.needs
+--games-per-room=<n>          default from scenario.defaultConfig
+--holes=<n>                   default from scenario.defaultConfig
+--watch=none|dashboard|tiled  default: dashboard
+--dashboard-port=<n>          default: 7777
+--target=<url>                default: TEST_URL env
+--run-id=<string>             default: ISO timestamp
+--list                        print scenarios and exit
+--dry-run                     validate config, don't run
+```
+
+Derived: `accounts / rooms` must divide evenly.
+
+## Environment variables
+
+```
+TEST_URL             target base URL (e.g. https://staging.adlee.work)
+SOAK_INVITE_CODE     invite code flagged marks_as_test (staging: 5VC2MCCN)
+SOAK_HOLES           override --holes
+SOAK_ROOMS           override --rooms
+SOAK_ACCOUNTS        override --accounts
+SOAK_CPUS_PER_ROOM   override --cpus-per-room
+SOAK_GAMES_PER_ROOM  override --games-per-room
+SOAK_WATCH           override --watch
+SOAK_DASHBOARD_PORT  override --dashboard-port
+```
+
+## Watch modes
+
+- **`none`** — pure headless, JSON logs to stdout. Use for CI and overnight runs.
+- **`dashboard`** (default) — HTTP+WS server on localhost:7777 serving a live status grid. Click any player tile to watch their live session via CDP screencast.
+- **`tiled`** — 4 native Chromium windows for the host of each room, positioned in a 2×2 grid. Joiners stay headless.
+
+## Scenarios
+
+| Name | Description |
+|---|---|
+| `populate` | Long 9-hole games with varied CPU personalities, realistic pacing, for populating scoreboards |
+| `stress` | Rapid 1-hole games with chaos injection (rapid clicks, offline toggles, tab blur) for hunting race conditions |
+
+Add new scenarios by creating `scenarios/<name>.ts` and registering in `scenarios/index.ts`.
+
+## Architecture
+
+See the design spec for full module breakdown. Key modules:
+
+- `runner.ts` — CLI entry, wires everything together
+- `core/session-pool.ts` — owns browser contexts, seeds/logs in 16 accounts
+- `core/room-coordinator.ts` — host→joiners room-code handoff
+- `core/watchdog.ts` — per-room timeout detection
+- `core/screencaster.ts` — CDP Page.startScreencast for live video
+- `dashboard/server.ts` — HTTP + WS server
+- `scenarios/` — pluggable scenarios
+
+Reuses `../../tests/e2e/bot/golf-bot.ts` unchanged.
+
+## Running tests (unit)
+
+```bash
+npm test
+```
+
+Tests cover `Deferred`, `RoomCoordinator`, `Watchdog`, and `config`.
+Integration-level modules are verified by the smoke test.
+```
+
+- [ ] **Step 2: Create `tests/soak/CHECKLIST.md`**
+
+```markdown
+# Soak Harness Manual Validation Checklist
+
+Run after any significant change or before calling the implementation complete.
+
+## Bring-up
+
+- [ ] Local dev server is running (`python server/main.py`)
+- [ ] `SOAKTEST` invite code exists locally with `marks_as_test=TRUE`
+- [ ] `npm install` in `tests/soak/` succeeded
+- [ ] `npm run seed -- --count=16` creates/updates 16 accounts
+- [ ] `.env.stresstest` has 16 `SOAK_ACCOUNT_NN=...` lines
+- [ ] All seeded users show `is_test_account=TRUE` in the DB
+
+## Smoke
+
+- [ ] `bash scripts/smoke.sh` exits 0 within 60s
+
+## Scenarios
+
+- [ ] `--scenario=populate --rooms=1 --games-per-room=1` completes cleanly
+- [ ] `--scenario=populate --rooms=4 --games-per-room=1` runs 4 rooms in parallel with no cross-contamination
+- [ ] `--scenario=stress --games-per-room=3` logs `chaos_injected` events
+
+## Watch modes
+
+- [ ] `--watch=none` produces JSONL on stdout, nothing else
+- [ ] `--watch=dashboard` opens http://localhost:7777, grid renders, tiles update live, WS status shows `healthy`
+- [ ] Clicking any player tile opens the video modal and streams live JPEG frames (~10 fps)
+- [ ] Closing the modal stops the screencast (check logs for `screencast_stopped`)
+- [ ] `--watch=tiled` opens 4 native Chromium windows for the 4 hosts
+
+## Failure modes
+
+- [ ] Ctrl-C during a run → graceful shutdown, summary printed, exit code 2
+- [ ] Double Ctrl-C → hard exit (130)
+- [ ] Killing the dev server mid-run → health probes fail 3× → fatal abort, artifacts captured, exit 1
+- [ ] Artifacts directory contains a subdirectory per failed run with screenshots and state.json
+- [ ] Artifacts older than 7 days are pruned on next startup
+
+## Server-side filtering
+
+- [ ] `GET /api/stats/leaderboard` (default) hides soak_* accounts
+- [ ] `GET /api/stats/leaderboard?include_test=true` shows soak_* accounts
+- [ ] Admin panel user list shows `[Test]` badge on soak_* accounts
+- [ ] Admin panel "Include test accounts" checkbox filters them out
+- [ ] Admin panel invite codes tab shows `[Test-seed]` next to SOAKTEST
+
+## Staging bring-up (final step)
+
+- [ ] `UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN';` run on staging
+- [ ] `SOAK_INVITE_CODE=5VC2MCCN TEST_URL=https://staging.adlee.work npm run seed -- --count=16` seeds staging accounts
+- [ ] Staging run with `--scenario=populate --watch=none` completes
+- [ ] Staging leaderboard with `include_test=true` shows the soak accounts
+- [ ] Staging leaderboard default (no param) does NOT show the soak accounts
+```
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tests/soak/README.md tests/soak/CHECKLIST.md
+git commit -m "$(cat <<'EOF'
+docs(soak): full README + manual validation checklist
+
+Quickstart, flag reference, env var reference, scenario table, and
+the bring-up/validation checklist that gates calling the harness
+implementation complete.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+### Task 32: Staging bring-up (manual, no code)
+
+This is a documentation-only task — the actual run happens on your workstation. Listed here so the implementation plan is complete end to end.
+
+- [ ] **Step 1: Flag `5VC2MCCN` as test-seed on staging**
+
+From your workstation (requires DB access to staging):
+
+```bash
+ssh root@129.212.150.189 \
+  'docker exec -i golfgame-postgres psql -U postgres -d golfgame' <<'EOF'
+UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN';
+SELECT code, max_uses, use_count, marks_as_test FROM invite_codes WHERE code = '5VC2MCCN';
+EOF
+```
+
+Expected: `marks_as_test | t`.
+
+(The exact docker container name may differ — adjust based on `docker ps` on the staging host.)
+
+- [ ] **Step 2: Seed the 16 staging accounts**
+
+```bash
+cd tests/soak
+rm -f .env.stresstest
+TEST_URL=https://staging.adlee.work \
+  SOAK_INVITE_CODE=5VC2MCCN \
+  npm run seed -- --count=16
+```
+
+Expected: `.env.stresstest` populated with 16 entries.
+
+- [ ] **Step 3: Run populate against staging**
+
+```bash
+TEST_URL=https://staging.adlee.work \
+  SOAK_INVITE_CODE=5VC2MCCN \
+  npm run soak -- \
+    --scenario=populate \
+    --rooms=4 \
+    --games-per-room=3 \
+    --holes=3 \
+    --watch=dashboard
+```
+
+Expected: dashboard opens, 4 rooms play 3 games each, staging scoreboard accumulates data. Exit 0 at the end.
+
+- [ ] **Step 4: Verify scoreboard filtering on staging**
+
+```bash
+# Should NOT contain soak_* usernames
+curl -s "https://staging.adlee.work/api/stats/leaderboard?metric=wins" | jq '.entries[] | select(.username | startswith("soak_"))'
+
+# Should contain soak_* usernames
+curl -s "https://staging.adlee.work/api/stats/leaderboard?metric=wins&include_test=true" | jq '.entries[] | select(.username | startswith("soak_"))'
+```
+
+Expected: first returns nothing, second returns entries.
+
+- [ ] **Step 5: Mark implementation complete**
+
+Check off all items in `tests/soak/CHECKLIST.md` that correspond to this plan. Commit the filled-in checklist if you want a record:
+
+```bash
+git add tests/soak/CHECKLIST.md
+git commit -m "docs(soak): checklist passed on initial staging run"
+```
+
+---
+
+## Phase 11 — Version bump
+
+### Task 33: Bump to v3.3.4 and add footer to admin.html
+
+Updates all HTML footers from `v3.1.6` to `v3.3.4`, adds a footer to admin.html which currently has none, bumps `pyproject.toml`.
+
+**Files:**
+- Modify: `client/index.html` — both footer occurrences (L58, L291)
+- Modify: `client/admin.html` — add footer
+- Modify: `pyproject.toml` — version field
+
+- [ ] **Step 1: Update `client/index.html` footers**
+
+```bash
+grep -n "v3\.1\.6" client/index.html
+```
+
+For each match, replace `v3.1.6` with `v3.3.4`. There should be exactly two matches.
+
+- [ ] **Step 2: Add footer to `client/admin.html`**
+
+Find the closing `</body>` in `client/admin.html` and add a footer just before it:
+
+```html
+<footer class="app-footer" style="text-align: center; padding: 16px; color: var(--muted, #666); font-size: 12px;">v3.3.4 &copy; Aaron D. Lee</footer>
+</body>
+```
+
+(The inline style is a fallback — admin.css may already have an `.app-footer` class; if so, drop the inline styles.)
+
+```bash
+grep -n "app-footer" client/admin.css 2>/dev/null
+```
+
+If the class exists, use just `<footer class="app-footer">v3.3.4 &copy; Aaron D. Lee</footer>`.
+
+- [ ] **Step 3: Bump `pyproject.toml`**
+
+```bash
+sed -i 's/^version = "3\.1\.6"$/version = "3.3.4"/' pyproject.toml
+grep version pyproject.toml
+```
+
+Expected: `version = "3.3.4"`.
+
+- [ ] **Step 4: Verify in the browser**
+
+Restart the dev server, open http://localhost:8000 and http://localhost:8000/admin.html. Confirm both show `v3.3.4` in the footer.
+
+- [ ] **Step 5: Commit**
+
+```bash
+git add client/index.html client/admin.html pyproject.toml
+git commit -m "$(cat <<'EOF'
+chore: bump version to v3.3.4
+
+Updates client/index.html footer (×2) and pyproject.toml from
+v3.1.6 → v3.3.4, and adds a matching footer to client/admin.html
+which previously had none.
+
+Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
+EOF
+)"
+```
+
+---
+
+## Summary
+
+33 tasks across 11 phases:
+
+| Phase | Tasks | Milestone |
+|---|---|---|
+| 1 — Server changes | 1–8 | Stats filter works, test accounts are separable |
+| 2 — Harness scaffolding | 9–12 | Core pure-logic modules with Vitest tests pass |
+| 3 — SessionPool + seeding | 13–14 | `.env.stresstest` seeded via real HTTP |
+| 4 — First run | 15–18 | **`--watch=none` smoke test passes end-to-end** |
+| 5 — Dashboard | 19–21 | Live status grid in browser |
+| 6 — Live video | 22–23 | Click-to-watch CDP screencast |
+| 7 — Tiled mode | 24 | Native host windows |
+| 8 — Stress scenario | 25 | Chaos injection runs clean |
+| 9 — Failure handling | 26–29 | Watchdog + artifacts + graceful shutdown + health probes |
+| 10 — Polish | 30–31 | Smoke script + README + CHECKLIST |
+| 11 — Version bump | 33 | v3.3.4 everywhere |
+
+(Task 32 is the manual staging bring-up — no code.)
+
+Dependencies between tasks:
+
+- Tasks 1–8 are independent of the harness (ship them first if you want immediate value for admins)
+- Tasks 9–18 are strictly sequential (each builds on the previous)
+- Tasks 19–21, 22–23, 24, 25 are independent of each other — can be done in any order after Task 18
+- Tasks 26–29 can be done after Task 18 but are most valuable after Task 25
+- Tasks 30–31 come last before staging
+- Task 33 is independent and can be done any time after Task 8