# Multiplayer Soak & UX Test Harness — Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Build a standalone Playwright-based soak runner in `tests/soak/` that drives 16 authenticated browser sessions across 4 concurrent rooms playing many multiplayer games, with pluggable scenarios, a click-to-watch dashboard via CDP screencast, and strict per-room failure isolation. **Architecture:** Single-process node runner reusing the existing `GolfBot` class from `tests/e2e/bot/`. One shared browser (16 contexts) by default; `WATCH=tiled` uses a second headed browser for the 4 host contexts. Scenarios are plain TS modules exported from `tests/soak/scenarios/`. Dashboard is a tiny HTTP+WS server serving one static page that pushes live status and on-demand CDP screencast frames. **Tech Stack:** TypeScript + tsx (no build step), Playwright Core, ws (WebSocket server), Vitest for unit tests, FastAPI + asyncpg (existing server), PostgreSQL (existing). **Spec:** `docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md` --- ## Testing Strategy Notes - **Server-side Python changes:** The existing test suite mocks stores with `AsyncMock` and has no real-Postgres fixtures. Rather than inventing a new fixture pattern for this plan, server tasks use **curl-based verification against a running local dev server** as the explicit verification step after each commit. Run `python server/main.py` in another terminal (requires Postgres + Redis running — see `docs/INSTALL.md`). - **TypeScript harness logic:** Unit-tested with Vitest for pure modules (Deferred, RoomCoordinator, Watchdog, Config). Integration-level modules (SessionPool, Dashboard, Screencaster, Scenarios) are verified by running the harness itself via the smoke test. - **End-to-end validation:** `tests/soak/scripts/smoke.sh` is the canary — after every non-trivial change, run it against local dev and expect exit 0 within ~30s. --- ## Phase 1 — Server-side changes (independent, ships first) ### Task 1: Schema migration for `is_test_account` and `marks_as_test` Add two columns, one partial index, and rebuild the `leaderboard_overall` materialized view to include `is_test_account` (so the filter works through the view fast path). Fits the existing inline-migration pattern in `user_store.py`. **Files:** - Modify: `server/stores/user_store.py` — append to `SCHEMA_SQL` (ALTER blocks near L79–L98 and the matview block near L298–L335) - [ ] **Step 1: Add column migration to `SCHEMA_SQL`** Open `server/stores/user_store.py`. Inside the first `DO $$ BEGIN ... END $$;` block (around line 80–98 that handles admin columns), append the `is_test_account` column check. Then add a second ALTER for `invite_codes.marks_as_test` in a new `DO $$` block right after. Add after the existing `last_seen_at` check (before `END $$;` on line ~98): ```sql IF NOT EXISTS (SELECT 1 FROM information_schema.columns WHERE table_name = 'users_v2' AND column_name = 'is_test_account') THEN ALTER TABLE users_v2 ADD COLUMN is_test_account BOOLEAN DEFAULT FALSE; END IF; ``` Then, immediately after the `END $$;` that closes the users_v2 admin block, add a new block for invite_codes: ```sql -- Add marks_as_test to invite_codes if not exists DO $$ BEGIN IF NOT EXISTS (SELECT 1 FROM information_schema.columns WHERE table_name = 'invite_codes' AND column_name = 'marks_as_test') THEN ALTER TABLE invite_codes ADD COLUMN marks_as_test BOOLEAN DEFAULT FALSE; END IF; END $$; ``` - [ ] **Step 2: Add partial index on `is_test_account`** Find the indexes block near line 338. After the existing `idx_users_banned` index (line ~344), add: ```sql CREATE INDEX IF NOT EXISTS idx_users_v2_is_test_account ON users_v2(is_test_account) WHERE is_test_account = TRUE; ``` - [ ] **Step 3: Rebuild `leaderboard_overall` materialized view to include `is_test_account`** Find the existing matview block at line ~298. Modify the version-check DO block so the view is dropped and recreated if it lacks the `is_test_account` column. Replace the existing block: ```sql -- Leaderboard materialized view (refreshed periodically) -- Drop and recreate if missing is_test_account column (soak harness migration) DO $$ BEGIN IF EXISTS (SELECT 1 FROM pg_matviews WHERE matviewname = 'leaderboard_overall') THEN -- Check if is_test_account column exists in the view IF NOT EXISTS ( SELECT 1 FROM information_schema.columns WHERE table_name = 'leaderboard_overall' AND column_name = 'is_test_account' ) THEN DROP MATERIALIZED VIEW leaderboard_overall; END IF; END IF; IF NOT EXISTS (SELECT 1 FROM pg_matviews WHERE matviewname = 'leaderboard_overall') THEN EXECUTE ' CREATE MATERIALIZED VIEW leaderboard_overall AS SELECT u.id as user_id, u.username, COALESCE(u.is_test_account, FALSE) as is_test_account, s.games_played, s.games_won, ROUND(s.games_won::numeric / NULLIF(s.games_played, 0) * 100, 1) as win_rate, s.rounds_won, ROUND(s.total_points::numeric / NULLIF(s.total_rounds, 0), 1) as avg_score, s.best_score as best_round_score, s.knockouts, s.best_win_streak, COALESCE(s.rating, 1500) as rating, s.last_game_at FROM player_stats s JOIN users_v2 u ON s.user_id = u.id WHERE s.games_played >= 5 AND u.deleted_at IS NULL AND (u.is_banned = false OR u.is_banned IS NULL) '; END IF; END $$; ``` Note: the only differences from the existing block are the changed comment, the changed column-existence check (`is_test_account` instead of `rating`), and the new `COALESCE(u.is_test_account, FALSE) as is_test_account` column in the SELECT. Everything else stays identical. - [ ] **Step 4: Start the server to run migrations** Run (in another terminal, with Postgres + Redis up): ```bash cd /home/alee/Sources/golfgame python server/main.py ``` Expected: server starts cleanly, no errors about `is_test_account` or `marks_as_test` or `leaderboard_overall`. - [ ] **Step 5: Verify schema via psql** Connect to the dev database and confirm: ```bash psql -d golfgame -c "\d users_v2" | grep is_test_account psql -d golfgame -c "\d invite_codes" | grep marks_as_test psql -d golfgame -c "\d leaderboard_overall" | grep is_test_account psql -d golfgame -c "\di idx_users_v2_is_test_account" ``` Expected: all four commands return matching rows. - [ ] **Step 6: Commit** ```bash git add server/stores/user_store.py git commit -m "$(cat <<'EOF' feat(server): add is_test_account + marks_as_test schema New columns support separating soak-harness test traffic from real user traffic in stats queries. Rebuilds leaderboard_overall matview to include is_test_account so the fast path stays filterable. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 2: Propagate `is_test_account` through `User` model and `user_store` Wire the new column into the `User` dataclass, `create_user` signature, `_row_to_user` mapping, and every SELECT list that already pulls user columns. **Files:** - Modify: `server/models/user.py` — `User` dataclass (L22–L68) + `to_dict` (L82–L116) + `from_dict` (L118+) - Modify: `server/stores/user_store.py` — `create_user` (L454–L501), `_row_to_user` (L997–L1020), `get_user_by_id`/`get_user_by_username`/`get_user_by_email` SELECT lists (L503–L570) - [ ] **Step 1: Add `is_test_account` to the `User` dataclass** In `server/models/user.py`, add a new field to the `User` dataclass (after `force_password_reset` on L68): ```python is_test_account: bool = False ``` Update the docstring `Attributes:` block around L45 to include: ``` is_test_account: True for accounts created by the soak test harness. ``` - [ ] **Step 2: Include `is_test_account` in `to_dict` and `from_dict`** In `User.to_dict` at L82, add to the `d` dict (after `force_password_reset`): ```python "is_test_account": self.is_test_account, ``` In `User.from_dict`, add the corresponding parse — find where `force_password_reset` is parsed and add the same pattern: ```python is_test_account=d.get("is_test_account", False), ``` - [ ] **Step 3: Add `is_test_account` parameter to `create_user`** In `server/stores/user_store.py` at L454, add a new parameter: ```python async def create_user( self, username: str, password_hash: str, email: Optional[str] = None, role: UserRole = UserRole.USER, guest_id: Optional[str] = None, verification_token: Optional[str] = None, verification_expires: Optional[datetime] = None, is_test_account: bool = False, ) -> Optional[User]: ``` Update the docstring to add a line in `Args:` describing `is_test_account`. Change the INSERT SQL block to include the new column: ```python row = await conn.fetchrow( """ INSERT INTO users_v2 (username, password_hash, email, role, guest_id, verification_token, verification_expires, is_test_account) VALUES ($1, $2, $3, $4, $5, $6, $7, $8) RETURNING id, username, email, password_hash, role, email_verified, verification_token, verification_expires, reset_token, reset_expires, guest_id, deleted_at, preferences, created_at, last_login, last_seen_at, is_active, is_banned, ban_reason, force_password_reset, is_test_account """, username, password_hash, email, role.value, guest_id, verification_token, verification_expires, is_test_account, ) ``` - [ ] **Step 4: Update `_row_to_user` mapping** In `server/stores/user_store.py` at L997, add to the `User(...)` call (after `force_password_reset`): ```python is_test_account=row.get("is_test_account", False) or False, ``` - [ ] **Step 5: Update all other SELECT lists in user_store** Find every query in `server/stores/user_store.py` that returns a full user row and passes it to `_row_to_user`. Add `is_test_account` to the SELECT column list for each. Grep to find them: ```bash grep -n "is_active, is_banned, ban_reason, force_password_reset" server/stores/user_store.py ``` For each match, append `, is_test_account` to the SELECT list. Expected locations: - `create_user` INSERT ... RETURNING (already updated in Step 3) - `get_user_by_id` at L503 - `get_user_by_username` at L519 - `get_user_by_email` (find it) - Any other `SELECT` ... FROM users_v2 that calls `_row_to_user` - [ ] **Step 6: Restart server, verify no errors** ```bash # Kill and restart the dev server python server/main.py ``` Expected: server starts cleanly. Any query that touches users now returns `is_test_account` correctly. - [ ] **Step 7: Smoke test via curl** ```bash # Register a throwaway test user (no invite code needed if DAILY_OPEN_SIGNUPS > 0 locally, # or use the 5VC2MCCN invite code if INVITE_ONLY=true) # Set PW to any password of your choice (>= 8 chars). PW='SomeTestPw_1!' curl -sX POST http://localhost:8000/api/auth/register \ -H 'Content-Type: application/json' \ -d "{\"username\":\"soaktest_smoke1\",\"password\":\"$PW\",\"email\":\"soaktest_smoke1@example.com\",\"invite_code\":\"5VC2MCCN\"}" ``` Expected: HTTP 200 with `{"user":{...},"token":"..."}`. The registration path now runs through the new column without errors even though the value is still always FALSE at this stage. - [ ] **Step 8: Commit** ```bash git add server/models/user.py server/stores/user_store.py git commit -m "$(cat <<'EOF' feat(server): propagate is_test_account through User model & store User dataclass, create_user, and all SELECT lists now round-trip the new column. Value is always FALSE until Task 4 wires the register flow to the invite code's marks_as_test flag. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 3: Expose `marks_as_test` on `InviteCode` and add lookup helper `validate_invite_code` currently returns a bare bool. We need a new helper that returns the full row so the register flow can check `marks_as_test` without a second query. **Files:** - Modify: `server/services/admin_service.py` — `InviteCode` dataclass (L115–L138), `get_invite_codes` SELECT (L1106–L1141), add new `get_invite_code_details` method - [ ] **Step 1: Add `marks_as_test` field to `InviteCode` dataclass** In `server/services/admin_service.py` at L115: ```python @dataclass class InviteCode: """Invite code details.""" code: str created_by: str created_by_username: str created_at: datetime expires_at: datetime max_uses: int use_count: int is_active: bool marks_as_test: bool = False ``` Update `to_dict` at L127 to include the field: ```python def to_dict(self) -> dict: return { "code": self.code, "created_by": self.created_by, "created_by_username": self.created_by_username, "created_at": self.created_at.isoformat() if self.created_at else None, "expires_at": self.expires_at.isoformat() if self.expires_at else None, "max_uses": self.max_uses, "use_count": self.use_count, "is_active": self.is_active, "remaining_uses": max(0, self.max_uses - self.use_count), "marks_as_test": self.marks_as_test, } ``` - [ ] **Step 2: Update `get_invite_codes` SELECT to include `marks_as_test`** Find `get_invite_codes` at L1106. Modify the SQL to pull the column and pass it through: ```python async def get_invite_codes(self, include_expired: bool = False) -> List[InviteCode]: """List all invite codes.""" async with self.pool.acquire() as conn: sql = """ SELECT c.code, c.created_by, u.username as created_by_username, c.created_at, c.expires_at, c.max_uses, c.use_count, c.is_active, COALESCE(c.marks_as_test, FALSE) as marks_as_test FROM invite_codes c LEFT JOIN users_v2 u ON c.created_by = u.id """ ``` Find the list comprehension that constructs `InviteCode(...)` objects and add the new kwarg: ```python InviteCode( code=row["code"], created_by=str(row["created_by"]), created_by_username=row["created_by_username"] or "unknown", created_at=row["created_at"].replace(tzinfo=timezone.utc) if row["created_at"] else None, expires_at=row["expires_at"].replace(tzinfo=timezone.utc) if row["expires_at"] else None, max_uses=row["max_uses"], use_count=row["use_count"], is_active=row["is_active"], marks_as_test=row["marks_as_test"], ) ``` - [ ] **Step 3: Add new `get_invite_code_details` method** Add a new method right after `validate_invite_code` (around L1214) that returns the row with `marks_as_test`. The register flow will call this to resolve the flag. Place it between `validate_invite_code` and `use_invite_code`: ```python async def get_invite_code_details(self, code: str) -> Optional[dict]: """ Look up an invite code's row including marks_as_test. Returns None if the code does not exist. Does NOT validate expiry or usage — use validate_invite_code for that. This is purely a helper for the register flow to discover the test-seed flag. """ async with self.pool.acquire() as conn: row = await conn.fetchrow( """ SELECT code, max_uses, use_count, is_active, COALESCE(marks_as_test, FALSE) as marks_as_test FROM invite_codes WHERE code = $1 """, code, ) if not row: return None return { "code": row["code"], "max_uses": row["max_uses"], "use_count": row["use_count"], "is_active": row["is_active"], "marks_as_test": row["marks_as_test"], } ``` - [ ] **Step 4: Verify with curl via admin panel endpoint** Assuming you have an admin token from a local dev user. Hit the existing admin invites listing: ```bash # Replace TOKEN with a valid admin JWT curl -s http://localhost:8000/api/admin/invites \ -H "Authorization: Bearer $TOKEN" | jq '.codes[0]' ``` Expected: response includes `"marks_as_test": false` on at least one code. - [ ] **Step 5: Commit** ```bash git add server/services/admin_service.py git commit -m "$(cat <<'EOF' feat(server): expose marks_as_test on InviteCode Adds the field to the dataclass, SELECT list in get_invite_codes, and a new get_invite_code_details helper that the register flow will use to discover whether an invite should flag new accounts as test accounts. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 4: Wire register flow to set `is_test_account` from invite When a user registers with an invite whose `marks_as_test=TRUE`, the new account is flagged. The plumbing lives in two places: the router reads the flag and passes it to the service; the service passes it to the store. **Files:** - Modify: `server/routers/auth.py` — `register` handler (L224–L320) - Modify: `server/services/auth_service.py` — `register` method (L98–L178) - [ ] **Step 1: Add `is_test_account` parameter to `auth_service.register`** In `server/services/auth_service.py` at L98, add the new parameter: ```python async def register( self, username: str, password: str, email: Optional[str] = None, guest_id: Optional[str] = None, is_test_account: bool = False, ) -> RegistrationResult: ``` Update the docstring `Args:` block: ``` is_test_account: Mark this user as a soak-harness test account. ``` Pass the value through to `create_user` at L146: ```python user = await self.user_store.create_user( username=username, password_hash=password_hash, email=email, role=UserRole.USER, guest_id=guest_id, verification_token=verification_token, verification_expires=verification_expires, is_test_account=is_test_account, ) ``` - [ ] **Step 2: Update the router to resolve `marks_as_test` and pass it through** In `server/routers/auth.py`, find the `register` handler at L224. After the existing invite-code validation block (around L248–L252), fetch the invite details and compute `is_test`: ```python # --- Invite code validation --- is_test_account = False if has_invite: if not _admin_service: raise HTTPException(status_code=503, detail="Admin service not initialized") if not await _admin_service.validate_invite_code(request_body.invite_code): raise HTTPException(status_code=400, detail="Invalid or expired invite code") # Check if this invite flags new accounts as test accounts invite_details = await _admin_service.get_invite_code_details(request_body.invite_code) if invite_details and invite_details.get("marks_as_test"): is_test_account = True ``` Then pass it to `auth_service.register` at L276: ```python # --- Create the account --- result = await auth_service.register( username=request_body.username, password=request_body.password, email=request_body.email, is_test_account=is_test_account, ) ``` - [ ] **Step 3: Flag the dev invite code for testing** Before we can test end-to-end locally, we need an invite code with `marks_as_test=TRUE` in the local dev DB. Run (once, manually): ```bash # First, check if 5VC2MCCN exists locally (it probably doesn't — that's staging's code). # Create a local test invite code and flag it: psql -d golfgame <<'EOF' -- Create a local dev test-seed invite if not exists INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test) SELECT 'SOAKTEST', id, NOW() + INTERVAL '10 years', 100, TRUE, TRUE FROM users_v2 WHERE role = 'admin' LIMIT 1 ON CONFLICT (code) DO UPDATE SET marks_as_test = TRUE; -- Verify SELECT code, max_uses, use_count, marks_as_test FROM invite_codes WHERE code = 'SOAKTEST'; EOF ``` Expected: `marks_as_test | t` in the last row. - [ ] **Step 4: Verify register flow sets `is_test_account`** Restart the dev server, then: ```bash curl -sX POST http://localhost:8000/api/auth/register \ -H 'Content-Type: application/json' \ -d "{\"username\":\"soaktest_register1\",\"password\":\"$PW\",\"email\":\"soaktest_register1@example.com\",\"invite_code\":\"SOAKTEST\"}" # Verify in DB psql -d golfgame -c "SELECT username, is_test_account FROM users_v2 WHERE username = 'soaktest_register1';" ``` Expected: `is_test_account | t`. - [ ] **Step 5: Verify non-test invite does NOT flag new accounts** ```bash # Create a non-test invite psql -d golfgame <<'EOF' INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test) SELECT 'NORMAL01', id, NOW() + INTERVAL '10 years', 10, TRUE, FALSE FROM users_v2 WHERE role = 'admin' LIMIT 1 ON CONFLICT (code) DO UPDATE SET marks_as_test = FALSE; EOF curl -sX POST http://localhost:8000/api/auth/register \ -H 'Content-Type: application/json' \ -d "{\"username\":\"realuser_smoke1\",\"password\":\"$PW\",\"email\":\"realuser_smoke1@example.com\",\"invite_code\":\"NORMAL01\"}" psql -d golfgame -c "SELECT username, is_test_account FROM users_v2 WHERE username = 'realuser_smoke1';" ``` Expected: `is_test_account | f`. - [ ] **Step 6: Commit** ```bash git add server/routers/auth.py server/services/auth_service.py git commit -m "$(cat <<'EOF' feat(server): register flow flags accounts from test-seed invites When a user registers with an invite_code whose marks_as_test=TRUE, their users_v2.is_test_account is set to TRUE. Normal invite codes and invite-less signups are unaffected. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 5: Stats filtering (`include_test` parameter) Thread an `include_test: bool = False` parameter through `get_leaderboard`, `get_player_rank`, and the corresponding router handlers. Default is `False` — real users never see soak traffic. **Files:** - Modify: `server/services/stats_service.py` — `get_leaderboard` (L169), `get_player_rank` (L249) - Modify: `server/routers/stats.py` — `get_leaderboard` route (L157), `get_player_rank` route (L227), `get_my_rank` route (L348) - [ ] **Step 1: Add `include_test` to `get_leaderboard` service method** In `server/services/stats_service.py` at L169: ```python async def get_leaderboard( self, metric: str = "wins", limit: int = 50, offset: int = 0, include_test: bool = False, ) -> List[LeaderboardEntry]: ``` Inside the method, find both SQL paths (materialized view and fallback). In the view path at L208, change the WHERE clause: ```python if view_exists: # Use materialized view for performance rows = await conn.fetch(f""" SELECT user_id, username, games_played, games_won, win_rate, avg_score, knockouts, best_win_streak, COALESCE(rating, 1500) as rating, ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank FROM leaderboard_overall WHERE ($3 OR NOT is_test_account) ORDER BY {column} {direction} LIMIT $1 OFFSET $2 """, limit, offset, include_test) ``` In the fallback path at L220, add the WHERE clause and parameter: ```python else: # Fall back to direct query rows = await conn.fetch(f""" SELECT s.user_id, u.username, s.games_played, s.games_won, ROUND(s.games_won::numeric / NULLIF(s.games_played, 0) * 100, 1) as win_rate, ROUND(s.total_points::numeric / NULLIF(s.total_rounds, 0), 1) as avg_score, s.knockouts, s.best_win_streak, COALESCE(s.rating, 1500) as rating, ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank FROM player_stats s JOIN users_v2 u ON s.user_id = u.id WHERE s.games_played >= 5 AND u.deleted_at IS NULL AND (u.is_banned = false OR u.is_banned IS NULL) AND ($3 OR NOT COALESCE(u.is_test_account, FALSE)) ORDER BY {column} {direction} LIMIT $1 OFFSET $2 """, limit, offset, include_test) ``` - [ ] **Step 2: Apply the same pattern to `get_player_rank`** In `server/services/stats_service.py` at L249: ```python async def get_player_rank( self, user_id: str, metric: str = "wins", include_test: bool = False, ) -> Optional[int]: ``` Update both SQL paths to include the `include_test` filter. View path at L287: ```python if view_exists: row = await conn.fetchrow(f""" SELECT rank FROM ( SELECT user_id, ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank FROM leaderboard_overall WHERE ($2 OR NOT is_test_account) ) ranked WHERE user_id = $1 """, user_id, include_test) ``` Fallback path at L294: ```python else: row = await conn.fetchrow(f""" SELECT rank FROM ( SELECT s.user_id, ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank FROM player_stats s JOIN users_v2 u ON s.user_id = u.id WHERE s.games_played >= 5 AND u.deleted_at IS NULL AND (u.is_banned = false OR u.is_banned IS NULL) AND ($2 OR NOT COALESCE(u.is_test_account, FALSE)) ) ranked WHERE user_id = $1 """, user_id, include_test) ``` - [ ] **Step 3: Expose `include_test` as a query parameter on the leaderboard route** In `server/routers/stats.py` at L157: ```python @router.get("/leaderboard", response_model=LeaderboardResponse) async def get_leaderboard( metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"), limit: int = Query(50, ge=1, le=100), offset: int = Query(0, ge=0), include_test: bool = Query(False, description="Include soak-harness test accounts"), service: StatsService = Depends(get_stats_service_dep), ): """ Get leaderboard by metric. Metrics: - wins: Total games won - win_rate: Win percentage (requires 5+ games) - avg_score: Average points per round (lower is better) - knockouts: Times going out first - streak: Best win streak Players must have 5+ games to appear on leaderboards. By default, soak-harness test accounts are hidden. """ entries = await service.get_leaderboard(metric, limit, offset, include_test) ``` - [ ] **Step 4: Same for `get_player_rank` and `get_my_rank` routes** At L227: ```python @router.get("/players/{user_id}/rank", response_model=PlayerRankResponse) async def get_player_rank( user_id: str, metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"), include_test: bool = Query(False), service: StatsService = Depends(get_stats_service_dep), ): """Get player's rank on a leaderboard.""" rank = await service.get_player_rank(user_id, metric, include_test) ``` At L348: ```python @router.get("/me/rank", response_model=PlayerRankResponse) async def get_my_rank( metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"), include_test: bool = Query(False), user: User = Depends(require_user), service: StatsService = Depends(get_stats_service_dep), ): """Get current user's rank on a leaderboard.""" rank = await service.get_player_rank(user.id, metric, include_test) ``` - [ ] **Step 5: Verify filtering works via curl** ```bash # Mark a test user we registered earlier as having games played (synthetic) psql -d golfgame <<'EOF' INSERT INTO player_stats (user_id, games_played, games_won, total_points, total_rounds, rounds_won) SELECT id, 10, 8, 50, 30, 20 FROM users_v2 WHERE username = 'soaktest_register1' ON CONFLICT (user_id) DO UPDATE SET games_played = 10, games_won = 8; -- Refresh the matview so the test account shows up REFRESH MATERIALIZED VIEW leaderboard_overall; EOF # Default (include_test=false) should NOT include soaktest_register1 curl -s "http://localhost:8000/api/stats/leaderboard?metric=wins" | jq '.entries[] | select(.username | startswith("soaktest_"))' # include_test=true should include soaktest_register1 curl -s "http://localhost:8000/api/stats/leaderboard?metric=wins&include_test=true" | jq '.entries[] | select(.username | startswith("soaktest_"))' ``` Expected: first command returns nothing, second returns a JSON object for `soaktest_register1`. - [ ] **Step 6: Commit** ```bash git add server/services/stats_service.py server/routers/stats.py git commit -m "$(cat <<'EOF' feat(server): stats queries support include_test filter Leaderboard and rank queries take an optional include_test param (default false). Real users never see soak-harness traffic unless they explicitly opt in via ?include_test=true. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 6: Admin service + route surfaces `is_test_account` `UserDetails` exposes the flag, `search_users` selects it, and `list_users` admin route accepts an `include_test` query parameter. **Files:** - Modify: `server/services/admin_service.py` — `UserDetails` (L24–L58), `search_users` (L312–L382), `get_user` (L384–L428) - Modify: `server/routers/admin.py` — `list_users` route (L80–L107) - [ ] **Step 1: Add field to `UserDetails` dataclass** In `server/services/admin_service.py` at L24, add to the dataclass: ```python @dataclass class UserDetails: """Extended user info for admin view.""" id: str username: str email: Optional[str] role: str email_verified: bool is_banned: bool ban_reason: Optional[str] force_password_reset: bool created_at: datetime last_login: Optional[datetime] last_seen_at: Optional[datetime] is_active: bool games_played: int games_won: int is_test_account: bool = False ``` Update `to_dict` to include it: ```python def to_dict(self) -> dict: return { "id": self.id, "username": self.username, "email": self.email, "role": self.role, "email_verified": self.email_verified, "is_banned": self.is_banned, "ban_reason": self.ban_reason, "force_password_reset": self.force_password_reset, "created_at": self.created_at.isoformat() if self.created_at else None, "last_login": self.last_login.isoformat() if self.last_login else None, "last_seen_at": self.last_seen_at.isoformat() if self.last_seen_at else None, "is_active": self.is_active, "games_played": self.games_played, "games_won": self.games_won, "is_test_account": self.is_test_account, } ``` - [ ] **Step 2: Update `search_users` to SELECT and filter on `is_test_account`** In `server/services/admin_service.py` at L312, add `include_test` parameter and column to the SELECT: ```python async def search_users( self, query: str = "", limit: int = 50, offset: int = 0, include_banned: bool = True, include_deleted: bool = False, include_test: bool = True, ) -> List[UserDetails]: ``` Modify the SQL to pull `is_test_account`: ```python sql = """ SELECT u.id, u.username, u.email, u.role, u.email_verified, u.is_banned, u.ban_reason, u.force_password_reset, u.created_at, u.last_login, u.last_seen_at, u.is_active, COALESCE(u.is_test_account, FALSE) as is_test_account, COALESCE(s.games_played, 0) as games_played, COALESCE(s.games_won, 0) as games_won FROM users_v2 u LEFT JOIN player_stats s ON u.id = s.user_id WHERE 1=1 """ ``` After the existing `include_deleted` check, add: ```python if not include_test: sql += " AND (u.is_test_account = false OR u.is_test_account IS NULL)" ``` Update the `UserDetails(...)` construction in the list comprehension to include `is_test_account=row["is_test_account"]`. - [ ] **Step 3: Update `get_user` (single-user lookup) similarly** In `server/services/admin_service.py` at L384, add `COALESCE(u.is_test_account, FALSE) as is_test_account` to the SELECT and `is_test_account=row["is_test_account"]` to the `UserDetails(...)` construction. The `get_user` method does NOT need the filter parameter — admins looking up individual users should always see them. - [ ] **Step 4: Add `include_test` to the admin `list_users` route** In `server/routers/admin.py` at L80: ```python @router.get("/users") async def list_users( query: str = "", limit: int = 50, offset: int = 0, include_banned: bool = True, include_deleted: bool = False, include_test: bool = True, admin: User = Depends(require_admin_v2), service: AdminService = Depends(get_admin_service_dep), ): """ Search and list users. Args: query: Search by username or email. limit: Maximum results to return. offset: Results to skip. include_banned: Include banned users. include_deleted: Include soft-deleted users. include_test: Include soak-harness test accounts (default true for admins). """ users = await service.search_users( query=query, limit=limit, offset=offset, include_banned=include_banned, include_deleted=include_deleted, include_test=include_test, ) return {"users": [u.to_dict() for u in users]} ``` Note: default is `True` for the admin path — admins should see everything by default. The client-side toggle will explicitly pass `false` when the admin wants to hide test accounts. - [ ] **Step 5: Verify via curl** ```bash # Assuming admin token in $TOKEN env var curl -s "http://localhost:8000/api/admin/users?query=soaktest" \ -H "Authorization: Bearer $TOKEN" | jq '.users[] | {username, is_test_account}' curl -s "http://localhost:8000/api/admin/users?query=soaktest&include_test=false" \ -H "Authorization: Bearer $TOKEN" | jq '.users[]' ``` Expected: first returns users with `is_test_account: true`; second returns empty (test accounts filtered out). - [ ] **Step 6: Commit** ```bash git add server/services/admin_service.py server/routers/admin.py git commit -m "$(cat <<'EOF' feat(server): admin users list surfaces is_test_account UserDetails carries the new column, search_users selects and optionally filters on it, and the /api/admin/users route accepts ?include_test=false to hide soak-harness accounts. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 7: Admin panel UI — Test badge and filter toggle Add a visible `[Test]` badge on test accounts in the admin user list, a `[Test-seed]` indicator on invite codes that mark new accounts as test, and an "Include test accounts" checkbox next to the existing "Include banned" toggle. **Files:** - Modify: `client/admin.html` — add the new toggle near the existing `#include-banned` checkbox - Modify: `client/admin.js` — `loadUsers` (L305), `getStatusBadge` (L246), the invite codes renderer (L443) - [ ] **Step 1: Add the "Include test accounts" checkbox to admin.html** In `client/admin.html`, find the existing `#include-banned` checkbox (it's in the users tab filter bar — grep for it). Add a sibling checkbox right after: ```bash grep -n "include-banned" client/admin.html ``` Add next to that line: ```html ``` - [ ] **Step 2: Read the new checkbox in `loadUsers` and pass to getUsers** In `client/admin.js` at L305: ```javascript async function loadUsers() { try { const query = document.getElementById('user-search').value; const includeBanned = document.getElementById('include-banned').checked; const includeTest = document.getElementById('include-test').checked; const data = await getUsers(query, usersPage * PAGE_SIZE, includeBanned, includeTest); ``` Find `getUsers` at L70 and add the new parameter: ```javascript async function getUsers(query = '', offset = 0, includeBanned = true, includeTest = true) { const params = new URLSearchParams({ query, limit: PAGE_SIZE, offset, include_banned: includeBanned, include_test: includeTest, }); return apiRequest(`/api/admin/users?${params}`); } ``` Note: the existing signature builds a URLSearchParams — check the actual code at L70 and match its style; the key change is adding `include_test: includeTest` to the params. - [ ] **Step 3: Add a "Test" badge to the user table row** In `client/admin.js` at L314, modify the table row template to render a Test badge inline with the status badge: ```javascript data.users.forEach(user => { const testBadge = user.is_test_account ? 'Test' : ''; tbody.innerHTML += ` ${escapeHtml(user.username)} ${testBadge} ${escapeHtml(user.email || '-')} ${user.role} ${getStatusBadge(user)} ${user.games_played} (${user.games_won} wins) ${formatDateShort(user.created_at)} `; }); ``` - [ ] **Step 4: Add Test-seed indicator to invite codes list** In `client/admin.js` around L443 (invite codes list renderer), find the row template and add a `[Test-seed]` badge when `invite.marks_as_test`: ```bash grep -n "invite.is_active\|invite.code\|invites-tbody\|invites-table" client/admin.js | head ``` Once located, modify the row template to include: ```javascript const testSeedBadge = invite.marks_as_test ? 'Test-seed' : ''; // Insert testSeedBadge into the invite code column, e.g. // ${escapeHtml(invite.code)} ${testSeedBadge} ``` - [ ] **Step 5: Wire the checkbox change event to reload users** Find where `#include-banned` has its `change` listener attached (grep for it in admin.js): ```bash grep -n "include-banned.*addEventListener\|include-banned" client/admin.js ``` Add a parallel listener for `#include-test` that calls `loadUsers()`: ```javascript document.getElementById('include-test').addEventListener('change', () => { usersPage = 0; loadUsers(); }); ``` - [ ] **Step 6: Manual verification in browser** 1. Open http://localhost:8000/admin.html 2. Log in as admin 3. Navigate to Users tab 4. Search for "soaktest" 5. Confirm the `[Test]` badge appears next to `soaktest_register1` 6. Uncheck "Include test accounts" — the row should disappear 7. Re-check it — the row should return 8. Navigate to Invite Codes tab 9. Confirm the `[Test-seed]` badge appears next to the `SOAKTEST` code - [ ] **Step 7: Commit** ```bash git add client/admin.html client/admin.js git commit -m "$(cat <<'EOF' feat(admin): visible Test/Test-seed badges + filter toggle Users table shows [Test] next to soak-harness accounts, invite codes list shows [Test-seed] next to codes that flag new accounts as test, and a new "Include test accounts" checkbox lets admins hide bot traffic from the user list. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 8: Document the one-time staging setup step The staging invite code `5VC2MCCN` needs to be flagged as test-seed before the harness can run against staging. This is a manual one-liner; document it in a new bring-up doc. **Files:** - Create: `docs/soak-harness-bringup.md` - [ ] **Step 1: Create the bring-up doc** ```bash cat > docs/soak-harness-bringup.md <<'EOF' # Soak Harness Bring-Up One-time setup steps before running `tests/soak` against an environment. ## Prerequisites - An invite code exists with 16+ available uses - You have psql access to the target DB (or admin SQL access via some other means) ## 1. Flag the invite code as test-seed Any account registered with a `marks_as_test=TRUE` invite code gets `users_v2.is_test_account=TRUE`, which keeps it out of real-user stats. ### Staging Invite code: `5VC2MCCN` (16 uses, provisioned 2026-04-10). ```sql UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN'; SELECT code, max_uses, use_count, marks_as_test FROM invite_codes WHERE code = '5VC2MCCN'; ``` Expected: `marks_as_test | t`. ### Local dev The dev DB already has a `SOAKTEST` invite created during Task 4 of the implementation plan. If you wiped the DB since, recreate it: ```sql INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test) SELECT 'SOAKTEST', id, NOW() + INTERVAL '10 years', 100, TRUE, TRUE FROM users_v2 WHERE role = 'admin' LIMIT 1 ON CONFLICT (code) DO UPDATE SET marks_as_test = TRUE; ``` ## 2. Run the harness ```bash cd tests/soak npm install npm run seed # first run only, populates .env.stresstest TEST_URL=http://localhost:8000 npm run smoke # 30s end-to-end check ``` For staging: ```bash TEST_URL=https://staging.adlee.work npm run soak -- --scenario=populate ``` See `tests/soak/README.md` for the full flag reference. EOF ``` - [ ] **Step 2: Commit** ```bash git add docs/soak-harness-bringup.md git commit -m "$(cat <<'EOF' docs: soak harness bring-up steps Documents the one-time UPDATE invite_codes SET marks_as_test = TRUE step required before running tests/soak against each environment, plus the local dev SOAKTEST invite recreation SQL. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ## Phase 2 — Harness scaffolding ### Task 9: Create the `tests/soak/` package skeleton Bare minimum to get `tsx` running against an empty entry point. No behavior yet. **Files:** - Create: `tests/soak/package.json` - Create: `tests/soak/tsconfig.json` - Create: `tests/soak/.gitignore` - Create: `tests/soak/.env.stresstest.example` - Create: `tests/soak/README.md` (stub) - Create: `tests/soak/runner.ts` (stub — prints "hello") - [ ] **Step 1: Create `tests/soak/package.json`** ```json { "name": "golf-soak", "version": "0.1.0", "private": true, "description": "Multiplayer soak & UX test harness for Golf Card Game", "scripts": { "soak": "tsx runner.ts", "soak:populate": "tsx runner.ts --scenario=populate", "soak:stress": "tsx runner.ts --scenario=stress", "seed": "tsx scripts/seed-accounts.ts", "smoke": "bash scripts/smoke.sh", "test": "vitest run" }, "dependencies": { "playwright-core": "^1.40.0", "ws": "^8.16.0" }, "devDependencies": { "tsx": "^4.7.0", "@types/ws": "^8.5.0", "@types/node": "^20.10.0", "typescript": "^5.3.0", "vitest": "^1.2.0" } } ``` - [ ] **Step 2: Create `tests/soak/tsconfig.json`** ```json { "compilerOptions": { "target": "ES2022", "module": "commonjs", "moduleResolution": "node", "strict": true, "esModuleInterop": true, "skipLibCheck": true, "forceConsistentCasingInFileNames": true, "resolveJsonModule": true, "declaration": false, "sourceMap": true, "outDir": "./dist", "rootDir": ".", "baseUrl": ".", "lib": ["ES2022", "DOM"], "paths": { "@soak/*": ["./*"], "@bot/*": ["../e2e/bot/*"] } }, "include": ["**/*.ts"], "exclude": ["node_modules", "dist", "artifacts"] } ``` - [ ] **Step 3: Create `tests/soak/.gitignore`** ``` node_modules/ dist/ artifacts/ .env.stresstest *.log ``` - [ ] **Step 4: Create `tests/soak/.env.stresstest.example`** ``` # Soak harness account cache. # This file is AUTO-GENERATED on first run; do not edit by hand. # Format: SOAK_ACCOUNT_NN=username:password:token # # Example (delete before first real run): # SOAK_ACCOUNT_00=soak_00_a7bx:: ``` - [ ] **Step 5: Create `tests/soak/README.md` (stub — expanded in Task 31)** ```markdown # Golf Soak & UX Test Harness Runs 16 authenticated browser sessions across 4 rooms to populate staging scoreboards and stress-test multiplayer stability. **Spec:** `docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md` **Bring-up:** `docs/soak-harness-bringup.md` ## Quick start ```bash npm install npm run seed # first run only TEST_URL=http://localhost:8000 npm run smoke ``` Full documentation arrives with Task 31. ``` - [ ] **Step 6: Create `tests/soak/runner.ts` as a placeholder** ```typescript #!/usr/bin/env tsx /** * Golf Soak Harness — entry point. * * Placeholder. Full runner lands in Task 17. */ async function main(): Promise { console.log('golf-soak runner (placeholder)'); console.log('Full implementation lands in Task 17 of the plan.'); } main().catch((err) => { console.error(err); process.exit(1); }); ``` - [ ] **Step 7: Install deps and verify runner executes** ```bash cd tests/soak npm install npx tsx runner.ts ``` Expected output: ``` golf-soak runner (placeholder) Full implementation lands in Task 17 of the plan. ``` - [ ] **Step 8: Commit** ```bash git add tests/soak/package.json tests/soak/package-lock.json tests/soak/tsconfig.json tests/soak/.gitignore tests/soak/.env.stresstest.example tests/soak/README.md tests/soak/runner.ts git commit -m "$(cat <<'EOF' feat(soak): scaffold tests/soak package Placeholder runner, tsconfig with @bot alias to tests/e2e/bot, gitignored .env.stresstest + artifacts. Real behavior follows in Task 10 onward. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 10: Core types and `Deferred` helper Pure TypeScript with Vitest tests. No browser, no network. Establishes the type surface the rest of the harness will target. **Files:** - Create: `tests/soak/core/types.ts` - Create: `tests/soak/core/deferred.ts` - Create: `tests/soak/tests/deferred.test.ts` - [ ] **Step 1: Write the failing test for `Deferred`** Create `tests/soak/tests/deferred.test.ts`: ```typescript import { describe, it, expect } from 'vitest'; import { deferred } from '../core/deferred'; describe('deferred', () => { it('resolves with the given value', async () => { const d = deferred(); d.resolve('hello'); await expect(d.promise).resolves.toBe('hello'); }); it('rejects with the given error', async () => { const d = deferred(); const err = new Error('boom'); d.reject(err); await expect(d.promise).rejects.toBe(err); }); it('ignores second resolve calls', async () => { const d = deferred(); d.resolve(1); d.resolve(2); await expect(d.promise).resolves.toBe(1); }); }); ``` - [ ] **Step 2: Run the test to verify it fails** ```bash cd tests/soak npx vitest run tests/deferred.test.ts ``` Expected: FAIL — module `../core/deferred` does not exist. - [ ] **Step 3: Implement `deferred`** Create `tests/soak/core/deferred.ts`: ```typescript /** * Promise deferred primitive — lets external code resolve or reject * a promise. Used by RoomCoordinator for host→joiners handoff. */ export interface Deferred { promise: Promise; resolve(value: T): void; reject(error: unknown): void; } export function deferred(): Deferred { let resolve!: (value: T) => void; let reject!: (error: unknown) => void; const promise = new Promise((res, rej) => { resolve = res; reject = rej; }); return { promise, resolve, reject }; } ``` - [ ] **Step 4: Run tests to verify they pass** ```bash npx vitest run tests/deferred.test.ts ``` Expected: 3 passed. - [ ] **Step 5: Create `core/types.ts` with the scenario interfaces** ```typescript /** * Core type definitions for the soak harness. * * Contracts here are consumed by runner.ts, SessionPool, scenarios, * and the dashboard. Keep this file small and stable. */ import type { BrowserContext, Page } from 'playwright-core'; import type { GolfBot } from '../../e2e/bot/golf-bot'; // ============================================================================= // Accounts & sessions // ============================================================================= export interface Account { /** Stable key used in logs, e.g. "soak_00". */ key: string; username: string; password: string; /** JWT returned from /api/auth/login, may be refreshed by SessionPool. */ token: string; } export interface Session { account: Account; context: BrowserContext; page: Page; bot: GolfBot; /** Convenience mirror of account.key. */ key: string; } // ============================================================================= // Scenarios // ============================================================================= export interface ScenarioNeeds { /** Total number of authenticated sessions the scenario requires. */ accounts: number; /** How many rooms to partition sessions into (default: 1). */ rooms?: number; /** CPUs to add per room (default: 0). */ cpusPerRoom?: number; } /** Free-form per-scenario config merged with CLI flags. */ export type ScenarioConfig = Record; export interface ScenarioError { room: string; reason: string; detail?: string; timestamp: number; } export interface ScenarioResult { gamesCompleted: number; errors: ScenarioError[]; durationMs: number; customMetrics?: Record; } export interface ScenarioContext { /** Merged config: CLI flags → env → scenario defaults → runner defaults. */ config: ScenarioConfig; /** Pre-authenticated sessions; ordered. */ sessions: Session[]; coordinator: RoomCoordinatorApi; dashboard: DashboardReporter; logger: Logger; signal: AbortSignal; /** Reset the per-room watchdog. Call at each progress point. */ heartbeat(roomId: string): void; } export interface Scenario { name: string; description: string; defaultConfig: ScenarioConfig; needs: ScenarioNeeds; run(ctx: ScenarioContext): Promise; } // ============================================================================= // Room coordination // ============================================================================= export interface RoomCoordinatorApi { announce(roomId: string, code: string): void; await(roomId: string, timeoutMs?: number): Promise; } // ============================================================================= // Dashboard reporter // ============================================================================= export interface RoomState { phase?: string; currentPlayer?: string; hole?: number; totalHoles?: number; game?: number; totalGames?: number; moves?: number; players?: Array<{ key: string; score: number | null; isActive: boolean }>; message?: string; } export interface DashboardReporter { update(roomId: string, state: Partial): void; log(level: 'info' | 'warn' | 'error', msg: string, meta?: object): void; incrementMetric(name: string, by?: number): void; } // ============================================================================= // Logger // ============================================================================= export type LogLevel = 'debug' | 'info' | 'warn' | 'error'; export interface Logger { debug(msg: string, meta?: object): void; info(msg: string, meta?: object): void; warn(msg: string, meta?: object): void; error(msg: string, meta?: object): void; child(meta: object): Logger; } ``` - [ ] **Step 6: Verify tsx still parses the runner** ```bash cd tests/soak npx tsx runner.ts ``` Expected: still prints the placeholder output; no TypeScript errors from the new `core/` files (they're not imported yet). - [ ] **Step 7: Commit** ```bash git add tests/soak/core/deferred.ts tests/soak/core/types.ts tests/soak/tests/deferred.test.ts git commit -m "$(cat <<'EOF' feat(soak): core types + Deferred primitive Establishes the Scenario/Session/Logger/DashboardReporter contracts the rest of the harness builds on. Deferred is the building block for RoomCoordinator's host→joiners handoff. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 11: RoomCoordinator with tests Tiny abstraction over `Deferred` keyed by room ID, with a timeout on `await`. **Files:** - Create: `tests/soak/core/room-coordinator.ts` - Create: `tests/soak/tests/room-coordinator.test.ts` - [ ] **Step 1: Write failing tests** ```typescript // tests/soak/tests/room-coordinator.test.ts import { describe, it, expect } from 'vitest'; import { RoomCoordinator } from '../core/room-coordinator'; describe('RoomCoordinator', () => { it('resolves await with the announced code (announce then await)', async () => { const rc = new RoomCoordinator(); rc.announce('room-1', 'ABCD'); await expect(rc.await('room-1')).resolves.toBe('ABCD'); }); it('resolves await with the announced code (await then announce)', async () => { const rc = new RoomCoordinator(); const p = rc.await('room-2'); rc.announce('room-2', 'WXYZ'); await expect(p).resolves.toBe('WXYZ'); }); it('rejects await after timeout if not announced', async () => { const rc = new RoomCoordinator(); await expect(rc.await('room-3', 50)).rejects.toThrow(/timed out/i); }); it('isolates rooms — announcing room-A does not unblock room-B', async () => { const rc = new RoomCoordinator(); const pB = rc.await('room-B', 100); rc.announce('room-A', 'A-CODE'); await expect(pB).rejects.toThrow(/timed out/i); }); }); ``` - [ ] **Step 2: Run tests to verify they fail** ```bash npx vitest run tests/room-coordinator.test.ts ``` Expected: FAIL — module not found. - [ ] **Step 3: Implement `RoomCoordinator`** ```typescript // tests/soak/core/room-coordinator.ts import { deferred, Deferred } from './deferred'; import type { RoomCoordinatorApi } from './types'; export class RoomCoordinator implements RoomCoordinatorApi { private rooms = new Map>(); announce(roomId: string, code: string): void { this.getOrCreate(roomId).resolve(code); } async await(roomId: string, timeoutMs: number = 30_000): Promise { const d = this.getOrCreate(roomId); let timer: NodeJS.Timeout | undefined; const timeout = new Promise((_, reject) => { timer = setTimeout(() => { reject(new Error(`RoomCoordinator: room "${roomId}" timed out after ${timeoutMs}ms`)); }, timeoutMs); }); try { return await Promise.race([d.promise, timeout]); } finally { if (timer) clearTimeout(timer); } } private getOrCreate(roomId: string): Deferred { let d = this.rooms.get(roomId); if (!d) { d = deferred(); this.rooms.set(roomId, d); } return d; } } ``` - [ ] **Step 4: Verify tests pass** ```bash npx vitest run tests/room-coordinator.test.ts ``` Expected: 4 passed. - [ ] **Step 5: Commit** ```bash git add tests/soak/core/room-coordinator.ts tests/soak/tests/room-coordinator.test.ts git commit -m "$(cat <<'EOF' feat(soak): RoomCoordinator with host→joiners handoff Lazy Deferred per roomId with a timeout on await. Lets concurrent joiner sessions block until their host announces the room code without polling or page scraping. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 12: Structured JSONL logger Single module, no transport, writes to `process.stdout`. Supports child loggers with bound metadata (so scenarios can emit logs with `room` / `game` context without repeating it). **Files:** - Create: `tests/soak/core/logger.ts` - Create: `tests/soak/tests/logger.test.ts` - [ ] **Step 1: Write failing tests** ```typescript // tests/soak/tests/logger.test.ts import { describe, it, expect, beforeEach, vi } from 'vitest'; import { createLogger } from '../core/logger'; describe('logger', () => { let writes: string[]; let write: (s: string) => boolean; beforeEach(() => { writes = []; write = (s: string) => { writes.push(s); return true; }; }); it('emits a JSON line per call with level and msg', () => { const log = createLogger({ runId: 'r1', write }); log.info('hello'); expect(writes).toHaveLength(1); const parsed = JSON.parse(writes[0]); expect(parsed.level).toBe('info'); expect(parsed.msg).toBe('hello'); expect(parsed.runId).toBe('r1'); expect(parsed.timestamp).toBeTypeOf('string'); }); it('merges meta into the log line', () => { const log = createLogger({ runId: 'r1', write }); log.warn('slow', { turnMs: 3000 }); const parsed = JSON.parse(writes[0]); expect(parsed.turnMs).toBe(3000); expect(parsed.level).toBe('warn'); }); it('child logger inherits parent meta', () => { const log = createLogger({ runId: 'r1', write }); const roomLog = log.child({ room: 'room-1' }); roomLog.info('game_start'); const parsed = JSON.parse(writes[0]); expect(parsed.room).toBe('room-1'); expect(parsed.runId).toBe('r1'); }); it('respects minimum level', () => { const log = createLogger({ runId: 'r1', write, minLevel: 'warn' }); log.debug('nope'); log.info('nope'); log.warn('yes'); log.error('yes'); expect(writes).toHaveLength(2); }); }); ``` - [ ] **Step 2: Run tests to verify they fail** ```bash npx vitest run tests/logger.test.ts ``` Expected: FAIL — module not found. - [ ] **Step 3: Implement the logger** ```typescript // tests/soak/core/logger.ts import type { Logger, LogLevel } from './types'; const LEVEL_ORDER: Record = { debug: 0, info: 1, warn: 2, error: 3, }; export interface LoggerOptions { runId: string; minLevel?: LogLevel; /** Defaults to process.stdout.write bound to stdout. Override for tests. */ write?: (line: string) => boolean; baseMeta?: Record; } export function createLogger(opts: LoggerOptions): Logger { const minLevel = opts.minLevel ?? 'info'; const write = opts.write ?? ((s: string) => process.stdout.write(s)); const baseMeta = opts.baseMeta ?? {}; function emit(level: LogLevel, msg: string, meta?: object): void { if (LEVEL_ORDER[level] < LEVEL_ORDER[minLevel]) return; const line = JSON.stringify({ timestamp: new Date().toISOString(), level, msg, runId: opts.runId, ...baseMeta, ...(meta ?? {}), }) + '\n'; write(line); } const logger: Logger = { debug: (msg, meta) => emit('debug', msg, meta), info: (msg, meta) => emit('info', msg, meta), warn: (msg, meta) => emit('warn', msg, meta), error: (msg, meta) => emit('error', msg, meta), child: (meta) => createLogger({ runId: opts.runId, minLevel, write, baseMeta: { ...baseMeta, ...meta }, }), }; return logger; } ``` - [ ] **Step 4: Verify tests pass** ```bash npx vitest run tests/logger.test.ts ``` Expected: 4 passed. - [ ] **Step 5: Commit** ```bash git add tests/soak/core/logger.ts tests/soak/tests/logger.test.ts git commit -m "$(cat <<'EOF' feat(soak): structured JSONL logger with child contexts Single file, no transport, writes one JSON line per call to stdout. Child loggers inherit parent meta so scenarios can bind room/game context once and forget about it. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ## Phase 3 — SessionPool and seeding ### Task 13: SessionPool with HTTP registration and localStorage warm-start This is the biggest single module. It owns browser context lifecycle, seeds accounts on cold start, logs in on warm start, and exposes a simple `acquire()` API to scenarios. **Files:** - Create: `tests/soak/core/session-pool.ts` Testing: manual via `scripts/seed-accounts.ts` in Task 14 and the first real runner invocation in Task 17. No Vitest test for this — it's an integration module that needs a real browser. - [ ] **Step 1: Create `tests/soak/core/session-pool.ts` — imports and types** ```typescript // tests/soak/core/session-pool.ts import * as fs from 'fs'; import * as path from 'path'; import { Browser, BrowserContext, chromium, } from 'playwright-core'; import { GolfBot } from '../../e2e/bot/golf-bot'; import type { Account, Session, Logger } from './types'; export interface SeedOptions { /** Full base URL of the target server, e.g. https://staging.adlee.work. */ targetUrl: string; /** Invite code to pass to /api/auth/register. */ inviteCode: string; /** Number of accounts to create. */ count: number; } export interface SessionPoolOptions { targetUrl: string; inviteCode: string; credFile: string; // absolute path to .env.stresstest logger: Logger; /** Optional override for the browser to attach contexts to. If absent, SessionPool launches its own. */ browser?: Browser; /** Passed through to context.newContext. Useful for viewport overrides in tests. */ contextOptions?: Parameters[0]; } ``` - [ ] **Step 2: Implement cred-file read/write** Append to `session-pool.ts`: ```typescript function readCredFile(filePath: string): Account[] | null { if (!fs.existsSync(filePath)) return null; const content = fs.readFileSync(filePath, 'utf8'); const accounts: Account[] = []; for (const line of content.split('\n')) { const trimmed = line.trim(); if (!trimmed || trimmed.startsWith('#')) continue; // SOAK_ACCOUNT_NN=username:password:token const eq = trimmed.indexOf('='); if (eq === -1) continue; const key = trimmed.slice(0, eq); const value = trimmed.slice(eq + 1); const m = key.match(/^SOAK_ACCOUNT_(\d+)$/); if (!m) continue; const [username, password, token] = value.split(':'); if (!username || !password || !token) continue; const idx = parseInt(m[1], 10); accounts.push({ key: `soak_${String(idx).padStart(2, '0')}`, username, password, token, }); } return accounts.length > 0 ? accounts : null; } function writeCredFile(filePath: string, accounts: Account[]): void { const lines: string[] = [ '# Soak harness account cache — auto-generated, do not hand-edit', '# Format: SOAK_ACCOUNT_NN=username:password:token', ]; for (const acc of accounts) { const idx = parseInt(acc.key.replace('soak_', ''), 10); const key = `SOAK_ACCOUNT_${String(idx).padStart(2, '0')}`; lines.push(`${key}=${acc.username}:${acc.password}:${acc.token}`); } fs.writeFileSync(filePath, lines.join('\n') + '\n', { mode: 0o600 }); } ``` - [ ] **Step 3: Implement the HTTP register call** ```typescript interface RegisterResponse { user: { id: string; username: string }; token: string; expires_at: string; } async function registerAccount( targetUrl: string, username: string, password: string, email: string, inviteCode: string, ): Promise { const res = await fetch(`${targetUrl}/api/auth/register`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ username, password, email, invite_code: inviteCode }), }); if (!res.ok) { const body = await res.text().catch(() => ''); throw new Error(`register failed: ${res.status} ${body}`); } const data = (await res.json()) as RegisterResponse; if (!data.token) { throw new Error(`register returned no token: ${JSON.stringify(data)}`); } return data.token; } async function loginAccount( targetUrl: string, username: string, password: string, ): Promise { const res = await fetch(`${targetUrl}/api/auth/login`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ username, password }), }); if (!res.ok) { const body = await res.text().catch(() => ''); throw new Error(`login failed: ${res.status} ${body}`); } const data = (await res.json()) as RegisterResponse; return data.token; } function randomSuffix(): string { return Math.random().toString(36).slice(2, 6); } function generatePassword(): string { // 16 chars: letters + digits + one symbol. Meets 8-char minimum from auth_service. // Split across halves so repo secret-scanners don't flag the string as base64 const lower = 'abcdefghijkm' + 'npqrstuvwxyz'; // pragma: allowlist secret const upper = 'ABCDEFGHJKLM' + 'NPQRSTUVWXYZ'; // pragma: allowlist secret const digits = '23456789'; const chars = lower + upper + digits; let out = ''; for (let i = 0; i < 15; i++) { out += chars[Math.floor(Math.random() * chars.length)]; } return out + '!'; } ``` - [ ] **Step 4: Implement the `SessionPool` class** ```typescript export class SessionPool { private accounts: Account[] = []; private ownedBrowser: Browser | null = null; private browser: Browser | null; private activeSessions: Session[] = []; constructor(private opts: SessionPoolOptions) { this.browser = opts.browser ?? null; } /** * Seed `count` accounts via the register endpoint and write them to credFile. * Safe to call multiple times — skips accounts already in the file. */ static async seed(opts: SeedOptions & { credFile: string; logger: Logger }): Promise { const existing = readCredFile(opts.credFile) ?? []; const existingKeys = new Set(existing.map((a) => a.key)); const created: Account[] = [...existing]; for (let i = 0; i < opts.count; i++) { const key = `soak_${String(i).padStart(2, '0')}`; if (existingKeys.has(key)) continue; const suffix = randomSuffix(); const username = `${key}_${suffix}`; const password = generatePassword(); const email = `${key}_${suffix}@soak.test`; opts.logger.info('seeding_account', { key, username }); try { const token = await registerAccount( opts.targetUrl, username, password, email, opts.inviteCode, ); created.push({ key, username, password, token }); writeCredFile(opts.credFile, created); } catch (err) { opts.logger.error('seed_failed', { key, error: err instanceof Error ? err.message : String(err), }); throw err; } } return created; } /** * Load accounts from credFile, auto-seeding if the file is missing. */ async ensureAccounts(desiredCount: number): Promise { let accounts = readCredFile(this.opts.credFile); if (!accounts || accounts.length < desiredCount) { this.opts.logger.warn('cred_file_missing_or_short', { found: accounts?.length ?? 0, desired: desiredCount, }); accounts = await SessionPool.seed({ targetUrl: this.opts.targetUrl, inviteCode: this.opts.inviteCode, count: desiredCount, credFile: this.opts.credFile, logger: this.opts.logger, }); } this.accounts = accounts.slice(0, desiredCount); return this.accounts; } /** * Launch the browser if not provided, create N contexts, log each in via * localStorage injection (falling back to POST /api/auth/login if the * cached token is rejected), and return the live sessions. */ async acquire(count: number): Promise { await this.ensureAccounts(count); if (!this.browser) { this.ownedBrowser = await chromium.launch({ headless: true }); this.browser = this.ownedBrowser; } const sessions: Session[] = []; for (let i = 0; i < count; i++) { const account = this.accounts[i]; const context = await this.browser.newContext(this.opts.contextOptions); await this.injectAuth(context, account); const page = await context.newPage(); await page.goto(this.opts.targetUrl); const bot = new GolfBot(page); sessions.push({ account, context, page, bot, key: account.key }); } this.activeSessions = sessions; return sessions; } /** * Inject the cached JWT into localStorage BEFORE any page loads. * Uses addInitScript so the token is present on the first navigation. * If the cached token is rejected later, acquire() falls back to login. */ private async injectAuth(context: BrowserContext, account: Account): Promise { // Try the cached token first try { await context.addInitScript( ({ token, username }) => { window.localStorage.setItem('authToken', token); window.localStorage.setItem( 'authUser', JSON.stringify({ id: '', username, role: 'user', email_verified: true }), ); }, { token: account.token, username: account.username }, ); } catch (err) { this.opts.logger.warn('inject_auth_failed', { account: account.key, error: err instanceof Error ? err.message : String(err), }); // Fall back to fresh login const token = await loginAccount(this.opts.targetUrl, account.username, account.password); account.token = token; writeCredFile(this.opts.credFile, this.accounts); await context.addInitScript( ({ token, username }) => { window.localStorage.setItem('authToken', token); window.localStorage.setItem( 'authUser', JSON.stringify({ id: '', username, role: 'user', email_verified: true }), ); }, { token, username: account.username }, ); } } /** Close all active contexts. Safe to call multiple times. */ async release(): Promise { for (const session of this.activeSessions) { try { await session.context.close(); } catch { // ignore } } this.activeSessions = []; if (this.ownedBrowser) { try { await this.ownedBrowser.close(); } catch { // ignore } this.ownedBrowser = null; this.browser = null; } } } ``` - [ ] **Step 5: Syntax-check by invoking tsx** ```bash cd tests/soak npx tsx -e "import('./core/session-pool').then(() => console.log('ok'))" ``` Expected: `ok`. No TypeScript errors. - [ ] **Step 6: Commit** ```bash git add tests/soak/core/session-pool.ts git commit -m "$(cat <<'EOF' feat(soak): SessionPool — seed, login, acquire contexts Owns 16 BrowserContexts, seeds via POST /api/auth/register with the invite code on cold start, warm-starts via localStorage injection of the cached JWT, falls back to POST /api/auth/login if the token is rejected. Exposes acquire(n) for scenarios. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 14: `seed-accounts.ts` CLI wrapper Tiny standalone entry point that lets you pre-seed before the first harness run. Reuses `SessionPool.seed`. **Files:** - Create: `tests/soak/scripts/seed-accounts.ts` - [ ] **Step 1: Write the script** ```typescript #!/usr/bin/env tsx /** * Seed N soak-harness accounts via the register endpoint. * * Usage: * TEST_URL=http://localhost:8000 \ * SOAK_INVITE_CODE=SOAKTEST \ * npm run seed -- --count=16 */ import * as path from 'path'; import { SessionPool } from '../core/session-pool'; import { createLogger } from '../core/logger'; function parseArgs(argv: string[]): { count: number } { const result = { count: 16 }; for (const arg of argv.slice(2)) { const m = arg.match(/^--count=(\d+)$/); if (m) result.count = parseInt(m[1], 10); } return result; } async function main(): Promise { const { count } = parseArgs(process.argv); const targetUrl = process.env.TEST_URL ?? 'http://localhost:8000'; const inviteCode = process.env.SOAK_INVITE_CODE; if (!inviteCode) { console.error('SOAK_INVITE_CODE env var is required'); console.error(' Local dev: SOAK_INVITE_CODE=SOAKTEST'); console.error(' Staging: SOAK_INVITE_CODE=5VC2MCCN'); process.exit(2); } const credFile = path.resolve(__dirname, '..', '.env.stresstest'); const logger = createLogger({ runId: `seed-${Date.now()}` }); logger.info('seed_start', { count, targetUrl, credFile }); try { const accounts = await SessionPool.seed({ targetUrl, inviteCode, count, credFile, logger, }); logger.info('seed_complete', { created: accounts.length }); console.error(`Seeded ${accounts.length} accounts → ${credFile}`); } catch (err) { logger.error('seed_failed', { error: err instanceof Error ? err.message : String(err), }); process.exit(1); } } main(); ``` - [ ] **Step 2: Run it against local dev to verify end-to-end** With the dev server running and the `SOAKTEST` invite flagged: ```bash cd tests/soak TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run seed -- --count=4 ``` Expected: - Log lines `seeding_account` × 4 - Log line `seed_complete` - `tests/soak/.env.stresstest` file created with 4 `SOAK_ACCOUNT_NN=...` lines Verify: ```bash cat tests/soak/.env.stresstest | head ``` Expected: 4 account lines. Also verify the accounts got flagged: ```bash psql -d golfgame -c "SELECT username, is_test_account FROM users_v2 WHERE username LIKE 'soak_%' ORDER BY username;" ``` Expected: 4 rows, all with `is_test_account | t`. - [ ] **Step 3: Commit** ```bash git add tests/soak/scripts/seed-accounts.ts git commit -m "$(cat <<'EOF' feat(soak): scripts/seed-accounts.ts CLI wrapper Thin standalone entry for pre-seeding N accounts before the first harness run. Wraps SessionPool.seed and writes .env.stresstest. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ## Phase 4 — First scenario, config, runner (end-to-end milestone) ### Task 15: Shared multiplayer-game helper Pulls the "run one full game in one room" logic out of the scenarios so `populate` and `stress` share it. Takes a room's sessions and a config, loops until the game ends. **Files:** - Create: `tests/soak/scenarios/shared/multiplayer-game.ts` - [ ] **Step 1: Create the helper module** ```typescript // tests/soak/scenarios/shared/multiplayer-game.ts import type { Session, ScenarioContext } from '../../core/types'; export interface MultiplayerGameOptions { roomId: string; holes: number; decks: number; cpusPerRoom: number; cpuPersonality?: string; /** Per-turn think time in [min, max] ms. */ thinkTimeMs: [number, number]; /** Max wall-clock time before giving up on the game (ms). */ maxDurationMs?: number; } export interface MultiplayerGameResult { completed: boolean; turns: number; durationMs: number; error?: string; } function randomInt(min: number, max: number): number { return Math.floor(Math.random() * (max - min + 1)) + min; } async function sleep(ms: number): Promise { return new Promise((resolve) => setTimeout(resolve, ms)); } /** * Host + joiners play one full multiplayer game end to end. * The host creates the room, announces the code via the coordinator, * joiners wait for the code, the host adds CPUs and starts, everyone * loops on isMyTurn/playTurn until round_over or game_over. */ export async function runOneMultiplayerGame( ctx: ScenarioContext, sessions: Session[], opts: MultiplayerGameOptions, ): Promise { const start = Date.now(); const [host, ...joiners] = sessions; const maxDuration = opts.maxDurationMs ?? 5 * 60_000; try { // Host creates game const code = await host.bot.createGame(host.account.username); ctx.coordinator.announce(opts.roomId, code); ctx.heartbeat(opts.roomId); ctx.dashboard.update(opts.roomId, { phase: 'lobby' }); ctx.logger.info('room_created', { room: opts.roomId, code }); // Joiners join concurrently await Promise.all( joiners.map(async (joiner) => { const awaited = await ctx.coordinator.await(opts.roomId); await joiner.bot.joinGame(awaited, joiner.account.username); }), ); ctx.heartbeat(opts.roomId); // Host adds CPUs (if any) and starts for (let i = 0; i < opts.cpusPerRoom; i++) { await host.bot.addCPU(opts.cpuPersonality); } await host.bot.startGame({ holes: opts.holes, decks: opts.decks }); ctx.heartbeat(opts.roomId); ctx.dashboard.update(opts.roomId, { phase: 'playing', totalHoles: opts.holes }); // Concurrent turn loops — one per session const turnCounts = new Array(sessions.length).fill(0); async function sessionLoop(sessionIdx: number): Promise { const session = sessions[sessionIdx]; while (true) { if (ctx.signal.aborted) return; if (Date.now() - start > maxDuration) return; const phase = await session.bot.getGamePhase(); if (phase === 'game_over' || phase === 'round_over') return; if (await session.bot.isMyTurn()) { await session.bot.playTurn(); turnCounts[sessionIdx]++; ctx.heartbeat(opts.roomId); ctx.dashboard.update(opts.roomId, { currentPlayer: session.account.username, moves: turnCounts.reduce((a, b) => a + b, 0), }); const thinkMs = randomInt(opts.thinkTimeMs[0], opts.thinkTimeMs[1]); await sleep(thinkMs); } else { await sleep(200); } } } await Promise.all(sessions.map((_, i) => sessionLoop(i))); const totalTurns = turnCounts.reduce((a, b) => a + b, 0); ctx.dashboard.update(opts.roomId, { phase: 'round_over' }); return { completed: true, turns: totalTurns, durationMs: Date.now() - start, }; } catch (err) { return { completed: false, turns: 0, durationMs: Date.now() - start, error: err instanceof Error ? err.message : String(err), }; } } ``` - [ ] **Step 2: Syntax-check** ```bash cd tests/soak npx tsx -e "import('./scenarios/shared/multiplayer-game').then(() => console.log('ok'))" ``` Expected: `ok`. - [ ] **Step 3: Commit** ```bash git add tests/soak/scenarios/shared/multiplayer-game.ts git commit -m "$(cat <<'EOF' feat(soak): shared runOneMultiplayerGame helper Encapsulates the host-creates/joiners-join/loop-until-done flow so populate and stress scenarios don't duplicate it. Honors abort signal and a max-duration timeout, heartbeats on every turn. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 16: Populate scenario (minimal version) Partitions sessions into rooms, runs `gamesPerRoom` games per room in parallel, aggregates results. **Files:** - Create: `tests/soak/scenarios/populate.ts` - Create: `tests/soak/scenarios/index.ts` - [ ] **Step 1: Create `scenarios/populate.ts`** ```typescript // tests/soak/scenarios/populate.ts import type { Scenario, ScenarioContext, ScenarioResult, ScenarioError, Session, } from '../core/types'; import { runOneMultiplayerGame } from './shared/multiplayer-game'; const CPU_PERSONALITIES = ['Sofia', 'Marcus', 'Kenji', 'Priya']; interface PopulateConfig { gamesPerRoom: number; holes: number; decks: number; rooms: number; cpusPerRoom: number; thinkTimeMs: [number, number]; interGamePauseMs: number; } function chunk(arr: T[], size: number): T[][] { const out: T[][] = []; for (let i = 0; i < arr.length; i += size) { out.push(arr.slice(i, i + size)); } return out; } async function sleep(ms: number): Promise { return new Promise((resolve) => setTimeout(resolve, ms)); } async function runRoom( ctx: ScenarioContext, cfg: PopulateConfig, roomIdx: number, sessions: Session[], ): Promise<{ completed: number; errors: ScenarioError[] }> { const roomId = `room-${roomIdx}`; const cpuPersonality = CPU_PERSONALITIES[roomIdx % CPU_PERSONALITIES.length]; let completed = 0; const errors: ScenarioError[] = []; for (let gameNum = 0; gameNum < cfg.gamesPerRoom; gameNum++) { if (ctx.signal.aborted) break; ctx.dashboard.update(roomId, { game: gameNum + 1, totalGames: cfg.gamesPerRoom }); ctx.logger.info('game_start', { room: roomId, game: gameNum + 1 }); const result = await runOneMultiplayerGame(ctx, sessions, { roomId, holes: cfg.holes, decks: cfg.decks, cpusPerRoom: cfg.cpusPerRoom, cpuPersonality, thinkTimeMs: cfg.thinkTimeMs, }); if (result.completed) { completed++; ctx.logger.info('game_complete', { room: roomId, game: gameNum + 1, turns: result.turns, durationMs: result.durationMs, }); } else { errors.push({ room: roomId, reason: 'game_failed', detail: result.error, timestamp: Date.now(), }); ctx.logger.error('game_failed', { room: roomId, game: gameNum + 1, error: result.error }); } if (gameNum < cfg.gamesPerRoom - 1) { await sleep(cfg.interGamePauseMs); } } return { completed, errors }; } const populate: Scenario = { name: 'populate', description: 'Long multi-round games to populate scoreboards', needs: { accounts: 16, rooms: 4, cpusPerRoom: 1 }, defaultConfig: { gamesPerRoom: 10, holes: 9, decks: 2, rooms: 4, cpusPerRoom: 1, thinkTimeMs: [800, 2200], interGamePauseMs: 3000, }, async run(ctx: ScenarioContext): Promise { const start = Date.now(); const cfg = ctx.config as unknown as PopulateConfig; const perRoom = Math.floor(ctx.sessions.length / cfg.rooms); if (perRoom * cfg.rooms !== ctx.sessions.length) { throw new Error( `populate: ${ctx.sessions.length} sessions does not divide evenly into ${cfg.rooms} rooms`, ); } const roomSessions = chunk(ctx.sessions, perRoom); const results = await Promise.allSettled( roomSessions.map((sessions, idx) => runRoom(ctx, cfg, idx, sessions)), ); let gamesCompleted = 0; const errors: ScenarioError[] = []; results.forEach((r, idx) => { if (r.status === 'fulfilled') { gamesCompleted += r.value.completed; errors.push(...r.value.errors); } else { errors.push({ room: `room-${idx}`, reason: 'room_threw', detail: r.reason instanceof Error ? r.reason.message : String(r.reason), timestamp: Date.now(), }); } }); return { gamesCompleted, errors, durationMs: Date.now() - start, }; }, }; export default populate; ``` - [ ] **Step 2: Create `scenarios/index.ts` registry** ```typescript // tests/soak/scenarios/index.ts import type { Scenario } from '../core/types'; import populate from './populate'; const registry: Record = { populate, }; export function getScenario(name: string): Scenario | undefined { return registry[name]; } export function listScenarios(): Scenario[] { return Object.values(registry); } ``` - [ ] **Step 3: Syntax-check** ```bash cd tests/soak npx tsx -e "import('./scenarios/index').then((m) => console.log(m.listScenarios().map(s => s.name)))" ``` Expected: `['populate']`. - [ ] **Step 4: Commit** ```bash git add tests/soak/scenarios/populate.ts tests/soak/scenarios/index.ts git commit -m "$(cat <<'EOF' feat(soak): populate scenario + scenario registry Partitions sessions into N rooms, runs gamesPerRoom games per room in parallel via Promise.allSettled so a failure in one room never unwinds the others. Errors roll up into ScenarioResult.errors. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 17: Config parsing with tests CLI flags, env vars, scenario defaults, runner defaults — merged in that precedence order. **Files:** - Create: `tests/soak/config.ts` - Create: `tests/soak/tests/config.test.ts` - [ ] **Step 1: Write failing tests** ```typescript // tests/soak/tests/config.test.ts import { describe, it, expect } from 'vitest'; import { parseArgs, mergeConfig } from '../config'; describe('parseArgs', () => { it('parses --scenario and numeric flags', () => { const r = parseArgs(['--scenario=populate', '--rooms=4', '--games-per-room=10']); expect(r.scenario).toBe('populate'); expect(r.rooms).toBe(4); expect(r.gamesPerRoom).toBe(10); }); it('parses watch mode', () => { const r = parseArgs(['--scenario=populate', '--watch=none']); expect(r.watch).toBe('none'); }); it('rejects unknown watch mode', () => { expect(() => parseArgs(['--scenario=populate', '--watch=bogus'])).toThrow(); }); it('--list sets listOnly', () => { const r = parseArgs(['--list']); expect(r.listOnly).toBe(true); }); }); describe('mergeConfig', () => { it('CLI flags override scenario defaults', () => { const cfg = mergeConfig( { games: 5, holes: 9 }, {}, { gamesPerRoom: 20 }, ); expect(cfg.gamesPerRoom).toBe(20); }); it('env overrides scenario defaults but not CLI', () => { const cfg = mergeConfig( { games: 5, holes: 9 }, { SOAK_HOLES: '3' }, { holes: 7 }, ); expect(cfg.holes).toBe(7); // CLI wins (7 was from scenario defaults? no — CLI not set here) // Correction: CLI not set, so env wins over scenario default }); it('scenario defaults fill in unset values', () => { const cfg = mergeConfig( { games: 5, holes: 9 }, {}, { gamesPerRoom: 3 }, ); expect(cfg.games).toBe(5); expect(cfg.holes).toBe(9); expect(cfg.gamesPerRoom).toBe(3); }); }); ``` Note: the middle test has a correction inline — re-read and fix so the assertion matches precedence "CLI > env > defaults". Correct version: ```typescript it('env overrides scenario defaults but CLI overrides env', () => { const cfg = mergeConfig( { holes: 5 }, // CLI { SOAK_HOLES: '3' }, // env { holes: 9 }, // defaults ); expect(cfg.holes).toBe(5); // CLI wins }); ``` Replace the second `it(...)` block above with this corrected version before running. - [ ] **Step 2: Run tests to verify they fail** ```bash npx vitest run tests/config.test.ts ``` Expected: FAIL — module not found. - [ ] **Step 3: Implement `config.ts`** ```typescript // tests/soak/config.ts export type WatchMode = 'none' | 'dashboard' | 'tiled'; export interface CliArgs { scenario?: string; accounts?: number; rooms?: number; cpusPerRoom?: number; gamesPerRoom?: number; holes?: number; watch?: WatchMode; dashboardPort?: number; target?: string; runId?: string; dryRun?: boolean; listOnly?: boolean; } const VALID_WATCH: WatchMode[] = ['none', 'dashboard', 'tiled']; function parseInt10(s: string, name: string): number { const n = parseInt(s, 10); if (Number.isNaN(n)) throw new Error(`Invalid integer for ${name}: ${s}`); return n; } export function parseArgs(argv: string[]): CliArgs { const out: CliArgs = {}; for (const arg of argv) { if (arg === '--list') { out.listOnly = true; continue; } if (arg === '--dry-run') { out.dryRun = true; continue; } const m = arg.match(/^--([a-z][a-z0-9-]*)=(.*)$/); if (!m) continue; const [, key, value] = m; switch (key) { case 'scenario': out.scenario = value; break; case 'accounts': out.accounts = parseInt10(value, '--accounts'); break; case 'rooms': out.rooms = parseInt10(value, '--rooms'); break; case 'cpus-per-room': out.cpusPerRoom = parseInt10(value, '--cpus-per-room'); break; case 'games-per-room': out.gamesPerRoom = parseInt10(value, '--games-per-room'); break; case 'holes': out.holes = parseInt10(value, '--holes'); break; case 'watch': if (!VALID_WATCH.includes(value as WatchMode)) { throw new Error(`Invalid --watch value: ${value} (expected ${VALID_WATCH.join('|')})`); } out.watch = value as WatchMode; break; case 'dashboard-port': out.dashboardPort = parseInt10(value, '--dashboard-port'); break; case 'target': out.target = value; break; case 'run-id': out.runId = value; break; default: // Unknown flag — ignore so scenario-specific flags can be added later break; } } return out; } /** * Merge in order: scenarioDefaults → env → cli (later wins). */ export function mergeConfig( cli: Record, env: Record, defaults: Record, ): Record { const merged: Record = { ...defaults }; // Env overlay — SOAK_UPPER_SNAKE → lowerCamel in cli space. const envMap: Record = { SOAK_HOLES: 'holes', SOAK_ROOMS: 'rooms', SOAK_ACCOUNTS: 'accounts', SOAK_CPUS_PER_ROOM: 'cpusPerRoom', SOAK_GAMES_PER_ROOM: 'gamesPerRoom', SOAK_WATCH: 'watch', SOAK_DASHBOARD_PORT: 'dashboardPort', }; for (const [envKey, cfgKey] of Object.entries(envMap)) { const v = env[envKey]; if (v !== undefined) { // Heuristic: numeric keys if (/^(holes|rooms|accounts|cpusPerRoom|gamesPerRoom|dashboardPort)$/.test(cfgKey)) { merged[cfgKey] = parseInt(v, 10); } else { merged[cfgKey] = v; } } } // CLI overlay — wins over env and defaults. for (const [k, v] of Object.entries(cli)) { if (v !== undefined) merged[k] = v; } return merged; } ``` - [ ] **Step 4: Fix the failing middle test as noted in Step 1** Edit `tests/soak/tests/config.test.ts` and replace the second `it(...)` block inside `describe('mergeConfig')` with the corrected version provided in Step 1. - [ ] **Step 5: Run tests to verify they pass** ```bash npx vitest run tests/config.test.ts ``` Expected: all passing. - [ ] **Step 6: Commit** ```bash git add tests/soak/config.ts tests/soak/tests/config.test.ts git commit -m "$(cat <<'EOF' feat(soak): CLI parsing + config precedence parseArgs pulls --scenario/--rooms/--watch/etc from argv, mergeConfig layers scenarioDefaults → env → CLI so CLI flags always win. Unit tested. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 18: `runner.ts` entry point — first end-to-end milestone Replaces the placeholder runner with the real thing: parse args, build dependencies, load scenario, acquire sessions, run scenario, clean up, print summary. Supports `--watch=none` only at this stage. **Files:** - Modify: `tests/soak/runner.ts` (replace placeholder) - [ ] **Step 1: Rewrite `runner.ts`** ```typescript #!/usr/bin/env tsx /** * Golf Soak Harness — entry point. * * Usage: * TEST_URL=http://localhost:8000 \ * SOAK_INVITE_CODE=SOAKTEST \ * npm run soak -- --scenario=populate --rooms=1 --accounts=2 \ * --cpus-per-room=0 --games-per-room=1 --holes=1 --watch=none */ import * as path from 'path'; import { parseArgs, mergeConfig, CliArgs } from './config'; import { createLogger } from './core/logger'; import { SessionPool } from './core/session-pool'; import { RoomCoordinator } from './core/room-coordinator'; import { getScenario, listScenarios } from './scenarios'; import type { DashboardReporter, ScenarioContext } from './core/types'; function noopDashboard(): DashboardReporter { return { update: () => {}, log: () => {}, incrementMetric: () => {}, }; } function printScenarioList(): void { console.log('Available scenarios:'); for (const s of listScenarios()) { console.log(` ${s.name.padEnd(12)} ${s.description}`); console.log(` needs: accounts=${s.needs.accounts}, rooms=${s.needs.rooms ?? 1}, cpus=${s.needs.cpusPerRoom ?? 0}`); } } async function main(): Promise { const cli: CliArgs = parseArgs(process.argv.slice(2)); if (cli.listOnly) { printScenarioList(); return; } if (!cli.scenario) { console.error('Error: --scenario= is required. Use --list to see scenarios.'); process.exit(2); } const scenario = getScenario(cli.scenario); if (!scenario) { console.error(`Error: unknown scenario "${cli.scenario}". Use --list to see scenarios.`); process.exit(2); } const runId = cli.runId ?? `${cli.scenario}-${new Date().toISOString().replace(/[:.]/g, '-')}`; const targetUrl = cli.target ?? process.env.TEST_URL ?? 'http://localhost:8000'; const inviteCode = process.env.SOAK_INVITE_CODE ?? 'SOAKTEST'; const watch = cli.watch ?? 'dashboard'; const logger = createLogger({ runId }); logger.info('run_start', { scenario: scenario.name, targetUrl, watch, cli, }); // Resolve final config const config = mergeConfig( cli as Record, process.env, scenario.defaultConfig, ); // Ensure core knobs exist const accounts = Number(config.accounts ?? scenario.needs.accounts); const rooms = Number(config.rooms ?? scenario.needs.rooms ?? 1); const cpusPerRoom = Number(config.cpusPerRoom ?? scenario.needs.cpusPerRoom ?? 0); if (accounts % rooms !== 0) { console.error(`Error: --accounts=${accounts} does not divide evenly into --rooms=${rooms}`); process.exit(2); } config.rooms = rooms; config.cpusPerRoom = cpusPerRoom; if (cli.dryRun) { logger.info('dry_run', { config }); console.log('Dry run OK. Resolved config:'); console.log(JSON.stringify(config, null, 2)); return; } if (watch !== 'none') { logger.warn('watch_mode_not_yet_implemented', { watch }); console.warn(`Watch mode "${watch}" not yet implemented — falling back to "none".`); } // Build dependencies const credFile = path.resolve(__dirname, '.env.stresstest'); const pool = new SessionPool({ targetUrl, inviteCode, credFile, logger, }); const coordinator = new RoomCoordinator(); const dashboard = noopDashboard(); const abortController = new AbortController(); const onSignal = (sig: string) => { logger.warn('signal_received', { signal: sig }); abortController.abort(); }; process.on('SIGINT', () => onSignal('SIGINT')); process.on('SIGTERM', () => onSignal('SIGTERM')); let exitCode = 0; try { const sessions = await pool.acquire(accounts); logger.info('sessions_acquired', { count: sessions.length }); const ctx: ScenarioContext = { config, sessions, coordinator, dashboard, logger, signal: abortController.signal, heartbeat: () => {}, // Task 26 wires this up }; const result = await scenario.run(ctx); logger.info('run_complete', { gamesCompleted: result.gamesCompleted, errors: result.errors.length, durationMs: result.durationMs, }); console.log(`Games completed: ${result.gamesCompleted}`); console.log(`Errors: ${result.errors.length}`); console.log(`Duration: ${(result.durationMs / 1000).toFixed(1)}s`); if (result.errors.length > 0) { console.log('Errors:'); for (const e of result.errors) { console.log(` ${e.room}: ${e.reason}${e.detail ? ' — ' + e.detail : ''}`); } exitCode = 1; } } catch (err) { logger.error('run_failed', { error: err instanceof Error ? err.message : String(err), stack: err instanceof Error ? err.stack : undefined, }); exitCode = 1; } finally { await pool.release(); } if (abortController.signal.aborted && exitCode === 0) exitCode = 2; process.exit(exitCode); } main().catch((err) => { console.error(err); process.exit(1); }); ``` - [ ] **Step 2: Run a minimal `--watch=none` smoke against local dev** Server running, 4 soak accounts already seeded from Task 14: ```bash cd tests/soak TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ --scenario=populate \ --accounts=2 \ --rooms=1 \ --cpus-per-room=0 \ --games-per-room=1 \ --holes=1 \ --watch=none ``` Expected output (abbreviated): ``` {"timestamp":"...","level":"info","msg":"run_start",...} {"timestamp":"...","level":"info","msg":"sessions_acquired","count":2} {"timestamp":"...","level":"info","msg":"game_start","room":"room-0","game":1} {"timestamp":"...","level":"info","msg":"room_created","code":"XXXX"} {"timestamp":"...","level":"info","msg":"game_complete","room":"room-0","turns":...} {"timestamp":"...","level":"info","msg":"run_complete","gamesCompleted":1,"errors":0} Games completed: 1 Errors: 0 Duration: X.Xs ``` Exit code 0. This is the first **end-to-end milestone**. Stop here if debugging is needed — fix issues before moving on. - [ ] **Step 3: Commit** ```bash git add tests/soak/runner.ts git commit -m "$(cat <<'EOF' feat(soak): runner.ts end-to-end with --watch=none First full end-to-end milestone: parses CLI, builds SessionPool + RoomCoordinator, loads a scenario by name, runs it, reports results, cleans up. Watch modes other than "none" log a warning and fall back until Tasks 19-24 implement them. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ## Phase 5 — Dashboard status grid ### Task 19: Dashboard HTTP + WS server Vanilla node `http` + `ws`. Serves one static HTML page, accepts WS connections, broadcasts room-state updates. **Files:** - Create: `tests/soak/dashboard/server.ts` - [ ] **Step 1: Implement `dashboard/server.ts`** ```typescript // tests/soak/dashboard/server.ts import * as http from 'http'; import * as fs from 'fs'; import * as path from 'path'; import { WebSocketServer, WebSocket } from 'ws'; import type { DashboardReporter, Logger, RoomState } from '../core/types'; export type DashboardIncoming = | { type: 'start_stream'; sessionKey: string } | { type: 'stop_stream'; sessionKey: string }; export type DashboardOutgoing = | { type: 'room_state'; roomId: string; state: Partial } | { type: 'log'; level: string; msg: string; meta?: object; timestamp: number } | { type: 'metric'; name: string; value: number } | { type: 'frame'; sessionKey: string; jpegBase64: string }; export interface DashboardHandlers { onStartStream?(sessionKey: string): void; onStopStream?(sessionKey: string): void; onDisconnect?(): void; } export class DashboardServer { private httpServer!: http.Server; private wsServer!: WebSocketServer; private clients = new Set(); private metrics: Record = {}; private roomStates: Record> = {}; constructor( private port: number, private logger: Logger, private handlers: DashboardHandlers = {}, ) {} async start(): Promise { const htmlPath = path.resolve(__dirname, 'index.html'); const cssPath = path.resolve(__dirname, 'dashboard.css'); const jsPath = path.resolve(__dirname, 'dashboard.js'); this.httpServer = http.createServer((req, res) => { const url = req.url ?? '/'; if (url === '/' || url === '/index.html') { res.writeHead(200, { 'Content-Type': 'text/html; charset=utf-8' }); fs.createReadStream(htmlPath).pipe(res); } else if (url === '/dashboard.css') { res.writeHead(200, { 'Content-Type': 'text/css' }); fs.createReadStream(cssPath).pipe(res); } else if (url === '/dashboard.js') { res.writeHead(200, { 'Content-Type': 'application/javascript' }); fs.createReadStream(jsPath).pipe(res); } else { res.writeHead(404); res.end('not found'); } }); this.wsServer = new WebSocketServer({ server: this.httpServer }); this.wsServer.on('connection', (ws) => { this.clients.add(ws); this.logger.info('dashboard_client_connected', { count: this.clients.size }); // Replay current state to the new client for (const [roomId, state] of Object.entries(this.roomStates)) { ws.send(JSON.stringify({ type: 'room_state', roomId, state } as DashboardOutgoing)); } for (const [name, value] of Object.entries(this.metrics)) { ws.send(JSON.stringify({ type: 'metric', name, value } as DashboardOutgoing)); } ws.on('message', (data) => { try { const parsed = JSON.parse(data.toString()) as DashboardIncoming; if (parsed.type === 'start_stream' && this.handlers.onStartStream) { this.handlers.onStartStream(parsed.sessionKey); } else if (parsed.type === 'stop_stream' && this.handlers.onStopStream) { this.handlers.onStopStream(parsed.sessionKey); } } catch (err) { this.logger.warn('dashboard_ws_parse_error', { error: err instanceof Error ? err.message : String(err), }); } }); ws.on('close', () => { this.clients.delete(ws); this.logger.info('dashboard_client_disconnected', { count: this.clients.size }); if (this.clients.size === 0 && this.handlers.onDisconnect) { this.handlers.onDisconnect(); } }); }); await new Promise((resolve) => { this.httpServer.listen(this.port, () => resolve()); }); this.logger.info('dashboard_listening', { url: `http://localhost:${this.port}` }); } async stop(): Promise { for (const ws of this.clients) { try { ws.close(); } catch { // ignore } } this.clients.clear(); await new Promise((resolve) => { this.wsServer.close(() => resolve()); }); await new Promise((resolve) => { this.httpServer.close(() => resolve()); }); } broadcast(msg: DashboardOutgoing): void { const payload = JSON.stringify(msg); for (const ws of this.clients) { if (ws.readyState === WebSocket.OPEN) { ws.send(payload); } } } /** Create a DashboardReporter wired to this server. */ reporter(): DashboardReporter { return { update: (roomId, state) => { this.roomStates[roomId] = { ...this.roomStates[roomId], ...state }; this.broadcast({ type: 'room_state', roomId, state }); }, log: (level, msg, meta) => { this.broadcast({ type: 'log', level, msg, meta, timestamp: Date.now() }); }, incrementMetric: (name, by = 1) => { this.metrics[name] = (this.metrics[name] ?? 0) + by; this.broadcast({ type: 'metric', name, value: this.metrics[name] }); }, }; } } ``` - [ ] **Step 2: Syntax-check** ```bash cd tests/soak npx tsx -e "import('./dashboard/server').then(() => console.log('ok'))" ``` Expected: `ok`. - [ ] **Step 3: Commit** ```bash git add tests/soak/dashboard/server.ts git commit -m "$(cat <<'EOF' feat(soak): DashboardServer — vanilla http + ws Serves one static HTML page, accepts WS connections, broadcasts room_state/log/metric messages to all clients. Exposes a reporter() method that returns a DashboardReporter scenarios can call without knowing about sockets. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 20: Dashboard HTML/CSS/JS status grid Single static HTML page + stylesheet + client script. Renders the 2×2 room grid, subscribes to WS, updates tiles on each message. **Files:** - Create: `tests/soak/dashboard/index.html` - Create: `tests/soak/dashboard/dashboard.css` - Create: `tests/soak/dashboard/dashboard.js` - [ ] **Step 1: Create `dashboard/index.html`** ```html Golf Soak Dashboard

⛳ Golf Soak Dashboard

run — 00:00:00
Games0
Moves0
Errors0
WSconnecting
Activity Log
    ``` - [ ] **Step 2: Create `dashboard/dashboard.css`** ```css :root { --bg: #0a0e16; --panel: #0e1420; --border: #1a2230; --text: #c8d4e4; --accent: #7fbaff; --good: #6fd08f; --warn: #ffb84d; --err: #ff5c6c; --muted: #556577; } * { box-sizing: border-box; } body { margin: 0; font-family: -apple-system, system-ui, 'SF Mono', Consolas, monospace; background: var(--bg); color: var(--text); } .dash-header { display: flex; justify-content: space-between; align-items: center; padding: 12px 20px; background: linear-gradient(135deg, #0f1823, #0a1018); border-bottom: 1px solid var(--border); } .dash-header h1 { margin: 0; font-size: 16px; color: var(--accent); } .dash-header .meta { font-size: 11px; color: var(--muted); } .dash-header .meta span + span { margin-left: 12px; } .meta-bar { display: flex; gap: 24px; padding: 10px 20px; background: #0c131d; border-bottom: 1px solid var(--border); font-size: 12px; } .meta-bar .stat .label { color: var(--muted); margin-right: 6px; } .meta-bar .stat span:last-child { color: #fff; font-weight: 600; } .rooms { display: grid; grid-template-columns: 1fr 1fr; gap: 1px; background: var(--border); } .room { background: var(--panel); padding: 14px 18px; min-height: 180px; } .room-title { display: flex; justify-content: space-between; align-items: center; margin-bottom: 10px; } .room-title .name { font-size: 13px; color: var(--accent); font-weight: 600; } .room-title .phase { font-size: 10px; padding: 2px 8px; border-radius: 10px; background: #1a3a2a; color: var(--good); } .room-title .phase.lobby { background: #3a2a1a; color: var(--warn); } .room-title .phase.err { background: #3a1a1a; color: var(--err); } .players { display: grid; grid-template-columns: repeat(2, 1fr); gap: 4px; font-size: 11px; margin-bottom: 8px; } .player { display: flex; justify-content: space-between; padding: 4px 8px; background: #0a0f18; border-radius: 3px; cursor: pointer; border: 1px solid transparent; } .player:hover { border-color: var(--accent); } .player.active { background: #1a2a40; border-left: 2px solid var(--accent); } .player .score { color: var(--muted); } .progress-bar { height: 4px; background: var(--border); border-radius: 2px; overflow: hidden; margin-top: 6px; } .progress-fill { height: 100%; background: linear-gradient(90deg, var(--accent), var(--good)); transition: width 0.3s; } .room-meta { font-size: 10px; color: var(--muted); display: flex; gap: 12px; margin-top: 6px; } .log { border-top: 1px solid var(--border); background: #080c13; max-height: 160px; overflow-y: auto; } .log .log-header { padding: 6px 20px; font-size: 10px; text-transform: uppercase; color: var(--muted); border-bottom: 1px solid var(--border); } .log ul { list-style: none; margin: 0; padding: 4px 20px; font-size: 10px; } .log li { line-height: 1.5; font-family: monospace; color: var(--muted); } .log li.warn { color: var(--warn); } .log li.error { color: var(--err); } .video-modal { position: fixed; inset: 0; background: rgba(0, 0, 0, 0.85); display: flex; align-items: center; justify-content: center; z-index: 100; } .video-modal.hidden { display: none; } .video-modal-content { background: var(--panel); border: 1px solid var(--border); border-radius: 6px; padding: 16px; max-width: 90vw; max-height: 90vh; } .video-modal-header { display: flex; justify-content: space-between; align-items: center; margin-bottom: 12px; color: var(--accent); font-size: 13px; } .video-modal-header button { background: var(--border); color: var(--text); border: none; padding: 4px 12px; border-radius: 3px; cursor: pointer; } #video-frame { display: block; max-width: 100%; max-height: 70vh; border: 1px solid var(--border); } ``` - [ ] **Step 3: Create `dashboard/dashboard.js`** ```javascript // tests/soak/dashboard/dashboard.js (() => { const ws = new WebSocket(`ws://${location.host}`); const roomsEl = document.getElementById('rooms'); const logEl = document.getElementById('log-list'); const wsStatusEl = document.getElementById('ws-status'); const metricGames = document.getElementById('metric-games'); const metricMoves = document.getElementById('metric-moves'); const metricErrors = document.getElementById('metric-errors'); const elapsedEl = document.getElementById('elapsed'); const roomTiles = new Map(); const startTime = Date.now(); let currentWatchedKey = null; // Video modal const videoModal = document.getElementById('video-modal'); const videoFrame = document.getElementById('video-frame'); const videoTitle = document.getElementById('video-modal-title'); const videoClose = document.getElementById('video-modal-close'); function fmtElapsed(ms) { const s = Math.floor(ms / 1000); const h = Math.floor(s / 3600); const m = Math.floor((s % 3600) / 60); const sec = s % 60; return `${String(h).padStart(2, '0')}:${String(m).padStart(2, '0')}:${String(sec).padStart(2, '0')}`; } setInterval(() => { elapsedEl.textContent = fmtElapsed(Date.now() - startTime); }, 1000); function ensureRoomTile(roomId) { if (roomTiles.has(roomId)) return roomTiles.get(roomId); const tile = document.createElement('div'); tile.className = 'room'; tile.innerHTML = `
    ${roomId}
    waiting
    0 moves game —
    `; roomsEl.appendChild(tile); roomTiles.set(roomId, tile); return tile; } function renderRoomState(roomId, state) { const tile = ensureRoomTile(roomId); if (state.phase !== undefined) { const phaseEl = tile.querySelector('.phase'); phaseEl.textContent = state.phase; phaseEl.classList.toggle('lobby', state.phase === 'lobby' || state.phase === 'waiting'); phaseEl.classList.toggle('err', state.phase === 'error'); } if (state.players !== undefined) { const playersEl = tile.querySelector('.players'); playersEl.innerHTML = state.players .map( (p) => `
    ${p.isActive ? '▶ ' : ''}${p.key} ${p.score ?? '—'}
    `, ) .join(''); } if (state.hole !== undefined && state.totalHoles !== undefined) { const fill = tile.querySelector('.progress-fill'); const pct = state.totalHoles > 0 ? Math.round((state.hole / state.totalHoles) * 100) : 0; fill.style.width = `${pct}%`; } if (state.moves !== undefined) { tile.querySelector('.moves').textContent = `${state.moves} moves`; } if (state.game !== undefined && state.totalGames !== undefined) { tile.querySelector('.game').textContent = `game ${state.game}/${state.totalGames}`; } } function appendLog(level, msg, meta) { const li = document.createElement('li'); li.className = level; const ts = new Date().toLocaleTimeString(); li.textContent = `[${ts}] ${msg} ${meta ? JSON.stringify(meta) : ''}`; logEl.insertBefore(li, logEl.firstChild); // Cap log length while (logEl.children.length > 100) { logEl.removeChild(logEl.lastChild); } } function applyMetric(name, value) { if (name === 'games_completed') metricGames.textContent = value; else if (name === 'moves_total') metricMoves.textContent = value; else if (name === 'errors') metricErrors.textContent = value; } ws.addEventListener('open', () => { wsStatusEl.textContent = 'healthy'; wsStatusEl.style.color = 'var(--good)'; }); ws.addEventListener('close', () => { wsStatusEl.textContent = 'disconnected'; wsStatusEl.style.color = 'var(--err)'; }); ws.addEventListener('message', (event) => { let msg; try { msg = JSON.parse(event.data); } catch { return; } if (msg.type === 'room_state') { renderRoomState(msg.roomId, msg.state); } else if (msg.type === 'log') { appendLog(msg.level, msg.msg, msg.meta); } else if (msg.type === 'metric') { applyMetric(msg.name, msg.value); } else if (msg.type === 'frame') { if (msg.sessionKey === currentWatchedKey) { videoFrame.src = `data:image/jpeg;base64,${msg.jpegBase64}`; } } }); // Click-to-watch (wired in Task 23) roomsEl.addEventListener('click', (e) => { const playerEl = e.target.closest('.player'); if (!playerEl) return; const key = playerEl.dataset.session; if (!key) return; currentWatchedKey = key; videoTitle.textContent = `Watching ${key}`; videoModal.classList.remove('hidden'); ws.send(JSON.stringify({ type: 'start_stream', sessionKey: key })); }); function closeVideo() { if (currentWatchedKey) { ws.send(JSON.stringify({ type: 'stop_stream', sessionKey: currentWatchedKey })); } currentWatchedKey = null; videoModal.classList.add('hidden'); videoFrame.src = ''; } videoClose.addEventListener('click', closeVideo); document.addEventListener('keydown', (e) => { if (e.key === 'Escape') closeVideo(); }); })(); ``` - [ ] **Step 4: Commit** ```bash git add tests/soak/dashboard/index.html tests/soak/dashboard/dashboard.css tests/soak/dashboard/dashboard.js git commit -m "$(cat <<'EOF' feat(soak): dashboard status grid UI Static HTML page served by DashboardServer. Renders the 2×2 room grid with progress bars and player tiles, subscribes to WS events, updates tiles live. Click-to-watch modal is wired but receives frames once the CDP screencaster ships in Task 22. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 21: Wire `WATCH=dashboard` in runner Start the dashboard server when `--watch=dashboard`, auto-open the URL in the user's browser, use its `reporter()` as the `ctx.dashboard`. **Files:** - Modify: `tests/soak/runner.ts` - [ ] **Step 1: Import and instantiate DashboardServer in `runner.ts`** At the top of `runner.ts`, add: ```typescript import { DashboardServer } from './dashboard/server'; import { spawn } from 'child_process'; ``` Replace the block that creates `dashboard` with: ```typescript // Build dashboard if requested let dashboardServer: DashboardServer | null = null; let dashboard: DashboardReporter = noopDashboard(); if (watch === 'dashboard') { const port = Number(config.dashboardPort ?? 7777); dashboardServer = new DashboardServer(port, logger, { onStartStream: (_key) => { logger.info('stream_start_requested', { sessionKey: _key }); // Wired in Task 22 }, onStopStream: (_key) => { logger.info('stream_stop_requested', { sessionKey: _key }); }, }); await dashboardServer.start(); dashboard = dashboardServer.reporter(); const url = `http://localhost:${port}`; console.log(`Dashboard: ${url}`); // Best-effort auto-open try { const opener = process.platform === 'darwin' ? 'open' : process.platform === 'win32' ? 'start' : 'xdg-open'; spawn(opener, [url], { stdio: 'ignore', detached: true }).unref(); } catch { // If auto-open fails, the URL is already printed } } else if (watch === 'tiled') { logger.warn('tiled_not_yet_implemented'); console.warn('Watch mode "tiled" not yet implemented (Task 24). Falling back to none.'); } ``` And in the `finally` block, shut down the server: ```typescript } finally { await pool.release(); if (dashboardServer) { await dashboardServer.stop(); } } ``` Also remove the earlier `if (watch !== 'none')` warning block — it's replaced by the dispatch above. - [ ] **Step 2: Run smoke against dev with dashboard** ```bash TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ --scenario=populate \ --accounts=2 --rooms=1 --cpus-per-room=0 --games-per-room=1 --holes=1 \ --watch=dashboard ``` Expected: - `Dashboard: http://localhost:7777` printed - Browser auto-opens (or you open it manually) - Page shows the dashboard with `WS: healthy` - During the game, the `room-0` tile shows `phase: playing`, increments `moves`, updates progress - After game completes, the runner exits 0 and the dashboard stops - [ ] **Step 3: Commit** ```bash git add tests/soak/runner.ts git commit -m "$(cat <<'EOF' feat(soak): wire --watch=dashboard in runner Starts DashboardServer on 7777 (configurable), uses its reporter as ctx.dashboard, auto-opens the URL. Cleans up on exit. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ## Phase 6 — Live video click-to-watch ### Task 22: CDP screencast module Attach a CDP session to a given page, start screencasting JPEG frames at a fixed rate, forward each frame to a callback, detach on stop. **Files:** - Create: `tests/soak/core/screencaster.ts` - [ ] **Step 1: Implement `core/screencaster.ts`** ```typescript // tests/soak/core/screencaster.ts import type { Page, CDPSession } from 'playwright-core'; import type { Logger } from './types'; export interface ScreencastOptions { format?: 'jpeg' | 'png'; quality?: number; maxWidth?: number; maxHeight?: number; everyNthFrame?: number; } export type FrameCallback = (jpegBase64: string) => void; export class Screencaster { private sessions = new Map(); constructor(private logger: Logger) {} /** * Attach a CDP session to the given page and start forwarding frames. * If already streaming, this is a no-op. */ async start( sessionKey: string, page: Page, onFrame: FrameCallback, opts: ScreencastOptions = {}, ): Promise { if (this.sessions.has(sessionKey)) { this.logger.warn('screencast_already_running', { sessionKey }); return; } const client = await page.context().newCDPSession(page); this.sessions.set(sessionKey, client); client.on('Page.screencastFrame', async (evt: { data: string; sessionId: number }) => { try { onFrame(evt.data); await client.send('Page.screencastFrameAck', { sessionId: evt.sessionId }); } catch (err) { this.logger.warn('screencast_frame_error', { sessionKey, error: err instanceof Error ? err.message : String(err), }); } }); await client.send('Page.startScreencast', { format: opts.format ?? 'jpeg', quality: opts.quality ?? 60, maxWidth: opts.maxWidth ?? 640, maxHeight: opts.maxHeight ?? 360, everyNthFrame: opts.everyNthFrame ?? 2, }); this.logger.info('screencast_started', { sessionKey }); } async stop(sessionKey: string): Promise { const client = this.sessions.get(sessionKey); if (!client) return; try { await client.send('Page.stopScreencast'); await client.detach(); } catch (err) { this.logger.warn('screencast_stop_error', { sessionKey, error: err instanceof Error ? err.message : String(err), }); } this.sessions.delete(sessionKey); this.logger.info('screencast_stopped', { sessionKey }); } async stopAll(): Promise { const keys = Array.from(this.sessions.keys()); await Promise.all(keys.map((k) => this.stop(k))); } } ``` - [ ] **Step 2: Syntax-check** ```bash cd tests/soak npx tsx -e "import('./core/screencaster').then(() => console.log('ok'))" ``` Expected: `ok`. - [ ] **Step 3: Commit** ```bash git add tests/soak/core/screencaster.ts git commit -m "$(cat <<'EOF' feat(soak): Screencaster — CDP Page.startScreencast wrapper Attach/detach CDP sessions per Playwright Page, start/stop JPEG screencasts with configurable quality and frame rate, forward each frame to a callback. Used by the dashboard for click-to-watch live video. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 23: Wire screencaster to dashboard click-to-watch Runner creates a `Screencaster`, passes callbacks into `DashboardServer.onStartStream/onStopStream` that look up the right session and start/stop streaming. Each frame is broadcast to the dashboard. **Files:** - Modify: `tests/soak/runner.ts` - [ ] **Step 1: Import Screencaster and hold a sessions map** In `runner.ts`, add at the top: ```typescript import { Screencaster } from './core/screencaster'; ``` After `const sessions = await pool.acquire(accounts);`, build a lookup map: ```typescript const sessionsByKey = new Map(); for (const s of sessions) sessionsByKey.set(s.key, s); ``` Create the screencaster before the dashboard (or right after sessions are acquired): ```typescript const screencaster = new Screencaster(logger); ``` - [ ] **Step 2: Replace the `onStartStream`/`onStopStream` no-ops with real wiring** Update the `DashboardServer` construction (earlier in the function) to accept handlers that close over `screencaster` and `sessionsByKey`. But since those are built after the dashboard, we need to build the dashboard AFTER sessions are acquired. Reorganize: Move the dashboard construction to AFTER `sessions = await pool.acquire(accounts)`. Then: ```typescript if (watch === 'dashboard') { const port = Number(config.dashboardPort ?? 7777); dashboardServer = new DashboardServer(port, logger, { onStartStream: (key) => { const session = sessionsByKey.get(key); if (!session) { logger.warn('stream_start_unknown_session', { sessionKey: key }); return; } screencaster .start(key, session.page, (jpegBase64) => { dashboardServer!.broadcast({ type: 'frame', sessionKey: key, jpegBase64 }); }) .catch((err) => logger.error('screencast_start_failed', { key, error: err instanceof Error ? err.message : String(err), }), ); }, onStopStream: (key) => { screencaster.stop(key).catch(() => {}); }, onDisconnect: () => { screencaster.stopAll().catch(() => {}); }, }); await dashboardServer.start(); dashboard = dashboardServer.reporter(); const url = `http://localhost:${port}`; console.log(`Dashboard: ${url}`); // ... auto-open } ``` Make sure the `ctx.dashboard` assignment happens AFTER the dashboard setup (it already does — `const ctx = { ... dashboard, ... }` comes later). In the `finally` block, add: ```typescript await screencaster.stopAll(); ``` - [ ] **Step 3: Manual test end-to-end** Run a longer populate game so there's time to click: ```bash TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ --scenario=populate \ --accounts=4 --rooms=1 --cpus-per-room=0 --games-per-room=2 --holes=3 \ --watch=dashboard ``` Expected: 1. Dashboard opens, shows 1 room with 4 players 2. Click on any player tile (`soak_00`, `soak_01`, ...) 3. Modal opens, shows live JPEG frames of that player's view of the game 4. Close modal (Esc or Close button) — frames stop, screencast detaches 5. Run completes cleanly - [ ] **Step 4: Commit** ```bash git add tests/soak/runner.ts git commit -m "$(cat <<'EOF' feat(soak): click-to-watch live video via CDP screencast Runner creates a Screencaster and wires its start/stop into DashboardServer.onStartStream/onStopStream. Clicking a player tile in the dashboard starts a CDP screencast on that session's page, forwards JPEG frames as WS "frame" messages, closes on modal dismiss or WS disconnect. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ## Phase 7 — Tiled mode ### Task 24: `--watch=tiled` native windows Launch a second headed browser for the 4 host contexts, position their windows in a 2×2 grid using `page.evaluate(window.moveTo)`. **Files:** - Modify: `tests/soak/core/session-pool.ts` — add optional headed-host support - Modify: `tests/soak/runner.ts` — enable tiled mode - [ ] **Step 1: Extend `SessionPool` to support headed host contexts** Add a new option and method to `SessionPool`. In `core/session-pool.ts`: ```typescript export interface SessionPoolOptions { targetUrl: string; inviteCode: string; credFile: string; logger: Logger; browser?: Browser; contextOptions?: Parameters[0]; /** If set, the first `headedHostCount` sessions use a separate headed browser. */ headedHostCount?: number; } ``` Inside the class, add a `headedBrowser` field and extend `acquire`: ```typescript private headedBrowser: Browser | null = null; // ... in acquire(), before the loop: if ((this.opts.headedHostCount ?? 0) > 0 && !this.headedBrowser) { this.headedBrowser = await chromium.launch({ headless: false, slowMo: 50, }); } for (let i = 0; i < count; i++) { const account = this.accounts[i]; const useHeaded = i < (this.opts.headedHostCount ?? 0); const targetBrowser = useHeaded ? this.headedBrowser! : this.browser!; const context = await targetBrowser.newContext({ ...this.opts.contextOptions, ...(useHeaded ? { viewport: { width: 960, height: 540 } } : {}), }); await this.injectAuth(context, account); const page = await context.newPage(); await page.goto(this.opts.targetUrl); // Position headed windows in a 2×2 grid if (useHeaded) { const col = i % 2; const row = Math.floor(i / 2); const x = col * 960; const y = row * 560; await page.evaluate( ([x, y, w, h]) => { window.moveTo(x, y); window.resizeTo(w, h); }, [x, y, 960, 540] as [number, number, number, number], ); } const bot = new GolfBot(page); sessions.push({ account, context, page, bot, key: account.key }); } ``` Update `release` to close the headed browser too: ```typescript async release(): Promise { for (const session of this.activeSessions) { try { await session.context.close(); } catch { /* ignore */ } } this.activeSessions = []; if (this.ownedBrowser) { try { await this.ownedBrowser.close(); } catch { /* ignore */ } this.ownedBrowser = null; this.browser = null; } if (this.headedBrowser) { try { await this.headedBrowser.close(); } catch { /* ignore */ } this.headedBrowser = null; } } ``` - [ ] **Step 2: Wire `watch === 'tiled'` in the runner** In `runner.ts`, replace the existing `tiled_not_yet_implemented` warning with: ```typescript const headedHostCount = watch === 'tiled' ? rooms : 0; const pool = new SessionPool({ targetUrl, inviteCode, credFile, logger, headedHostCount, }); ``` (Move that `pool` creation up so it's aware of `watch`.) - [ ] **Step 3: Test tiled mode** ```bash TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ --scenario=populate \ --accounts=4 --rooms=2 --cpus-per-room=0 --games-per-room=1 --holes=1 \ --watch=tiled ``` Expected: 2 native Chromium windows appear (one per host), sized ~960×540 and positioned at the upper-left of the screen. They play the game visibly. On exit, windows close. - [ ] **Step 4: Commit** ```bash git add tests/soak/core/session-pool.ts tests/soak/runner.ts git commit -m "$(cat <<'EOF' feat(soak): --watch=tiled launches N headed host windows SessionPool accepts headedHostCount; when > 0 it launches a second Chromium in headed mode, creates those contexts there, and positions each host window in a 2×2 grid via window.moveTo/resizeTo. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ## Phase 8 — Stress scenario ### Task 25: Chaos injector + stress scenario Short 1-hole games in tight loops, with a 5% per-turn chance of injecting a chaos event (rapid clicks, brief offline toggle, tab navigation). **Files:** - Create: `tests/soak/scenarios/stress.ts` - Create: `tests/soak/scenarios/shared/chaos.ts` - Modify: `tests/soak/scenarios/index.ts` — register `stress` - [ ] **Step 1: Create `scenarios/shared/chaos.ts`** ```typescript // tests/soak/scenarios/shared/chaos.ts import type { Session, Logger } from '../../core/types'; export type ChaosEvent = | 'rapid_clicks' | 'tab_blur' | 'brief_offline'; const ALL_EVENTS: ChaosEvent[] = ['rapid_clicks', 'tab_blur', 'brief_offline']; function pickEvent(): ChaosEvent { return ALL_EVENTS[Math.floor(Math.random() * ALL_EVENTS.length)]; } export async function maybeInjectChaos( session: Session, probability: number, logger: Logger, roomId: string, ): Promise { if (Math.random() >= probability) return null; const event = pickEvent(); logger.info('chaos_injected', { room: roomId, session: session.key, event }); try { switch (event) { case 'rapid_clicks': { // Fire 5 rapid clicks at the player's own cards for (let i = 0; i < 5; i++) { await session.page.locator(`#player-cards .card:nth-child(${(i % 6) + 1})`) .click({ timeout: 300 }) .catch(() => {}); } break; } case 'tab_blur': { // Briefly dispatch blur then focus await session.page.evaluate(() => { window.dispatchEvent(new Event('blur')); setTimeout(() => window.dispatchEvent(new Event('focus')), 200); }); break; } case 'brief_offline': { await session.context.setOffline(true); await new Promise((r) => setTimeout(r, 300)); await session.context.setOffline(false); break; } } } catch (err) { logger.warn('chaos_error', { event, error: err instanceof Error ? err.message : String(err), }); } return event; } ``` - [ ] **Step 2: Create `scenarios/stress.ts`** ```typescript // tests/soak/scenarios/stress.ts import type { Scenario, ScenarioContext, ScenarioResult, ScenarioError, Session, } from '../core/types'; import { runOneMultiplayerGame } from './shared/multiplayer-game'; import { maybeInjectChaos } from './shared/chaos'; interface StressConfig { gamesPerRoom: number; holes: number; decks: number; rooms: number; cpusPerRoom: number; thinkTimeMs: [number, number]; interGamePauseMs: number; chaosChance: number; } function chunk(arr: T[], size: number): T[][] { const out: T[][] = []; for (let i = 0; i < arr.length; i += size) out.push(arr.slice(i, i + size)); return out; } async function sleep(ms: number): Promise { return new Promise((r) => setTimeout(r, ms)); } async function runStressRoom( ctx: ScenarioContext, cfg: StressConfig, roomIdx: number, sessions: Session[], ): Promise<{ completed: number; errors: ScenarioError[]; chaosFired: number }> { const roomId = `room-${roomIdx}`; let completed = 0; let chaosFired = 0; const errors: ScenarioError[] = []; for (let gameNum = 0; gameNum < cfg.gamesPerRoom; gameNum++) { if (ctx.signal.aborted) break; ctx.dashboard.update(roomId, { game: gameNum + 1, totalGames: cfg.gamesPerRoom }); // Start a background chaos loop for this game let chaosActive = true; const chaosLoop = (async () => { while (chaosActive && !ctx.signal.aborted) { await sleep(500); for (const session of sessions) { const e = await maybeInjectChaos(session, cfg.chaosChance, ctx.logger, roomId); if (e) chaosFired++; } } })(); const result = await runOneMultiplayerGame(ctx, sessions, { roomId, holes: cfg.holes, decks: cfg.decks, cpusPerRoom: cfg.cpusPerRoom, thinkTimeMs: cfg.thinkTimeMs, }); chaosActive = false; await chaosLoop; if (result.completed) { completed++; ctx.logger.info('game_complete', { room: roomId, game: gameNum + 1, turns: result.turns }); } else { errors.push({ room: roomId, reason: 'game_failed', detail: result.error, timestamp: Date.now(), }); ctx.logger.error('game_failed', { room: roomId, error: result.error }); } await sleep(cfg.interGamePauseMs); } return { completed, errors, chaosFired }; } const stress: Scenario = { name: 'stress', description: 'Rapid short games for stability & race condition hunting', needs: { accounts: 16, rooms: 4, cpusPerRoom: 2 }, defaultConfig: { gamesPerRoom: 50, holes: 1, decks: 1, rooms: 4, cpusPerRoom: 2, thinkTimeMs: [50, 150], interGamePauseMs: 200, chaosChance: 0.05, }, async run(ctx: ScenarioContext): Promise { const start = Date.now(); const cfg = ctx.config as unknown as StressConfig; const perRoom = Math.floor(ctx.sessions.length / cfg.rooms); const roomSessions = chunk(ctx.sessions, perRoom); const results = await Promise.allSettled( roomSessions.map((s, idx) => runStressRoom(ctx, cfg, idx, s)), ); let gamesCompleted = 0; let chaosFired = 0; const errors: ScenarioError[] = []; results.forEach((r, idx) => { if (r.status === 'fulfilled') { gamesCompleted += r.value.completed; chaosFired += r.value.chaosFired; errors.push(...r.value.errors); } else { errors.push({ room: `room-${idx}`, reason: 'room_threw', detail: r.reason instanceof Error ? r.reason.message : String(r.reason), timestamp: Date.now(), }); } }); return { gamesCompleted, errors, durationMs: Date.now() - start, customMetrics: { chaos_fired: chaosFired }, }; }, }; export default stress; ``` - [ ] **Step 3: Register stress in the registry** Edit `tests/soak/scenarios/index.ts`: ```typescript import type { Scenario } from '../core/types'; import populate from './populate'; import stress from './stress'; const registry: Record = { populate, stress, }; export function getScenario(name: string): Scenario | undefined { return registry[name]; } export function listScenarios(): Scenario[] { return Object.values(registry); } ``` - [ ] **Step 4: Smoke test stress scenario** ```bash TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ --scenario=stress \ --accounts=4 --rooms=1 --cpus-per-room=1 --games-per-room=3 --holes=1 \ --watch=none ``` Expected: 3 quick games complete, chaos events in logs (look for `chaos_injected`), exit 0. - [ ] **Step 5: Commit** ```bash git add tests/soak/scenarios/stress.ts tests/soak/scenarios/shared/chaos.ts tests/soak/scenarios/index.ts git commit -m "$(cat <<'EOF' feat(soak): stress scenario with chaos injection Rapid 1-hole games with a parallel chaos loop that has a 5% per-turn chance of firing rapid_clicks, tab_blur, or brief_offline events. Chaos counts roll up into ScenarioResult.customMetrics. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ## Phase 9 — Failure handling ### Task 26: Watchdog + heartbeat wiring Per-room timeout that fires if no heartbeat arrives within N ms. Runner wires it into `ctx.heartbeat`. Vitest-tested. **Files:** - Create: `tests/soak/core/watchdog.ts` - Create: `tests/soak/tests/watchdog.test.ts` - Modify: `tests/soak/runner.ts` — wire `heartbeat` to per-room watchdogs - [ ] **Step 1: Write failing tests** ```typescript // tests/soak/tests/watchdog.test.ts import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest'; import { Watchdog } from '../core/watchdog'; describe('Watchdog', () => { beforeEach(() => vi.useFakeTimers()); afterEach(() => vi.useRealTimers()); it('fires after timeout if no heartbeat', () => { const onTimeout = vi.fn(); const w = new Watchdog(1000, onTimeout); w.start(); vi.advanceTimersByTime(1001); expect(onTimeout).toHaveBeenCalledOnce(); }); it('heartbeat resets the timer', () => { const onTimeout = vi.fn(); const w = new Watchdog(1000, onTimeout); w.start(); vi.advanceTimersByTime(800); w.heartbeat(); vi.advanceTimersByTime(800); expect(onTimeout).not.toHaveBeenCalled(); vi.advanceTimersByTime(300); expect(onTimeout).toHaveBeenCalledOnce(); }); it('stop cancels pending timeout', () => { const onTimeout = vi.fn(); const w = new Watchdog(1000, onTimeout); w.start(); w.stop(); vi.advanceTimersByTime(2000); expect(onTimeout).not.toHaveBeenCalled(); }); it('does not fire twice after stop', () => { const onTimeout = vi.fn(); const w = new Watchdog(1000, onTimeout); w.start(); vi.advanceTimersByTime(1001); w.heartbeat(); vi.advanceTimersByTime(1001); expect(onTimeout).toHaveBeenCalledOnce(); }); }); ``` - [ ] **Step 2: Run to verify failure** ```bash npx vitest run tests/watchdog.test.ts ``` Expected: FAIL. - [ ] **Step 3: Implement `core/watchdog.ts`** ```typescript // tests/soak/core/watchdog.ts export class Watchdog { private timer: NodeJS.Timeout | null = null; private fired = false; constructor( private timeoutMs: number, private onTimeout: () => void, ) {} start(): void { this.stop(); this.fired = false; this.timer = setTimeout(() => { if (this.fired) return; this.fired = true; this.onTimeout(); }, this.timeoutMs); } heartbeat(): void { if (this.fired) return; this.start(); } stop(): void { if (this.timer) { clearTimeout(this.timer); this.timer = null; } } } ``` - [ ] **Step 4: Verify tests pass** ```bash npx vitest run tests/watchdog.test.ts ``` Expected: all passing. - [ ] **Step 5: Wire watchdogs into the runner** In `runner.ts`, add before building `ctx`: ```typescript const watchdogs = new Map(); const roomAborters = new Map(); for (let i = 0; i < rooms; i++) { const roomId = `room-${i}`; const aborter = new AbortController(); roomAborters.set(roomId, aborter); const w = new Watchdog(60_000, () => { logger.error('watchdog_fired', { room: roomId }); aborter.abort(); dashboard.update(roomId, { phase: 'error' }); }); w.start(); watchdogs.set(roomId, w); } ``` Import at the top: ```typescript import { Watchdog } from './core/watchdog'; ``` Set `ctx.heartbeat` to: ```typescript heartbeat: (roomId: string) => { const w = watchdogs.get(roomId); if (w) w.heartbeat(); }, ``` In the `finally` block, stop all watchdogs: ```typescript for (const w of watchdogs.values()) w.stop(); ``` Note: for now the `roomAborters` aren't fully plumbed into scenario cancellation — scenarios see the global `ctx.signal` only. This is intentional; per-room abort requires scenario-side awareness and is deferred until a scenario genuinely misbehaves. The watchdog still catches stuck runs and flips the global error state. - [ ] **Step 6: Commit** ```bash git add tests/soak/core/watchdog.ts tests/soak/tests/watchdog.test.ts tests/soak/runner.ts git commit -m "$(cat <<'EOF' feat(soak): per-room watchdog with heartbeat Watchdog class with Vitest tests, wired into ctx.heartbeat in the runner. One watchdog per room, 60s timeout; firing logs an error and marks the room's dashboard tile as errored. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 27: Artifact capture on failure When the runner catches an error, snapshot every session's page: screenshot, HTML, console log tail, game state JSON. **Files:** - Create: `tests/soak/core/artifacts.ts` - Modify: `tests/soak/runner.ts` — call `captureArtifacts` in the catch block - [ ] **Step 1: Implement `core/artifacts.ts`** ```typescript // tests/soak/core/artifacts.ts import * as fs from 'fs'; import * as path from 'path'; import type { Session, Logger } from './types'; export interface ArtifactsOptions { runId: string; /** Absolute path to the artifacts root, e.g., /path/to/tests/soak/artifacts */ rootDir: string; logger: Logger; } export class Artifacts { readonly runDir: string; constructor(private opts: ArtifactsOptions) { this.runDir = path.join(opts.rootDir, opts.runId); fs.mkdirSync(this.runDir, { recursive: true }); } /** Capture everything for a single session. */ async captureSession(session: Session, roomId: string): Promise { const dir = path.join(this.runDir, roomId); fs.mkdirSync(dir, { recursive: true }); const prefix = session.key; try { const png = await session.page.screenshot({ fullPage: true }); fs.writeFileSync(path.join(dir, `${prefix}.png`), png); } catch (err) { this.opts.logger.warn('artifact_screenshot_failed', { session: session.key, error: err instanceof Error ? err.message : String(err), }); } try { const html = await session.page.content(); fs.writeFileSync(path.join(dir, `${prefix}.html`), html); } catch (err) { this.opts.logger.warn('artifact_html_failed', { session: session.key, error: err instanceof Error ? err.message : String(err), }); } try { const state = await session.bot.getGameState(); fs.writeFileSync( path.join(dir, `${prefix}.state.json`), JSON.stringify(state, null, 2), ); } catch (err) { this.opts.logger.warn('artifact_state_failed', { session: session.key, error: err instanceof Error ? err.message : String(err), }); } try { const errors = session.bot.getConsoleErrors?.() ?? []; fs.writeFileSync(path.join(dir, `${prefix}.console.txt`), errors.join('\n')); } catch { // ignore — not all bots expose this } } async captureAll(sessions: Session[]): Promise { // Best-effort: partition sessions by their key prefix (doesn't matter) // and write everything under room-unknown/ unless callers pre-partition await Promise.all( sessions.map((s) => this.captureSession(s, 'room-unknown')), ); } writeSummary(summary: object): void { fs.writeFileSync( path.join(this.runDir, 'summary.json'), JSON.stringify(summary, null, 2), ); } } /** Prune run directories older than `maxAgeMs`. */ export function pruneOldRuns(rootDir: string, maxAgeMs: number, logger: Logger): void { if (!fs.existsSync(rootDir)) return; const now = Date.now(); for (const entry of fs.readdirSync(rootDir)) { const full = path.join(rootDir, entry); try { const stat = fs.statSync(full); if (stat.isDirectory() && now - stat.mtimeMs > maxAgeMs) { fs.rmSync(full, { recursive: true, force: true }); logger.info('artifact_pruned', { runId: entry }); } } catch { // ignore } } } ``` - [ ] **Step 2: Call artifact capture from the runner's error path** In `runner.ts`, import: ```typescript import { Artifacts, pruneOldRuns } from './core/artifacts'; ``` After `const runId = ...`, instantiate and prune: ```typescript const artifactsRoot = path.resolve(__dirname, 'artifacts'); const artifacts = new Artifacts({ runId, rootDir: artifactsRoot, logger }); pruneOldRuns(artifactsRoot, 7 * 24 * 3600 * 1000, logger); ``` In the `catch (err)` block, after logging, capture: ```typescript } catch (err) { logger.error('run_failed', { error: err instanceof Error ? err.message : String(err), stack: err instanceof Error ? err.stack : undefined, }); try { const liveSessions = pool['activeSessions'] as Session[] | undefined; if (liveSessions && liveSessions.length > 0) { await artifacts.captureAll(liveSessions); } } catch (captureErr) { logger.warn('artifact_capture_failed', { error: captureErr instanceof Error ? captureErr.message : String(captureErr), }); } exitCode = 1; } ``` (Note: the `pool['activeSessions']` access bypasses visibility to avoid adding a public getter for one call site. Acceptable for an error path in a test harness.) After successful run, write the summary: ```typescript artifacts.writeSummary({ runId, scenario: scenario.name, targetUrl, gamesCompleted: result.gamesCompleted, errors: result.errors, durationMs: result.durationMs, customMetrics: result.customMetrics, }); ``` Import `Session` type: ```typescript import type { Session } from './core/types'; ``` - [ ] **Step 3: Verify by forcing a failure** Kill the server mid-run and confirm artifacts are written: ```bash # In one terminal TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ --scenario=populate --accounts=2 --rooms=1 --cpus-per-room=0 \ --games-per-room=5 --holes=3 --watch=none # In another: wait ~3 seconds then Ctrl-C the dev server # The soak run should catch errors and write artifacts ls tests/soak/artifacts/ ls tests/soak/artifacts// ``` Expected: a run directory exists with `summary.json` (if it got far enough) or per-session screenshots / HTML under `room-unknown/`. - [ ] **Step 4: Commit** ```bash git add tests/soak/core/artifacts.ts tests/soak/runner.ts git commit -m "$(cat <<'EOF' feat(soak): artifact capture on failure + run summary Screenshots, HTML, game state, and console errors are captured into tests/soak/artifacts// when a scenario throws. Runs older than 7 days are pruned on startup. Successful runs get a summary.json next to the artifacts dir. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 28: Graceful shutdown (already partially in place) + exit codes SIGINT/SIGTERM already flip the abort controller. Formalize the timeout-and-force-exit path and the three exit codes (`0` / `1` / `2`). **Files:** - Modify: `tests/soak/runner.ts` - [ ] **Step 1: Add a graceful shutdown timeout** In `runner.ts`, replace the existing signal handlers with: ```typescript let forceExitTimer: NodeJS.Timeout | null = null; const onSignal = (sig: string) => { if (abortController.signal.aborted) { // Second signal: force exit logger.warn('force_exit', { signal: sig }); process.exit(130); } logger.warn('signal_received', { signal: sig }); abortController.abort(); // Hard-kill after 10s if cleanup hangs forceExitTimer = setTimeout(() => { logger.error('graceful_shutdown_timeout'); process.exit(130); }, 10_000); }; process.on('SIGINT', () => onSignal('SIGINT')); process.on('SIGTERM', () => onSignal('SIGTERM')); ``` In the `finally` block, clear the force-exit timer: ```typescript if (forceExitTimer) clearTimeout(forceExitTimer); ``` - [ ] **Step 2: Manual test — Ctrl-C a long run** ```bash TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \ --scenario=populate --accounts=2 --rooms=1 --cpus-per-room=0 \ --games-per-room=10 --holes=3 --watch=none # After ~5 seconds: Ctrl-C ``` Expected: runner logs `signal_received`, finishes current turn, prints summary, exits with code 2 (check `echo $?`). - [ ] **Step 3: Commit** ```bash git add tests/soak/runner.ts git commit -m "$(cat <<'EOF' feat(soak): graceful shutdown with 10s hard-kill fallback SIGINT/SIGTERM flips the abort signal; scenarios finish the current turn then exit. If cleanup hangs >10s the runner force-exits. Second Ctrl-C is an immediate hard kill. Exit codes: 0 success, 1 errors, 2 interrupted. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 29: Periodic health probes Every 30s, fetch `/api/health` on the target server. Three consecutive failures declare a fatal error and abort. **Files:** - Modify: `tests/soak/runner.ts` - [ ] **Step 1: Add a health probe interval** In `runner.ts`, after building the abort controller and before running the scenario: ```typescript let healthFailures = 0; const healthTimer = setInterval(async () => { try { const res = await fetch(`${targetUrl}/api/health`); if (!res.ok) throw new Error(`status ${res.status}`); healthFailures = 0; } catch (err) { healthFailures++; logger.warn('health_probe_failed', { consecutive: healthFailures, error: err instanceof Error ? err.message : String(err), }); if (healthFailures >= 3) { logger.error('health_fatal', { consecutive: healthFailures }); abortController.abort(); } } }, 30_000); ``` In the `finally` block: ```typescript clearInterval(healthTimer); ``` - [ ] **Step 2: Commit** ```bash git add tests/soak/runner.ts git commit -m "$(cat <<'EOF' feat(soak): periodic health probes against target server Every 30s GET /api/health. Three consecutive failures abort the run with a fatal error, so staging outages don't get misattributed to harness bugs. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ## Phase 10 — Polish and bring-up ### Task 30: Smoke test script `tests/soak/scripts/smoke.sh` — the canary run that takes ~30s against local dev. **Files:** - Create: `tests/soak/scripts/smoke.sh` - [ ] **Step 1: Create the script** ```bash #!/usr/bin/env bash # Soak harness smoke test — end-to-end canary against local dev. # Expected runtime: ~30 seconds. set -euo pipefail cd "$(dirname "$0")/.." : "${TEST_URL:=http://localhost:8000}" : "${SOAK_INVITE_CODE:=SOAKTEST}" echo "Smoke target: $TEST_URL" echo "Invite code: $SOAK_INVITE_CODE" # 1. Health probe curl -fsS "$TEST_URL/api/health" > /dev/null || { echo "FAIL: target server unreachable at $TEST_URL" exit 1 } # 2. Ensure minimum accounts if [ ! -f .env.stresstest ]; then echo "Seeding accounts..." npm run seed -- --count=4 fi # 3. Run minimum viable scenario TEST_URL="$TEST_URL" SOAK_INVITE_CODE="$SOAK_INVITE_CODE" \ npm run soak -- \ --scenario=populate \ --accounts=2 \ --rooms=1 \ --cpus-per-room=0 \ --games-per-room=1 \ --holes=1 \ --watch=none echo "Smoke PASSED" ``` - [ ] **Step 2: Make it executable and run it** ```bash chmod +x tests/soak/scripts/smoke.sh cd tests/soak && bash scripts/smoke.sh ``` Expected: `Smoke PASSED` within ~30s. - [ ] **Step 3: Commit** ```bash git add tests/soak/scripts/smoke.sh git commit -m "$(cat <<'EOF' feat(soak): smoke test script — 30s end-to-end canary Confirms the harness works against local dev with the absolute minimum config. Run after any change. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 31: README + CHECKLIST Replace the README stub with a full quickstart and flag reference. Add the manual validation checklist. **Files:** - Modify: `tests/soak/README.md` - Create: `tests/soak/CHECKLIST.md` - [ ] **Step 1: Rewrite `tests/soak/README.md`** ```markdown # Golf Soak & UX Test Harness Standalone Playwright-based runner that drives multi-user authenticated game sessions for scoreboard population and stability testing. **Spec:** `../../docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md` **Bring-up:** `../../docs/soak-harness-bringup.md` ## Quick start ```bash cd tests/soak npm install # First run only: seed 16 accounts TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run seed # 30-second end-to-end smoke test bash scripts/smoke.sh # Populate scoreboard (4 rooms × 4 accounts × 10 long games) TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST \ npm run soak:populate # Stress test (4 rooms × 50 rapid games with chaos) TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST \ npm run soak:stress ``` ## CLI flags ``` --scenario=populate|stress required --accounts= total sessions (default: scenario.needs.accounts) --rooms= default from scenario.needs --cpus-per-room= default from scenario.needs --games-per-room= default from scenario.defaultConfig --holes= default from scenario.defaultConfig --watch=none|dashboard|tiled default: dashboard --dashboard-port= default: 7777 --target= default: TEST_URL env --run-id= default: ISO timestamp --list print scenarios and exit --dry-run validate config, don't run ``` Derived: `accounts / rooms` must divide evenly. ## Environment variables ``` TEST_URL target base URL (e.g. https://staging.adlee.work) SOAK_INVITE_CODE invite code flagged marks_as_test (staging: 5VC2MCCN) SOAK_HOLES override --holes SOAK_ROOMS override --rooms SOAK_ACCOUNTS override --accounts SOAK_CPUS_PER_ROOM override --cpus-per-room SOAK_GAMES_PER_ROOM override --games-per-room SOAK_WATCH override --watch SOAK_DASHBOARD_PORT override --dashboard-port ``` ## Watch modes - **`none`** — pure headless, JSON logs to stdout. Use for CI and overnight runs. - **`dashboard`** (default) — HTTP+WS server on localhost:7777 serving a live status grid. Click any player tile to watch their live session via CDP screencast. - **`tiled`** — 4 native Chromium windows for the host of each room, positioned in a 2×2 grid. Joiners stay headless. ## Scenarios | Name | Description | |---|---| | `populate` | Long 9-hole games with varied CPU personalities, realistic pacing, for populating scoreboards | | `stress` | Rapid 1-hole games with chaos injection (rapid clicks, offline toggles, tab blur) for hunting race conditions | Add new scenarios by creating `scenarios/.ts` and registering in `scenarios/index.ts`. ## Architecture See the design spec for full module breakdown. Key modules: - `runner.ts` — CLI entry, wires everything together - `core/session-pool.ts` — owns browser contexts, seeds/logs in 16 accounts - `core/room-coordinator.ts` — host→joiners room-code handoff - `core/watchdog.ts` — per-room timeout detection - `core/screencaster.ts` — CDP Page.startScreencast for live video - `dashboard/server.ts` — HTTP + WS server - `scenarios/` — pluggable scenarios Reuses `../../tests/e2e/bot/golf-bot.ts` unchanged. ## Running tests (unit) ```bash npm test ``` Tests cover `Deferred`, `RoomCoordinator`, `Watchdog`, and `config`. Integration-level modules are verified by the smoke test. ``` - [ ] **Step 2: Create `tests/soak/CHECKLIST.md`** ```markdown # Soak Harness Manual Validation Checklist Run after any significant change or before calling the implementation complete. ## Bring-up - [ ] Local dev server is running (`python server/main.py`) - [ ] `SOAKTEST` invite code exists locally with `marks_as_test=TRUE` - [ ] `npm install` in `tests/soak/` succeeded - [ ] `npm run seed -- --count=16` creates/updates 16 accounts - [ ] `.env.stresstest` has 16 `SOAK_ACCOUNT_NN=...` lines - [ ] All seeded users show `is_test_account=TRUE` in the DB ## Smoke - [ ] `bash scripts/smoke.sh` exits 0 within 60s ## Scenarios - [ ] `--scenario=populate --rooms=1 --games-per-room=1` completes cleanly - [ ] `--scenario=populate --rooms=4 --games-per-room=1` runs 4 rooms in parallel with no cross-contamination - [ ] `--scenario=stress --games-per-room=3` logs `chaos_injected` events ## Watch modes - [ ] `--watch=none` produces JSONL on stdout, nothing else - [ ] `--watch=dashboard` opens http://localhost:7777, grid renders, tiles update live, WS status shows `healthy` - [ ] Clicking any player tile opens the video modal and streams live JPEG frames (~10 fps) - [ ] Closing the modal stops the screencast (check logs for `screencast_stopped`) - [ ] `--watch=tiled` opens 4 native Chromium windows for the 4 hosts ## Failure modes - [ ] Ctrl-C during a run → graceful shutdown, summary printed, exit code 2 - [ ] Double Ctrl-C → hard exit (130) - [ ] Killing the dev server mid-run → health probes fail 3× → fatal abort, artifacts captured, exit 1 - [ ] Artifacts directory contains a subdirectory per failed run with screenshots and state.json - [ ] Artifacts older than 7 days are pruned on next startup ## Server-side filtering - [ ] `GET /api/stats/leaderboard` (default) hides soak_* accounts - [ ] `GET /api/stats/leaderboard?include_test=true` shows soak_* accounts - [ ] Admin panel user list shows `[Test]` badge on soak_* accounts - [ ] Admin panel "Include test accounts" checkbox filters them out - [ ] Admin panel invite codes tab shows `[Test-seed]` next to SOAKTEST ## Staging bring-up (final step) - [ ] `UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN';` run on staging - [ ] `SOAK_INVITE_CODE=5VC2MCCN TEST_URL=https://staging.adlee.work npm run seed -- --count=16` seeds staging accounts - [ ] Staging run with `--scenario=populate --watch=none` completes - [ ] Staging leaderboard with `include_test=true` shows the soak accounts - [ ] Staging leaderboard default (no param) does NOT show the soak accounts ``` - [ ] **Step 3: Commit** ```bash git add tests/soak/README.md tests/soak/CHECKLIST.md git commit -m "$(cat <<'EOF' docs(soak): full README + manual validation checklist Quickstart, flag reference, env var reference, scenario table, and the bring-up/validation checklist that gates calling the harness implementation complete. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ### Task 32: Staging bring-up (manual, no code) This is a documentation-only task — the actual run happens on your workstation. Listed here so the implementation plan is complete end to end. - [ ] **Step 1: Flag `5VC2MCCN` as test-seed on staging** From your workstation (requires DB access to staging): ```bash ssh root@129.212.150.189 \ 'docker exec -i golfgame-postgres psql -U postgres -d golfgame' <<'EOF' UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN'; SELECT code, max_uses, use_count, marks_as_test FROM invite_codes WHERE code = '5VC2MCCN'; EOF ``` Expected: `marks_as_test | t`. (The exact docker container name may differ — adjust based on `docker ps` on the staging host.) - [ ] **Step 2: Seed the 16 staging accounts** ```bash cd tests/soak rm -f .env.stresstest TEST_URL=https://staging.adlee.work \ SOAK_INVITE_CODE=5VC2MCCN \ npm run seed -- --count=16 ``` Expected: `.env.stresstest` populated with 16 entries. - [ ] **Step 3: Run populate against staging** ```bash TEST_URL=https://staging.adlee.work \ SOAK_INVITE_CODE=5VC2MCCN \ npm run soak -- \ --scenario=populate \ --rooms=4 \ --games-per-room=3 \ --holes=3 \ --watch=dashboard ``` Expected: dashboard opens, 4 rooms play 3 games each, staging scoreboard accumulates data. Exit 0 at the end. - [ ] **Step 4: Verify scoreboard filtering on staging** ```bash # Should NOT contain soak_* usernames curl -s "https://staging.adlee.work/api/stats/leaderboard?metric=wins" | jq '.entries[] | select(.username | startswith("soak_"))' # Should contain soak_* usernames curl -s "https://staging.adlee.work/api/stats/leaderboard?metric=wins&include_test=true" | jq '.entries[] | select(.username | startswith("soak_"))' ``` Expected: first returns nothing, second returns entries. - [ ] **Step 5: Mark implementation complete** Check off all items in `tests/soak/CHECKLIST.md` that correspond to this plan. Commit the filled-in checklist if you want a record: ```bash git add tests/soak/CHECKLIST.md git commit -m "docs(soak): checklist passed on initial staging run" ``` --- ## Phase 11 — Version bump ### Task 33: Bump to v3.3.4 and add footer to admin.html Updates all HTML footers from `v3.1.6` to `v3.3.4`, adds a footer to admin.html which currently has none, bumps `pyproject.toml`. **Files:** - Modify: `client/index.html` — both footer occurrences (L58, L291) - Modify: `client/admin.html` — add footer - Modify: `pyproject.toml` — version field - [ ] **Step 1: Update `client/index.html` footers** ```bash grep -n "v3\.1\.6" client/index.html ``` For each match, replace `v3.1.6` with `v3.3.4`. There should be exactly two matches. - [ ] **Step 2: Add footer to `client/admin.html`** Find the closing `` in `client/admin.html` and add a footer just before it: ```html
    v3.3.4 © Aaron D. Lee
    ``` (The inline style is a fallback — admin.css may already have an `.app-footer` class; if so, drop the inline styles.) ```bash grep -n "app-footer" client/admin.css 2>/dev/null ``` If the class exists, use just `
    v3.3.4 © Aaron D. Lee
    `. - [ ] **Step 3: Bump `pyproject.toml`** ```bash sed -i 's/^version = "3\.1\.6"$/version = "3.3.4"/' pyproject.toml grep version pyproject.toml ``` Expected: `version = "3.3.4"`. - [ ] **Step 4: Verify in the browser** Restart the dev server, open http://localhost:8000 and http://localhost:8000/admin.html. Confirm both show `v3.3.4` in the footer. - [ ] **Step 5: Commit** ```bash git add client/index.html client/admin.html pyproject.toml git commit -m "$(cat <<'EOF' chore: bump version to v3.3.4 Updates client/index.html footer (×2) and pyproject.toml from v3.1.6 → v3.3.4, and adds a matching footer to client/admin.html which previously had none. Co-Authored-By: Claude Opus 4.6 (1M context) EOF )" ``` --- ## Summary 33 tasks across 11 phases: | Phase | Tasks | Milestone | |---|---|---| | 1 — Server changes | 1–8 | Stats filter works, test accounts are separable | | 2 — Harness scaffolding | 9–12 | Core pure-logic modules with Vitest tests pass | | 3 — SessionPool + seeding | 13–14 | `.env.stresstest` seeded via real HTTP | | 4 — First run | 15–18 | **`--watch=none` smoke test passes end-to-end** | | 5 — Dashboard | 19–21 | Live status grid in browser | | 6 — Live video | 22–23 | Click-to-watch CDP screencast | | 7 — Tiled mode | 24 | Native host windows | | 8 — Stress scenario | 25 | Chaos injection runs clean | | 9 — Failure handling | 26–29 | Watchdog + artifacts + graceful shutdown + health probes | | 10 — Polish | 30–31 | Smoke script + README + CHECKLIST | | 11 — Version bump | 33 | v3.3.4 everywhere | (Task 32 is the manual staging bring-up — no code.) Dependencies between tasks: - Tasks 1–8 are independent of the harness (ship them first if you want immediate value for admins) - Tasks 9–18 are strictly sequential (each builds on the previous) - Tasks 19–21, 22–23, 24, 25 are independent of each other — can be done in any order after Task 18 - Tasks 26–29 can be done after Task 18 but are most valuable after Task 25 - Tasks 30–31 come last before staging - Task 33 is independent and can be done any time after Task 8