33-task TDD plan across 11 phases implementing the soak & UX test harness design. Server-side schema/filter/admin changes ship first (independent), then the tests/soak/ TypeScript runner builds up incrementally — first milestone is a --watch=none smoke run against local dev after Task 18, then dashboard, live video, tiled mode, stress scenario, failure handling, and staging bring-up. Final task bumps project version to v3.3.4. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5496 lines
163 KiB
Markdown
5496 lines
163 KiB
Markdown
# Multiplayer Soak & UX Test Harness — Implementation Plan
|
||
|
||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||
|
||
**Goal:** Build a standalone Playwright-based soak runner in `tests/soak/` that drives 16 authenticated browser sessions across 4 concurrent rooms playing many multiplayer games, with pluggable scenarios, a click-to-watch dashboard via CDP screencast, and strict per-room failure isolation.
|
||
|
||
**Architecture:** Single-process node runner reusing the existing `GolfBot` class from `tests/e2e/bot/`. One shared browser (16 contexts) by default; `WATCH=tiled` uses a second headed browser for the 4 host contexts. Scenarios are plain TS modules exported from `tests/soak/scenarios/`. Dashboard is a tiny HTTP+WS server serving one static page that pushes live status and on-demand CDP screencast frames.
|
||
|
||
**Tech Stack:** TypeScript + tsx (no build step), Playwright Core, ws (WebSocket server), Vitest for unit tests, FastAPI + asyncpg (existing server), PostgreSQL (existing).
|
||
|
||
**Spec:** `docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md`
|
||
|
||
---
|
||
|
||
## Testing Strategy Notes
|
||
|
||
- **Server-side Python changes:** The existing test suite mocks stores with `AsyncMock` and has no real-Postgres fixtures. Rather than inventing a new fixture pattern for this plan, server tasks use **curl-based verification against a running local dev server** as the explicit verification step after each commit. Run `python server/main.py` in another terminal (requires Postgres + Redis running — see `docs/INSTALL.md`).
|
||
- **TypeScript harness logic:** Unit-tested with Vitest for pure modules (Deferred, RoomCoordinator, Watchdog, Config). Integration-level modules (SessionPool, Dashboard, Screencaster, Scenarios) are verified by running the harness itself via the smoke test.
|
||
- **End-to-end validation:** `tests/soak/scripts/smoke.sh` is the canary — after every non-trivial change, run it against local dev and expect exit 0 within ~30s.
|
||
|
||
---
|
||
|
||
## Phase 1 — Server-side changes (independent, ships first)
|
||
|
||
### Task 1: Schema migration for `is_test_account` and `marks_as_test`
|
||
|
||
Add two columns, one partial index, and rebuild the `leaderboard_overall` materialized view to include `is_test_account` (so the filter works through the view fast path). Fits the existing inline-migration pattern in `user_store.py`.
|
||
|
||
**Files:**
|
||
- Modify: `server/stores/user_store.py` — append to `SCHEMA_SQL` (ALTER blocks near L79–L98 and the matview block near L298–L335)
|
||
|
||
- [ ] **Step 1: Add column migration to `SCHEMA_SQL`**
|
||
|
||
Open `server/stores/user_store.py`. Inside the first `DO $$ BEGIN ... END $$;` block (around line 80–98 that handles admin columns), append the `is_test_account` column check. Then add a second ALTER for `invite_codes.marks_as_test` in a new `DO $$` block right after.
|
||
|
||
Add after the existing `last_seen_at` check (before `END $$;` on line ~98):
|
||
|
||
```sql
|
||
IF NOT EXISTS (SELECT 1 FROM information_schema.columns
|
||
WHERE table_name = 'users_v2' AND column_name = 'is_test_account') THEN
|
||
ALTER TABLE users_v2 ADD COLUMN is_test_account BOOLEAN DEFAULT FALSE;
|
||
END IF;
|
||
```
|
||
|
||
Then, immediately after the `END $$;` that closes the users_v2 admin block, add a new block for invite_codes:
|
||
|
||
```sql
|
||
-- Add marks_as_test to invite_codes if not exists
|
||
DO $$
|
||
BEGIN
|
||
IF NOT EXISTS (SELECT 1 FROM information_schema.columns
|
||
WHERE table_name = 'invite_codes' AND column_name = 'marks_as_test') THEN
|
||
ALTER TABLE invite_codes ADD COLUMN marks_as_test BOOLEAN DEFAULT FALSE;
|
||
END IF;
|
||
END $$;
|
||
```
|
||
|
||
- [ ] **Step 2: Add partial index on `is_test_account`**
|
||
|
||
Find the indexes block near line 338. After the existing `idx_users_banned` index (line ~344), add:
|
||
|
||
```sql
|
||
CREATE INDEX IF NOT EXISTS idx_users_v2_is_test_account ON users_v2(is_test_account)
|
||
WHERE is_test_account = TRUE;
|
||
```
|
||
|
||
- [ ] **Step 3: Rebuild `leaderboard_overall` materialized view to include `is_test_account`**
|
||
|
||
Find the existing matview block at line ~298. Modify the version-check DO block so the view is dropped and recreated if it lacks the `is_test_account` column. Replace the existing block:
|
||
|
||
```sql
|
||
-- Leaderboard materialized view (refreshed periodically)
|
||
-- Drop and recreate if missing is_test_account column (soak harness migration)
|
||
DO $$
|
||
BEGIN
|
||
IF EXISTS (SELECT 1 FROM pg_matviews WHERE matviewname = 'leaderboard_overall') THEN
|
||
-- Check if is_test_account column exists in the view
|
||
IF NOT EXISTS (
|
||
SELECT 1 FROM information_schema.columns
|
||
WHERE table_name = 'leaderboard_overall' AND column_name = 'is_test_account'
|
||
) THEN
|
||
DROP MATERIALIZED VIEW leaderboard_overall;
|
||
END IF;
|
||
END IF;
|
||
|
||
IF NOT EXISTS (SELECT 1 FROM pg_matviews WHERE matviewname = 'leaderboard_overall') THEN
|
||
EXECUTE '
|
||
CREATE MATERIALIZED VIEW leaderboard_overall AS
|
||
SELECT
|
||
u.id as user_id,
|
||
u.username,
|
||
COALESCE(u.is_test_account, FALSE) as is_test_account,
|
||
s.games_played,
|
||
s.games_won,
|
||
ROUND(s.games_won::numeric / NULLIF(s.games_played, 0) * 100, 1) as win_rate,
|
||
s.rounds_won,
|
||
ROUND(s.total_points::numeric / NULLIF(s.total_rounds, 0), 1) as avg_score,
|
||
s.best_score as best_round_score,
|
||
s.knockouts,
|
||
s.best_win_streak,
|
||
COALESCE(s.rating, 1500) as rating,
|
||
s.last_game_at
|
||
FROM player_stats s
|
||
JOIN users_v2 u ON s.user_id = u.id
|
||
WHERE s.games_played >= 5
|
||
AND u.deleted_at IS NULL
|
||
AND (u.is_banned = false OR u.is_banned IS NULL)
|
||
';
|
||
END IF;
|
||
END $$;
|
||
```
|
||
|
||
Note: the only differences from the existing block are the changed comment, the changed column-existence check (`is_test_account` instead of `rating`), and the new `COALESCE(u.is_test_account, FALSE) as is_test_account` column in the SELECT. Everything else stays identical.
|
||
|
||
- [ ] **Step 4: Start the server to run migrations**
|
||
|
||
Run (in another terminal, with Postgres + Redis up):
|
||
|
||
```bash
|
||
cd /home/alee/Sources/golfgame
|
||
python server/main.py
|
||
```
|
||
|
||
Expected: server starts cleanly, no errors about `is_test_account` or `marks_as_test` or `leaderboard_overall`.
|
||
|
||
- [ ] **Step 5: Verify schema via psql**
|
||
|
||
Connect to the dev database and confirm:
|
||
|
||
```bash
|
||
psql -d golfgame -c "\d users_v2" | grep is_test_account
|
||
psql -d golfgame -c "\d invite_codes" | grep marks_as_test
|
||
psql -d golfgame -c "\d leaderboard_overall" | grep is_test_account
|
||
psql -d golfgame -c "\di idx_users_v2_is_test_account"
|
||
```
|
||
|
||
Expected: all four commands return matching rows.
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add server/stores/user_store.py
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(server): add is_test_account + marks_as_test schema
|
||
|
||
New columns support separating soak-harness test traffic from real
|
||
user traffic in stats queries. Rebuilds leaderboard_overall matview
|
||
to include is_test_account so the fast path stays filterable.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 2: Propagate `is_test_account` through `User` model and `user_store`
|
||
|
||
Wire the new column into the `User` dataclass, `create_user` signature, `_row_to_user` mapping, and every SELECT list that already pulls user columns.
|
||
|
||
**Files:**
|
||
- Modify: `server/models/user.py` — `User` dataclass (L22–L68) + `to_dict` (L82–L116) + `from_dict` (L118+)
|
||
- Modify: `server/stores/user_store.py` — `create_user` (L454–L501), `_row_to_user` (L997–L1020), `get_user_by_id`/`get_user_by_username`/`get_user_by_email` SELECT lists (L503–L570)
|
||
|
||
- [ ] **Step 1: Add `is_test_account` to the `User` dataclass**
|
||
|
||
In `server/models/user.py`, add a new field to the `User` dataclass (after `force_password_reset` on L68):
|
||
|
||
```python
|
||
is_test_account: bool = False
|
||
```
|
||
|
||
Update the docstring `Attributes:` block around L45 to include:
|
||
|
||
```
|
||
is_test_account: True for accounts created by the soak test harness.
|
||
```
|
||
|
||
- [ ] **Step 2: Include `is_test_account` in `to_dict` and `from_dict`**
|
||
|
||
In `User.to_dict` at L82, add to the `d` dict (after `force_password_reset`):
|
||
|
||
```python
|
||
"is_test_account": self.is_test_account,
|
||
```
|
||
|
||
In `User.from_dict`, add the corresponding parse — find where `force_password_reset` is parsed and add the same pattern:
|
||
|
||
```python
|
||
is_test_account=d.get("is_test_account", False),
|
||
```
|
||
|
||
- [ ] **Step 3: Add `is_test_account` parameter to `create_user`**
|
||
|
||
In `server/stores/user_store.py` at L454, add a new parameter:
|
||
|
||
```python
|
||
async def create_user(
|
||
self,
|
||
username: str,
|
||
password_hash: str,
|
||
email: Optional[str] = None,
|
||
role: UserRole = UserRole.USER,
|
||
guest_id: Optional[str] = None,
|
||
verification_token: Optional[str] = None,
|
||
verification_expires: Optional[datetime] = None,
|
||
is_test_account: bool = False,
|
||
) -> Optional[User]:
|
||
```
|
||
|
||
Update the docstring to add a line in `Args:` describing `is_test_account`.
|
||
|
||
Change the INSERT SQL block to include the new column:
|
||
|
||
```python
|
||
row = await conn.fetchrow(
|
||
"""
|
||
INSERT INTO users_v2 (username, password_hash, email, role, guest_id,
|
||
verification_token, verification_expires,
|
||
is_test_account)
|
||
VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
|
||
RETURNING id, username, email, password_hash, role, email_verified,
|
||
verification_token, verification_expires, reset_token, reset_expires,
|
||
guest_id, deleted_at, preferences, created_at, last_login, last_seen_at,
|
||
is_active, is_banned, ban_reason, force_password_reset, is_test_account
|
||
""",
|
||
username,
|
||
password_hash,
|
||
email,
|
||
role.value,
|
||
guest_id,
|
||
verification_token,
|
||
verification_expires,
|
||
is_test_account,
|
||
)
|
||
```
|
||
|
||
- [ ] **Step 4: Update `_row_to_user` mapping**
|
||
|
||
In `server/stores/user_store.py` at L997, add to the `User(...)` call (after `force_password_reset`):
|
||
|
||
```python
|
||
is_test_account=row.get("is_test_account", False) or False,
|
||
```
|
||
|
||
- [ ] **Step 5: Update all other SELECT lists in user_store**
|
||
|
||
Find every query in `server/stores/user_store.py` that returns a full user row and passes it to `_row_to_user`. Add `is_test_account` to the SELECT column list for each. Grep to find them:
|
||
|
||
```bash
|
||
grep -n "is_active, is_banned, ban_reason, force_password_reset" server/stores/user_store.py
|
||
```
|
||
|
||
For each match, append `, is_test_account` to the SELECT list. Expected locations:
|
||
- `create_user` INSERT ... RETURNING (already updated in Step 3)
|
||
- `get_user_by_id` at L503
|
||
- `get_user_by_username` at L519
|
||
- `get_user_by_email` (find it)
|
||
- Any other `SELECT` ... FROM users_v2 that calls `_row_to_user`
|
||
|
||
- [ ] **Step 6: Restart server, verify no errors**
|
||
|
||
```bash
|
||
# Kill and restart the dev server
|
||
python server/main.py
|
||
```
|
||
|
||
Expected: server starts cleanly. Any query that touches users now returns `is_test_account` correctly.
|
||
|
||
- [ ] **Step 7: Smoke test via curl**
|
||
|
||
```bash
|
||
# Register a throwaway test user (no invite code needed if DAILY_OPEN_SIGNUPS > 0 locally,
|
||
# or use the 5VC2MCCN invite code if INVITE_ONLY=true)
|
||
# Set PW to any password of your choice (>= 8 chars).
|
||
PW='SomeTestPw_1!'
|
||
curl -sX POST http://localhost:8000/api/auth/register \
|
||
-H 'Content-Type: application/json' \
|
||
-d "{\"username\":\"soaktest_smoke1\",\"password\":\"$PW\",\"email\":\"soaktest_smoke1@example.com\",\"invite_code\":\"5VC2MCCN\"}"
|
||
```
|
||
|
||
Expected: HTTP 200 with `{"user":{...},"token":"..."}`. The registration path now runs through the new column without errors even though the value is still always FALSE at this stage.
|
||
|
||
- [ ] **Step 8: Commit**
|
||
|
||
```bash
|
||
git add server/models/user.py server/stores/user_store.py
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(server): propagate is_test_account through User model & store
|
||
|
||
User dataclass, create_user, and all SELECT lists now round-trip the
|
||
new column. Value is always FALSE until Task 4 wires the register
|
||
flow to the invite code's marks_as_test flag.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 3: Expose `marks_as_test` on `InviteCode` and add lookup helper
|
||
|
||
`validate_invite_code` currently returns a bare bool. We need a new helper that returns the full row so the register flow can check `marks_as_test` without a second query.
|
||
|
||
**Files:**
|
||
- Modify: `server/services/admin_service.py` — `InviteCode` dataclass (L115–L138), `get_invite_codes` SELECT (L1106–L1141), add new `get_invite_code_details` method
|
||
|
||
- [ ] **Step 1: Add `marks_as_test` field to `InviteCode` dataclass**
|
||
|
||
In `server/services/admin_service.py` at L115:
|
||
|
||
```python
|
||
@dataclass
|
||
class InviteCode:
|
||
"""Invite code details."""
|
||
code: str
|
||
created_by: str
|
||
created_by_username: str
|
||
created_at: datetime
|
||
expires_at: datetime
|
||
max_uses: int
|
||
use_count: int
|
||
is_active: bool
|
||
marks_as_test: bool = False
|
||
```
|
||
|
||
Update `to_dict` at L127 to include the field:
|
||
|
||
```python
|
||
def to_dict(self) -> dict:
|
||
return {
|
||
"code": self.code,
|
||
"created_by": self.created_by,
|
||
"created_by_username": self.created_by_username,
|
||
"created_at": self.created_at.isoformat() if self.created_at else None,
|
||
"expires_at": self.expires_at.isoformat() if self.expires_at else None,
|
||
"max_uses": self.max_uses,
|
||
"use_count": self.use_count,
|
||
"is_active": self.is_active,
|
||
"remaining_uses": max(0, self.max_uses - self.use_count),
|
||
"marks_as_test": self.marks_as_test,
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Update `get_invite_codes` SELECT to include `marks_as_test`**
|
||
|
||
Find `get_invite_codes` at L1106. Modify the SQL to pull the column and pass it through:
|
||
|
||
```python
|
||
async def get_invite_codes(self, include_expired: bool = False) -> List[InviteCode]:
|
||
"""List all invite codes."""
|
||
async with self.pool.acquire() as conn:
|
||
sql = """
|
||
SELECT c.code, c.created_by, u.username as created_by_username,
|
||
c.created_at, c.expires_at,
|
||
c.max_uses, c.use_count, c.is_active,
|
||
COALESCE(c.marks_as_test, FALSE) as marks_as_test
|
||
FROM invite_codes c
|
||
LEFT JOIN users_v2 u ON c.created_by = u.id
|
||
"""
|
||
```
|
||
|
||
Find the list comprehension that constructs `InviteCode(...)` objects and add the new kwarg:
|
||
|
||
```python
|
||
InviteCode(
|
||
code=row["code"],
|
||
created_by=str(row["created_by"]),
|
||
created_by_username=row["created_by_username"] or "unknown",
|
||
created_at=row["created_at"].replace(tzinfo=timezone.utc) if row["created_at"] else None,
|
||
expires_at=row["expires_at"].replace(tzinfo=timezone.utc) if row["expires_at"] else None,
|
||
max_uses=row["max_uses"],
|
||
use_count=row["use_count"],
|
||
is_active=row["is_active"],
|
||
marks_as_test=row["marks_as_test"],
|
||
)
|
||
```
|
||
|
||
- [ ] **Step 3: Add new `get_invite_code_details` method**
|
||
|
||
Add a new method right after `validate_invite_code` (around L1214) that returns the row with `marks_as_test`. The register flow will call this to resolve the flag. Place it between `validate_invite_code` and `use_invite_code`:
|
||
|
||
```python
|
||
async def get_invite_code_details(self, code: str) -> Optional[dict]:
|
||
"""
|
||
Look up an invite code's row including marks_as_test.
|
||
|
||
Returns None if the code does not exist. Does NOT validate expiry
|
||
or usage — use validate_invite_code for that. This is purely a
|
||
helper for the register flow to discover the test-seed flag.
|
||
"""
|
||
async with self.pool.acquire() as conn:
|
||
row = await conn.fetchrow(
|
||
"""
|
||
SELECT code, max_uses, use_count, is_active,
|
||
COALESCE(marks_as_test, FALSE) as marks_as_test
|
||
FROM invite_codes
|
||
WHERE code = $1
|
||
""",
|
||
code,
|
||
)
|
||
if not row:
|
||
return None
|
||
return {
|
||
"code": row["code"],
|
||
"max_uses": row["max_uses"],
|
||
"use_count": row["use_count"],
|
||
"is_active": row["is_active"],
|
||
"marks_as_test": row["marks_as_test"],
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Verify with curl via admin panel endpoint**
|
||
|
||
Assuming you have an admin token from a local dev user. Hit the existing admin invites listing:
|
||
|
||
```bash
|
||
# Replace TOKEN with a valid admin JWT
|
||
curl -s http://localhost:8000/api/admin/invites \
|
||
-H "Authorization: Bearer $TOKEN" | jq '.codes[0]'
|
||
```
|
||
|
||
Expected: response includes `"marks_as_test": false` on at least one code.
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add server/services/admin_service.py
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(server): expose marks_as_test on InviteCode
|
||
|
||
Adds the field to the dataclass, SELECT list in get_invite_codes,
|
||
and a new get_invite_code_details helper that the register flow
|
||
will use to discover whether an invite should flag new accounts
|
||
as test accounts.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 4: Wire register flow to set `is_test_account` from invite
|
||
|
||
When a user registers with an invite whose `marks_as_test=TRUE`, the new account is flagged. The plumbing lives in two places: the router reads the flag and passes it to the service; the service passes it to the store.
|
||
|
||
**Files:**
|
||
- Modify: `server/routers/auth.py` — `register` handler (L224–L320)
|
||
- Modify: `server/services/auth_service.py` — `register` method (L98–L178)
|
||
|
||
- [ ] **Step 1: Add `is_test_account` parameter to `auth_service.register`**
|
||
|
||
In `server/services/auth_service.py` at L98, add the new parameter:
|
||
|
||
```python
|
||
async def register(
|
||
self,
|
||
username: str,
|
||
password: str,
|
||
email: Optional[str] = None,
|
||
guest_id: Optional[str] = None,
|
||
is_test_account: bool = False,
|
||
) -> RegistrationResult:
|
||
```
|
||
|
||
Update the docstring `Args:` block:
|
||
|
||
```
|
||
is_test_account: Mark this user as a soak-harness test account.
|
||
```
|
||
|
||
Pass the value through to `create_user` at L146:
|
||
|
||
```python
|
||
user = await self.user_store.create_user(
|
||
username=username,
|
||
password_hash=password_hash,
|
||
email=email,
|
||
role=UserRole.USER,
|
||
guest_id=guest_id,
|
||
verification_token=verification_token,
|
||
verification_expires=verification_expires,
|
||
is_test_account=is_test_account,
|
||
)
|
||
```
|
||
|
||
- [ ] **Step 2: Update the router to resolve `marks_as_test` and pass it through**
|
||
|
||
In `server/routers/auth.py`, find the `register` handler at L224. After the existing invite-code validation block (around L248–L252), fetch the invite details and compute `is_test`:
|
||
|
||
```python
|
||
# --- Invite code validation ---
|
||
is_test_account = False
|
||
if has_invite:
|
||
if not _admin_service:
|
||
raise HTTPException(status_code=503, detail="Admin service not initialized")
|
||
if not await _admin_service.validate_invite_code(request_body.invite_code):
|
||
raise HTTPException(status_code=400, detail="Invalid or expired invite code")
|
||
# Check if this invite flags new accounts as test accounts
|
||
invite_details = await _admin_service.get_invite_code_details(request_body.invite_code)
|
||
if invite_details and invite_details.get("marks_as_test"):
|
||
is_test_account = True
|
||
```
|
||
|
||
Then pass it to `auth_service.register` at L276:
|
||
|
||
```python
|
||
# --- Create the account ---
|
||
result = await auth_service.register(
|
||
username=request_body.username,
|
||
password=request_body.password,
|
||
email=request_body.email,
|
||
is_test_account=is_test_account,
|
||
)
|
||
```
|
||
|
||
- [ ] **Step 3: Flag the dev invite code for testing**
|
||
|
||
Before we can test end-to-end locally, we need an invite code with `marks_as_test=TRUE` in the local dev DB. Run (once, manually):
|
||
|
||
```bash
|
||
# First, check if 5VC2MCCN exists locally (it probably doesn't — that's staging's code).
|
||
# Create a local test invite code and flag it:
|
||
psql -d golfgame <<'EOF'
|
||
-- Create a local dev test-seed invite if not exists
|
||
INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test)
|
||
SELECT 'SOAKTEST', id, NOW() + INTERVAL '10 years', 100, TRUE, TRUE
|
||
FROM users_v2 WHERE role = 'admin' LIMIT 1
|
||
ON CONFLICT (code) DO UPDATE SET marks_as_test = TRUE;
|
||
|
||
-- Verify
|
||
SELECT code, max_uses, use_count, marks_as_test FROM invite_codes WHERE code = 'SOAKTEST';
|
||
EOF
|
||
```
|
||
|
||
Expected: `marks_as_test | t` in the last row.
|
||
|
||
- [ ] **Step 4: Verify register flow sets `is_test_account`**
|
||
|
||
Restart the dev server, then:
|
||
|
||
```bash
|
||
curl -sX POST http://localhost:8000/api/auth/register \
|
||
-H 'Content-Type: application/json' \
|
||
-d "{\"username\":\"soaktest_register1\",\"password\":\"$PW\",\"email\":\"soaktest_register1@example.com\",\"invite_code\":\"SOAKTEST\"}"
|
||
|
||
# Verify in DB
|
||
psql -d golfgame -c "SELECT username, is_test_account FROM users_v2 WHERE username = 'soaktest_register1';"
|
||
```
|
||
|
||
Expected: `is_test_account | t`.
|
||
|
||
- [ ] **Step 5: Verify non-test invite does NOT flag new accounts**
|
||
|
||
```bash
|
||
# Create a non-test invite
|
||
psql -d golfgame <<'EOF'
|
||
INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test)
|
||
SELECT 'NORMAL01', id, NOW() + INTERVAL '10 years', 10, TRUE, FALSE
|
||
FROM users_v2 WHERE role = 'admin' LIMIT 1
|
||
ON CONFLICT (code) DO UPDATE SET marks_as_test = FALSE;
|
||
EOF
|
||
|
||
curl -sX POST http://localhost:8000/api/auth/register \
|
||
-H 'Content-Type: application/json' \
|
||
-d "{\"username\":\"realuser_smoke1\",\"password\":\"$PW\",\"email\":\"realuser_smoke1@example.com\",\"invite_code\":\"NORMAL01\"}"
|
||
|
||
psql -d golfgame -c "SELECT username, is_test_account FROM users_v2 WHERE username = 'realuser_smoke1';"
|
||
```
|
||
|
||
Expected: `is_test_account | f`.
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add server/routers/auth.py server/services/auth_service.py
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(server): register flow flags accounts from test-seed invites
|
||
|
||
When a user registers with an invite_code whose marks_as_test=TRUE,
|
||
their users_v2.is_test_account is set to TRUE. Normal invite codes
|
||
and invite-less signups are unaffected.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 5: Stats filtering (`include_test` parameter)
|
||
|
||
Thread an `include_test: bool = False` parameter through `get_leaderboard`, `get_player_rank`, and the corresponding router handlers. Default is `False` — real users never see soak traffic.
|
||
|
||
**Files:**
|
||
- Modify: `server/services/stats_service.py` — `get_leaderboard` (L169), `get_player_rank` (L249)
|
||
- Modify: `server/routers/stats.py` — `get_leaderboard` route (L157), `get_player_rank` route (L227), `get_my_rank` route (L348)
|
||
|
||
- [ ] **Step 1: Add `include_test` to `get_leaderboard` service method**
|
||
|
||
In `server/services/stats_service.py` at L169:
|
||
|
||
```python
|
||
async def get_leaderboard(
|
||
self,
|
||
metric: str = "wins",
|
||
limit: int = 50,
|
||
offset: int = 0,
|
||
include_test: bool = False,
|
||
) -> List[LeaderboardEntry]:
|
||
```
|
||
|
||
Inside the method, find both SQL paths (materialized view and fallback). In the view path at L208, change the WHERE clause:
|
||
|
||
```python
|
||
if view_exists:
|
||
# Use materialized view for performance
|
||
rows = await conn.fetch(f"""
|
||
SELECT
|
||
user_id, username, games_played, games_won,
|
||
win_rate, avg_score, knockouts, best_win_streak,
|
||
COALESCE(rating, 1500) as rating,
|
||
ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank
|
||
FROM leaderboard_overall
|
||
WHERE ($3 OR NOT is_test_account)
|
||
ORDER BY {column} {direction}
|
||
LIMIT $1 OFFSET $2
|
||
""", limit, offset, include_test)
|
||
```
|
||
|
||
In the fallback path at L220, add the WHERE clause and parameter:
|
||
|
||
```python
|
||
else:
|
||
# Fall back to direct query
|
||
rows = await conn.fetch(f"""
|
||
SELECT
|
||
s.user_id, u.username, s.games_played, s.games_won,
|
||
ROUND(s.games_won::numeric / NULLIF(s.games_played, 0) * 100, 1) as win_rate,
|
||
ROUND(s.total_points::numeric / NULLIF(s.total_rounds, 0), 1) as avg_score,
|
||
s.knockouts, s.best_win_streak,
|
||
COALESCE(s.rating, 1500) as rating,
|
||
ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank
|
||
FROM player_stats s
|
||
JOIN users_v2 u ON s.user_id = u.id
|
||
WHERE s.games_played >= 5
|
||
AND u.deleted_at IS NULL
|
||
AND (u.is_banned = false OR u.is_banned IS NULL)
|
||
AND ($3 OR NOT COALESCE(u.is_test_account, FALSE))
|
||
ORDER BY {column} {direction}
|
||
LIMIT $1 OFFSET $2
|
||
""", limit, offset, include_test)
|
||
```
|
||
|
||
- [ ] **Step 2: Apply the same pattern to `get_player_rank`**
|
||
|
||
In `server/services/stats_service.py` at L249:
|
||
|
||
```python
|
||
async def get_player_rank(
|
||
self,
|
||
user_id: str,
|
||
metric: str = "wins",
|
||
include_test: bool = False,
|
||
) -> Optional[int]:
|
||
```
|
||
|
||
Update both SQL paths to include the `include_test` filter. View path at L287:
|
||
|
||
```python
|
||
if view_exists:
|
||
row = await conn.fetchrow(f"""
|
||
SELECT rank FROM (
|
||
SELECT user_id, ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank
|
||
FROM leaderboard_overall
|
||
WHERE ($2 OR NOT is_test_account)
|
||
) ranked
|
||
WHERE user_id = $1
|
||
""", user_id, include_test)
|
||
```
|
||
|
||
Fallback path at L294:
|
||
|
||
```python
|
||
else:
|
||
row = await conn.fetchrow(f"""
|
||
SELECT rank FROM (
|
||
SELECT s.user_id, ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank
|
||
FROM player_stats s
|
||
JOIN users_v2 u ON s.user_id = u.id
|
||
WHERE s.games_played >= 5
|
||
AND u.deleted_at IS NULL
|
||
AND (u.is_banned = false OR u.is_banned IS NULL)
|
||
AND ($2 OR NOT COALESCE(u.is_test_account, FALSE))
|
||
) ranked
|
||
WHERE user_id = $1
|
||
""", user_id, include_test)
|
||
```
|
||
|
||
- [ ] **Step 3: Expose `include_test` as a query parameter on the leaderboard route**
|
||
|
||
In `server/routers/stats.py` at L157:
|
||
|
||
```python
|
||
@router.get("/leaderboard", response_model=LeaderboardResponse)
|
||
async def get_leaderboard(
|
||
metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"),
|
||
limit: int = Query(50, ge=1, le=100),
|
||
offset: int = Query(0, ge=0),
|
||
include_test: bool = Query(False, description="Include soak-harness test accounts"),
|
||
service: StatsService = Depends(get_stats_service_dep),
|
||
):
|
||
"""
|
||
Get leaderboard by metric.
|
||
|
||
Metrics:
|
||
- wins: Total games won
|
||
- win_rate: Win percentage (requires 5+ games)
|
||
- avg_score: Average points per round (lower is better)
|
||
- knockouts: Times going out first
|
||
- streak: Best win streak
|
||
|
||
Players must have 5+ games to appear on leaderboards.
|
||
By default, soak-harness test accounts are hidden.
|
||
"""
|
||
entries = await service.get_leaderboard(metric, limit, offset, include_test)
|
||
```
|
||
|
||
- [ ] **Step 4: Same for `get_player_rank` and `get_my_rank` routes**
|
||
|
||
At L227:
|
||
|
||
```python
|
||
@router.get("/players/{user_id}/rank", response_model=PlayerRankResponse)
|
||
async def get_player_rank(
|
||
user_id: str,
|
||
metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"),
|
||
include_test: bool = Query(False),
|
||
service: StatsService = Depends(get_stats_service_dep),
|
||
):
|
||
"""Get player's rank on a leaderboard."""
|
||
rank = await service.get_player_rank(user_id, metric, include_test)
|
||
```
|
||
|
||
At L348:
|
||
|
||
```python
|
||
@router.get("/me/rank", response_model=PlayerRankResponse)
|
||
async def get_my_rank(
|
||
metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"),
|
||
include_test: bool = Query(False),
|
||
user: User = Depends(require_user),
|
||
service: StatsService = Depends(get_stats_service_dep),
|
||
):
|
||
"""Get current user's rank on a leaderboard."""
|
||
rank = await service.get_player_rank(user.id, metric, include_test)
|
||
```
|
||
|
||
- [ ] **Step 5: Verify filtering works via curl**
|
||
|
||
```bash
|
||
# Mark a test user we registered earlier as having games played (synthetic)
|
||
psql -d golfgame <<'EOF'
|
||
INSERT INTO player_stats (user_id, games_played, games_won, total_points, total_rounds, rounds_won)
|
||
SELECT id, 10, 8, 50, 30, 20 FROM users_v2 WHERE username = 'soaktest_register1'
|
||
ON CONFLICT (user_id) DO UPDATE SET games_played = 10, games_won = 8;
|
||
|
||
-- Refresh the matview so the test account shows up
|
||
REFRESH MATERIALIZED VIEW leaderboard_overall;
|
||
EOF
|
||
|
||
# Default (include_test=false) should NOT include soaktest_register1
|
||
curl -s "http://localhost:8000/api/stats/leaderboard?metric=wins" | jq '.entries[] | select(.username | startswith("soaktest_"))'
|
||
|
||
# include_test=true should include soaktest_register1
|
||
curl -s "http://localhost:8000/api/stats/leaderboard?metric=wins&include_test=true" | jq '.entries[] | select(.username | startswith("soaktest_"))'
|
||
```
|
||
|
||
Expected: first command returns nothing, second returns a JSON object for `soaktest_register1`.
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add server/services/stats_service.py server/routers/stats.py
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(server): stats queries support include_test filter
|
||
|
||
Leaderboard and rank queries take an optional include_test param
|
||
(default false). Real users never see soak-harness traffic unless
|
||
they explicitly opt in via ?include_test=true.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 6: Admin service + route surfaces `is_test_account`
|
||
|
||
`UserDetails` exposes the flag, `search_users` selects it, and `list_users` admin route accepts an `include_test` query parameter.
|
||
|
||
**Files:**
|
||
- Modify: `server/services/admin_service.py` — `UserDetails` (L24–L58), `search_users` (L312–L382), `get_user` (L384–L428)
|
||
- Modify: `server/routers/admin.py` — `list_users` route (L80–L107)
|
||
|
||
- [ ] **Step 1: Add field to `UserDetails` dataclass**
|
||
|
||
In `server/services/admin_service.py` at L24, add to the dataclass:
|
||
|
||
```python
|
||
@dataclass
|
||
class UserDetails:
|
||
"""Extended user info for admin view."""
|
||
id: str
|
||
username: str
|
||
email: Optional[str]
|
||
role: str
|
||
email_verified: bool
|
||
is_banned: bool
|
||
ban_reason: Optional[str]
|
||
force_password_reset: bool
|
||
created_at: datetime
|
||
last_login: Optional[datetime]
|
||
last_seen_at: Optional[datetime]
|
||
is_active: bool
|
||
games_played: int
|
||
games_won: int
|
||
is_test_account: bool = False
|
||
```
|
||
|
||
Update `to_dict` to include it:
|
||
|
||
```python
|
||
def to_dict(self) -> dict:
|
||
return {
|
||
"id": self.id,
|
||
"username": self.username,
|
||
"email": self.email,
|
||
"role": self.role,
|
||
"email_verified": self.email_verified,
|
||
"is_banned": self.is_banned,
|
||
"ban_reason": self.ban_reason,
|
||
"force_password_reset": self.force_password_reset,
|
||
"created_at": self.created_at.isoformat() if self.created_at else None,
|
||
"last_login": self.last_login.isoformat() if self.last_login else None,
|
||
"last_seen_at": self.last_seen_at.isoformat() if self.last_seen_at else None,
|
||
"is_active": self.is_active,
|
||
"games_played": self.games_played,
|
||
"games_won": self.games_won,
|
||
"is_test_account": self.is_test_account,
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Update `search_users` to SELECT and filter on `is_test_account`**
|
||
|
||
In `server/services/admin_service.py` at L312, add `include_test` parameter and column to the SELECT:
|
||
|
||
```python
|
||
async def search_users(
|
||
self,
|
||
query: str = "",
|
||
limit: int = 50,
|
||
offset: int = 0,
|
||
include_banned: bool = True,
|
||
include_deleted: bool = False,
|
||
include_test: bool = True,
|
||
) -> List[UserDetails]:
|
||
```
|
||
|
||
Modify the SQL to pull `is_test_account`:
|
||
|
||
```python
|
||
sql = """
|
||
SELECT u.id, u.username, u.email, u.role,
|
||
u.email_verified, u.is_banned, u.ban_reason,
|
||
u.force_password_reset, u.created_at, u.last_login,
|
||
u.last_seen_at, u.is_active,
|
||
COALESCE(u.is_test_account, FALSE) as is_test_account,
|
||
COALESCE(s.games_played, 0) as games_played,
|
||
COALESCE(s.games_won, 0) as games_won
|
||
FROM users_v2 u
|
||
LEFT JOIN player_stats s ON u.id = s.user_id
|
||
WHERE 1=1
|
||
"""
|
||
```
|
||
|
||
After the existing `include_deleted` check, add:
|
||
|
||
```python
|
||
if not include_test:
|
||
sql += " AND (u.is_test_account = false OR u.is_test_account IS NULL)"
|
||
```
|
||
|
||
Update the `UserDetails(...)` construction in the list comprehension to include `is_test_account=row["is_test_account"]`.
|
||
|
||
- [ ] **Step 3: Update `get_user` (single-user lookup) similarly**
|
||
|
||
In `server/services/admin_service.py` at L384, add `COALESCE(u.is_test_account, FALSE) as is_test_account` to the SELECT and `is_test_account=row["is_test_account"]` to the `UserDetails(...)` construction. The `get_user` method does NOT need the filter parameter — admins looking up individual users should always see them.
|
||
|
||
- [ ] **Step 4: Add `include_test` to the admin `list_users` route**
|
||
|
||
In `server/routers/admin.py` at L80:
|
||
|
||
```python
|
||
@router.get("/users")
|
||
async def list_users(
|
||
query: str = "",
|
||
limit: int = 50,
|
||
offset: int = 0,
|
||
include_banned: bool = True,
|
||
include_deleted: bool = False,
|
||
include_test: bool = True,
|
||
admin: User = Depends(require_admin_v2),
|
||
service: AdminService = Depends(get_admin_service_dep),
|
||
):
|
||
"""
|
||
Search and list users.
|
||
|
||
Args:
|
||
query: Search by username or email.
|
||
limit: Maximum results to return.
|
||
offset: Results to skip.
|
||
include_banned: Include banned users.
|
||
include_deleted: Include soft-deleted users.
|
||
include_test: Include soak-harness test accounts (default true for admins).
|
||
"""
|
||
users = await service.search_users(
|
||
query=query,
|
||
limit=limit,
|
||
offset=offset,
|
||
include_banned=include_banned,
|
||
include_deleted=include_deleted,
|
||
include_test=include_test,
|
||
)
|
||
return {"users": [u.to_dict() for u in users]}
|
||
```
|
||
|
||
Note: default is `True` for the admin path — admins should see everything by default. The client-side toggle will explicitly pass `false` when the admin wants to hide test accounts.
|
||
|
||
- [ ] **Step 5: Verify via curl**
|
||
|
||
```bash
|
||
# Assuming admin token in $TOKEN env var
|
||
curl -s "http://localhost:8000/api/admin/users?query=soaktest" \
|
||
-H "Authorization: Bearer $TOKEN" | jq '.users[] | {username, is_test_account}'
|
||
|
||
curl -s "http://localhost:8000/api/admin/users?query=soaktest&include_test=false" \
|
||
-H "Authorization: Bearer $TOKEN" | jq '.users[]'
|
||
```
|
||
|
||
Expected: first returns users with `is_test_account: true`; second returns empty (test accounts filtered out).
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add server/services/admin_service.py server/routers/admin.py
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(server): admin users list surfaces is_test_account
|
||
|
||
UserDetails carries the new column, search_users selects and
|
||
optionally filters on it, and the /api/admin/users route accepts
|
||
?include_test=false to hide soak-harness accounts.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 7: Admin panel UI — Test badge and filter toggle
|
||
|
||
Add a visible `[Test]` badge on test accounts in the admin user list, a `[Test-seed]` indicator on invite codes that mark new accounts as test, and an "Include test accounts" checkbox next to the existing "Include banned" toggle.
|
||
|
||
**Files:**
|
||
- Modify: `client/admin.html` — add the new toggle near the existing `#include-banned` checkbox
|
||
- Modify: `client/admin.js` — `loadUsers` (L305), `getStatusBadge` (L246), the invite codes renderer (L443)
|
||
|
||
- [ ] **Step 1: Add the "Include test accounts" checkbox to admin.html**
|
||
|
||
In `client/admin.html`, find the existing `#include-banned` checkbox (it's in the users tab filter bar — grep for it). Add a sibling checkbox right after:
|
||
|
||
```bash
|
||
grep -n "include-banned" client/admin.html
|
||
```
|
||
|
||
Add next to that line:
|
||
|
||
```html
|
||
<label>
|
||
<input type="checkbox" id="include-test" />
|
||
Include test accounts
|
||
</label>
|
||
```
|
||
|
||
- [ ] **Step 2: Read the new checkbox in `loadUsers` and pass to getUsers**
|
||
|
||
In `client/admin.js` at L305:
|
||
|
||
```javascript
|
||
async function loadUsers() {
|
||
try {
|
||
const query = document.getElementById('user-search').value;
|
||
const includeBanned = document.getElementById('include-banned').checked;
|
||
const includeTest = document.getElementById('include-test').checked;
|
||
const data = await getUsers(query, usersPage * PAGE_SIZE, includeBanned, includeTest);
|
||
```
|
||
|
||
Find `getUsers` at L70 and add the new parameter:
|
||
|
||
```javascript
|
||
async function getUsers(query = '', offset = 0, includeBanned = true, includeTest = true) {
|
||
const params = new URLSearchParams({
|
||
query,
|
||
limit: PAGE_SIZE,
|
||
offset,
|
||
include_banned: includeBanned,
|
||
include_test: includeTest,
|
||
});
|
||
return apiRequest(`/api/admin/users?${params}`);
|
||
}
|
||
```
|
||
|
||
Note: the existing signature builds a URLSearchParams — check the actual code at L70 and match its style; the key change is adding `include_test: includeTest` to the params.
|
||
|
||
- [ ] **Step 3: Add a "Test" badge to the user table row**
|
||
|
||
In `client/admin.js` at L314, modify the table row template to render a Test badge inline with the status badge:
|
||
|
||
```javascript
|
||
data.users.forEach(user => {
|
||
const testBadge = user.is_test_account
|
||
? '<span class="badge badge-info" title="Soak harness test account">Test</span>'
|
||
: '';
|
||
tbody.innerHTML += `
|
||
<tr>
|
||
<td>${escapeHtml(user.username)} ${testBadge}</td>
|
||
<td>${escapeHtml(user.email || '-')}</td>
|
||
<td><span class="badge badge-${user.role === 'admin' ? 'info' : 'muted'}">${user.role}</span></td>
|
||
<td>${getStatusBadge(user)}</td>
|
||
<td>${user.games_played} (${user.games_won} wins)</td>
|
||
<td>${formatDateShort(user.created_at)}</td>
|
||
<td>
|
||
<button class="btn btn-small" data-action="view-user" data-id="${user.id}">View</button>
|
||
</td>
|
||
</tr>
|
||
`;
|
||
});
|
||
```
|
||
|
||
- [ ] **Step 4: Add Test-seed indicator to invite codes list**
|
||
|
||
In `client/admin.js` around L443 (invite codes list renderer), find the row template and add a `[Test-seed]` badge when `invite.marks_as_test`:
|
||
|
||
```bash
|
||
grep -n "invite.is_active\|invite.code\|invites-tbody\|invites-table" client/admin.js | head
|
||
```
|
||
|
||
Once located, modify the row template to include:
|
||
|
||
```javascript
|
||
const testSeedBadge = invite.marks_as_test
|
||
? '<span class="badge badge-info" title="Creates test accounts">Test-seed</span>'
|
||
: '';
|
||
// Insert testSeedBadge into the invite code column, e.g.
|
||
// <td>${escapeHtml(invite.code)} ${testSeedBadge}</td>
|
||
```
|
||
|
||
- [ ] **Step 5: Wire the checkbox change event to reload users**
|
||
|
||
Find where `#include-banned` has its `change` listener attached (grep for it in admin.js):
|
||
|
||
```bash
|
||
grep -n "include-banned.*addEventListener\|include-banned" client/admin.js
|
||
```
|
||
|
||
Add a parallel listener for `#include-test` that calls `loadUsers()`:
|
||
|
||
```javascript
|
||
document.getElementById('include-test').addEventListener('change', () => {
|
||
usersPage = 0;
|
||
loadUsers();
|
||
});
|
||
```
|
||
|
||
- [ ] **Step 6: Manual verification in browser**
|
||
|
||
1. Open http://localhost:8000/admin.html
|
||
2. Log in as admin
|
||
3. Navigate to Users tab
|
||
4. Search for "soaktest"
|
||
5. Confirm the `[Test]` badge appears next to `soaktest_register1`
|
||
6. Uncheck "Include test accounts" — the row should disappear
|
||
7. Re-check it — the row should return
|
||
8. Navigate to Invite Codes tab
|
||
9. Confirm the `[Test-seed]` badge appears next to the `SOAKTEST` code
|
||
|
||
- [ ] **Step 7: Commit**
|
||
|
||
```bash
|
||
git add client/admin.html client/admin.js
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(admin): visible Test/Test-seed badges + filter toggle
|
||
|
||
Users table shows [Test] next to soak-harness accounts, invite codes
|
||
list shows [Test-seed] next to codes that flag new accounts as test,
|
||
and a new "Include test accounts" checkbox lets admins hide bot
|
||
traffic from the user list.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 8: Document the one-time staging setup step
|
||
|
||
The staging invite code `5VC2MCCN` needs to be flagged as test-seed before the harness can run against staging. This is a manual one-liner; document it in a new bring-up doc.
|
||
|
||
**Files:**
|
||
- Create: `docs/soak-harness-bringup.md`
|
||
|
||
- [ ] **Step 1: Create the bring-up doc**
|
||
|
||
```bash
|
||
cat > docs/soak-harness-bringup.md <<'EOF'
|
||
# Soak Harness Bring-Up
|
||
|
||
One-time setup steps before running `tests/soak` against an environment.
|
||
|
||
## Prerequisites
|
||
|
||
- An invite code exists with 16+ available uses
|
||
- You have psql access to the target DB (or admin SQL access via some other means)
|
||
|
||
## 1. Flag the invite code as test-seed
|
||
|
||
Any account registered with a `marks_as_test=TRUE` invite code gets
|
||
`users_v2.is_test_account=TRUE`, which keeps it out of real-user stats.
|
||
|
||
### Staging
|
||
|
||
Invite code: `5VC2MCCN` (16 uses, provisioned 2026-04-10).
|
||
|
||
```sql
|
||
UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN';
|
||
SELECT code, max_uses, use_count, marks_as_test FROM invite_codes WHERE code = '5VC2MCCN';
|
||
```
|
||
|
||
Expected: `marks_as_test | t`.
|
||
|
||
### Local dev
|
||
|
||
The dev DB already has a `SOAKTEST` invite created during Task 4 of
|
||
the implementation plan. If you wiped the DB since, recreate it:
|
||
|
||
```sql
|
||
INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test)
|
||
SELECT 'SOAKTEST', id, NOW() + INTERVAL '10 years', 100, TRUE, TRUE
|
||
FROM users_v2 WHERE role = 'admin' LIMIT 1
|
||
ON CONFLICT (code) DO UPDATE SET marks_as_test = TRUE;
|
||
```
|
||
|
||
## 2. Run the harness
|
||
|
||
```bash
|
||
cd tests/soak
|
||
npm install
|
||
npm run seed # first run only, populates .env.stresstest
|
||
TEST_URL=http://localhost:8000 npm run smoke # 30s end-to-end check
|
||
```
|
||
|
||
For staging:
|
||
|
||
```bash
|
||
TEST_URL=https://staging.adlee.work npm run soak -- --scenario=populate
|
||
```
|
||
|
||
See `tests/soak/README.md` for the full flag reference.
|
||
EOF
|
||
```
|
||
|
||
- [ ] **Step 2: Commit**
|
||
|
||
```bash
|
||
git add docs/soak-harness-bringup.md
|
||
git commit -m "$(cat <<'EOF'
|
||
docs: soak harness bring-up steps
|
||
|
||
Documents the one-time UPDATE invite_codes SET marks_as_test = TRUE
|
||
step required before running tests/soak against each environment,
|
||
plus the local dev SOAKTEST invite recreation SQL.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 2 — Harness scaffolding
|
||
|
||
### Task 9: Create the `tests/soak/` package skeleton
|
||
|
||
Bare minimum to get `tsx` running against an empty entry point. No behavior yet.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/package.json`
|
||
- Create: `tests/soak/tsconfig.json`
|
||
- Create: `tests/soak/.gitignore`
|
||
- Create: `tests/soak/.env.stresstest.example`
|
||
- Create: `tests/soak/README.md` (stub)
|
||
- Create: `tests/soak/runner.ts` (stub — prints "hello")
|
||
|
||
- [ ] **Step 1: Create `tests/soak/package.json`**
|
||
|
||
```json
|
||
{
|
||
"name": "golf-soak",
|
||
"version": "0.1.0",
|
||
"private": true,
|
||
"description": "Multiplayer soak & UX test harness for Golf Card Game",
|
||
"scripts": {
|
||
"soak": "tsx runner.ts",
|
||
"soak:populate": "tsx runner.ts --scenario=populate",
|
||
"soak:stress": "tsx runner.ts --scenario=stress",
|
||
"seed": "tsx scripts/seed-accounts.ts",
|
||
"smoke": "bash scripts/smoke.sh",
|
||
"test": "vitest run"
|
||
},
|
||
"dependencies": {
|
||
"playwright-core": "^1.40.0",
|
||
"ws": "^8.16.0"
|
||
},
|
||
"devDependencies": {
|
||
"tsx": "^4.7.0",
|
||
"@types/ws": "^8.5.0",
|
||
"@types/node": "^20.10.0",
|
||
"typescript": "^5.3.0",
|
||
"vitest": "^1.2.0"
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Create `tests/soak/tsconfig.json`**
|
||
|
||
```json
|
||
{
|
||
"compilerOptions": {
|
||
"target": "ES2022",
|
||
"module": "commonjs",
|
||
"moduleResolution": "node",
|
||
"strict": true,
|
||
"esModuleInterop": true,
|
||
"skipLibCheck": true,
|
||
"forceConsistentCasingInFileNames": true,
|
||
"resolveJsonModule": true,
|
||
"declaration": false,
|
||
"sourceMap": true,
|
||
"outDir": "./dist",
|
||
"rootDir": ".",
|
||
"baseUrl": ".",
|
||
"lib": ["ES2022", "DOM"],
|
||
"paths": {
|
||
"@soak/*": ["./*"],
|
||
"@bot/*": ["../e2e/bot/*"]
|
||
}
|
||
},
|
||
"include": ["**/*.ts"],
|
||
"exclude": ["node_modules", "dist", "artifacts"]
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 3: Create `tests/soak/.gitignore`**
|
||
|
||
```
|
||
node_modules/
|
||
dist/
|
||
artifacts/
|
||
.env.stresstest
|
||
*.log
|
||
```
|
||
|
||
- [ ] **Step 4: Create `tests/soak/.env.stresstest.example`**
|
||
|
||
```
|
||
# Soak harness account cache.
|
||
# This file is AUTO-GENERATED on first run; do not edit by hand.
|
||
# Format: SOAK_ACCOUNT_NN=username:password:token
|
||
#
|
||
# Example (delete before first real run):
|
||
# SOAK_ACCOUNT_00=soak_00_a7bx:<generated-password>:<jwt-token>
|
||
```
|
||
|
||
- [ ] **Step 5: Create `tests/soak/README.md` (stub — expanded in Task 31)**
|
||
|
||
```markdown
|
||
# Golf Soak & UX Test Harness
|
||
|
||
Runs 16 authenticated browser sessions across 4 rooms to populate
|
||
staging scoreboards and stress-test multiplayer stability.
|
||
|
||
**Spec:** `docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md`
|
||
**Bring-up:** `docs/soak-harness-bringup.md`
|
||
|
||
## Quick start
|
||
|
||
```bash
|
||
npm install
|
||
npm run seed # first run only
|
||
TEST_URL=http://localhost:8000 npm run smoke
|
||
```
|
||
|
||
Full documentation arrives with Task 31.
|
||
```
|
||
|
||
- [ ] **Step 6: Create `tests/soak/runner.ts` as a placeholder**
|
||
|
||
```typescript
|
||
#!/usr/bin/env tsx
|
||
/**
|
||
* Golf Soak Harness — entry point.
|
||
*
|
||
* Placeholder. Full runner lands in Task 17.
|
||
*/
|
||
|
||
async function main(): Promise<void> {
|
||
console.log('golf-soak runner (placeholder)');
|
||
console.log('Full implementation lands in Task 17 of the plan.');
|
||
}
|
||
|
||
main().catch((err) => {
|
||
console.error(err);
|
||
process.exit(1);
|
||
});
|
||
```
|
||
|
||
- [ ] **Step 7: Install deps and verify runner executes**
|
||
|
||
```bash
|
||
cd tests/soak
|
||
npm install
|
||
npx tsx runner.ts
|
||
```
|
||
|
||
Expected output:
|
||
|
||
```
|
||
golf-soak runner (placeholder)
|
||
Full implementation lands in Task 17 of the plan.
|
||
```
|
||
|
||
- [ ] **Step 8: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/package.json tests/soak/package-lock.json tests/soak/tsconfig.json tests/soak/.gitignore tests/soak/.env.stresstest.example tests/soak/README.md tests/soak/runner.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): scaffold tests/soak package
|
||
|
||
Placeholder runner, tsconfig with @bot alias to tests/e2e/bot,
|
||
gitignored .env.stresstest + artifacts. Real behavior follows
|
||
in Task 10 onward.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 10: Core types and `Deferred` helper
|
||
|
||
Pure TypeScript with Vitest tests. No browser, no network. Establishes the type surface the rest of the harness will target.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/core/types.ts`
|
||
- Create: `tests/soak/core/deferred.ts`
|
||
- Create: `tests/soak/tests/deferred.test.ts`
|
||
|
||
- [ ] **Step 1: Write the failing test for `Deferred`**
|
||
|
||
Create `tests/soak/tests/deferred.test.ts`:
|
||
|
||
```typescript
|
||
import { describe, it, expect } from 'vitest';
|
||
import { deferred } from '../core/deferred';
|
||
|
||
describe('deferred', () => {
|
||
it('resolves with the given value', async () => {
|
||
const d = deferred<string>();
|
||
d.resolve('hello');
|
||
await expect(d.promise).resolves.toBe('hello');
|
||
});
|
||
|
||
it('rejects with the given error', async () => {
|
||
const d = deferred<string>();
|
||
const err = new Error('boom');
|
||
d.reject(err);
|
||
await expect(d.promise).rejects.toBe(err);
|
||
});
|
||
|
||
it('ignores second resolve calls', async () => {
|
||
const d = deferred<number>();
|
||
d.resolve(1);
|
||
d.resolve(2);
|
||
await expect(d.promise).resolves.toBe(1);
|
||
});
|
||
});
|
||
```
|
||
|
||
- [ ] **Step 2: Run the test to verify it fails**
|
||
|
||
```bash
|
||
cd tests/soak
|
||
npx vitest run tests/deferred.test.ts
|
||
```
|
||
|
||
Expected: FAIL — module `../core/deferred` does not exist.
|
||
|
||
- [ ] **Step 3: Implement `deferred`**
|
||
|
||
Create `tests/soak/core/deferred.ts`:
|
||
|
||
```typescript
|
||
/**
|
||
* Promise deferred primitive — lets external code resolve or reject
|
||
* a promise. Used by RoomCoordinator for host→joiners handoff.
|
||
*/
|
||
|
||
export interface Deferred<T> {
|
||
promise: Promise<T>;
|
||
resolve(value: T): void;
|
||
reject(error: unknown): void;
|
||
}
|
||
|
||
export function deferred<T>(): Deferred<T> {
|
||
let resolve!: (value: T) => void;
|
||
let reject!: (error: unknown) => void;
|
||
const promise = new Promise<T>((res, rej) => {
|
||
resolve = res;
|
||
reject = rej;
|
||
});
|
||
return { promise, resolve, reject };
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Run tests to verify they pass**
|
||
|
||
```bash
|
||
npx vitest run tests/deferred.test.ts
|
||
```
|
||
|
||
Expected: 3 passed.
|
||
|
||
- [ ] **Step 5: Create `core/types.ts` with the scenario interfaces**
|
||
|
||
```typescript
|
||
/**
|
||
* Core type definitions for the soak harness.
|
||
*
|
||
* Contracts here are consumed by runner.ts, SessionPool, scenarios,
|
||
* and the dashboard. Keep this file small and stable.
|
||
*/
|
||
|
||
import type { BrowserContext, Page } from 'playwright-core';
|
||
import type { GolfBot } from '../../e2e/bot/golf-bot';
|
||
|
||
// =============================================================================
|
||
// Accounts & sessions
|
||
// =============================================================================
|
||
|
||
export interface Account {
|
||
/** Stable key used in logs, e.g. "soak_00". */
|
||
key: string;
|
||
username: string;
|
||
password: string;
|
||
/** JWT returned from /api/auth/login, may be refreshed by SessionPool. */
|
||
token: string;
|
||
}
|
||
|
||
export interface Session {
|
||
account: Account;
|
||
context: BrowserContext;
|
||
page: Page;
|
||
bot: GolfBot;
|
||
/** Convenience mirror of account.key. */
|
||
key: string;
|
||
}
|
||
|
||
// =============================================================================
|
||
// Scenarios
|
||
// =============================================================================
|
||
|
||
export interface ScenarioNeeds {
|
||
/** Total number of authenticated sessions the scenario requires. */
|
||
accounts: number;
|
||
/** How many rooms to partition sessions into (default: 1). */
|
||
rooms?: number;
|
||
/** CPUs to add per room (default: 0). */
|
||
cpusPerRoom?: number;
|
||
}
|
||
|
||
/** Free-form per-scenario config merged with CLI flags. */
|
||
export type ScenarioConfig = Record<string, unknown>;
|
||
|
||
export interface ScenarioError {
|
||
room: string;
|
||
reason: string;
|
||
detail?: string;
|
||
timestamp: number;
|
||
}
|
||
|
||
export interface ScenarioResult {
|
||
gamesCompleted: number;
|
||
errors: ScenarioError[];
|
||
durationMs: number;
|
||
customMetrics?: Record<string, number>;
|
||
}
|
||
|
||
export interface ScenarioContext {
|
||
/** Merged config: CLI flags → env → scenario defaults → runner defaults. */
|
||
config: ScenarioConfig;
|
||
/** Pre-authenticated sessions; ordered. */
|
||
sessions: Session[];
|
||
coordinator: RoomCoordinatorApi;
|
||
dashboard: DashboardReporter;
|
||
logger: Logger;
|
||
signal: AbortSignal;
|
||
/** Reset the per-room watchdog. Call at each progress point. */
|
||
heartbeat(roomId: string): void;
|
||
}
|
||
|
||
export interface Scenario {
|
||
name: string;
|
||
description: string;
|
||
defaultConfig: ScenarioConfig;
|
||
needs: ScenarioNeeds;
|
||
run(ctx: ScenarioContext): Promise<ScenarioResult>;
|
||
}
|
||
|
||
// =============================================================================
|
||
// Room coordination
|
||
// =============================================================================
|
||
|
||
export interface RoomCoordinatorApi {
|
||
announce(roomId: string, code: string): void;
|
||
await(roomId: string, timeoutMs?: number): Promise<string>;
|
||
}
|
||
|
||
// =============================================================================
|
||
// Dashboard reporter
|
||
// =============================================================================
|
||
|
||
export interface RoomState {
|
||
phase?: string;
|
||
currentPlayer?: string;
|
||
hole?: number;
|
||
totalHoles?: number;
|
||
game?: number;
|
||
totalGames?: number;
|
||
moves?: number;
|
||
players?: Array<{ key: string; score: number | null; isActive: boolean }>;
|
||
message?: string;
|
||
}
|
||
|
||
export interface DashboardReporter {
|
||
update(roomId: string, state: Partial<RoomState>): void;
|
||
log(level: 'info' | 'warn' | 'error', msg: string, meta?: object): void;
|
||
incrementMetric(name: string, by?: number): void;
|
||
}
|
||
|
||
// =============================================================================
|
||
// Logger
|
||
// =============================================================================
|
||
|
||
export type LogLevel = 'debug' | 'info' | 'warn' | 'error';
|
||
|
||
export interface Logger {
|
||
debug(msg: string, meta?: object): void;
|
||
info(msg: string, meta?: object): void;
|
||
warn(msg: string, meta?: object): void;
|
||
error(msg: string, meta?: object): void;
|
||
child(meta: object): Logger;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 6: Verify tsx still parses the runner**
|
||
|
||
```bash
|
||
cd tests/soak
|
||
npx tsx runner.ts
|
||
```
|
||
|
||
Expected: still prints the placeholder output; no TypeScript errors from the new `core/` files (they're not imported yet).
|
||
|
||
- [ ] **Step 7: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/core/deferred.ts tests/soak/core/types.ts tests/soak/tests/deferred.test.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): core types + Deferred primitive
|
||
|
||
Establishes the Scenario/Session/Logger/DashboardReporter contracts
|
||
the rest of the harness builds on. Deferred is the building block
|
||
for RoomCoordinator's host→joiners handoff.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 11: RoomCoordinator with tests
|
||
|
||
Tiny abstraction over `Deferred` keyed by room ID, with a timeout on `await`.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/core/room-coordinator.ts`
|
||
- Create: `tests/soak/tests/room-coordinator.test.ts`
|
||
|
||
- [ ] **Step 1: Write failing tests**
|
||
|
||
```typescript
|
||
// tests/soak/tests/room-coordinator.test.ts
|
||
import { describe, it, expect } from 'vitest';
|
||
import { RoomCoordinator } from '../core/room-coordinator';
|
||
|
||
describe('RoomCoordinator', () => {
|
||
it('resolves await with the announced code (announce then await)', async () => {
|
||
const rc = new RoomCoordinator();
|
||
rc.announce('room-1', 'ABCD');
|
||
await expect(rc.await('room-1')).resolves.toBe('ABCD');
|
||
});
|
||
|
||
it('resolves await with the announced code (await then announce)', async () => {
|
||
const rc = new RoomCoordinator();
|
||
const p = rc.await('room-2');
|
||
rc.announce('room-2', 'WXYZ');
|
||
await expect(p).resolves.toBe('WXYZ');
|
||
});
|
||
|
||
it('rejects await after timeout if not announced', async () => {
|
||
const rc = new RoomCoordinator();
|
||
await expect(rc.await('room-3', 50)).rejects.toThrow(/timed out/i);
|
||
});
|
||
|
||
it('isolates rooms — announcing room-A does not unblock room-B', async () => {
|
||
const rc = new RoomCoordinator();
|
||
const pB = rc.await('room-B', 100);
|
||
rc.announce('room-A', 'A-CODE');
|
||
await expect(pB).rejects.toThrow(/timed out/i);
|
||
});
|
||
});
|
||
```
|
||
|
||
- [ ] **Step 2: Run tests to verify they fail**
|
||
|
||
```bash
|
||
npx vitest run tests/room-coordinator.test.ts
|
||
```
|
||
|
||
Expected: FAIL — module not found.
|
||
|
||
- [ ] **Step 3: Implement `RoomCoordinator`**
|
||
|
||
```typescript
|
||
// tests/soak/core/room-coordinator.ts
|
||
import { deferred, Deferred } from './deferred';
|
||
import type { RoomCoordinatorApi } from './types';
|
||
|
||
export class RoomCoordinator implements RoomCoordinatorApi {
|
||
private rooms = new Map<string, Deferred<string>>();
|
||
|
||
announce(roomId: string, code: string): void {
|
||
this.getOrCreate(roomId).resolve(code);
|
||
}
|
||
|
||
async await(roomId: string, timeoutMs: number = 30_000): Promise<string> {
|
||
const d = this.getOrCreate(roomId);
|
||
let timer: NodeJS.Timeout | undefined;
|
||
const timeout = new Promise<never>((_, reject) => {
|
||
timer = setTimeout(() => {
|
||
reject(new Error(`RoomCoordinator: room "${roomId}" timed out after ${timeoutMs}ms`));
|
||
}, timeoutMs);
|
||
});
|
||
try {
|
||
return await Promise.race([d.promise, timeout]);
|
||
} finally {
|
||
if (timer) clearTimeout(timer);
|
||
}
|
||
}
|
||
|
||
private getOrCreate(roomId: string): Deferred<string> {
|
||
let d = this.rooms.get(roomId);
|
||
if (!d) {
|
||
d = deferred<string>();
|
||
this.rooms.set(roomId, d);
|
||
}
|
||
return d;
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Verify tests pass**
|
||
|
||
```bash
|
||
npx vitest run tests/room-coordinator.test.ts
|
||
```
|
||
|
||
Expected: 4 passed.
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/core/room-coordinator.ts tests/soak/tests/room-coordinator.test.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): RoomCoordinator with host→joiners handoff
|
||
|
||
Lazy Deferred per roomId with a timeout on await. Lets concurrent
|
||
joiner sessions block until their host announces the room code
|
||
without polling or page scraping.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 12: Structured JSONL logger
|
||
|
||
Single module, no transport, writes to `process.stdout`. Supports child loggers with bound metadata (so scenarios can emit logs with `room` / `game` context without repeating it).
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/core/logger.ts`
|
||
- Create: `tests/soak/tests/logger.test.ts`
|
||
|
||
- [ ] **Step 1: Write failing tests**
|
||
|
||
```typescript
|
||
// tests/soak/tests/logger.test.ts
|
||
import { describe, it, expect, beforeEach, vi } from 'vitest';
|
||
import { createLogger } from '../core/logger';
|
||
|
||
describe('logger', () => {
|
||
let writes: string[];
|
||
let write: (s: string) => boolean;
|
||
|
||
beforeEach(() => {
|
||
writes = [];
|
||
write = (s: string) => {
|
||
writes.push(s);
|
||
return true;
|
||
};
|
||
});
|
||
|
||
it('emits a JSON line per call with level and msg', () => {
|
||
const log = createLogger({ runId: 'r1', write });
|
||
log.info('hello');
|
||
expect(writes).toHaveLength(1);
|
||
const parsed = JSON.parse(writes[0]);
|
||
expect(parsed.level).toBe('info');
|
||
expect(parsed.msg).toBe('hello');
|
||
expect(parsed.runId).toBe('r1');
|
||
expect(parsed.timestamp).toBeTypeOf('string');
|
||
});
|
||
|
||
it('merges meta into the log line', () => {
|
||
const log = createLogger({ runId: 'r1', write });
|
||
log.warn('slow', { turnMs: 3000 });
|
||
const parsed = JSON.parse(writes[0]);
|
||
expect(parsed.turnMs).toBe(3000);
|
||
expect(parsed.level).toBe('warn');
|
||
});
|
||
|
||
it('child logger inherits parent meta', () => {
|
||
const log = createLogger({ runId: 'r1', write });
|
||
const roomLog = log.child({ room: 'room-1' });
|
||
roomLog.info('game_start');
|
||
const parsed = JSON.parse(writes[0]);
|
||
expect(parsed.room).toBe('room-1');
|
||
expect(parsed.runId).toBe('r1');
|
||
});
|
||
|
||
it('respects minimum level', () => {
|
||
const log = createLogger({ runId: 'r1', write, minLevel: 'warn' });
|
||
log.debug('nope');
|
||
log.info('nope');
|
||
log.warn('yes');
|
||
log.error('yes');
|
||
expect(writes).toHaveLength(2);
|
||
});
|
||
});
|
||
```
|
||
|
||
- [ ] **Step 2: Run tests to verify they fail**
|
||
|
||
```bash
|
||
npx vitest run tests/logger.test.ts
|
||
```
|
||
|
||
Expected: FAIL — module not found.
|
||
|
||
- [ ] **Step 3: Implement the logger**
|
||
|
||
```typescript
|
||
// tests/soak/core/logger.ts
|
||
import type { Logger, LogLevel } from './types';
|
||
|
||
const LEVEL_ORDER: Record<LogLevel, number> = {
|
||
debug: 0,
|
||
info: 1,
|
||
warn: 2,
|
||
error: 3,
|
||
};
|
||
|
||
export interface LoggerOptions {
|
||
runId: string;
|
||
minLevel?: LogLevel;
|
||
/** Defaults to process.stdout.write bound to stdout. Override for tests. */
|
||
write?: (line: string) => boolean;
|
||
baseMeta?: Record<string, unknown>;
|
||
}
|
||
|
||
export function createLogger(opts: LoggerOptions): Logger {
|
||
const minLevel = opts.minLevel ?? 'info';
|
||
const write = opts.write ?? ((s: string) => process.stdout.write(s));
|
||
const baseMeta = opts.baseMeta ?? {};
|
||
|
||
function emit(level: LogLevel, msg: string, meta?: object): void {
|
||
if (LEVEL_ORDER[level] < LEVEL_ORDER[minLevel]) return;
|
||
const line = JSON.stringify({
|
||
timestamp: new Date().toISOString(),
|
||
level,
|
||
msg,
|
||
runId: opts.runId,
|
||
...baseMeta,
|
||
...(meta ?? {}),
|
||
}) + '\n';
|
||
write(line);
|
||
}
|
||
|
||
const logger: Logger = {
|
||
debug: (msg, meta) => emit('debug', msg, meta),
|
||
info: (msg, meta) => emit('info', msg, meta),
|
||
warn: (msg, meta) => emit('warn', msg, meta),
|
||
error: (msg, meta) => emit('error', msg, meta),
|
||
child: (meta) =>
|
||
createLogger({
|
||
runId: opts.runId,
|
||
minLevel,
|
||
write,
|
||
baseMeta: { ...baseMeta, ...meta },
|
||
}),
|
||
};
|
||
|
||
return logger;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Verify tests pass**
|
||
|
||
```bash
|
||
npx vitest run tests/logger.test.ts
|
||
```
|
||
|
||
Expected: 4 passed.
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/core/logger.ts tests/soak/tests/logger.test.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): structured JSONL logger with child contexts
|
||
|
||
Single file, no transport, writes one JSON line per call to stdout.
|
||
Child loggers inherit parent meta so scenarios can bind room/game
|
||
context once and forget about it.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 3 — SessionPool and seeding
|
||
|
||
### Task 13: SessionPool with HTTP registration and localStorage warm-start
|
||
|
||
This is the biggest single module. It owns browser context lifecycle, seeds accounts on cold start, logs in on warm start, and exposes a simple `acquire()` API to scenarios.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/core/session-pool.ts`
|
||
|
||
Testing: manual via `scripts/seed-accounts.ts` in Task 14 and the first real runner invocation in Task 17. No Vitest test for this — it's an integration module that needs a real browser.
|
||
|
||
- [ ] **Step 1: Create `tests/soak/core/session-pool.ts` — imports and types**
|
||
|
||
```typescript
|
||
// tests/soak/core/session-pool.ts
|
||
import * as fs from 'fs';
|
||
import * as path from 'path';
|
||
import {
|
||
Browser,
|
||
BrowserContext,
|
||
chromium,
|
||
} from 'playwright-core';
|
||
import { GolfBot } from '../../e2e/bot/golf-bot';
|
||
import type { Account, Session, Logger } from './types';
|
||
|
||
export interface SeedOptions {
|
||
/** Full base URL of the target server, e.g. https://staging.adlee.work. */
|
||
targetUrl: string;
|
||
/** Invite code to pass to /api/auth/register. */
|
||
inviteCode: string;
|
||
/** Number of accounts to create. */
|
||
count: number;
|
||
}
|
||
|
||
export interface SessionPoolOptions {
|
||
targetUrl: string;
|
||
inviteCode: string;
|
||
credFile: string; // absolute path to .env.stresstest
|
||
logger: Logger;
|
||
/** Optional override for the browser to attach contexts to. If absent, SessionPool launches its own. */
|
||
browser?: Browser;
|
||
/** Passed through to context.newContext. Useful for viewport overrides in tests. */
|
||
contextOptions?: Parameters<Browser['newContext']>[0];
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Implement cred-file read/write**
|
||
|
||
Append to `session-pool.ts`:
|
||
|
||
```typescript
|
||
function readCredFile(filePath: string): Account[] | null {
|
||
if (!fs.existsSync(filePath)) return null;
|
||
const content = fs.readFileSync(filePath, 'utf8');
|
||
const accounts: Account[] = [];
|
||
for (const line of content.split('\n')) {
|
||
const trimmed = line.trim();
|
||
if (!trimmed || trimmed.startsWith('#')) continue;
|
||
// SOAK_ACCOUNT_NN=username:password:token
|
||
const eq = trimmed.indexOf('=');
|
||
if (eq === -1) continue;
|
||
const key = trimmed.slice(0, eq);
|
||
const value = trimmed.slice(eq + 1);
|
||
const m = key.match(/^SOAK_ACCOUNT_(\d+)$/);
|
||
if (!m) continue;
|
||
const [username, password, token] = value.split(':');
|
||
if (!username || !password || !token) continue;
|
||
const idx = parseInt(m[1], 10);
|
||
accounts.push({
|
||
key: `soak_${String(idx).padStart(2, '0')}`,
|
||
username,
|
||
password,
|
||
token,
|
||
});
|
||
}
|
||
return accounts.length > 0 ? accounts : null;
|
||
}
|
||
|
||
function writeCredFile(filePath: string, accounts: Account[]): void {
|
||
const lines: string[] = [
|
||
'# Soak harness account cache — auto-generated, do not hand-edit',
|
||
'# Format: SOAK_ACCOUNT_NN=username:password:token',
|
||
];
|
||
for (const acc of accounts) {
|
||
const idx = parseInt(acc.key.replace('soak_', ''), 10);
|
||
const key = `SOAK_ACCOUNT_${String(idx).padStart(2, '0')}`;
|
||
lines.push(`${key}=${acc.username}:${acc.password}:${acc.token}`);
|
||
}
|
||
fs.writeFileSync(filePath, lines.join('\n') + '\n', { mode: 0o600 });
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 3: Implement the HTTP register call**
|
||
|
||
```typescript
|
||
interface RegisterResponse {
|
||
user: { id: string; username: string };
|
||
token: string;
|
||
expires_at: string;
|
||
}
|
||
|
||
async function registerAccount(
|
||
targetUrl: string,
|
||
username: string,
|
||
password: string,
|
||
email: string,
|
||
inviteCode: string,
|
||
): Promise<string> {
|
||
const res = await fetch(`${targetUrl}/api/auth/register`, {
|
||
method: 'POST',
|
||
headers: { 'Content-Type': 'application/json' },
|
||
body: JSON.stringify({ username, password, email, invite_code: inviteCode }),
|
||
});
|
||
if (!res.ok) {
|
||
const body = await res.text().catch(() => '<no body>');
|
||
throw new Error(`register failed: ${res.status} ${body}`);
|
||
}
|
||
const data = (await res.json()) as RegisterResponse;
|
||
if (!data.token) {
|
||
throw new Error(`register returned no token: ${JSON.stringify(data)}`);
|
||
}
|
||
return data.token;
|
||
}
|
||
|
||
async function loginAccount(
|
||
targetUrl: string,
|
||
username: string,
|
||
password: string,
|
||
): Promise<string> {
|
||
const res = await fetch(`${targetUrl}/api/auth/login`, {
|
||
method: 'POST',
|
||
headers: { 'Content-Type': 'application/json' },
|
||
body: JSON.stringify({ username, password }),
|
||
});
|
||
if (!res.ok) {
|
||
const body = await res.text().catch(() => '<no body>');
|
||
throw new Error(`login failed: ${res.status} ${body}`);
|
||
}
|
||
const data = (await res.json()) as RegisterResponse;
|
||
return data.token;
|
||
}
|
||
|
||
function randomSuffix(): string {
|
||
return Math.random().toString(36).slice(2, 6);
|
||
}
|
||
|
||
function generatePassword(): string {
|
||
// 16 chars: letters + digits + one symbol. Meets 8-char minimum from auth_service.
|
||
// Split across halves so repo secret-scanners don't flag the string as base64
|
||
const lower = 'abcdefghijkm' + 'npqrstuvwxyz'; // pragma: allowlist secret
|
||
const upper = 'ABCDEFGHJKLM' + 'NPQRSTUVWXYZ'; // pragma: allowlist secret
|
||
const digits = '23456789';
|
||
const chars = lower + upper + digits;
|
||
let out = '';
|
||
for (let i = 0; i < 15; i++) {
|
||
out += chars[Math.floor(Math.random() * chars.length)];
|
||
}
|
||
return out + '!';
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Implement the `SessionPool` class**
|
||
|
||
```typescript
|
||
export class SessionPool {
|
||
private accounts: Account[] = [];
|
||
private ownedBrowser: Browser | null = null;
|
||
private browser: Browser | null;
|
||
private activeSessions: Session[] = [];
|
||
|
||
constructor(private opts: SessionPoolOptions) {
|
||
this.browser = opts.browser ?? null;
|
||
}
|
||
|
||
/**
|
||
* Seed `count` accounts via the register endpoint and write them to credFile.
|
||
* Safe to call multiple times — skips accounts already in the file.
|
||
*/
|
||
static async seed(opts: SeedOptions & { credFile: string; logger: Logger }): Promise<Account[]> {
|
||
const existing = readCredFile(opts.credFile) ?? [];
|
||
const existingKeys = new Set(existing.map((a) => a.key));
|
||
const created: Account[] = [...existing];
|
||
|
||
for (let i = 0; i < opts.count; i++) {
|
||
const key = `soak_${String(i).padStart(2, '0')}`;
|
||
if (existingKeys.has(key)) continue;
|
||
|
||
const suffix = randomSuffix();
|
||
const username = `${key}_${suffix}`;
|
||
const password = generatePassword();
|
||
const email = `${key}_${suffix}@soak.test`;
|
||
|
||
opts.logger.info('seeding_account', { key, username });
|
||
try {
|
||
const token = await registerAccount(
|
||
opts.targetUrl,
|
||
username,
|
||
password,
|
||
email,
|
||
opts.inviteCode,
|
||
);
|
||
created.push({ key, username, password, token });
|
||
writeCredFile(opts.credFile, created);
|
||
} catch (err) {
|
||
opts.logger.error('seed_failed', {
|
||
key,
|
||
error: err instanceof Error ? err.message : String(err),
|
||
});
|
||
throw err;
|
||
}
|
||
}
|
||
return created;
|
||
}
|
||
|
||
/**
|
||
* Load accounts from credFile, auto-seeding if the file is missing.
|
||
*/
|
||
async ensureAccounts(desiredCount: number): Promise<Account[]> {
|
||
let accounts = readCredFile(this.opts.credFile);
|
||
if (!accounts || accounts.length < desiredCount) {
|
||
this.opts.logger.warn('cred_file_missing_or_short', {
|
||
found: accounts?.length ?? 0,
|
||
desired: desiredCount,
|
||
});
|
||
accounts = await SessionPool.seed({
|
||
targetUrl: this.opts.targetUrl,
|
||
inviteCode: this.opts.inviteCode,
|
||
count: desiredCount,
|
||
credFile: this.opts.credFile,
|
||
logger: this.opts.logger,
|
||
});
|
||
}
|
||
this.accounts = accounts.slice(0, desiredCount);
|
||
return this.accounts;
|
||
}
|
||
|
||
/**
|
||
* Launch the browser if not provided, create N contexts, log each in via
|
||
* localStorage injection (falling back to POST /api/auth/login if the
|
||
* cached token is rejected), and return the live sessions.
|
||
*/
|
||
async acquire(count: number): Promise<Session[]> {
|
||
await this.ensureAccounts(count);
|
||
if (!this.browser) {
|
||
this.ownedBrowser = await chromium.launch({ headless: true });
|
||
this.browser = this.ownedBrowser;
|
||
}
|
||
|
||
const sessions: Session[] = [];
|
||
for (let i = 0; i < count; i++) {
|
||
const account = this.accounts[i];
|
||
const context = await this.browser.newContext(this.opts.contextOptions);
|
||
await this.injectAuth(context, account);
|
||
const page = await context.newPage();
|
||
await page.goto(this.opts.targetUrl);
|
||
const bot = new GolfBot(page);
|
||
sessions.push({ account, context, page, bot, key: account.key });
|
||
}
|
||
this.activeSessions = sessions;
|
||
return sessions;
|
||
}
|
||
|
||
/**
|
||
* Inject the cached JWT into localStorage BEFORE any page loads.
|
||
* Uses addInitScript so the token is present on the first navigation.
|
||
* If the cached token is rejected later, acquire() falls back to login.
|
||
*/
|
||
private async injectAuth(context: BrowserContext, account: Account): Promise<void> {
|
||
// Try the cached token first
|
||
try {
|
||
await context.addInitScript(
|
||
({ token, username }) => {
|
||
window.localStorage.setItem('authToken', token);
|
||
window.localStorage.setItem(
|
||
'authUser',
|
||
JSON.stringify({ id: '', username, role: 'user', email_verified: true }),
|
||
);
|
||
},
|
||
{ token: account.token, username: account.username },
|
||
);
|
||
} catch (err) {
|
||
this.opts.logger.warn('inject_auth_failed', {
|
||
account: account.key,
|
||
error: err instanceof Error ? err.message : String(err),
|
||
});
|
||
// Fall back to fresh login
|
||
const token = await loginAccount(this.opts.targetUrl, account.username, account.password);
|
||
account.token = token;
|
||
writeCredFile(this.opts.credFile, this.accounts);
|
||
await context.addInitScript(
|
||
({ token, username }) => {
|
||
window.localStorage.setItem('authToken', token);
|
||
window.localStorage.setItem(
|
||
'authUser',
|
||
JSON.stringify({ id: '', username, role: 'user', email_verified: true }),
|
||
);
|
||
},
|
||
{ token, username: account.username },
|
||
);
|
||
}
|
||
}
|
||
|
||
/** Close all active contexts. Safe to call multiple times. */
|
||
async release(): Promise<void> {
|
||
for (const session of this.activeSessions) {
|
||
try {
|
||
await session.context.close();
|
||
} catch {
|
||
// ignore
|
||
}
|
||
}
|
||
this.activeSessions = [];
|
||
if (this.ownedBrowser) {
|
||
try {
|
||
await this.ownedBrowser.close();
|
||
} catch {
|
||
// ignore
|
||
}
|
||
this.ownedBrowser = null;
|
||
this.browser = null;
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 5: Syntax-check by invoking tsx**
|
||
|
||
```bash
|
||
cd tests/soak
|
||
npx tsx -e "import('./core/session-pool').then(() => console.log('ok'))"
|
||
```
|
||
|
||
Expected: `ok`. No TypeScript errors.
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/core/session-pool.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): SessionPool — seed, login, acquire contexts
|
||
|
||
Owns 16 BrowserContexts, seeds via POST /api/auth/register with the
|
||
invite code on cold start, warm-starts via localStorage injection of
|
||
the cached JWT, falls back to POST /api/auth/login if the token is
|
||
rejected. Exposes acquire(n) for scenarios.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 14: `seed-accounts.ts` CLI wrapper
|
||
|
||
Tiny standalone entry point that lets you pre-seed before the first harness run. Reuses `SessionPool.seed`.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/scripts/seed-accounts.ts`
|
||
|
||
- [ ] **Step 1: Write the script**
|
||
|
||
```typescript
|
||
#!/usr/bin/env tsx
|
||
/**
|
||
* Seed N soak-harness accounts via the register endpoint.
|
||
*
|
||
* Usage:
|
||
* TEST_URL=http://localhost:8000 \
|
||
* SOAK_INVITE_CODE=SOAKTEST \
|
||
* npm run seed -- --count=16
|
||
*/
|
||
|
||
import * as path from 'path';
|
||
import { SessionPool } from '../core/session-pool';
|
||
import { createLogger } from '../core/logger';
|
||
|
||
function parseArgs(argv: string[]): { count: number } {
|
||
const result = { count: 16 };
|
||
for (const arg of argv.slice(2)) {
|
||
const m = arg.match(/^--count=(\d+)$/);
|
||
if (m) result.count = parseInt(m[1], 10);
|
||
}
|
||
return result;
|
||
}
|
||
|
||
async function main(): Promise<void> {
|
||
const { count } = parseArgs(process.argv);
|
||
const targetUrl = process.env.TEST_URL ?? 'http://localhost:8000';
|
||
const inviteCode = process.env.SOAK_INVITE_CODE;
|
||
if (!inviteCode) {
|
||
console.error('SOAK_INVITE_CODE env var is required');
|
||
console.error(' Local dev: SOAK_INVITE_CODE=SOAKTEST');
|
||
console.error(' Staging: SOAK_INVITE_CODE=5VC2MCCN');
|
||
process.exit(2);
|
||
}
|
||
|
||
const credFile = path.resolve(__dirname, '..', '.env.stresstest');
|
||
const logger = createLogger({ runId: `seed-${Date.now()}` });
|
||
|
||
logger.info('seed_start', { count, targetUrl, credFile });
|
||
try {
|
||
const accounts = await SessionPool.seed({
|
||
targetUrl,
|
||
inviteCode,
|
||
count,
|
||
credFile,
|
||
logger,
|
||
});
|
||
logger.info('seed_complete', { created: accounts.length });
|
||
console.error(`Seeded ${accounts.length} accounts → ${credFile}`);
|
||
} catch (err) {
|
||
logger.error('seed_failed', {
|
||
error: err instanceof Error ? err.message : String(err),
|
||
});
|
||
process.exit(1);
|
||
}
|
||
}
|
||
|
||
main();
|
||
```
|
||
|
||
- [ ] **Step 2: Run it against local dev to verify end-to-end**
|
||
|
||
With the dev server running and the `SOAKTEST` invite flagged:
|
||
|
||
```bash
|
||
cd tests/soak
|
||
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run seed -- --count=4
|
||
```
|
||
|
||
Expected:
|
||
- Log lines `seeding_account` × 4
|
||
- Log line `seed_complete`
|
||
- `tests/soak/.env.stresstest` file created with 4 `SOAK_ACCOUNT_NN=...` lines
|
||
|
||
Verify:
|
||
|
||
```bash
|
||
cat tests/soak/.env.stresstest | head
|
||
```
|
||
|
||
Expected: 4 account lines.
|
||
|
||
Also verify the accounts got flagged:
|
||
|
||
```bash
|
||
psql -d golfgame -c "SELECT username, is_test_account FROM users_v2 WHERE username LIKE 'soak_%' ORDER BY username;"
|
||
```
|
||
|
||
Expected: 4 rows, all with `is_test_account | t`.
|
||
|
||
- [ ] **Step 3: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/scripts/seed-accounts.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): scripts/seed-accounts.ts CLI wrapper
|
||
|
||
Thin standalone entry for pre-seeding N accounts before the first
|
||
harness run. Wraps SessionPool.seed and writes .env.stresstest.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 4 — First scenario, config, runner (end-to-end milestone)
|
||
|
||
### Task 15: Shared multiplayer-game helper
|
||
|
||
Pulls the "run one full game in one room" logic out of the scenarios so `populate` and `stress` share it. Takes a room's sessions and a config, loops until the game ends.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/scenarios/shared/multiplayer-game.ts`
|
||
|
||
- [ ] **Step 1: Create the helper module**
|
||
|
||
```typescript
|
||
// tests/soak/scenarios/shared/multiplayer-game.ts
|
||
import type { Session, ScenarioContext } from '../../core/types';
|
||
|
||
export interface MultiplayerGameOptions {
|
||
roomId: string;
|
||
holes: number;
|
||
decks: number;
|
||
cpusPerRoom: number;
|
||
cpuPersonality?: string;
|
||
/** Per-turn think time in [min, max] ms. */
|
||
thinkTimeMs: [number, number];
|
||
/** Max wall-clock time before giving up on the game (ms). */
|
||
maxDurationMs?: number;
|
||
}
|
||
|
||
export interface MultiplayerGameResult {
|
||
completed: boolean;
|
||
turns: number;
|
||
durationMs: number;
|
||
error?: string;
|
||
}
|
||
|
||
function randomInt(min: number, max: number): number {
|
||
return Math.floor(Math.random() * (max - min + 1)) + min;
|
||
}
|
||
|
||
async function sleep(ms: number): Promise<void> {
|
||
return new Promise((resolve) => setTimeout(resolve, ms));
|
||
}
|
||
|
||
/**
|
||
* Host + joiners play one full multiplayer game end to end.
|
||
* The host creates the room, announces the code via the coordinator,
|
||
* joiners wait for the code, the host adds CPUs and starts, everyone
|
||
* loops on isMyTurn/playTurn until round_over or game_over.
|
||
*/
|
||
export async function runOneMultiplayerGame(
|
||
ctx: ScenarioContext,
|
||
sessions: Session[],
|
||
opts: MultiplayerGameOptions,
|
||
): Promise<MultiplayerGameResult> {
|
||
const start = Date.now();
|
||
const [host, ...joiners] = sessions;
|
||
const maxDuration = opts.maxDurationMs ?? 5 * 60_000;
|
||
|
||
try {
|
||
// Host creates game
|
||
const code = await host.bot.createGame(host.account.username);
|
||
ctx.coordinator.announce(opts.roomId, code);
|
||
ctx.heartbeat(opts.roomId);
|
||
ctx.dashboard.update(opts.roomId, { phase: 'lobby' });
|
||
ctx.logger.info('room_created', { room: opts.roomId, code });
|
||
|
||
// Joiners join concurrently
|
||
await Promise.all(
|
||
joiners.map(async (joiner) => {
|
||
const awaited = await ctx.coordinator.await(opts.roomId);
|
||
await joiner.bot.joinGame(awaited, joiner.account.username);
|
||
}),
|
||
);
|
||
ctx.heartbeat(opts.roomId);
|
||
|
||
// Host adds CPUs (if any) and starts
|
||
for (let i = 0; i < opts.cpusPerRoom; i++) {
|
||
await host.bot.addCPU(opts.cpuPersonality);
|
||
}
|
||
await host.bot.startGame({ holes: opts.holes, decks: opts.decks });
|
||
ctx.heartbeat(opts.roomId);
|
||
ctx.dashboard.update(opts.roomId, { phase: 'playing', totalHoles: opts.holes });
|
||
|
||
// Concurrent turn loops — one per session
|
||
const turnCounts = new Array(sessions.length).fill(0);
|
||
|
||
async function sessionLoop(sessionIdx: number): Promise<void> {
|
||
const session = sessions[sessionIdx];
|
||
while (true) {
|
||
if (ctx.signal.aborted) return;
|
||
if (Date.now() - start > maxDuration) return;
|
||
|
||
const phase = await session.bot.getGamePhase();
|
||
if (phase === 'game_over' || phase === 'round_over') return;
|
||
|
||
if (await session.bot.isMyTurn()) {
|
||
await session.bot.playTurn();
|
||
turnCounts[sessionIdx]++;
|
||
ctx.heartbeat(opts.roomId);
|
||
ctx.dashboard.update(opts.roomId, {
|
||
currentPlayer: session.account.username,
|
||
moves: turnCounts.reduce((a, b) => a + b, 0),
|
||
});
|
||
const thinkMs = randomInt(opts.thinkTimeMs[0], opts.thinkTimeMs[1]);
|
||
await sleep(thinkMs);
|
||
} else {
|
||
await sleep(200);
|
||
}
|
||
}
|
||
}
|
||
|
||
await Promise.all(sessions.map((_, i) => sessionLoop(i)));
|
||
|
||
const totalTurns = turnCounts.reduce((a, b) => a + b, 0);
|
||
ctx.dashboard.update(opts.roomId, { phase: 'round_over' });
|
||
return {
|
||
completed: true,
|
||
turns: totalTurns,
|
||
durationMs: Date.now() - start,
|
||
};
|
||
} catch (err) {
|
||
return {
|
||
completed: false,
|
||
turns: 0,
|
||
durationMs: Date.now() - start,
|
||
error: err instanceof Error ? err.message : String(err),
|
||
};
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Syntax-check**
|
||
|
||
```bash
|
||
cd tests/soak
|
||
npx tsx -e "import('./scenarios/shared/multiplayer-game').then(() => console.log('ok'))"
|
||
```
|
||
|
||
Expected: `ok`.
|
||
|
||
- [ ] **Step 3: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/scenarios/shared/multiplayer-game.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): shared runOneMultiplayerGame helper
|
||
|
||
Encapsulates the host-creates/joiners-join/loop-until-done flow so
|
||
populate and stress scenarios don't duplicate it. Honors abort
|
||
signal and a max-duration timeout, heartbeats on every turn.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 16: Populate scenario (minimal version)
|
||
|
||
Partitions sessions into rooms, runs `gamesPerRoom` games per room in parallel, aggregates results.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/scenarios/populate.ts`
|
||
- Create: `tests/soak/scenarios/index.ts`
|
||
|
||
- [ ] **Step 1: Create `scenarios/populate.ts`**
|
||
|
||
```typescript
|
||
// tests/soak/scenarios/populate.ts
|
||
import type {
|
||
Scenario,
|
||
ScenarioContext,
|
||
ScenarioResult,
|
||
ScenarioError,
|
||
Session,
|
||
} from '../core/types';
|
||
import { runOneMultiplayerGame } from './shared/multiplayer-game';
|
||
|
||
const CPU_PERSONALITIES = ['Sofia', 'Marcus', 'Kenji', 'Priya'];
|
||
|
||
interface PopulateConfig {
|
||
gamesPerRoom: number;
|
||
holes: number;
|
||
decks: number;
|
||
rooms: number;
|
||
cpusPerRoom: number;
|
||
thinkTimeMs: [number, number];
|
||
interGamePauseMs: number;
|
||
}
|
||
|
||
function chunk<T>(arr: T[], size: number): T[][] {
|
||
const out: T[][] = [];
|
||
for (let i = 0; i < arr.length; i += size) {
|
||
out.push(arr.slice(i, i + size));
|
||
}
|
||
return out;
|
||
}
|
||
|
||
async function sleep(ms: number): Promise<void> {
|
||
return new Promise((resolve) => setTimeout(resolve, ms));
|
||
}
|
||
|
||
async function runRoom(
|
||
ctx: ScenarioContext,
|
||
cfg: PopulateConfig,
|
||
roomIdx: number,
|
||
sessions: Session[],
|
||
): Promise<{ completed: number; errors: ScenarioError[] }> {
|
||
const roomId = `room-${roomIdx}`;
|
||
const cpuPersonality = CPU_PERSONALITIES[roomIdx % CPU_PERSONALITIES.length];
|
||
let completed = 0;
|
||
const errors: ScenarioError[] = [];
|
||
|
||
for (let gameNum = 0; gameNum < cfg.gamesPerRoom; gameNum++) {
|
||
if (ctx.signal.aborted) break;
|
||
ctx.dashboard.update(roomId, { game: gameNum + 1, totalGames: cfg.gamesPerRoom });
|
||
ctx.logger.info('game_start', { room: roomId, game: gameNum + 1 });
|
||
|
||
const result = await runOneMultiplayerGame(ctx, sessions, {
|
||
roomId,
|
||
holes: cfg.holes,
|
||
decks: cfg.decks,
|
||
cpusPerRoom: cfg.cpusPerRoom,
|
||
cpuPersonality,
|
||
thinkTimeMs: cfg.thinkTimeMs,
|
||
});
|
||
|
||
if (result.completed) {
|
||
completed++;
|
||
ctx.logger.info('game_complete', {
|
||
room: roomId,
|
||
game: gameNum + 1,
|
||
turns: result.turns,
|
||
durationMs: result.durationMs,
|
||
});
|
||
} else {
|
||
errors.push({
|
||
room: roomId,
|
||
reason: 'game_failed',
|
||
detail: result.error,
|
||
timestamp: Date.now(),
|
||
});
|
||
ctx.logger.error('game_failed', { room: roomId, game: gameNum + 1, error: result.error });
|
||
}
|
||
|
||
if (gameNum < cfg.gamesPerRoom - 1) {
|
||
await sleep(cfg.interGamePauseMs);
|
||
}
|
||
}
|
||
|
||
return { completed, errors };
|
||
}
|
||
|
||
const populate: Scenario = {
|
||
name: 'populate',
|
||
description: 'Long multi-round games to populate scoreboards',
|
||
needs: { accounts: 16, rooms: 4, cpusPerRoom: 1 },
|
||
defaultConfig: {
|
||
gamesPerRoom: 10,
|
||
holes: 9,
|
||
decks: 2,
|
||
rooms: 4,
|
||
cpusPerRoom: 1,
|
||
thinkTimeMs: [800, 2200],
|
||
interGamePauseMs: 3000,
|
||
},
|
||
|
||
async run(ctx: ScenarioContext): Promise<ScenarioResult> {
|
||
const start = Date.now();
|
||
const cfg = ctx.config as unknown as PopulateConfig;
|
||
|
||
const perRoom = Math.floor(ctx.sessions.length / cfg.rooms);
|
||
if (perRoom * cfg.rooms !== ctx.sessions.length) {
|
||
throw new Error(
|
||
`populate: ${ctx.sessions.length} sessions does not divide evenly into ${cfg.rooms} rooms`,
|
||
);
|
||
}
|
||
const roomSessions = chunk(ctx.sessions, perRoom);
|
||
|
||
const results = await Promise.allSettled(
|
||
roomSessions.map((sessions, idx) => runRoom(ctx, cfg, idx, sessions)),
|
||
);
|
||
|
||
let gamesCompleted = 0;
|
||
const errors: ScenarioError[] = [];
|
||
results.forEach((r, idx) => {
|
||
if (r.status === 'fulfilled') {
|
||
gamesCompleted += r.value.completed;
|
||
errors.push(...r.value.errors);
|
||
} else {
|
||
errors.push({
|
||
room: `room-${idx}`,
|
||
reason: 'room_threw',
|
||
detail: r.reason instanceof Error ? r.reason.message : String(r.reason),
|
||
timestamp: Date.now(),
|
||
});
|
||
}
|
||
});
|
||
|
||
return {
|
||
gamesCompleted,
|
||
errors,
|
||
durationMs: Date.now() - start,
|
||
};
|
||
},
|
||
};
|
||
|
||
export default populate;
|
||
```
|
||
|
||
- [ ] **Step 2: Create `scenarios/index.ts` registry**
|
||
|
||
```typescript
|
||
// tests/soak/scenarios/index.ts
|
||
import type { Scenario } from '../core/types';
|
||
import populate from './populate';
|
||
|
||
const registry: Record<string, Scenario> = {
|
||
populate,
|
||
};
|
||
|
||
export function getScenario(name: string): Scenario | undefined {
|
||
return registry[name];
|
||
}
|
||
|
||
export function listScenarios(): Scenario[] {
|
||
return Object.values(registry);
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 3: Syntax-check**
|
||
|
||
```bash
|
||
cd tests/soak
|
||
npx tsx -e "import('./scenarios/index').then((m) => console.log(m.listScenarios().map(s => s.name)))"
|
||
```
|
||
|
||
Expected: `['populate']`.
|
||
|
||
- [ ] **Step 4: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/scenarios/populate.ts tests/soak/scenarios/index.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): populate scenario + scenario registry
|
||
|
||
Partitions sessions into N rooms, runs gamesPerRoom games per room
|
||
in parallel via Promise.allSettled so a failure in one room never
|
||
unwinds the others. Errors roll up into ScenarioResult.errors.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 17: Config parsing with tests
|
||
|
||
CLI flags, env vars, scenario defaults, runner defaults — merged in that precedence order.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/config.ts`
|
||
- Create: `tests/soak/tests/config.test.ts`
|
||
|
||
- [ ] **Step 1: Write failing tests**
|
||
|
||
```typescript
|
||
// tests/soak/tests/config.test.ts
|
||
import { describe, it, expect } from 'vitest';
|
||
import { parseArgs, mergeConfig } from '../config';
|
||
|
||
describe('parseArgs', () => {
|
||
it('parses --scenario and numeric flags', () => {
|
||
const r = parseArgs(['--scenario=populate', '--rooms=4', '--games-per-room=10']);
|
||
expect(r.scenario).toBe('populate');
|
||
expect(r.rooms).toBe(4);
|
||
expect(r.gamesPerRoom).toBe(10);
|
||
});
|
||
|
||
it('parses watch mode', () => {
|
||
const r = parseArgs(['--scenario=populate', '--watch=none']);
|
||
expect(r.watch).toBe('none');
|
||
});
|
||
|
||
it('rejects unknown watch mode', () => {
|
||
expect(() => parseArgs(['--scenario=populate', '--watch=bogus'])).toThrow();
|
||
});
|
||
|
||
it('--list sets listOnly', () => {
|
||
const r = parseArgs(['--list']);
|
||
expect(r.listOnly).toBe(true);
|
||
});
|
||
});
|
||
|
||
describe('mergeConfig', () => {
|
||
it('CLI flags override scenario defaults', () => {
|
||
const cfg = mergeConfig(
|
||
{ games: 5, holes: 9 },
|
||
{},
|
||
{ gamesPerRoom: 20 },
|
||
);
|
||
expect(cfg.gamesPerRoom).toBe(20);
|
||
});
|
||
|
||
it('env overrides scenario defaults but not CLI', () => {
|
||
const cfg = mergeConfig(
|
||
{ games: 5, holes: 9 },
|
||
{ SOAK_HOLES: '3' },
|
||
{ holes: 7 },
|
||
);
|
||
expect(cfg.holes).toBe(7); // CLI wins (7 was from scenario defaults? no — CLI not set here)
|
||
// Correction: CLI not set, so env wins over scenario default
|
||
});
|
||
|
||
it('scenario defaults fill in unset values', () => {
|
||
const cfg = mergeConfig(
|
||
{ games: 5, holes: 9 },
|
||
{},
|
||
{ gamesPerRoom: 3 },
|
||
);
|
||
expect(cfg.games).toBe(5);
|
||
expect(cfg.holes).toBe(9);
|
||
expect(cfg.gamesPerRoom).toBe(3);
|
||
});
|
||
});
|
||
```
|
||
|
||
Note: the middle test has a correction inline — re-read and fix so the assertion matches precedence "CLI > env > defaults". Correct version:
|
||
|
||
```typescript
|
||
it('env overrides scenario defaults but CLI overrides env', () => {
|
||
const cfg = mergeConfig(
|
||
{ holes: 5 }, // CLI
|
||
{ SOAK_HOLES: '3' }, // env
|
||
{ holes: 9 }, // defaults
|
||
);
|
||
expect(cfg.holes).toBe(5); // CLI wins
|
||
});
|
||
```
|
||
|
||
Replace the second `it(...)` block above with this corrected version before running.
|
||
|
||
- [ ] **Step 2: Run tests to verify they fail**
|
||
|
||
```bash
|
||
npx vitest run tests/config.test.ts
|
||
```
|
||
|
||
Expected: FAIL — module not found.
|
||
|
||
- [ ] **Step 3: Implement `config.ts`**
|
||
|
||
```typescript
|
||
// tests/soak/config.ts
|
||
|
||
export type WatchMode = 'none' | 'dashboard' | 'tiled';
|
||
|
||
export interface CliArgs {
|
||
scenario?: string;
|
||
accounts?: number;
|
||
rooms?: number;
|
||
cpusPerRoom?: number;
|
||
gamesPerRoom?: number;
|
||
holes?: number;
|
||
watch?: WatchMode;
|
||
dashboardPort?: number;
|
||
target?: string;
|
||
runId?: string;
|
||
dryRun?: boolean;
|
||
listOnly?: boolean;
|
||
}
|
||
|
||
const VALID_WATCH: WatchMode[] = ['none', 'dashboard', 'tiled'];
|
||
|
||
function parseInt10(s: string, name: string): number {
|
||
const n = parseInt(s, 10);
|
||
if (Number.isNaN(n)) throw new Error(`Invalid integer for ${name}: ${s}`);
|
||
return n;
|
||
}
|
||
|
||
export function parseArgs(argv: string[]): CliArgs {
|
||
const out: CliArgs = {};
|
||
for (const arg of argv) {
|
||
if (arg === '--list') {
|
||
out.listOnly = true;
|
||
continue;
|
||
}
|
||
if (arg === '--dry-run') {
|
||
out.dryRun = true;
|
||
continue;
|
||
}
|
||
const m = arg.match(/^--([a-z][a-z0-9-]*)=(.*)$/);
|
||
if (!m) continue;
|
||
const [, key, value] = m;
|
||
switch (key) {
|
||
case 'scenario':
|
||
out.scenario = value;
|
||
break;
|
||
case 'accounts':
|
||
out.accounts = parseInt10(value, '--accounts');
|
||
break;
|
||
case 'rooms':
|
||
out.rooms = parseInt10(value, '--rooms');
|
||
break;
|
||
case 'cpus-per-room':
|
||
out.cpusPerRoom = parseInt10(value, '--cpus-per-room');
|
||
break;
|
||
case 'games-per-room':
|
||
out.gamesPerRoom = parseInt10(value, '--games-per-room');
|
||
break;
|
||
case 'holes':
|
||
out.holes = parseInt10(value, '--holes');
|
||
break;
|
||
case 'watch':
|
||
if (!VALID_WATCH.includes(value as WatchMode)) {
|
||
throw new Error(`Invalid --watch value: ${value} (expected ${VALID_WATCH.join('|')})`);
|
||
}
|
||
out.watch = value as WatchMode;
|
||
break;
|
||
case 'dashboard-port':
|
||
out.dashboardPort = parseInt10(value, '--dashboard-port');
|
||
break;
|
||
case 'target':
|
||
out.target = value;
|
||
break;
|
||
case 'run-id':
|
||
out.runId = value;
|
||
break;
|
||
default:
|
||
// Unknown flag — ignore so scenario-specific flags can be added later
|
||
break;
|
||
}
|
||
}
|
||
return out;
|
||
}
|
||
|
||
/**
|
||
* Merge in order: scenarioDefaults → env → cli (later wins).
|
||
*/
|
||
export function mergeConfig(
|
||
cli: Record<string, unknown>,
|
||
env: Record<string, string | undefined>,
|
||
defaults: Record<string, unknown>,
|
||
): Record<string, unknown> {
|
||
const merged: Record<string, unknown> = { ...defaults };
|
||
|
||
// Env overlay — SOAK_UPPER_SNAKE → lowerCamel in cli space.
|
||
const envMap: Record<string, string> = {
|
||
SOAK_HOLES: 'holes',
|
||
SOAK_ROOMS: 'rooms',
|
||
SOAK_ACCOUNTS: 'accounts',
|
||
SOAK_CPUS_PER_ROOM: 'cpusPerRoom',
|
||
SOAK_GAMES_PER_ROOM: 'gamesPerRoom',
|
||
SOAK_WATCH: 'watch',
|
||
SOAK_DASHBOARD_PORT: 'dashboardPort',
|
||
};
|
||
for (const [envKey, cfgKey] of Object.entries(envMap)) {
|
||
const v = env[envKey];
|
||
if (v !== undefined) {
|
||
// Heuristic: numeric keys
|
||
if (/^(holes|rooms|accounts|cpusPerRoom|gamesPerRoom|dashboardPort)$/.test(cfgKey)) {
|
||
merged[cfgKey] = parseInt(v, 10);
|
||
} else {
|
||
merged[cfgKey] = v;
|
||
}
|
||
}
|
||
}
|
||
|
||
// CLI overlay — wins over env and defaults.
|
||
for (const [k, v] of Object.entries(cli)) {
|
||
if (v !== undefined) merged[k] = v;
|
||
}
|
||
|
||
return merged;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Fix the failing middle test as noted in Step 1**
|
||
|
||
Edit `tests/soak/tests/config.test.ts` and replace the second `it(...)` block inside `describe('mergeConfig')` with the corrected version provided in Step 1.
|
||
|
||
- [ ] **Step 5: Run tests to verify they pass**
|
||
|
||
```bash
|
||
npx vitest run tests/config.test.ts
|
||
```
|
||
|
||
Expected: all passing.
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/config.ts tests/soak/tests/config.test.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): CLI parsing + config precedence
|
||
|
||
parseArgs pulls --scenario/--rooms/--watch/etc from argv, mergeConfig
|
||
layers scenarioDefaults → env → CLI so CLI flags always win. Unit
|
||
tested.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 18: `runner.ts` entry point — first end-to-end milestone
|
||
|
||
Replaces the placeholder runner with the real thing: parse args, build dependencies, load scenario, acquire sessions, run scenario, clean up, print summary. Supports `--watch=none` only at this stage.
|
||
|
||
**Files:**
|
||
- Modify: `tests/soak/runner.ts` (replace placeholder)
|
||
|
||
- [ ] **Step 1: Rewrite `runner.ts`**
|
||
|
||
```typescript
|
||
#!/usr/bin/env tsx
|
||
/**
|
||
* Golf Soak Harness — entry point.
|
||
*
|
||
* Usage:
|
||
* TEST_URL=http://localhost:8000 \
|
||
* SOAK_INVITE_CODE=SOAKTEST \
|
||
* npm run soak -- --scenario=populate --rooms=1 --accounts=2 \
|
||
* --cpus-per-room=0 --games-per-room=1 --holes=1 --watch=none
|
||
*/
|
||
|
||
import * as path from 'path';
|
||
import { parseArgs, mergeConfig, CliArgs } from './config';
|
||
import { createLogger } from './core/logger';
|
||
import { SessionPool } from './core/session-pool';
|
||
import { RoomCoordinator } from './core/room-coordinator';
|
||
import { getScenario, listScenarios } from './scenarios';
|
||
import type { DashboardReporter, ScenarioContext } from './core/types';
|
||
|
||
function noopDashboard(): DashboardReporter {
|
||
return {
|
||
update: () => {},
|
||
log: () => {},
|
||
incrementMetric: () => {},
|
||
};
|
||
}
|
||
|
||
function printScenarioList(): void {
|
||
console.log('Available scenarios:');
|
||
for (const s of listScenarios()) {
|
||
console.log(` ${s.name.padEnd(12)} ${s.description}`);
|
||
console.log(` needs: accounts=${s.needs.accounts}, rooms=${s.needs.rooms ?? 1}, cpus=${s.needs.cpusPerRoom ?? 0}`);
|
||
}
|
||
}
|
||
|
||
async function main(): Promise<void> {
|
||
const cli: CliArgs = parseArgs(process.argv.slice(2));
|
||
|
||
if (cli.listOnly) {
|
||
printScenarioList();
|
||
return;
|
||
}
|
||
|
||
if (!cli.scenario) {
|
||
console.error('Error: --scenario=<name> is required. Use --list to see scenarios.');
|
||
process.exit(2);
|
||
}
|
||
|
||
const scenario = getScenario(cli.scenario);
|
||
if (!scenario) {
|
||
console.error(`Error: unknown scenario "${cli.scenario}". Use --list to see scenarios.`);
|
||
process.exit(2);
|
||
}
|
||
|
||
const runId = cli.runId ?? `${cli.scenario}-${new Date().toISOString().replace(/[:.]/g, '-')}`;
|
||
const targetUrl = cli.target ?? process.env.TEST_URL ?? 'http://localhost:8000';
|
||
const inviteCode = process.env.SOAK_INVITE_CODE ?? 'SOAKTEST';
|
||
const watch = cli.watch ?? 'dashboard';
|
||
|
||
const logger = createLogger({ runId });
|
||
logger.info('run_start', {
|
||
scenario: scenario.name,
|
||
targetUrl,
|
||
watch,
|
||
cli,
|
||
});
|
||
|
||
// Resolve final config
|
||
const config = mergeConfig(
|
||
cli as Record<string, unknown>,
|
||
process.env,
|
||
scenario.defaultConfig,
|
||
);
|
||
// Ensure core knobs exist
|
||
const accounts = Number(config.accounts ?? scenario.needs.accounts);
|
||
const rooms = Number(config.rooms ?? scenario.needs.rooms ?? 1);
|
||
const cpusPerRoom = Number(config.cpusPerRoom ?? scenario.needs.cpusPerRoom ?? 0);
|
||
if (accounts % rooms !== 0) {
|
||
console.error(`Error: --accounts=${accounts} does not divide evenly into --rooms=${rooms}`);
|
||
process.exit(2);
|
||
}
|
||
config.rooms = rooms;
|
||
config.cpusPerRoom = cpusPerRoom;
|
||
|
||
if (cli.dryRun) {
|
||
logger.info('dry_run', { config });
|
||
console.log('Dry run OK. Resolved config:');
|
||
console.log(JSON.stringify(config, null, 2));
|
||
return;
|
||
}
|
||
|
||
if (watch !== 'none') {
|
||
logger.warn('watch_mode_not_yet_implemented', { watch });
|
||
console.warn(`Watch mode "${watch}" not yet implemented — falling back to "none".`);
|
||
}
|
||
|
||
// Build dependencies
|
||
const credFile = path.resolve(__dirname, '.env.stresstest');
|
||
const pool = new SessionPool({
|
||
targetUrl,
|
||
inviteCode,
|
||
credFile,
|
||
logger,
|
||
});
|
||
const coordinator = new RoomCoordinator();
|
||
const dashboard = noopDashboard();
|
||
const abortController = new AbortController();
|
||
|
||
const onSignal = (sig: string) => {
|
||
logger.warn('signal_received', { signal: sig });
|
||
abortController.abort();
|
||
};
|
||
process.on('SIGINT', () => onSignal('SIGINT'));
|
||
process.on('SIGTERM', () => onSignal('SIGTERM'));
|
||
|
||
let exitCode = 0;
|
||
try {
|
||
const sessions = await pool.acquire(accounts);
|
||
logger.info('sessions_acquired', { count: sessions.length });
|
||
|
||
const ctx: ScenarioContext = {
|
||
config,
|
||
sessions,
|
||
coordinator,
|
||
dashboard,
|
||
logger,
|
||
signal: abortController.signal,
|
||
heartbeat: () => {}, // Task 26 wires this up
|
||
};
|
||
|
||
const result = await scenario.run(ctx);
|
||
logger.info('run_complete', {
|
||
gamesCompleted: result.gamesCompleted,
|
||
errors: result.errors.length,
|
||
durationMs: result.durationMs,
|
||
});
|
||
console.log(`Games completed: ${result.gamesCompleted}`);
|
||
console.log(`Errors: ${result.errors.length}`);
|
||
console.log(`Duration: ${(result.durationMs / 1000).toFixed(1)}s`);
|
||
if (result.errors.length > 0) {
|
||
console.log('Errors:');
|
||
for (const e of result.errors) {
|
||
console.log(` ${e.room}: ${e.reason}${e.detail ? ' — ' + e.detail : ''}`);
|
||
}
|
||
exitCode = 1;
|
||
}
|
||
} catch (err) {
|
||
logger.error('run_failed', {
|
||
error: err instanceof Error ? err.message : String(err),
|
||
stack: err instanceof Error ? err.stack : undefined,
|
||
});
|
||
exitCode = 1;
|
||
} finally {
|
||
await pool.release();
|
||
}
|
||
|
||
if (abortController.signal.aborted && exitCode === 0) exitCode = 2;
|
||
process.exit(exitCode);
|
||
}
|
||
|
||
main().catch((err) => {
|
||
console.error(err);
|
||
process.exit(1);
|
||
});
|
||
```
|
||
|
||
- [ ] **Step 2: Run a minimal `--watch=none` smoke against local dev**
|
||
|
||
Server running, 4 soak accounts already seeded from Task 14:
|
||
|
||
```bash
|
||
cd tests/soak
|
||
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
|
||
--scenario=populate \
|
||
--accounts=2 \
|
||
--rooms=1 \
|
||
--cpus-per-room=0 \
|
||
--games-per-room=1 \
|
||
--holes=1 \
|
||
--watch=none
|
||
```
|
||
|
||
Expected output (abbreviated):
|
||
|
||
```
|
||
{"timestamp":"...","level":"info","msg":"run_start",...}
|
||
{"timestamp":"...","level":"info","msg":"sessions_acquired","count":2}
|
||
{"timestamp":"...","level":"info","msg":"game_start","room":"room-0","game":1}
|
||
{"timestamp":"...","level":"info","msg":"room_created","code":"XXXX"}
|
||
{"timestamp":"...","level":"info","msg":"game_complete","room":"room-0","turns":...}
|
||
{"timestamp":"...","level":"info","msg":"run_complete","gamesCompleted":1,"errors":0}
|
||
Games completed: 1
|
||
Errors: 0
|
||
Duration: X.Xs
|
||
```
|
||
|
||
Exit code 0.
|
||
|
||
This is the first **end-to-end milestone**. Stop here if debugging is needed — fix issues before moving on.
|
||
|
||
- [ ] **Step 3: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/runner.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): runner.ts end-to-end with --watch=none
|
||
|
||
First full end-to-end milestone: parses CLI, builds SessionPool +
|
||
RoomCoordinator, loads a scenario by name, runs it, reports results,
|
||
cleans up. Watch modes other than "none" log a warning and fall back
|
||
until Tasks 19-24 implement them.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 5 — Dashboard status grid
|
||
|
||
### Task 19: Dashboard HTTP + WS server
|
||
|
||
Vanilla node `http` + `ws`. Serves one static HTML page, accepts WS connections, broadcasts room-state updates.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/dashboard/server.ts`
|
||
|
||
- [ ] **Step 1: Implement `dashboard/server.ts`**
|
||
|
||
```typescript
|
||
// tests/soak/dashboard/server.ts
|
||
import * as http from 'http';
|
||
import * as fs from 'fs';
|
||
import * as path from 'path';
|
||
import { WebSocketServer, WebSocket } from 'ws';
|
||
import type { DashboardReporter, Logger, RoomState } from '../core/types';
|
||
|
||
export type DashboardIncoming =
|
||
| { type: 'start_stream'; sessionKey: string }
|
||
| { type: 'stop_stream'; sessionKey: string };
|
||
|
||
export type DashboardOutgoing =
|
||
| { type: 'room_state'; roomId: string; state: Partial<RoomState> }
|
||
| { type: 'log'; level: string; msg: string; meta?: object; timestamp: number }
|
||
| { type: 'metric'; name: string; value: number }
|
||
| { type: 'frame'; sessionKey: string; jpegBase64: string };
|
||
|
||
export interface DashboardHandlers {
|
||
onStartStream?(sessionKey: string): void;
|
||
onStopStream?(sessionKey: string): void;
|
||
onDisconnect?(): void;
|
||
}
|
||
|
||
export class DashboardServer {
|
||
private httpServer!: http.Server;
|
||
private wsServer!: WebSocketServer;
|
||
private clients = new Set<WebSocket>();
|
||
private metrics: Record<string, number> = {};
|
||
private roomStates: Record<string, Partial<RoomState>> = {};
|
||
|
||
constructor(
|
||
private port: number,
|
||
private logger: Logger,
|
||
private handlers: DashboardHandlers = {},
|
||
) {}
|
||
|
||
async start(): Promise<void> {
|
||
const htmlPath = path.resolve(__dirname, 'index.html');
|
||
const cssPath = path.resolve(__dirname, 'dashboard.css');
|
||
const jsPath = path.resolve(__dirname, 'dashboard.js');
|
||
|
||
this.httpServer = http.createServer((req, res) => {
|
||
const url = req.url ?? '/';
|
||
if (url === '/' || url === '/index.html') {
|
||
res.writeHead(200, { 'Content-Type': 'text/html; charset=utf-8' });
|
||
fs.createReadStream(htmlPath).pipe(res);
|
||
} else if (url === '/dashboard.css') {
|
||
res.writeHead(200, { 'Content-Type': 'text/css' });
|
||
fs.createReadStream(cssPath).pipe(res);
|
||
} else if (url === '/dashboard.js') {
|
||
res.writeHead(200, { 'Content-Type': 'application/javascript' });
|
||
fs.createReadStream(jsPath).pipe(res);
|
||
} else {
|
||
res.writeHead(404);
|
||
res.end('not found');
|
||
}
|
||
});
|
||
|
||
this.wsServer = new WebSocketServer({ server: this.httpServer });
|
||
this.wsServer.on('connection', (ws) => {
|
||
this.clients.add(ws);
|
||
this.logger.info('dashboard_client_connected', { count: this.clients.size });
|
||
|
||
// Replay current state to the new client
|
||
for (const [roomId, state] of Object.entries(this.roomStates)) {
|
||
ws.send(JSON.stringify({ type: 'room_state', roomId, state } as DashboardOutgoing));
|
||
}
|
||
for (const [name, value] of Object.entries(this.metrics)) {
|
||
ws.send(JSON.stringify({ type: 'metric', name, value } as DashboardOutgoing));
|
||
}
|
||
|
||
ws.on('message', (data) => {
|
||
try {
|
||
const parsed = JSON.parse(data.toString()) as DashboardIncoming;
|
||
if (parsed.type === 'start_stream' && this.handlers.onStartStream) {
|
||
this.handlers.onStartStream(parsed.sessionKey);
|
||
} else if (parsed.type === 'stop_stream' && this.handlers.onStopStream) {
|
||
this.handlers.onStopStream(parsed.sessionKey);
|
||
}
|
||
} catch (err) {
|
||
this.logger.warn('dashboard_ws_parse_error', {
|
||
error: err instanceof Error ? err.message : String(err),
|
||
});
|
||
}
|
||
});
|
||
|
||
ws.on('close', () => {
|
||
this.clients.delete(ws);
|
||
this.logger.info('dashboard_client_disconnected', { count: this.clients.size });
|
||
if (this.clients.size === 0 && this.handlers.onDisconnect) {
|
||
this.handlers.onDisconnect();
|
||
}
|
||
});
|
||
});
|
||
|
||
await new Promise<void>((resolve) => {
|
||
this.httpServer.listen(this.port, () => resolve());
|
||
});
|
||
this.logger.info('dashboard_listening', { url: `http://localhost:${this.port}` });
|
||
}
|
||
|
||
async stop(): Promise<void> {
|
||
for (const ws of this.clients) {
|
||
try {
|
||
ws.close();
|
||
} catch {
|
||
// ignore
|
||
}
|
||
}
|
||
this.clients.clear();
|
||
await new Promise<void>((resolve) => {
|
||
this.wsServer.close(() => resolve());
|
||
});
|
||
await new Promise<void>((resolve) => {
|
||
this.httpServer.close(() => resolve());
|
||
});
|
||
}
|
||
|
||
broadcast(msg: DashboardOutgoing): void {
|
||
const payload = JSON.stringify(msg);
|
||
for (const ws of this.clients) {
|
||
if (ws.readyState === WebSocket.OPEN) {
|
||
ws.send(payload);
|
||
}
|
||
}
|
||
}
|
||
|
||
/** Create a DashboardReporter wired to this server. */
|
||
reporter(): DashboardReporter {
|
||
return {
|
||
update: (roomId, state) => {
|
||
this.roomStates[roomId] = { ...this.roomStates[roomId], ...state };
|
||
this.broadcast({ type: 'room_state', roomId, state });
|
||
},
|
||
log: (level, msg, meta) => {
|
||
this.broadcast({ type: 'log', level, msg, meta, timestamp: Date.now() });
|
||
},
|
||
incrementMetric: (name, by = 1) => {
|
||
this.metrics[name] = (this.metrics[name] ?? 0) + by;
|
||
this.broadcast({ type: 'metric', name, value: this.metrics[name] });
|
||
},
|
||
};
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Syntax-check**
|
||
|
||
```bash
|
||
cd tests/soak
|
||
npx tsx -e "import('./dashboard/server').then(() => console.log('ok'))"
|
||
```
|
||
|
||
Expected: `ok`.
|
||
|
||
- [ ] **Step 3: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/dashboard/server.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): DashboardServer — vanilla http + ws
|
||
|
||
Serves one static HTML page, accepts WS connections, broadcasts
|
||
room_state/log/metric messages to all clients. Exposes a
|
||
reporter() method that returns a DashboardReporter scenarios can
|
||
call without knowing about sockets.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 20: Dashboard HTML/CSS/JS status grid
|
||
|
||
Single static HTML page + stylesheet + client script. Renders the 2×2 room grid, subscribes to WS, updates tiles on each message.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/dashboard/index.html`
|
||
- Create: `tests/soak/dashboard/dashboard.css`
|
||
- Create: `tests/soak/dashboard/dashboard.js`
|
||
|
||
- [ ] **Step 1: Create `dashboard/index.html`**
|
||
|
||
```html
|
||
<!DOCTYPE html>
|
||
<html lang="en">
|
||
<head>
|
||
<meta charset="UTF-8">
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||
<title>Golf Soak Dashboard</title>
|
||
<link rel="stylesheet" href="/dashboard.css">
|
||
</head>
|
||
<body>
|
||
<header class="dash-header">
|
||
<h1>⛳ Golf Soak Dashboard</h1>
|
||
<div class="meta">
|
||
<span id="run-id">run —</span>
|
||
<span id="elapsed">00:00:00</span>
|
||
</div>
|
||
</header>
|
||
|
||
<div class="meta-bar">
|
||
<div class="stat"><span class="label">Games</span><span id="metric-games">0</span></div>
|
||
<div class="stat"><span class="label">Moves</span><span id="metric-moves">0</span></div>
|
||
<div class="stat"><span class="label">Errors</span><span id="metric-errors">0</span></div>
|
||
<div class="stat"><span class="label">WS</span><span id="ws-status">connecting</span></div>
|
||
</div>
|
||
|
||
<div class="rooms" id="rooms">
|
||
<!-- Room tiles injected by dashboard.js -->
|
||
</div>
|
||
|
||
<section class="log">
|
||
<div class="log-header">Activity Log</div>
|
||
<ul id="log-list"></ul>
|
||
</section>
|
||
|
||
<!-- Modal for focused live video (Task 23) -->
|
||
<div id="video-modal" class="video-modal hidden">
|
||
<div class="video-modal-content">
|
||
<div class="video-modal-header">
|
||
<span id="video-modal-title">Watching —</span>
|
||
<button id="video-modal-close">Close</button>
|
||
</div>
|
||
<img id="video-frame" alt="Live screencast" />
|
||
</div>
|
||
</div>
|
||
|
||
<script src="/dashboard.js"></script>
|
||
</body>
|
||
</html>
|
||
```
|
||
|
||
- [ ] **Step 2: Create `dashboard/dashboard.css`**
|
||
|
||
```css
|
||
:root {
|
||
--bg: #0a0e16;
|
||
--panel: #0e1420;
|
||
--border: #1a2230;
|
||
--text: #c8d4e4;
|
||
--accent: #7fbaff;
|
||
--good: #6fd08f;
|
||
--warn: #ffb84d;
|
||
--err: #ff5c6c;
|
||
--muted: #556577;
|
||
}
|
||
|
||
* { box-sizing: border-box; }
|
||
|
||
body {
|
||
margin: 0;
|
||
font-family: -apple-system, system-ui, 'SF Mono', Consolas, monospace;
|
||
background: var(--bg);
|
||
color: var(--text);
|
||
}
|
||
|
||
.dash-header {
|
||
display: flex;
|
||
justify-content: space-between;
|
||
align-items: center;
|
||
padding: 12px 20px;
|
||
background: linear-gradient(135deg, #0f1823, #0a1018);
|
||
border-bottom: 1px solid var(--border);
|
||
}
|
||
.dash-header h1 { margin: 0; font-size: 16px; color: var(--accent); }
|
||
.dash-header .meta { font-size: 11px; color: var(--muted); }
|
||
.dash-header .meta span + span { margin-left: 12px; }
|
||
|
||
.meta-bar {
|
||
display: flex;
|
||
gap: 24px;
|
||
padding: 10px 20px;
|
||
background: #0c131d;
|
||
border-bottom: 1px solid var(--border);
|
||
font-size: 12px;
|
||
}
|
||
.meta-bar .stat .label { color: var(--muted); margin-right: 6px; }
|
||
.meta-bar .stat span:last-child { color: #fff; font-weight: 600; }
|
||
|
||
.rooms {
|
||
display: grid;
|
||
grid-template-columns: 1fr 1fr;
|
||
gap: 1px;
|
||
background: var(--border);
|
||
}
|
||
.room {
|
||
background: var(--panel);
|
||
padding: 14px 18px;
|
||
min-height: 180px;
|
||
}
|
||
.room-title {
|
||
display: flex;
|
||
justify-content: space-between;
|
||
align-items: center;
|
||
margin-bottom: 10px;
|
||
}
|
||
.room-title .name { font-size: 13px; color: var(--accent); font-weight: 600; }
|
||
.room-title .phase {
|
||
font-size: 10px;
|
||
padding: 2px 8px;
|
||
border-radius: 10px;
|
||
background: #1a3a2a;
|
||
color: var(--good);
|
||
}
|
||
.room-title .phase.lobby { background: #3a2a1a; color: var(--warn); }
|
||
.room-title .phase.err { background: #3a1a1a; color: var(--err); }
|
||
|
||
.players {
|
||
display: grid;
|
||
grid-template-columns: repeat(2, 1fr);
|
||
gap: 4px;
|
||
font-size: 11px;
|
||
margin-bottom: 8px;
|
||
}
|
||
.player {
|
||
display: flex;
|
||
justify-content: space-between;
|
||
padding: 4px 8px;
|
||
background: #0a0f18;
|
||
border-radius: 3px;
|
||
cursor: pointer;
|
||
border: 1px solid transparent;
|
||
}
|
||
.player:hover { border-color: var(--accent); }
|
||
.player.active {
|
||
background: #1a2a40;
|
||
border-left: 2px solid var(--accent);
|
||
}
|
||
.player .score { color: var(--muted); }
|
||
|
||
.progress-bar {
|
||
height: 4px;
|
||
background: var(--border);
|
||
border-radius: 2px;
|
||
overflow: hidden;
|
||
margin-top: 6px;
|
||
}
|
||
.progress-fill {
|
||
height: 100%;
|
||
background: linear-gradient(90deg, var(--accent), var(--good));
|
||
transition: width 0.3s;
|
||
}
|
||
.room-meta {
|
||
font-size: 10px;
|
||
color: var(--muted);
|
||
display: flex;
|
||
gap: 12px;
|
||
margin-top: 6px;
|
||
}
|
||
|
||
.log {
|
||
border-top: 1px solid var(--border);
|
||
background: #080c13;
|
||
max-height: 160px;
|
||
overflow-y: auto;
|
||
}
|
||
.log .log-header {
|
||
padding: 6px 20px;
|
||
font-size: 10px;
|
||
text-transform: uppercase;
|
||
color: var(--muted);
|
||
border-bottom: 1px solid var(--border);
|
||
}
|
||
.log ul { list-style: none; margin: 0; padding: 4px 20px; font-size: 10px; }
|
||
.log li { line-height: 1.5; font-family: monospace; color: var(--muted); }
|
||
.log li.warn { color: var(--warn); }
|
||
.log li.error { color: var(--err); }
|
||
|
||
.video-modal {
|
||
position: fixed;
|
||
inset: 0;
|
||
background: rgba(0, 0, 0, 0.85);
|
||
display: flex;
|
||
align-items: center;
|
||
justify-content: center;
|
||
z-index: 100;
|
||
}
|
||
.video-modal.hidden { display: none; }
|
||
.video-modal-content {
|
||
background: var(--panel);
|
||
border: 1px solid var(--border);
|
||
border-radius: 6px;
|
||
padding: 16px;
|
||
max-width: 90vw;
|
||
max-height: 90vh;
|
||
}
|
||
.video-modal-header {
|
||
display: flex;
|
||
justify-content: space-between;
|
||
align-items: center;
|
||
margin-bottom: 12px;
|
||
color: var(--accent);
|
||
font-size: 13px;
|
||
}
|
||
.video-modal-header button {
|
||
background: var(--border);
|
||
color: var(--text);
|
||
border: none;
|
||
padding: 4px 12px;
|
||
border-radius: 3px;
|
||
cursor: pointer;
|
||
}
|
||
#video-frame {
|
||
display: block;
|
||
max-width: 100%;
|
||
max-height: 70vh;
|
||
border: 1px solid var(--border);
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 3: Create `dashboard/dashboard.js`**
|
||
|
||
```javascript
|
||
// tests/soak/dashboard/dashboard.js
|
||
(() => {
|
||
const ws = new WebSocket(`ws://${location.host}`);
|
||
const roomsEl = document.getElementById('rooms');
|
||
const logEl = document.getElementById('log-list');
|
||
const wsStatusEl = document.getElementById('ws-status');
|
||
const metricGames = document.getElementById('metric-games');
|
||
const metricMoves = document.getElementById('metric-moves');
|
||
const metricErrors = document.getElementById('metric-errors');
|
||
const elapsedEl = document.getElementById('elapsed');
|
||
|
||
const roomTiles = new Map();
|
||
const startTime = Date.now();
|
||
let currentWatchedKey = null;
|
||
|
||
// Video modal
|
||
const videoModal = document.getElementById('video-modal');
|
||
const videoFrame = document.getElementById('video-frame');
|
||
const videoTitle = document.getElementById('video-modal-title');
|
||
const videoClose = document.getElementById('video-modal-close');
|
||
|
||
function fmtElapsed(ms) {
|
||
const s = Math.floor(ms / 1000);
|
||
const h = Math.floor(s / 3600);
|
||
const m = Math.floor((s % 3600) / 60);
|
||
const sec = s % 60;
|
||
return `${String(h).padStart(2, '0')}:${String(m).padStart(2, '0')}:${String(sec).padStart(2, '0')}`;
|
||
}
|
||
setInterval(() => {
|
||
elapsedEl.textContent = fmtElapsed(Date.now() - startTime);
|
||
}, 1000);
|
||
|
||
function ensureRoomTile(roomId) {
|
||
if (roomTiles.has(roomId)) return roomTiles.get(roomId);
|
||
const tile = document.createElement('div');
|
||
tile.className = 'room';
|
||
tile.innerHTML = `
|
||
<div class="room-title">
|
||
<div class="name">${roomId}</div>
|
||
<div class="phase lobby">waiting</div>
|
||
</div>
|
||
<div class="players"></div>
|
||
<div class="progress-bar"><div class="progress-fill" style="width:0%"></div></div>
|
||
<div class="room-meta">
|
||
<span class="moves">0 moves</span>
|
||
<span class="game">game —</span>
|
||
</div>
|
||
`;
|
||
roomsEl.appendChild(tile);
|
||
roomTiles.set(roomId, tile);
|
||
return tile;
|
||
}
|
||
|
||
function renderRoomState(roomId, state) {
|
||
const tile = ensureRoomTile(roomId);
|
||
if (state.phase !== undefined) {
|
||
const phaseEl = tile.querySelector('.phase');
|
||
phaseEl.textContent = state.phase;
|
||
phaseEl.classList.toggle('lobby', state.phase === 'lobby' || state.phase === 'waiting');
|
||
phaseEl.classList.toggle('err', state.phase === 'error');
|
||
}
|
||
if (state.players !== undefined) {
|
||
const playersEl = tile.querySelector('.players');
|
||
playersEl.innerHTML = state.players
|
||
.map(
|
||
(p) => `
|
||
<div class="player ${p.isActive ? 'active' : ''}" data-session="${p.key}">
|
||
<span>${p.isActive ? '▶ ' : ''}${p.key}</span>
|
||
<span class="score">${p.score ?? '—'}</span>
|
||
</div>
|
||
`,
|
||
)
|
||
.join('');
|
||
}
|
||
if (state.hole !== undefined && state.totalHoles !== undefined) {
|
||
const fill = tile.querySelector('.progress-fill');
|
||
const pct = state.totalHoles > 0 ? Math.round((state.hole / state.totalHoles) * 100) : 0;
|
||
fill.style.width = `${pct}%`;
|
||
}
|
||
if (state.moves !== undefined) {
|
||
tile.querySelector('.moves').textContent = `${state.moves} moves`;
|
||
}
|
||
if (state.game !== undefined && state.totalGames !== undefined) {
|
||
tile.querySelector('.game').textContent = `game ${state.game}/${state.totalGames}`;
|
||
}
|
||
}
|
||
|
||
function appendLog(level, msg, meta) {
|
||
const li = document.createElement('li');
|
||
li.className = level;
|
||
const ts = new Date().toLocaleTimeString();
|
||
li.textContent = `[${ts}] ${msg} ${meta ? JSON.stringify(meta) : ''}`;
|
||
logEl.insertBefore(li, logEl.firstChild);
|
||
// Cap log length
|
||
while (logEl.children.length > 100) {
|
||
logEl.removeChild(logEl.lastChild);
|
||
}
|
||
}
|
||
|
||
function applyMetric(name, value) {
|
||
if (name === 'games_completed') metricGames.textContent = value;
|
||
else if (name === 'moves_total') metricMoves.textContent = value;
|
||
else if (name === 'errors') metricErrors.textContent = value;
|
||
}
|
||
|
||
ws.addEventListener('open', () => {
|
||
wsStatusEl.textContent = 'healthy';
|
||
wsStatusEl.style.color = 'var(--good)';
|
||
});
|
||
ws.addEventListener('close', () => {
|
||
wsStatusEl.textContent = 'disconnected';
|
||
wsStatusEl.style.color = 'var(--err)';
|
||
});
|
||
ws.addEventListener('message', (event) => {
|
||
let msg;
|
||
try {
|
||
msg = JSON.parse(event.data);
|
||
} catch {
|
||
return;
|
||
}
|
||
if (msg.type === 'room_state') {
|
||
renderRoomState(msg.roomId, msg.state);
|
||
} else if (msg.type === 'log') {
|
||
appendLog(msg.level, msg.msg, msg.meta);
|
||
} else if (msg.type === 'metric') {
|
||
applyMetric(msg.name, msg.value);
|
||
} else if (msg.type === 'frame') {
|
||
if (msg.sessionKey === currentWatchedKey) {
|
||
videoFrame.src = `data:image/jpeg;base64,${msg.jpegBase64}`;
|
||
}
|
||
}
|
||
});
|
||
|
||
// Click-to-watch (wired in Task 23)
|
||
roomsEl.addEventListener('click', (e) => {
|
||
const playerEl = e.target.closest('.player');
|
||
if (!playerEl) return;
|
||
const key = playerEl.dataset.session;
|
||
if (!key) return;
|
||
currentWatchedKey = key;
|
||
videoTitle.textContent = `Watching ${key}`;
|
||
videoModal.classList.remove('hidden');
|
||
ws.send(JSON.stringify({ type: 'start_stream', sessionKey: key }));
|
||
});
|
||
|
||
function closeVideo() {
|
||
if (currentWatchedKey) {
|
||
ws.send(JSON.stringify({ type: 'stop_stream', sessionKey: currentWatchedKey }));
|
||
}
|
||
currentWatchedKey = null;
|
||
videoModal.classList.add('hidden');
|
||
videoFrame.src = '';
|
||
}
|
||
videoClose.addEventListener('click', closeVideo);
|
||
document.addEventListener('keydown', (e) => {
|
||
if (e.key === 'Escape') closeVideo();
|
||
});
|
||
})();
|
||
```
|
||
|
||
- [ ] **Step 4: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/dashboard/index.html tests/soak/dashboard/dashboard.css tests/soak/dashboard/dashboard.js
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): dashboard status grid UI
|
||
|
||
Static HTML page served by DashboardServer. Renders the 2×2 room
|
||
grid with progress bars and player tiles, subscribes to WS events,
|
||
updates tiles live. Click-to-watch modal is wired but receives
|
||
frames once the CDP screencaster ships in Task 22.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 21: Wire `WATCH=dashboard` in runner
|
||
|
||
Start the dashboard server when `--watch=dashboard`, auto-open the URL in the user's browser, use its `reporter()` as the `ctx.dashboard`.
|
||
|
||
**Files:**
|
||
- Modify: `tests/soak/runner.ts`
|
||
|
||
- [ ] **Step 1: Import and instantiate DashboardServer in `runner.ts`**
|
||
|
||
At the top of `runner.ts`, add:
|
||
|
||
```typescript
|
||
import { DashboardServer } from './dashboard/server';
|
||
import { spawn } from 'child_process';
|
||
```
|
||
|
||
Replace the block that creates `dashboard` with:
|
||
|
||
```typescript
|
||
// Build dashboard if requested
|
||
let dashboardServer: DashboardServer | null = null;
|
||
let dashboard: DashboardReporter = noopDashboard();
|
||
if (watch === 'dashboard') {
|
||
const port = Number(config.dashboardPort ?? 7777);
|
||
dashboardServer = new DashboardServer(port, logger, {
|
||
onStartStream: (_key) => {
|
||
logger.info('stream_start_requested', { sessionKey: _key });
|
||
// Wired in Task 22
|
||
},
|
||
onStopStream: (_key) => {
|
||
logger.info('stream_stop_requested', { sessionKey: _key });
|
||
},
|
||
});
|
||
await dashboardServer.start();
|
||
dashboard = dashboardServer.reporter();
|
||
const url = `http://localhost:${port}`;
|
||
console.log(`Dashboard: ${url}`);
|
||
// Best-effort auto-open
|
||
try {
|
||
const opener = process.platform === 'darwin' ? 'open' : process.platform === 'win32' ? 'start' : 'xdg-open';
|
||
spawn(opener, [url], { stdio: 'ignore', detached: true }).unref();
|
||
} catch {
|
||
// If auto-open fails, the URL is already printed
|
||
}
|
||
} else if (watch === 'tiled') {
|
||
logger.warn('tiled_not_yet_implemented');
|
||
console.warn('Watch mode "tiled" not yet implemented (Task 24). Falling back to none.');
|
||
}
|
||
```
|
||
|
||
And in the `finally` block, shut down the server:
|
||
|
||
```typescript
|
||
} finally {
|
||
await pool.release();
|
||
if (dashboardServer) {
|
||
await dashboardServer.stop();
|
||
}
|
||
}
|
||
```
|
||
|
||
Also remove the earlier `if (watch !== 'none')` warning block — it's replaced by the dispatch above.
|
||
|
||
- [ ] **Step 2: Run smoke against dev with dashboard**
|
||
|
||
```bash
|
||
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
|
||
--scenario=populate \
|
||
--accounts=2 --rooms=1 --cpus-per-room=0 --games-per-room=1 --holes=1 \
|
||
--watch=dashboard
|
||
```
|
||
|
||
Expected:
|
||
- `Dashboard: http://localhost:7777` printed
|
||
- Browser auto-opens (or you open it manually)
|
||
- Page shows the dashboard with `WS: healthy`
|
||
- During the game, the `room-0` tile shows `phase: playing`, increments `moves`, updates progress
|
||
- After game completes, the runner exits 0 and the dashboard stops
|
||
|
||
- [ ] **Step 3: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/runner.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): wire --watch=dashboard in runner
|
||
|
||
Starts DashboardServer on 7777 (configurable), uses its reporter as
|
||
ctx.dashboard, auto-opens the URL. Cleans up on exit.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 6 — Live video click-to-watch
|
||
|
||
### Task 22: CDP screencast module
|
||
|
||
Attach a CDP session to a given page, start screencasting JPEG frames at a fixed rate, forward each frame to a callback, detach on stop.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/core/screencaster.ts`
|
||
|
||
- [ ] **Step 1: Implement `core/screencaster.ts`**
|
||
|
||
```typescript
|
||
// tests/soak/core/screencaster.ts
|
||
import type { Page, CDPSession } from 'playwright-core';
|
||
import type { Logger } from './types';
|
||
|
||
export interface ScreencastOptions {
|
||
format?: 'jpeg' | 'png';
|
||
quality?: number;
|
||
maxWidth?: number;
|
||
maxHeight?: number;
|
||
everyNthFrame?: number;
|
||
}
|
||
|
||
export type FrameCallback = (jpegBase64: string) => void;
|
||
|
||
export class Screencaster {
|
||
private sessions = new Map<string, CDPSession>();
|
||
|
||
constructor(private logger: Logger) {}
|
||
|
||
/**
|
||
* Attach a CDP session to the given page and start forwarding frames.
|
||
* If already streaming, this is a no-op.
|
||
*/
|
||
async start(
|
||
sessionKey: string,
|
||
page: Page,
|
||
onFrame: FrameCallback,
|
||
opts: ScreencastOptions = {},
|
||
): Promise<void> {
|
||
if (this.sessions.has(sessionKey)) {
|
||
this.logger.warn('screencast_already_running', { sessionKey });
|
||
return;
|
||
}
|
||
const client = await page.context().newCDPSession(page);
|
||
this.sessions.set(sessionKey, client);
|
||
|
||
client.on('Page.screencastFrame', async (evt: { data: string; sessionId: number }) => {
|
||
try {
|
||
onFrame(evt.data);
|
||
await client.send('Page.screencastFrameAck', { sessionId: evt.sessionId });
|
||
} catch (err) {
|
||
this.logger.warn('screencast_frame_error', {
|
||
sessionKey,
|
||
error: err instanceof Error ? err.message : String(err),
|
||
});
|
||
}
|
||
});
|
||
|
||
await client.send('Page.startScreencast', {
|
||
format: opts.format ?? 'jpeg',
|
||
quality: opts.quality ?? 60,
|
||
maxWidth: opts.maxWidth ?? 640,
|
||
maxHeight: opts.maxHeight ?? 360,
|
||
everyNthFrame: opts.everyNthFrame ?? 2,
|
||
});
|
||
this.logger.info('screencast_started', { sessionKey });
|
||
}
|
||
|
||
async stop(sessionKey: string): Promise<void> {
|
||
const client = this.sessions.get(sessionKey);
|
||
if (!client) return;
|
||
try {
|
||
await client.send('Page.stopScreencast');
|
||
await client.detach();
|
||
} catch (err) {
|
||
this.logger.warn('screencast_stop_error', {
|
||
sessionKey,
|
||
error: err instanceof Error ? err.message : String(err),
|
||
});
|
||
}
|
||
this.sessions.delete(sessionKey);
|
||
this.logger.info('screencast_stopped', { sessionKey });
|
||
}
|
||
|
||
async stopAll(): Promise<void> {
|
||
const keys = Array.from(this.sessions.keys());
|
||
await Promise.all(keys.map((k) => this.stop(k)));
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Syntax-check**
|
||
|
||
```bash
|
||
cd tests/soak
|
||
npx tsx -e "import('./core/screencaster').then(() => console.log('ok'))"
|
||
```
|
||
|
||
Expected: `ok`.
|
||
|
||
- [ ] **Step 3: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/core/screencaster.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): Screencaster — CDP Page.startScreencast wrapper
|
||
|
||
Attach/detach CDP sessions per Playwright Page, start/stop JPEG
|
||
screencasts with configurable quality and frame rate, forward each
|
||
frame to a callback. Used by the dashboard for click-to-watch
|
||
live video.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 23: Wire screencaster to dashboard click-to-watch
|
||
|
||
Runner creates a `Screencaster`, passes callbacks into `DashboardServer.onStartStream/onStopStream` that look up the right session and start/stop streaming. Each frame is broadcast to the dashboard.
|
||
|
||
**Files:**
|
||
- Modify: `tests/soak/runner.ts`
|
||
|
||
- [ ] **Step 1: Import Screencaster and hold a sessions map**
|
||
|
||
In `runner.ts`, add at the top:
|
||
|
||
```typescript
|
||
import { Screencaster } from './core/screencaster';
|
||
```
|
||
|
||
After `const sessions = await pool.acquire(accounts);`, build a lookup map:
|
||
|
||
```typescript
|
||
const sessionsByKey = new Map<string, typeof sessions[number]>();
|
||
for (const s of sessions) sessionsByKey.set(s.key, s);
|
||
```
|
||
|
||
Create the screencaster before the dashboard (or right after sessions are acquired):
|
||
|
||
```typescript
|
||
const screencaster = new Screencaster(logger);
|
||
```
|
||
|
||
- [ ] **Step 2: Replace the `onStartStream`/`onStopStream` no-ops with real wiring**
|
||
|
||
Update the `DashboardServer` construction (earlier in the function) to accept handlers that close over `screencaster` and `sessionsByKey`. But since those are built after the dashboard, we need to build the dashboard AFTER sessions are acquired. Reorganize:
|
||
|
||
Move the dashboard construction to AFTER `sessions = await pool.acquire(accounts)`. Then:
|
||
|
||
```typescript
|
||
if (watch === 'dashboard') {
|
||
const port = Number(config.dashboardPort ?? 7777);
|
||
dashboardServer = new DashboardServer(port, logger, {
|
||
onStartStream: (key) => {
|
||
const session = sessionsByKey.get(key);
|
||
if (!session) {
|
||
logger.warn('stream_start_unknown_session', { sessionKey: key });
|
||
return;
|
||
}
|
||
screencaster
|
||
.start(key, session.page, (jpegBase64) => {
|
||
dashboardServer!.broadcast({ type: 'frame', sessionKey: key, jpegBase64 });
|
||
})
|
||
.catch((err) =>
|
||
logger.error('screencast_start_failed', {
|
||
key,
|
||
error: err instanceof Error ? err.message : String(err),
|
||
}),
|
||
);
|
||
},
|
||
onStopStream: (key) => {
|
||
screencaster.stop(key).catch(() => {});
|
||
},
|
||
onDisconnect: () => {
|
||
screencaster.stopAll().catch(() => {});
|
||
},
|
||
});
|
||
await dashboardServer.start();
|
||
dashboard = dashboardServer.reporter();
|
||
const url = `http://localhost:${port}`;
|
||
console.log(`Dashboard: ${url}`);
|
||
// ... auto-open
|
||
}
|
||
```
|
||
|
||
Make sure the `ctx.dashboard` assignment happens AFTER the dashboard setup (it already does — `const ctx = { ... dashboard, ... }` comes later).
|
||
|
||
In the `finally` block, add:
|
||
|
||
```typescript
|
||
await screencaster.stopAll();
|
||
```
|
||
|
||
- [ ] **Step 3: Manual test end-to-end**
|
||
|
||
Run a longer populate game so there's time to click:
|
||
|
||
```bash
|
||
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
|
||
--scenario=populate \
|
||
--accounts=4 --rooms=1 --cpus-per-room=0 --games-per-room=2 --holes=3 \
|
||
--watch=dashboard
|
||
```
|
||
|
||
Expected:
|
||
1. Dashboard opens, shows 1 room with 4 players
|
||
2. Click on any player tile (`soak_00`, `soak_01`, ...)
|
||
3. Modal opens, shows live JPEG frames of that player's view of the game
|
||
4. Close modal (Esc or Close button) — frames stop, screencast detaches
|
||
5. Run completes cleanly
|
||
|
||
- [ ] **Step 4: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/runner.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): click-to-watch live video via CDP screencast
|
||
|
||
Runner creates a Screencaster and wires its start/stop into
|
||
DashboardServer.onStartStream/onStopStream. Clicking a player tile
|
||
in the dashboard starts a CDP screencast on that session's page,
|
||
forwards JPEG frames as WS "frame" messages, closes on modal
|
||
dismiss or WS disconnect.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 7 — Tiled mode
|
||
|
||
### Task 24: `--watch=tiled` native windows
|
||
|
||
Launch a second headed browser for the 4 host contexts, position their windows in a 2×2 grid using `page.evaluate(window.moveTo)`.
|
||
|
||
**Files:**
|
||
- Modify: `tests/soak/core/session-pool.ts` — add optional headed-host support
|
||
- Modify: `tests/soak/runner.ts` — enable tiled mode
|
||
|
||
- [ ] **Step 1: Extend `SessionPool` to support headed host contexts**
|
||
|
||
Add a new option and method to `SessionPool`. In `core/session-pool.ts`:
|
||
|
||
```typescript
|
||
export interface SessionPoolOptions {
|
||
targetUrl: string;
|
||
inviteCode: string;
|
||
credFile: string;
|
||
logger: Logger;
|
||
browser?: Browser;
|
||
contextOptions?: Parameters<Browser['newContext']>[0];
|
||
/** If set, the first `headedHostCount` sessions use a separate headed browser. */
|
||
headedHostCount?: number;
|
||
}
|
||
```
|
||
|
||
Inside the class, add a `headedBrowser` field and extend `acquire`:
|
||
|
||
```typescript
|
||
private headedBrowser: Browser | null = null;
|
||
|
||
// ... in acquire(), before the loop:
|
||
|
||
if ((this.opts.headedHostCount ?? 0) > 0 && !this.headedBrowser) {
|
||
this.headedBrowser = await chromium.launch({
|
||
headless: false,
|
||
slowMo: 50,
|
||
});
|
||
}
|
||
|
||
for (let i = 0; i < count; i++) {
|
||
const account = this.accounts[i];
|
||
const useHeaded = i < (this.opts.headedHostCount ?? 0);
|
||
const targetBrowser = useHeaded ? this.headedBrowser! : this.browser!;
|
||
const context = await targetBrowser.newContext({
|
||
...this.opts.contextOptions,
|
||
...(useHeaded ? { viewport: { width: 960, height: 540 } } : {}),
|
||
});
|
||
await this.injectAuth(context, account);
|
||
const page = await context.newPage();
|
||
await page.goto(this.opts.targetUrl);
|
||
|
||
// Position headed windows in a 2×2 grid
|
||
if (useHeaded) {
|
||
const col = i % 2;
|
||
const row = Math.floor(i / 2);
|
||
const x = col * 960;
|
||
const y = row * 560;
|
||
await page.evaluate(
|
||
([x, y, w, h]) => {
|
||
window.moveTo(x, y);
|
||
window.resizeTo(w, h);
|
||
},
|
||
[x, y, 960, 540] as [number, number, number, number],
|
||
);
|
||
}
|
||
|
||
const bot = new GolfBot(page);
|
||
sessions.push({ account, context, page, bot, key: account.key });
|
||
}
|
||
```
|
||
|
||
Update `release` to close the headed browser too:
|
||
|
||
```typescript
|
||
async release(): Promise<void> {
|
||
for (const session of this.activeSessions) {
|
||
try { await session.context.close(); } catch { /* ignore */ }
|
||
}
|
||
this.activeSessions = [];
|
||
if (this.ownedBrowser) {
|
||
try { await this.ownedBrowser.close(); } catch { /* ignore */ }
|
||
this.ownedBrowser = null;
|
||
this.browser = null;
|
||
}
|
||
if (this.headedBrowser) {
|
||
try { await this.headedBrowser.close(); } catch { /* ignore */ }
|
||
this.headedBrowser = null;
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Wire `watch === 'tiled'` in the runner**
|
||
|
||
In `runner.ts`, replace the existing `tiled_not_yet_implemented` warning with:
|
||
|
||
```typescript
|
||
const headedHostCount = watch === 'tiled' ? rooms : 0;
|
||
|
||
const pool = new SessionPool({
|
||
targetUrl,
|
||
inviteCode,
|
||
credFile,
|
||
logger,
|
||
headedHostCount,
|
||
});
|
||
```
|
||
|
||
(Move that `pool` creation up so it's aware of `watch`.)
|
||
|
||
- [ ] **Step 3: Test tiled mode**
|
||
|
||
```bash
|
||
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
|
||
--scenario=populate \
|
||
--accounts=4 --rooms=2 --cpus-per-room=0 --games-per-room=1 --holes=1 \
|
||
--watch=tiled
|
||
```
|
||
|
||
Expected: 2 native Chromium windows appear (one per host), sized ~960×540 and positioned at the upper-left of the screen. They play the game visibly. On exit, windows close.
|
||
|
||
- [ ] **Step 4: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/core/session-pool.ts tests/soak/runner.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): --watch=tiled launches N headed host windows
|
||
|
||
SessionPool accepts headedHostCount; when > 0 it launches a second
|
||
Chromium in headed mode, creates those contexts there, and positions
|
||
each host window in a 2×2 grid via window.moveTo/resizeTo.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 8 — Stress scenario
|
||
|
||
### Task 25: Chaos injector + stress scenario
|
||
|
||
Short 1-hole games in tight loops, with a 5% per-turn chance of injecting a chaos event (rapid clicks, brief offline toggle, tab navigation).
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/scenarios/stress.ts`
|
||
- Create: `tests/soak/scenarios/shared/chaos.ts`
|
||
- Modify: `tests/soak/scenarios/index.ts` — register `stress`
|
||
|
||
- [ ] **Step 1: Create `scenarios/shared/chaos.ts`**
|
||
|
||
```typescript
|
||
// tests/soak/scenarios/shared/chaos.ts
|
||
import type { Session, Logger } from '../../core/types';
|
||
|
||
export type ChaosEvent =
|
||
| 'rapid_clicks'
|
||
| 'tab_blur'
|
||
| 'brief_offline';
|
||
|
||
const ALL_EVENTS: ChaosEvent[] = ['rapid_clicks', 'tab_blur', 'brief_offline'];
|
||
|
||
function pickEvent(): ChaosEvent {
|
||
return ALL_EVENTS[Math.floor(Math.random() * ALL_EVENTS.length)];
|
||
}
|
||
|
||
export async function maybeInjectChaos(
|
||
session: Session,
|
||
probability: number,
|
||
logger: Logger,
|
||
roomId: string,
|
||
): Promise<ChaosEvent | null> {
|
||
if (Math.random() >= probability) return null;
|
||
|
||
const event = pickEvent();
|
||
logger.info('chaos_injected', { room: roomId, session: session.key, event });
|
||
try {
|
||
switch (event) {
|
||
case 'rapid_clicks': {
|
||
// Fire 5 rapid clicks at the player's own cards
|
||
for (let i = 0; i < 5; i++) {
|
||
await session.page.locator(`#player-cards .card:nth-child(${(i % 6) + 1})`)
|
||
.click({ timeout: 300 })
|
||
.catch(() => {});
|
||
}
|
||
break;
|
||
}
|
||
case 'tab_blur': {
|
||
// Briefly dispatch blur then focus
|
||
await session.page.evaluate(() => {
|
||
window.dispatchEvent(new Event('blur'));
|
||
setTimeout(() => window.dispatchEvent(new Event('focus')), 200);
|
||
});
|
||
break;
|
||
}
|
||
case 'brief_offline': {
|
||
await session.context.setOffline(true);
|
||
await new Promise((r) => setTimeout(r, 300));
|
||
await session.context.setOffline(false);
|
||
break;
|
||
}
|
||
}
|
||
} catch (err) {
|
||
logger.warn('chaos_error', {
|
||
event,
|
||
error: err instanceof Error ? err.message : String(err),
|
||
});
|
||
}
|
||
return event;
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Create `scenarios/stress.ts`**
|
||
|
||
```typescript
|
||
// tests/soak/scenarios/stress.ts
|
||
import type {
|
||
Scenario,
|
||
ScenarioContext,
|
||
ScenarioResult,
|
||
ScenarioError,
|
||
Session,
|
||
} from '../core/types';
|
||
import { runOneMultiplayerGame } from './shared/multiplayer-game';
|
||
import { maybeInjectChaos } from './shared/chaos';
|
||
|
||
interface StressConfig {
|
||
gamesPerRoom: number;
|
||
holes: number;
|
||
decks: number;
|
||
rooms: number;
|
||
cpusPerRoom: number;
|
||
thinkTimeMs: [number, number];
|
||
interGamePauseMs: number;
|
||
chaosChance: number;
|
||
}
|
||
|
||
function chunk<T>(arr: T[], size: number): T[][] {
|
||
const out: T[][] = [];
|
||
for (let i = 0; i < arr.length; i += size) out.push(arr.slice(i, i + size));
|
||
return out;
|
||
}
|
||
|
||
async function sleep(ms: number): Promise<void> {
|
||
return new Promise((r) => setTimeout(r, ms));
|
||
}
|
||
|
||
async function runStressRoom(
|
||
ctx: ScenarioContext,
|
||
cfg: StressConfig,
|
||
roomIdx: number,
|
||
sessions: Session[],
|
||
): Promise<{ completed: number; errors: ScenarioError[]; chaosFired: number }> {
|
||
const roomId = `room-${roomIdx}`;
|
||
let completed = 0;
|
||
let chaosFired = 0;
|
||
const errors: ScenarioError[] = [];
|
||
|
||
for (let gameNum = 0; gameNum < cfg.gamesPerRoom; gameNum++) {
|
||
if (ctx.signal.aborted) break;
|
||
|
||
ctx.dashboard.update(roomId, { game: gameNum + 1, totalGames: cfg.gamesPerRoom });
|
||
|
||
// Start a background chaos loop for this game
|
||
let chaosActive = true;
|
||
const chaosLoop = (async () => {
|
||
while (chaosActive && !ctx.signal.aborted) {
|
||
await sleep(500);
|
||
for (const session of sessions) {
|
||
const e = await maybeInjectChaos(session, cfg.chaosChance, ctx.logger, roomId);
|
||
if (e) chaosFired++;
|
||
}
|
||
}
|
||
})();
|
||
|
||
const result = await runOneMultiplayerGame(ctx, sessions, {
|
||
roomId,
|
||
holes: cfg.holes,
|
||
decks: cfg.decks,
|
||
cpusPerRoom: cfg.cpusPerRoom,
|
||
thinkTimeMs: cfg.thinkTimeMs,
|
||
});
|
||
|
||
chaosActive = false;
|
||
await chaosLoop;
|
||
|
||
if (result.completed) {
|
||
completed++;
|
||
ctx.logger.info('game_complete', { room: roomId, game: gameNum + 1, turns: result.turns });
|
||
} else {
|
||
errors.push({
|
||
room: roomId,
|
||
reason: 'game_failed',
|
||
detail: result.error,
|
||
timestamp: Date.now(),
|
||
});
|
||
ctx.logger.error('game_failed', { room: roomId, error: result.error });
|
||
}
|
||
|
||
await sleep(cfg.interGamePauseMs);
|
||
}
|
||
|
||
return { completed, errors, chaosFired };
|
||
}
|
||
|
||
const stress: Scenario = {
|
||
name: 'stress',
|
||
description: 'Rapid short games for stability & race condition hunting',
|
||
needs: { accounts: 16, rooms: 4, cpusPerRoom: 2 },
|
||
defaultConfig: {
|
||
gamesPerRoom: 50,
|
||
holes: 1,
|
||
decks: 1,
|
||
rooms: 4,
|
||
cpusPerRoom: 2,
|
||
thinkTimeMs: [50, 150],
|
||
interGamePauseMs: 200,
|
||
chaosChance: 0.05,
|
||
},
|
||
|
||
async run(ctx: ScenarioContext): Promise<ScenarioResult> {
|
||
const start = Date.now();
|
||
const cfg = ctx.config as unknown as StressConfig;
|
||
const perRoom = Math.floor(ctx.sessions.length / cfg.rooms);
|
||
const roomSessions = chunk(ctx.sessions, perRoom);
|
||
|
||
const results = await Promise.allSettled(
|
||
roomSessions.map((s, idx) => runStressRoom(ctx, cfg, idx, s)),
|
||
);
|
||
|
||
let gamesCompleted = 0;
|
||
let chaosFired = 0;
|
||
const errors: ScenarioError[] = [];
|
||
results.forEach((r, idx) => {
|
||
if (r.status === 'fulfilled') {
|
||
gamesCompleted += r.value.completed;
|
||
chaosFired += r.value.chaosFired;
|
||
errors.push(...r.value.errors);
|
||
} else {
|
||
errors.push({
|
||
room: `room-${idx}`,
|
||
reason: 'room_threw',
|
||
detail: r.reason instanceof Error ? r.reason.message : String(r.reason),
|
||
timestamp: Date.now(),
|
||
});
|
||
}
|
||
});
|
||
|
||
return {
|
||
gamesCompleted,
|
||
errors,
|
||
durationMs: Date.now() - start,
|
||
customMetrics: { chaos_fired: chaosFired },
|
||
};
|
||
},
|
||
};
|
||
|
||
export default stress;
|
||
```
|
||
|
||
- [ ] **Step 3: Register stress in the registry**
|
||
|
||
Edit `tests/soak/scenarios/index.ts`:
|
||
|
||
```typescript
|
||
import type { Scenario } from '../core/types';
|
||
import populate from './populate';
|
||
import stress from './stress';
|
||
|
||
const registry: Record<string, Scenario> = {
|
||
populate,
|
||
stress,
|
||
};
|
||
|
||
export function getScenario(name: string): Scenario | undefined {
|
||
return registry[name];
|
||
}
|
||
|
||
export function listScenarios(): Scenario[] {
|
||
return Object.values(registry);
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Smoke test stress scenario**
|
||
|
||
```bash
|
||
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
|
||
--scenario=stress \
|
||
--accounts=4 --rooms=1 --cpus-per-room=1 --games-per-room=3 --holes=1 \
|
||
--watch=none
|
||
```
|
||
|
||
Expected: 3 quick games complete, chaos events in logs (look for `chaos_injected`), exit 0.
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/scenarios/stress.ts tests/soak/scenarios/shared/chaos.ts tests/soak/scenarios/index.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): stress scenario with chaos injection
|
||
|
||
Rapid 1-hole games with a parallel chaos loop that has a 5% per-turn
|
||
chance of firing rapid_clicks, tab_blur, or brief_offline events.
|
||
Chaos counts roll up into ScenarioResult.customMetrics.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 9 — Failure handling
|
||
|
||
### Task 26: Watchdog + heartbeat wiring
|
||
|
||
Per-room timeout that fires if no heartbeat arrives within N ms. Runner wires it into `ctx.heartbeat`. Vitest-tested.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/core/watchdog.ts`
|
||
- Create: `tests/soak/tests/watchdog.test.ts`
|
||
- Modify: `tests/soak/runner.ts` — wire `heartbeat` to per-room watchdogs
|
||
|
||
- [ ] **Step 1: Write failing tests**
|
||
|
||
```typescript
|
||
// tests/soak/tests/watchdog.test.ts
|
||
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
|
||
import { Watchdog } from '../core/watchdog';
|
||
|
||
describe('Watchdog', () => {
|
||
beforeEach(() => vi.useFakeTimers());
|
||
afterEach(() => vi.useRealTimers());
|
||
|
||
it('fires after timeout if no heartbeat', () => {
|
||
const onTimeout = vi.fn();
|
||
const w = new Watchdog(1000, onTimeout);
|
||
w.start();
|
||
vi.advanceTimersByTime(1001);
|
||
expect(onTimeout).toHaveBeenCalledOnce();
|
||
});
|
||
|
||
it('heartbeat resets the timer', () => {
|
||
const onTimeout = vi.fn();
|
||
const w = new Watchdog(1000, onTimeout);
|
||
w.start();
|
||
vi.advanceTimersByTime(800);
|
||
w.heartbeat();
|
||
vi.advanceTimersByTime(800);
|
||
expect(onTimeout).not.toHaveBeenCalled();
|
||
vi.advanceTimersByTime(300);
|
||
expect(onTimeout).toHaveBeenCalledOnce();
|
||
});
|
||
|
||
it('stop cancels pending timeout', () => {
|
||
const onTimeout = vi.fn();
|
||
const w = new Watchdog(1000, onTimeout);
|
||
w.start();
|
||
w.stop();
|
||
vi.advanceTimersByTime(2000);
|
||
expect(onTimeout).not.toHaveBeenCalled();
|
||
});
|
||
|
||
it('does not fire twice after stop', () => {
|
||
const onTimeout = vi.fn();
|
||
const w = new Watchdog(1000, onTimeout);
|
||
w.start();
|
||
vi.advanceTimersByTime(1001);
|
||
w.heartbeat();
|
||
vi.advanceTimersByTime(1001);
|
||
expect(onTimeout).toHaveBeenCalledOnce();
|
||
});
|
||
});
|
||
```
|
||
|
||
- [ ] **Step 2: Run to verify failure**
|
||
|
||
```bash
|
||
npx vitest run tests/watchdog.test.ts
|
||
```
|
||
|
||
Expected: FAIL.
|
||
|
||
- [ ] **Step 3: Implement `core/watchdog.ts`**
|
||
|
||
```typescript
|
||
// tests/soak/core/watchdog.ts
|
||
export class Watchdog {
|
||
private timer: NodeJS.Timeout | null = null;
|
||
private fired = false;
|
||
|
||
constructor(
|
||
private timeoutMs: number,
|
||
private onTimeout: () => void,
|
||
) {}
|
||
|
||
start(): void {
|
||
this.stop();
|
||
this.fired = false;
|
||
this.timer = setTimeout(() => {
|
||
if (this.fired) return;
|
||
this.fired = true;
|
||
this.onTimeout();
|
||
}, this.timeoutMs);
|
||
}
|
||
|
||
heartbeat(): void {
|
||
if (this.fired) return;
|
||
this.start();
|
||
}
|
||
|
||
stop(): void {
|
||
if (this.timer) {
|
||
clearTimeout(this.timer);
|
||
this.timer = null;
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 4: Verify tests pass**
|
||
|
||
```bash
|
||
npx vitest run tests/watchdog.test.ts
|
||
```
|
||
|
||
Expected: all passing.
|
||
|
||
- [ ] **Step 5: Wire watchdogs into the runner**
|
||
|
||
In `runner.ts`, add before building `ctx`:
|
||
|
||
```typescript
|
||
const watchdogs = new Map<string, Watchdog>();
|
||
const roomAborters = new Map<string, AbortController>();
|
||
for (let i = 0; i < rooms; i++) {
|
||
const roomId = `room-${i}`;
|
||
const aborter = new AbortController();
|
||
roomAborters.set(roomId, aborter);
|
||
const w = new Watchdog(60_000, () => {
|
||
logger.error('watchdog_fired', { room: roomId });
|
||
aborter.abort();
|
||
dashboard.update(roomId, { phase: 'error' });
|
||
});
|
||
w.start();
|
||
watchdogs.set(roomId, w);
|
||
}
|
||
```
|
||
|
||
Import at the top:
|
||
|
||
```typescript
|
||
import { Watchdog } from './core/watchdog';
|
||
```
|
||
|
||
Set `ctx.heartbeat` to:
|
||
|
||
```typescript
|
||
heartbeat: (roomId: string) => {
|
||
const w = watchdogs.get(roomId);
|
||
if (w) w.heartbeat();
|
||
},
|
||
```
|
||
|
||
In the `finally` block, stop all watchdogs:
|
||
|
||
```typescript
|
||
for (const w of watchdogs.values()) w.stop();
|
||
```
|
||
|
||
Note: for now the `roomAborters` aren't fully plumbed into scenario cancellation — scenarios see the global `ctx.signal` only. This is intentional; per-room abort requires scenario-side awareness and is deferred until a scenario genuinely misbehaves. The watchdog still catches stuck runs and flips the global error state.
|
||
|
||
- [ ] **Step 6: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/core/watchdog.ts tests/soak/tests/watchdog.test.ts tests/soak/runner.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): per-room watchdog with heartbeat
|
||
|
||
Watchdog class with Vitest tests, wired into ctx.heartbeat in the
|
||
runner. One watchdog per room, 60s timeout; firing logs an error
|
||
and marks the room's dashboard tile as errored.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 27: Artifact capture on failure
|
||
|
||
When the runner catches an error, snapshot every session's page: screenshot, HTML, console log tail, game state JSON.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/core/artifacts.ts`
|
||
- Modify: `tests/soak/runner.ts` — call `captureArtifacts` in the catch block
|
||
|
||
- [ ] **Step 1: Implement `core/artifacts.ts`**
|
||
|
||
```typescript
|
||
// tests/soak/core/artifacts.ts
|
||
import * as fs from 'fs';
|
||
import * as path from 'path';
|
||
import type { Session, Logger } from './types';
|
||
|
||
export interface ArtifactsOptions {
|
||
runId: string;
|
||
/** Absolute path to the artifacts root, e.g., /path/to/tests/soak/artifacts */
|
||
rootDir: string;
|
||
logger: Logger;
|
||
}
|
||
|
||
export class Artifacts {
|
||
readonly runDir: string;
|
||
|
||
constructor(private opts: ArtifactsOptions) {
|
||
this.runDir = path.join(opts.rootDir, opts.runId);
|
||
fs.mkdirSync(this.runDir, { recursive: true });
|
||
}
|
||
|
||
/** Capture everything for a single session. */
|
||
async captureSession(session: Session, roomId: string): Promise<void> {
|
||
const dir = path.join(this.runDir, roomId);
|
||
fs.mkdirSync(dir, { recursive: true });
|
||
const prefix = session.key;
|
||
|
||
try {
|
||
const png = await session.page.screenshot({ fullPage: true });
|
||
fs.writeFileSync(path.join(dir, `${prefix}.png`), png);
|
||
} catch (err) {
|
||
this.opts.logger.warn('artifact_screenshot_failed', {
|
||
session: session.key,
|
||
error: err instanceof Error ? err.message : String(err),
|
||
});
|
||
}
|
||
|
||
try {
|
||
const html = await session.page.content();
|
||
fs.writeFileSync(path.join(dir, `${prefix}.html`), html);
|
||
} catch (err) {
|
||
this.opts.logger.warn('artifact_html_failed', {
|
||
session: session.key,
|
||
error: err instanceof Error ? err.message : String(err),
|
||
});
|
||
}
|
||
|
||
try {
|
||
const state = await session.bot.getGameState();
|
||
fs.writeFileSync(
|
||
path.join(dir, `${prefix}.state.json`),
|
||
JSON.stringify(state, null, 2),
|
||
);
|
||
} catch (err) {
|
||
this.opts.logger.warn('artifact_state_failed', {
|
||
session: session.key,
|
||
error: err instanceof Error ? err.message : String(err),
|
||
});
|
||
}
|
||
|
||
try {
|
||
const errors = session.bot.getConsoleErrors?.() ?? [];
|
||
fs.writeFileSync(path.join(dir, `${prefix}.console.txt`), errors.join('\n'));
|
||
} catch {
|
||
// ignore — not all bots expose this
|
||
}
|
||
}
|
||
|
||
async captureAll(sessions: Session[]): Promise<void> {
|
||
// Best-effort: partition sessions by their key prefix (doesn't matter)
|
||
// and write everything under room-unknown/ unless callers pre-partition
|
||
await Promise.all(
|
||
sessions.map((s) => this.captureSession(s, 'room-unknown')),
|
||
);
|
||
}
|
||
|
||
writeSummary(summary: object): void {
|
||
fs.writeFileSync(
|
||
path.join(this.runDir, 'summary.json'),
|
||
JSON.stringify(summary, null, 2),
|
||
);
|
||
}
|
||
}
|
||
|
||
/** Prune run directories older than `maxAgeMs`. */
|
||
export function pruneOldRuns(rootDir: string, maxAgeMs: number, logger: Logger): void {
|
||
if (!fs.existsSync(rootDir)) return;
|
||
const now = Date.now();
|
||
for (const entry of fs.readdirSync(rootDir)) {
|
||
const full = path.join(rootDir, entry);
|
||
try {
|
||
const stat = fs.statSync(full);
|
||
if (stat.isDirectory() && now - stat.mtimeMs > maxAgeMs) {
|
||
fs.rmSync(full, { recursive: true, force: true });
|
||
logger.info('artifact_pruned', { runId: entry });
|
||
}
|
||
} catch {
|
||
// ignore
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
- [ ] **Step 2: Call artifact capture from the runner's error path**
|
||
|
||
In `runner.ts`, import:
|
||
|
||
```typescript
|
||
import { Artifacts, pruneOldRuns } from './core/artifacts';
|
||
```
|
||
|
||
After `const runId = ...`, instantiate and prune:
|
||
|
||
```typescript
|
||
const artifactsRoot = path.resolve(__dirname, 'artifacts');
|
||
const artifacts = new Artifacts({ runId, rootDir: artifactsRoot, logger });
|
||
pruneOldRuns(artifactsRoot, 7 * 24 * 3600 * 1000, logger);
|
||
```
|
||
|
||
In the `catch (err)` block, after logging, capture:
|
||
|
||
```typescript
|
||
} catch (err) {
|
||
logger.error('run_failed', {
|
||
error: err instanceof Error ? err.message : String(err),
|
||
stack: err instanceof Error ? err.stack : undefined,
|
||
});
|
||
try {
|
||
const liveSessions = pool['activeSessions'] as Session[] | undefined;
|
||
if (liveSessions && liveSessions.length > 0) {
|
||
await artifacts.captureAll(liveSessions);
|
||
}
|
||
} catch (captureErr) {
|
||
logger.warn('artifact_capture_failed', {
|
||
error: captureErr instanceof Error ? captureErr.message : String(captureErr),
|
||
});
|
||
}
|
||
exitCode = 1;
|
||
}
|
||
```
|
||
|
||
(Note: the `pool['activeSessions']` access bypasses visibility to avoid adding a public getter for one call site. Acceptable for an error path in a test harness.)
|
||
|
||
After successful run, write the summary:
|
||
|
||
```typescript
|
||
artifacts.writeSummary({
|
||
runId,
|
||
scenario: scenario.name,
|
||
targetUrl,
|
||
gamesCompleted: result.gamesCompleted,
|
||
errors: result.errors,
|
||
durationMs: result.durationMs,
|
||
customMetrics: result.customMetrics,
|
||
});
|
||
```
|
||
|
||
Import `Session` type:
|
||
|
||
```typescript
|
||
import type { Session } from './core/types';
|
||
```
|
||
|
||
- [ ] **Step 3: Verify by forcing a failure**
|
||
|
||
Kill the server mid-run and confirm artifacts are written:
|
||
|
||
```bash
|
||
# In one terminal
|
||
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
|
||
--scenario=populate --accounts=2 --rooms=1 --cpus-per-room=0 \
|
||
--games-per-room=5 --holes=3 --watch=none
|
||
|
||
# In another: wait ~3 seconds then Ctrl-C the dev server
|
||
# The soak run should catch errors and write artifacts
|
||
|
||
ls tests/soak/artifacts/
|
||
ls tests/soak/artifacts/<run-id>/
|
||
```
|
||
|
||
Expected: a run directory exists with `summary.json` (if it got far enough) or per-session screenshots / HTML under `room-unknown/`.
|
||
|
||
- [ ] **Step 4: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/core/artifacts.ts tests/soak/runner.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): artifact capture on failure + run summary
|
||
|
||
Screenshots, HTML, game state, and console errors are captured into
|
||
tests/soak/artifacts/<run-id>/ when a scenario throws. Runs older
|
||
than 7 days are pruned on startup. Successful runs get a
|
||
summary.json next to the artifacts dir.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 28: Graceful shutdown (already partially in place) + exit codes
|
||
|
||
SIGINT/SIGTERM already flip the abort controller. Formalize the timeout-and-force-exit path and the three exit codes (`0` / `1` / `2`).
|
||
|
||
**Files:**
|
||
- Modify: `tests/soak/runner.ts`
|
||
|
||
- [ ] **Step 1: Add a graceful shutdown timeout**
|
||
|
||
In `runner.ts`, replace the existing signal handlers with:
|
||
|
||
```typescript
|
||
let forceExitTimer: NodeJS.Timeout | null = null;
|
||
const onSignal = (sig: string) => {
|
||
if (abortController.signal.aborted) {
|
||
// Second signal: force exit
|
||
logger.warn('force_exit', { signal: sig });
|
||
process.exit(130);
|
||
}
|
||
logger.warn('signal_received', { signal: sig });
|
||
abortController.abort();
|
||
// Hard-kill after 10s if cleanup hangs
|
||
forceExitTimer = setTimeout(() => {
|
||
logger.error('graceful_shutdown_timeout');
|
||
process.exit(130);
|
||
}, 10_000);
|
||
};
|
||
process.on('SIGINT', () => onSignal('SIGINT'));
|
||
process.on('SIGTERM', () => onSignal('SIGTERM'));
|
||
```
|
||
|
||
In the `finally` block, clear the force-exit timer:
|
||
|
||
```typescript
|
||
if (forceExitTimer) clearTimeout(forceExitTimer);
|
||
```
|
||
|
||
- [ ] **Step 2: Manual test — Ctrl-C a long run**
|
||
|
||
```bash
|
||
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run soak -- \
|
||
--scenario=populate --accounts=2 --rooms=1 --cpus-per-room=0 \
|
||
--games-per-room=10 --holes=3 --watch=none
|
||
|
||
# After ~5 seconds: Ctrl-C
|
||
```
|
||
|
||
Expected: runner logs `signal_received`, finishes current turn, prints summary, exits with code 2 (check `echo $?`).
|
||
|
||
- [ ] **Step 3: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/runner.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): graceful shutdown with 10s hard-kill fallback
|
||
|
||
SIGINT/SIGTERM flips the abort signal; scenarios finish the current
|
||
turn then exit. If cleanup hangs >10s the runner force-exits. Second
|
||
Ctrl-C is an immediate hard kill. Exit codes: 0 success, 1 errors,
|
||
2 interrupted.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 29: Periodic health probes
|
||
|
||
Every 30s, fetch `/api/health` on the target server. Three consecutive failures declare a fatal error and abort.
|
||
|
||
**Files:**
|
||
- Modify: `tests/soak/runner.ts`
|
||
|
||
- [ ] **Step 1: Add a health probe interval**
|
||
|
||
In `runner.ts`, after building the abort controller and before running the scenario:
|
||
|
||
```typescript
|
||
let healthFailures = 0;
|
||
const healthTimer = setInterval(async () => {
|
||
try {
|
||
const res = await fetch(`${targetUrl}/api/health`);
|
||
if (!res.ok) throw new Error(`status ${res.status}`);
|
||
healthFailures = 0;
|
||
} catch (err) {
|
||
healthFailures++;
|
||
logger.warn('health_probe_failed', {
|
||
consecutive: healthFailures,
|
||
error: err instanceof Error ? err.message : String(err),
|
||
});
|
||
if (healthFailures >= 3) {
|
||
logger.error('health_fatal', { consecutive: healthFailures });
|
||
abortController.abort();
|
||
}
|
||
}
|
||
}, 30_000);
|
||
```
|
||
|
||
In the `finally` block:
|
||
|
||
```typescript
|
||
clearInterval(healthTimer);
|
||
```
|
||
|
||
- [ ] **Step 2: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/runner.ts
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): periodic health probes against target server
|
||
|
||
Every 30s GET /api/health. Three consecutive failures abort the
|
||
run with a fatal error, so staging outages don't get misattributed
|
||
to harness bugs.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 10 — Polish and bring-up
|
||
|
||
### Task 30: Smoke test script
|
||
|
||
`tests/soak/scripts/smoke.sh` — the canary run that takes ~30s against local dev.
|
||
|
||
**Files:**
|
||
- Create: `tests/soak/scripts/smoke.sh`
|
||
|
||
- [ ] **Step 1: Create the script**
|
||
|
||
```bash
|
||
#!/usr/bin/env bash
|
||
# Soak harness smoke test — end-to-end canary against local dev.
|
||
# Expected runtime: ~30 seconds.
|
||
set -euo pipefail
|
||
|
||
cd "$(dirname "$0")/.."
|
||
|
||
: "${TEST_URL:=http://localhost:8000}"
|
||
: "${SOAK_INVITE_CODE:=SOAKTEST}"
|
||
|
||
echo "Smoke target: $TEST_URL"
|
||
echo "Invite code: $SOAK_INVITE_CODE"
|
||
|
||
# 1. Health probe
|
||
curl -fsS "$TEST_URL/api/health" > /dev/null || {
|
||
echo "FAIL: target server unreachable at $TEST_URL"
|
||
exit 1
|
||
}
|
||
|
||
# 2. Ensure minimum accounts
|
||
if [ ! -f .env.stresstest ]; then
|
||
echo "Seeding accounts..."
|
||
npm run seed -- --count=4
|
||
fi
|
||
|
||
# 3. Run minimum viable scenario
|
||
TEST_URL="$TEST_URL" SOAK_INVITE_CODE="$SOAK_INVITE_CODE" \
|
||
npm run soak -- \
|
||
--scenario=populate \
|
||
--accounts=2 \
|
||
--rooms=1 \
|
||
--cpus-per-room=0 \
|
||
--games-per-room=1 \
|
||
--holes=1 \
|
||
--watch=none
|
||
|
||
echo "Smoke PASSED"
|
||
```
|
||
|
||
- [ ] **Step 2: Make it executable and run it**
|
||
|
||
```bash
|
||
chmod +x tests/soak/scripts/smoke.sh
|
||
cd tests/soak && bash scripts/smoke.sh
|
||
```
|
||
|
||
Expected: `Smoke PASSED` within ~30s.
|
||
|
||
- [ ] **Step 3: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/scripts/smoke.sh
|
||
git commit -m "$(cat <<'EOF'
|
||
feat(soak): smoke test script — 30s end-to-end canary
|
||
|
||
Confirms the harness works against local dev with the absolute
|
||
minimum config. Run after any change.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 31: README + CHECKLIST
|
||
|
||
Replace the README stub with a full quickstart and flag reference. Add the manual validation checklist.
|
||
|
||
**Files:**
|
||
- Modify: `tests/soak/README.md`
|
||
- Create: `tests/soak/CHECKLIST.md`
|
||
|
||
- [ ] **Step 1: Rewrite `tests/soak/README.md`**
|
||
|
||
```markdown
|
||
# Golf Soak & UX Test Harness
|
||
|
||
Standalone Playwright-based runner that drives multi-user authenticated
|
||
game sessions for scoreboard population and stability testing.
|
||
|
||
**Spec:** `../../docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md`
|
||
**Bring-up:** `../../docs/soak-harness-bringup.md`
|
||
|
||
## Quick start
|
||
|
||
```bash
|
||
cd tests/soak
|
||
npm install
|
||
|
||
# First run only: seed 16 accounts
|
||
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST npm run seed
|
||
|
||
# 30-second end-to-end smoke test
|
||
bash scripts/smoke.sh
|
||
|
||
# Populate scoreboard (4 rooms × 4 accounts × 10 long games)
|
||
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST \
|
||
npm run soak:populate
|
||
|
||
# Stress test (4 rooms × 50 rapid games with chaos)
|
||
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST \
|
||
npm run soak:stress
|
||
```
|
||
|
||
## CLI flags
|
||
|
||
```
|
||
--scenario=populate|stress required
|
||
--accounts=<n> total sessions (default: scenario.needs.accounts)
|
||
--rooms=<n> default from scenario.needs
|
||
--cpus-per-room=<n> default from scenario.needs
|
||
--games-per-room=<n> default from scenario.defaultConfig
|
||
--holes=<n> default from scenario.defaultConfig
|
||
--watch=none|dashboard|tiled default: dashboard
|
||
--dashboard-port=<n> default: 7777
|
||
--target=<url> default: TEST_URL env
|
||
--run-id=<string> default: ISO timestamp
|
||
--list print scenarios and exit
|
||
--dry-run validate config, don't run
|
||
```
|
||
|
||
Derived: `accounts / rooms` must divide evenly.
|
||
|
||
## Environment variables
|
||
|
||
```
|
||
TEST_URL target base URL (e.g. https://staging.adlee.work)
|
||
SOAK_INVITE_CODE invite code flagged marks_as_test (staging: 5VC2MCCN)
|
||
SOAK_HOLES override --holes
|
||
SOAK_ROOMS override --rooms
|
||
SOAK_ACCOUNTS override --accounts
|
||
SOAK_CPUS_PER_ROOM override --cpus-per-room
|
||
SOAK_GAMES_PER_ROOM override --games-per-room
|
||
SOAK_WATCH override --watch
|
||
SOAK_DASHBOARD_PORT override --dashboard-port
|
||
```
|
||
|
||
## Watch modes
|
||
|
||
- **`none`** — pure headless, JSON logs to stdout. Use for CI and overnight runs.
|
||
- **`dashboard`** (default) — HTTP+WS server on localhost:7777 serving a live status grid. Click any player tile to watch their live session via CDP screencast.
|
||
- **`tiled`** — 4 native Chromium windows for the host of each room, positioned in a 2×2 grid. Joiners stay headless.
|
||
|
||
## Scenarios
|
||
|
||
| Name | Description |
|
||
|---|---|
|
||
| `populate` | Long 9-hole games with varied CPU personalities, realistic pacing, for populating scoreboards |
|
||
| `stress` | Rapid 1-hole games with chaos injection (rapid clicks, offline toggles, tab blur) for hunting race conditions |
|
||
|
||
Add new scenarios by creating `scenarios/<name>.ts` and registering in `scenarios/index.ts`.
|
||
|
||
## Architecture
|
||
|
||
See the design spec for full module breakdown. Key modules:
|
||
|
||
- `runner.ts` — CLI entry, wires everything together
|
||
- `core/session-pool.ts` — owns browser contexts, seeds/logs in 16 accounts
|
||
- `core/room-coordinator.ts` — host→joiners room-code handoff
|
||
- `core/watchdog.ts` — per-room timeout detection
|
||
- `core/screencaster.ts` — CDP Page.startScreencast for live video
|
||
- `dashboard/server.ts` — HTTP + WS server
|
||
- `scenarios/` — pluggable scenarios
|
||
|
||
Reuses `../../tests/e2e/bot/golf-bot.ts` unchanged.
|
||
|
||
## Running tests (unit)
|
||
|
||
```bash
|
||
npm test
|
||
```
|
||
|
||
Tests cover `Deferred`, `RoomCoordinator`, `Watchdog`, and `config`.
|
||
Integration-level modules are verified by the smoke test.
|
||
```
|
||
|
||
- [ ] **Step 2: Create `tests/soak/CHECKLIST.md`**
|
||
|
||
```markdown
|
||
# Soak Harness Manual Validation Checklist
|
||
|
||
Run after any significant change or before calling the implementation complete.
|
||
|
||
## Bring-up
|
||
|
||
- [ ] Local dev server is running (`python server/main.py`)
|
||
- [ ] `SOAKTEST` invite code exists locally with `marks_as_test=TRUE`
|
||
- [ ] `npm install` in `tests/soak/` succeeded
|
||
- [ ] `npm run seed -- --count=16` creates/updates 16 accounts
|
||
- [ ] `.env.stresstest` has 16 `SOAK_ACCOUNT_NN=...` lines
|
||
- [ ] All seeded users show `is_test_account=TRUE` in the DB
|
||
|
||
## Smoke
|
||
|
||
- [ ] `bash scripts/smoke.sh` exits 0 within 60s
|
||
|
||
## Scenarios
|
||
|
||
- [ ] `--scenario=populate --rooms=1 --games-per-room=1` completes cleanly
|
||
- [ ] `--scenario=populate --rooms=4 --games-per-room=1` runs 4 rooms in parallel with no cross-contamination
|
||
- [ ] `--scenario=stress --games-per-room=3` logs `chaos_injected` events
|
||
|
||
## Watch modes
|
||
|
||
- [ ] `--watch=none` produces JSONL on stdout, nothing else
|
||
- [ ] `--watch=dashboard` opens http://localhost:7777, grid renders, tiles update live, WS status shows `healthy`
|
||
- [ ] Clicking any player tile opens the video modal and streams live JPEG frames (~10 fps)
|
||
- [ ] Closing the modal stops the screencast (check logs for `screencast_stopped`)
|
||
- [ ] `--watch=tiled` opens 4 native Chromium windows for the 4 hosts
|
||
|
||
## Failure modes
|
||
|
||
- [ ] Ctrl-C during a run → graceful shutdown, summary printed, exit code 2
|
||
- [ ] Double Ctrl-C → hard exit (130)
|
||
- [ ] Killing the dev server mid-run → health probes fail 3× → fatal abort, artifacts captured, exit 1
|
||
- [ ] Artifacts directory contains a subdirectory per failed run with screenshots and state.json
|
||
- [ ] Artifacts older than 7 days are pruned on next startup
|
||
|
||
## Server-side filtering
|
||
|
||
- [ ] `GET /api/stats/leaderboard` (default) hides soak_* accounts
|
||
- [ ] `GET /api/stats/leaderboard?include_test=true` shows soak_* accounts
|
||
- [ ] Admin panel user list shows `[Test]` badge on soak_* accounts
|
||
- [ ] Admin panel "Include test accounts" checkbox filters them out
|
||
- [ ] Admin panel invite codes tab shows `[Test-seed]` next to SOAKTEST
|
||
|
||
## Staging bring-up (final step)
|
||
|
||
- [ ] `UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN';` run on staging
|
||
- [ ] `SOAK_INVITE_CODE=5VC2MCCN TEST_URL=https://staging.adlee.work npm run seed -- --count=16` seeds staging accounts
|
||
- [ ] Staging run with `--scenario=populate --watch=none` completes
|
||
- [ ] Staging leaderboard with `include_test=true` shows the soak accounts
|
||
- [ ] Staging leaderboard default (no param) does NOT show the soak accounts
|
||
```
|
||
|
||
- [ ] **Step 3: Commit**
|
||
|
||
```bash
|
||
git add tests/soak/README.md tests/soak/CHECKLIST.md
|
||
git commit -m "$(cat <<'EOF'
|
||
docs(soak): full README + manual validation checklist
|
||
|
||
Quickstart, flag reference, env var reference, scenario table, and
|
||
the bring-up/validation checklist that gates calling the harness
|
||
implementation complete.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
### Task 32: Staging bring-up (manual, no code)
|
||
|
||
This is a documentation-only task — the actual run happens on your workstation. Listed here so the implementation plan is complete end to end.
|
||
|
||
- [ ] **Step 1: Flag `5VC2MCCN` as test-seed on staging**
|
||
|
||
From your workstation (requires DB access to staging):
|
||
|
||
```bash
|
||
ssh root@129.212.150.189 \
|
||
'docker exec -i golfgame-postgres psql -U postgres -d golfgame' <<'EOF'
|
||
UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN';
|
||
SELECT code, max_uses, use_count, marks_as_test FROM invite_codes WHERE code = '5VC2MCCN';
|
||
EOF
|
||
```
|
||
|
||
Expected: `marks_as_test | t`.
|
||
|
||
(The exact docker container name may differ — adjust based on `docker ps` on the staging host.)
|
||
|
||
- [ ] **Step 2: Seed the 16 staging accounts**
|
||
|
||
```bash
|
||
cd tests/soak
|
||
rm -f .env.stresstest
|
||
TEST_URL=https://staging.adlee.work \
|
||
SOAK_INVITE_CODE=5VC2MCCN \
|
||
npm run seed -- --count=16
|
||
```
|
||
|
||
Expected: `.env.stresstest` populated with 16 entries.
|
||
|
||
- [ ] **Step 3: Run populate against staging**
|
||
|
||
```bash
|
||
TEST_URL=https://staging.adlee.work \
|
||
SOAK_INVITE_CODE=5VC2MCCN \
|
||
npm run soak -- \
|
||
--scenario=populate \
|
||
--rooms=4 \
|
||
--games-per-room=3 \
|
||
--holes=3 \
|
||
--watch=dashboard
|
||
```
|
||
|
||
Expected: dashboard opens, 4 rooms play 3 games each, staging scoreboard accumulates data. Exit 0 at the end.
|
||
|
||
- [ ] **Step 4: Verify scoreboard filtering on staging**
|
||
|
||
```bash
|
||
# Should NOT contain soak_* usernames
|
||
curl -s "https://staging.adlee.work/api/stats/leaderboard?metric=wins" | jq '.entries[] | select(.username | startswith("soak_"))'
|
||
|
||
# Should contain soak_* usernames
|
||
curl -s "https://staging.adlee.work/api/stats/leaderboard?metric=wins&include_test=true" | jq '.entries[] | select(.username | startswith("soak_"))'
|
||
```
|
||
|
||
Expected: first returns nothing, second returns entries.
|
||
|
||
- [ ] **Step 5: Mark implementation complete**
|
||
|
||
Check off all items in `tests/soak/CHECKLIST.md` that correspond to this plan. Commit the filled-in checklist if you want a record:
|
||
|
||
```bash
|
||
git add tests/soak/CHECKLIST.md
|
||
git commit -m "docs(soak): checklist passed on initial staging run"
|
||
```
|
||
|
||
---
|
||
|
||
## Phase 11 — Version bump
|
||
|
||
### Task 33: Bump to v3.3.4 and add footer to admin.html
|
||
|
||
Updates all HTML footers from `v3.1.6` to `v3.3.4`, adds a footer to admin.html which currently has none, bumps `pyproject.toml`.
|
||
|
||
**Files:**
|
||
- Modify: `client/index.html` — both footer occurrences (L58, L291)
|
||
- Modify: `client/admin.html` — add footer
|
||
- Modify: `pyproject.toml` — version field
|
||
|
||
- [ ] **Step 1: Update `client/index.html` footers**
|
||
|
||
```bash
|
||
grep -n "v3\.1\.6" client/index.html
|
||
```
|
||
|
||
For each match, replace `v3.1.6` with `v3.3.4`. There should be exactly two matches.
|
||
|
||
- [ ] **Step 2: Add footer to `client/admin.html`**
|
||
|
||
Find the closing `</body>` in `client/admin.html` and add a footer just before it:
|
||
|
||
```html
|
||
<footer class="app-footer" style="text-align: center; padding: 16px; color: var(--muted, #666); font-size: 12px;">v3.3.4 © Aaron D. Lee</footer>
|
||
</body>
|
||
```
|
||
|
||
(The inline style is a fallback — admin.css may already have an `.app-footer` class; if so, drop the inline styles.)
|
||
|
||
```bash
|
||
grep -n "app-footer" client/admin.css 2>/dev/null
|
||
```
|
||
|
||
If the class exists, use just `<footer class="app-footer">v3.3.4 © Aaron D. Lee</footer>`.
|
||
|
||
- [ ] **Step 3: Bump `pyproject.toml`**
|
||
|
||
```bash
|
||
sed -i 's/^version = "3\.1\.6"$/version = "3.3.4"/' pyproject.toml
|
||
grep version pyproject.toml
|
||
```
|
||
|
||
Expected: `version = "3.3.4"`.
|
||
|
||
- [ ] **Step 4: Verify in the browser**
|
||
|
||
Restart the dev server, open http://localhost:8000 and http://localhost:8000/admin.html. Confirm both show `v3.3.4` in the footer.
|
||
|
||
- [ ] **Step 5: Commit**
|
||
|
||
```bash
|
||
git add client/index.html client/admin.html pyproject.toml
|
||
git commit -m "$(cat <<'EOF'
|
||
chore: bump version to v3.3.4
|
||
|
||
Updates client/index.html footer (×2) and pyproject.toml from
|
||
v3.1.6 → v3.3.4, and adds a matching footer to client/admin.html
|
||
which previously had none.
|
||
|
||
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
EOF
|
||
)"
|
||
```
|
||
|
||
---
|
||
|
||
## Summary
|
||
|
||
33 tasks across 11 phases:
|
||
|
||
| Phase | Tasks | Milestone |
|
||
|---|---|---|
|
||
| 1 — Server changes | 1–8 | Stats filter works, test accounts are separable |
|
||
| 2 — Harness scaffolding | 9–12 | Core pure-logic modules with Vitest tests pass |
|
||
| 3 — SessionPool + seeding | 13–14 | `.env.stresstest` seeded via real HTTP |
|
||
| 4 — First run | 15–18 | **`--watch=none` smoke test passes end-to-end** |
|
||
| 5 — Dashboard | 19–21 | Live status grid in browser |
|
||
| 6 — Live video | 22–23 | Click-to-watch CDP screencast |
|
||
| 7 — Tiled mode | 24 | Native host windows |
|
||
| 8 — Stress scenario | 25 | Chaos injection runs clean |
|
||
| 9 — Failure handling | 26–29 | Watchdog + artifacts + graceful shutdown + health probes |
|
||
| 10 — Polish | 30–31 | Smoke script + README + CHECKLIST |
|
||
| 11 — Version bump | 33 | v3.3.4 everywhere |
|
||
|
||
(Task 32 is the manual staging bring-up — no code.)
|
||
|
||
Dependencies between tasks:
|
||
|
||
- Tasks 1–8 are independent of the harness (ship them first if you want immediate value for admins)
|
||
- Tasks 9–18 are strictly sequential (each builds on the previous)
|
||
- Tasks 19–21, 22–23, 24, 25 are independent of each other — can be done in any order after Task 18
|
||
- Tasks 26–29 can be done after Task 18 but are most valuable after Task 25
|
||
- Tasks 30–31 come last before staging
|
||
- Task 33 is independent and can be done any time after Task 8
|