8 Commits

Author SHA1 Message Date
adlee-was-taken
76a9de27c2 chore(staging): wire LEADERBOARD_INCLUDE_TEST_DEFAULT into compose
All checks were successful
Build & Deploy Staging / build-and-deploy (release) Successful in 31s
The v3.3.5 router reads config.LEADERBOARD_INCLUDE_TEST_DEFAULT but the
staging compose file was never passing the env through to the container.
This change was applied manually on the staging host before but never
made it back into the repo — fixing that so CI deploys pick it up.

Value on staging is sourced from .env (already set to true). Production
leaves it unset, so the default of false applies.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:55:00 -04:00
adlee-was-taken
f37f279098 chore: bump version to 3.3.5
Some checks failed
Build & Deploy Staging / build-and-deploy (release) Failing after 5s
Covers: game-lifecycle DB fixes (stranded active games, populated
metadata, winner_id, stats idempotency) and staging leaderboard
include-test override.

Also aligns FastAPI app version in server/main.py (was stuck at 3.2.0)
with the actual release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:50:30 -04:00
adlee-was-taken
c02b0054c2 fix(server): winner_id on completed games + stats idempotency latch
Two issues in the GAME_OVER broadcast path:

1. log_game_end called update_game_completed with winner_id=None default,
   so games_v2.winner_id was NULL on all 17 completed staging rows. The
   denormalized column existed but carried no information. Compute winner
   (lowest total; None on tie) in broadcast_game_state and thread through.

2. _process_stats_safe had no idempotency guard. log_game_end was already
   self-guarding via game_log_id=None after first fire, but nothing
   stopped repeated GAME_OVER broadcasts from re-firing stats and
   double-counting games_played/games_won. Add Room.stats_processed latch;
   reset it in handle_start_game so a re-used room still records.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:47:53 -04:00
adlee-was-taken
8030a3c171 fix(server): populate games_v2 metadata on game start
update_game_started (started_at, num_players, num_rounds, player_ids)
was defined in event_store but had zero callers. 289/289 staging games
had those fields NULL — queries that joined on them returned garbage,
and the denormalized player_ids GIN index was dead weight.

log_game_start now calls create_game THEN update_game_started in one
async task. If create fails, update is skipped (row doesn't exist).
handlers.py passes num_rounds and player_ids through at call time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:41:58 -04:00
adlee-was-taken
d5f8eef6b3 fix(server): mark games abandoned on room teardown + staging leaderboard
When handle_player_leave emptied a room or handle_end_game was invoked,
the room was removed from memory without touching games_v2. Periodic
cleanup only scans in-memory rooms, so those rows were stranded as
status='active' forever — staging had 42 orphans accumulated over 5h.

- event_store.update_game_abandoned: guarded UPDATE (status='active' only)
- GameLogger.log_game_abandoned{,_async}: fire-and-forget wrapper
- handle_end_game + handle_player_leave: flip status before remove_room
- LEADERBOARD_INCLUDE_TEST_DEFAULT: env override so staging can show
  soak-harness accounts by default; prod keeps them hidden

Verified on staging: 42 orphans swept on restart, soak accounts now
visible on /api/stats/leaderboard (rank 1-4).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:37:49 -04:00
adlee-was-taken
70498b1c33 fix(soak): multi-hole round transitions, token refresh, dashboard wiring
- Session loop now handles round_over by clicking #ss-next-btn (the
  scoresheet modal button) instead of exiting early. Waits for next
  round or game_over before continuing.
- SessionPool detects expired tokens on acquire and re-logins
  automatically, writing fresh credentials to .env.stresstest.
- Added 2s post-game delay before goto('/') so the server can process
  game completion before WebSocket disconnect.
- Wired dashboard metrics (games_completed, moves_total, errors),
  activity log entries, and player tiles for both populate and stress
  scenarios.
- Bumped screencast resolution to 960x540 and set headless viewport
  to 960x800 for better click-to-watch framing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 20:37:24 -04:00
adlee-was-taken
ccc2f3b559 fix(server): game completion pipeline — stats recording + dict iteration safety
Three bugs prevented game stats from recording:

1. broadcast_game_state had game_over processing (log_game_end + stats)
   inside the per-player loop — if all players disconnected before the
   loop ran, stats never processed. Moved to run once before the loop.

2. room.broadcast and broadcast_game_state iterated players.items()
   without snapshotting, causing RuntimeError when concurrent player
   disconnects mutated the dict. Fixed with list().

3. stats_service.process_game_from_state passed avg_round_score to a
   CASE expression without a type hint, causing asyncpg to fail with
   "could not determine data type of parameter $6". Added ::integer
   casts.

Also wrapped per-player send_json calls in try/except so a single
disconnected player doesn't abort the broadcast to remaining players.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-17 20:37:08 -04:00
adlee-was-taken
d5194f43ba docs(soak): full README + validation checklist
Replaces the Task 31 stub README with complete documentation:
quickstart, first-time setup (invite flagging, seeding, smoke),
usage examples for all three watch modes, CLI flag reference, env
var table, scenario descriptions, error handling summary, test
account filtering explanation, and architecture overview.

Adds CHECKLIST.md with post-deploy verification, bring-up,
scenario, watch mode, failure handling, and staging gate items.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 23:05:22 -04:00
21 changed files with 993 additions and 110 deletions

View File

@@ -400,7 +400,7 @@
<!-- Toast Container -->
<div id="toast-container"></div>
<footer class="app-footer" style="text-align: center; padding: 16px; color: #888; font-size: 12px;">v3.3.4 &copy; Aaron D. Lee</footer>
<footer class="app-footer" style="text-align: center; padding: 16px; color: #888; font-size: 12px;">v3.3.5 &copy; Aaron D. Lee</footer>
<script src="admin.js"></script>
</body>

View File

@@ -55,7 +55,7 @@
<p id="lobby-error" class="error"></p>
<footer class="app-footer">v3.3.4 &copy; Aaron D. Lee</footer>
<footer class="app-footer">v3.3.5 &copy; Aaron D. Lee</footer>
</div>
<!-- Matchmaking Screen -->
@@ -288,7 +288,7 @@
<p id="waiting-message" class="info">Waiting for host to start the game...</p>
</div>
<footer class="app-footer">v3.3.4 &copy; Aaron D. Lee</footer>
<footer class="app-footer">v3.3.5 &copy; Aaron D. Lee</footer>
</div>
<!-- Game Screen -->

View File

@@ -35,6 +35,7 @@ services:
- BOOTSTRAP_ADMIN_USERNAME=${BOOTSTRAP_ADMIN_USERNAME:-}
- BOOTSTRAP_ADMIN_PASSWORD=${BOOTSTRAP_ADMIN_PASSWORD:-}
- MATCHMAKING_ENABLED=true
- LEADERBOARD_INCLUDE_TEST_DEFAULT=${LEADERBOARD_INCLUDE_TEST_DEFAULT:-false}
depends_on:
postgres:
condition: service_healthy

View File

@@ -1,6 +1,6 @@
[project]
name = "golfgame"
version = "3.3.4"
version = "3.3.5"
description = "6-Card Golf card game with AI opponents"
readme = "README.md"
requires-python = ">=3.11"

View File

@@ -171,6 +171,12 @@ class ServerConfig:
# Rate limiting
RATE_LIMIT_ENABLED: bool = True
# Leaderboard: include soak-harness test accounts by default when the
# client doesn't pass ?include_test. Production keeps this False so real
# users never see synthetic traffic; staging can flip it True so bring-up
# traffic actually shows on the board.
LEADERBOARD_INCLUDE_TEST_DEFAULT: bool = False
# Error tracking (Sentry)
SENTRY_DSN: str = ""
@@ -216,6 +222,7 @@ class ServerConfig:
MATCHMAKING_MAX_PLAYERS=get_env_int("MATCHMAKING_MAX_PLAYERS", 4),
ADMIN_EMAILS=admin_emails,
RATE_LIMIT_ENABLED=get_env_bool("RATE_LIMIT_ENABLED", True),
LEADERBOARD_INCLUDE_TEST_DEFAULT=get_env_bool("LEADERBOARD_INCLUDE_TEST_DEFAULT", False),
SENTRY_DSN=get_env("SENTRY_DSN", ""),
card_values=CardValues(
ACE=get_env_int("CARD_ACE", 1),

View File

@@ -209,12 +209,16 @@ async def handle_start_game(data: dict, ctx: ConnectionContext, *, broadcast_gam
async with ctx.current_room.game_lock:
ctx.current_room.game.start_game(num_decks, num_rounds, options)
# Reset the per-game idempotency latch so this game's stats can fire.
ctx.current_room.stats_processed = False
game_logger = get_logger()
if game_logger:
ctx.current_room.game_log_id = game_logger.log_game_start(
room_code=ctx.current_room.code,
num_players=len(ctx.current_room.players),
num_rounds=num_rounds,
player_ids=[p.id for p in ctx.current_room.game.players],
options=options,
)
@@ -508,6 +512,14 @@ async def handle_end_game(data: dict, ctx: ConnectionContext, *, room_manager, c
pass
ctx.current_room.cpu_turn_task = None
# Mark the DB row abandoned before we lose the room (and its game_log_id)
# from memory — otherwise games_v2 would be stranded as 'active' forever.
if ctx.current_room.game_log_id:
game_logger = get_logger()
if game_logger:
game_logger.log_game_abandoned(ctx.current_room.game_log_id)
ctx.current_room.game_log_id = None
await ctx.current_room.broadcast({
"type": "game_ended",
"reason": "Host ended the game",

View File

@@ -432,7 +432,7 @@ async def _close_all_websockets():
app = FastAPI(
title="Golf Card Game",
debug=config.DEBUG,
version="3.2.0",
version="3.3.5",
lifespan=lifespan,
)
@@ -751,16 +751,42 @@ async def broadcast_game_state(room: Room):
spectator_state = room.game.get_state(None) # No player perspective
await _spectator_manager.send_game_state(room.code, spectator_state)
for pid, player in room.players.items():
# Process game completion BEFORE the per-player loop so it runs exactly
# once and isn't gated on any player still being connected.
if room.game.phase == GamePhase.GAME_OVER:
# Determine winner (lowest total; None on tie) so games_v2.winner_id
# is actually populated and stats/rating agree with each other.
winner_id: Optional[str] = None
if room.game.players:
lowest = min(p.total_score for p in room.game.players)
leaders = [p for p in room.game.players if p.total_score == lowest]
if len(leaders) == 1:
winner_id = leaders[0].id
game_logger = get_logger()
if game_logger and room.game_log_id:
game_logger.log_game_end(room.game_log_id, winner_id=winner_id)
room.game_log_id = None
# Idempotency: latch on room so repeat GAME_OVER broadcasts don't
# double-count. Set before scheduling the task — the task itself is
# fire-and-forget and might outlive this function.
if _stats_service and room.game.players and not room.stats_processed:
room.stats_processed = True
asyncio.create_task(_process_stats_safe(room))
for pid, player in list(room.players.items()):
# Skip CPU players
if player.is_cpu or not player.websocket:
continue
game_state = room.game.get_state(pid)
try:
await player.websocket.send_json({
"type": "game_state",
"game_state": game_state,
})
except Exception:
continue
# Check for round over
if room.game.phase == GamePhase.ROUND_OVER:
@@ -768,9 +794,9 @@ async def broadcast_game_state(room: Room):
{"id": p.id, "name": p.name, "score": p.score, "total": p.total_score, "rounds_won": p.rounds_won}
for p in room.game.players
]
# Build rankings
by_points = sorted(scores, key=lambda x: x["total"])
by_holes_won = sorted(scores, key=lambda x: -x["rounds_won"])
try:
await player.websocket.send_json({
"type": "round_over",
"scores": scores,
@@ -782,25 +808,17 @@ async def broadcast_game_state(room: Room):
"by_holes_won": by_holes_won,
},
})
except Exception:
pass
# Check for game over
elif room.game.phase == GamePhase.GAME_OVER:
# Log game end
game_logger = get_logger()
if game_logger and room.game_log_id:
game_logger.log_game_end(room.game_log_id)
room.game_log_id = None # Clear to avoid duplicate logging
# Process stats asynchronously (fire-and-forget) to avoid delaying game over notifications
if _stats_service and room.game.players:
asyncio.create_task(_process_stats_safe(room))
scores = [
{"name": p.name, "total": p.total_score, "rounds_won": p.rounds_won}
for p in room.game.players
]
by_points = sorted(scores, key=lambda x: x["total"])
by_holes_won = sorted(scores, key=lambda x: -x["rounds_won"])
try:
await player.websocket.send_json({
"type": "game_over",
"final_scores": by_points,
@@ -809,6 +827,8 @@ async def broadcast_game_state(room: Room):
"by_holes_won": by_holes_won,
},
})
except Exception:
pass
# Notify current player it's their turn (only if human)
elif room.game.phase in (GamePhase.PLAYING, GamePhase.FINAL_TURN):
@@ -909,6 +929,14 @@ async def handle_player_leave(room: Room, player_id: str):
# Check both is_empty() AND human_player_count() — CPU players keep rooms
# technically non-empty, but a room with only CPUs is an abandoned room.
if room.is_empty() or room.human_player_count() == 0:
# Mark games_v2 abandoned while we still hold the game_log_id. After
# remove_room() the row would be stranded as 'active' — periodic
# cleanup only scans in-memory rooms.
if room.game_log_id:
game_logger = get_logger()
if game_logger:
game_logger.log_game_abandoned(room.game_log_id)
room.game_log_id = None
# Remove all remaining CPU players to release their profiles
for cpu in list(room.get_cpu_players()):
room.remove_player(cpu.id)

View File

@@ -73,6 +73,10 @@ class Room:
game_lock: asyncio.Lock = field(default_factory=asyncio.Lock)
cpu_turn_task: Optional[asyncio.Task] = None
last_activity: float = field(default_factory=time.time)
# Latched True after _process_stats_safe fires for this game; prevents
# double-counting if broadcast_game_state is invoked multiple times
# with phase=GAME_OVER (double-click on next-round, reconnect flush).
stats_processed: bool = False
def touch(self) -> None:
"""Update last_activity timestamp to mark room as active."""
@@ -232,7 +236,7 @@ class Room:
message: JSON-serializable message dict.
exclude: Optional player ID to skip.
"""
for player_id, player in self.players.items():
for player_id, player in list(self.players.items()):
if player_id != exclude and player.websocket and not player.is_cpu:
try:
await player.websocket.send_json(message)

View File

@@ -12,6 +12,7 @@ from typing import Optional
from fastapi import APIRouter, Depends, HTTPException, Header, Query
from pydantic import BaseModel
from config import config
from models.user import User
from services.stats_service import StatsService
@@ -159,7 +160,7 @@ async def get_leaderboard(
metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"),
limit: int = Query(50, ge=1, le=100),
offset: int = Query(0, ge=0),
include_test: bool = Query(False, description="Include soak-harness test accounts"),
include_test: Optional[bool] = Query(None, description="Include soak-harness test accounts. Defaults to LEADERBOARD_INCLUDE_TEST_DEFAULT env (False in prod)."),
service: StatsService = Depends(get_stats_service_dep),
):
"""
@@ -173,9 +174,14 @@ async def get_leaderboard(
- streak: Best win streak
Players must have 5+ games to appear on leaderboards.
By default, soak-harness test accounts are hidden.
Soak-harness test accounts are hidden unless include_test is passed,
or LEADERBOARD_INCLUDE_TEST_DEFAULT is set True on the server (staging).
"""
entries = await service.get_leaderboard(metric, limit, offset, include_test)
effective_include_test = (
include_test if include_test is not None
else config.LEADERBOARD_INCLUDE_TEST_DEFAULT
)
entries = await service.get_leaderboard(metric, limit, offset, effective_include_test)
return {
"metric": metric,

View File

@@ -63,21 +63,19 @@ class GameLogger:
self,
room_code: str,
num_players: int,
num_rounds: int,
player_ids: list[str],
options: "GameOptions",
game_id: Optional[str] = None,
) -> str:
"""
Log game start, return game_id.
Log game start. Writes the row via create_game and then populates
started_at/num_players/num_rounds/player_ids via update_game_started
so downstream queries don't see a half-initialized games_v2 row.
Creates a game record in games_v2 table.
Args:
room_code: Room code for the game.
num_players: Number of players.
options: Game options/house rules.
Returns:
Generated game UUID.
If create_game fails the update is skipped — the row doesn't exist.
"""
if game_id is None:
game_id = str(uuid.uuid4())
try:
@@ -87,9 +85,20 @@ class GameLogger:
host_id="system",
options=self._options_to_dict(options),
)
except Exception as e:
log.error(f"Failed to log game start (create): {e}")
return game_id
try:
await self.event_store.update_game_started(
game_id,
num_players,
num_rounds,
player_ids,
)
log.debug(f"Logged game start: {game_id} room={room_code}")
except Exception as e:
log.error(f"Failed to log game start: {e}")
log.error(f"Failed to log game start (update): {e}")
return game_id
@@ -97,6 +106,8 @@ class GameLogger:
self,
room_code: str,
num_players: int,
num_rounds: int,
player_ids: list[str],
options: "GameOptions",
) -> str:
"""
@@ -108,47 +119,46 @@ class GameLogger:
game_id = str(uuid.uuid4())
try:
loop = asyncio.get_running_loop()
# Already in async context - fire task, return ID immediately
asyncio.create_task(self._log_game_start_with_id(game_id, room_code, num_players, options))
asyncio.get_running_loop()
asyncio.create_task(
self.log_game_start_async(
room_code=room_code,
num_players=num_players,
num_rounds=num_rounds,
player_ids=player_ids,
options=options,
game_id=game_id,
)
)
return game_id
except RuntimeError:
# Not in async context - run synchronously
return asyncio.run(self.log_game_start_async(room_code, num_players, options))
return asyncio.run(
self.log_game_start_async(
room_code=room_code,
num_players=num_players,
num_rounds=num_rounds,
player_ids=player_ids,
options=options,
game_id=game_id,
)
)
async def _log_game_start_with_id(
async def log_game_end_async(
self,
game_id: str,
room_code: str,
num_players: int,
options: "GameOptions",
winner_id: Optional[str] = None,
) -> None:
"""Helper to log game start with pre-generated ID."""
try:
await self.event_store.create_game(
game_id=game_id,
room_code=room_code,
host_id="system",
options=self._options_to_dict(options),
)
log.debug(f"Logged game start: {game_id} room={room_code}")
except Exception as e:
log.error(f"Failed to log game start: {e}")
async def log_game_end_async(self, game_id: str) -> None:
"""
Mark game as ended.
Args:
game_id: Game UUID.
Mark game as ended. winner_id is the player who finished with the
lowest total — None when tied or when the caller doesn't have it.
"""
try:
await self.event_store.update_game_completed(game_id)
log.debug(f"Logged game end: {game_id}")
await self.event_store.update_game_completed(game_id, winner_id)
log.debug(f"Logged game end: {game_id} winner={winner_id}")
except Exception as e:
log.error(f"Failed to log game end: {e}")
def log_game_end(self, game_id: str) -> None:
def log_game_end(self, game_id: str, winner_id: Optional[str] = None) -> None:
"""
Sync wrapper for log_game_end_async.
@@ -158,12 +168,30 @@ class GameLogger:
return
try:
loop = asyncio.get_running_loop()
asyncio.create_task(self.log_game_end_async(game_id))
asyncio.get_running_loop()
asyncio.create_task(self.log_game_end_async(game_id, winner_id))
except RuntimeError:
# Not in async context - skip (simulations don't need this)
pass
async def log_game_abandoned_async(self, game_id: str) -> None:
"""Mark game as abandoned (room emptied before GAME_OVER)."""
try:
await self.event_store.update_game_abandoned(game_id)
log.debug(f"Logged game abandoned: {game_id}")
except Exception as e:
log.error(f"Failed to log game abandoned: {e}")
def log_game_abandoned(self, game_id: str) -> None:
"""Sync wrapper: fires async task in async context, no-op otherwise."""
if not game_id:
return
try:
asyncio.get_running_loop()
asyncio.create_task(self.log_game_abandoned_async(game_id))
except RuntimeError:
pass
# -------------------------------------------------------------------------
# Move Logging
# -------------------------------------------------------------------------

View File

@@ -781,7 +781,7 @@ class StatsService:
# We don't have per-round data in legacy mode, so some stats are limited
# Use total_score / num_rounds as an approximation for avg round score
avg_round_score = total_score / num_rounds if num_rounds > 0 else None
avg_round_score = total_score // num_rounds if num_rounds > 0 else total_score
# Update stats
await conn.execute("""
@@ -792,13 +792,13 @@ class StatsService:
rounds_won = rounds_won + $4,
total_points = total_points + $5,
best_score = CASE
WHEN best_score IS NULL THEN $6
WHEN $6 IS NOT NULL AND $6 < best_score THEN $6
WHEN best_score IS NULL THEN $6::integer
WHEN $6::integer IS NOT NULL AND $6::integer < best_score THEN $6::integer
ELSE best_score
END,
worst_score = CASE
WHEN worst_score IS NULL THEN $7
WHEN $7 IS NOT NULL AND $7 > worst_score THEN $7
WHEN worst_score IS NULL THEN $7::integer
WHEN $7::integer IS NOT NULL AND $7::integer > worst_score THEN $7::integer
ELSE worst_score
END,
current_win_streak = CASE WHEN $2 = 1 THEN current_win_streak + 1 ELSE 0 END,

View File

@@ -432,6 +432,23 @@ class EventStore:
winner_id,
)
async def update_game_abandoned(self, game_id: str) -> None:
"""
Mark a game as abandoned — used when a room empties before the game
reaches GAME_OVER, so games_v2 never leaks as stranded 'active'.
Only flips rows that are still active so a legitimately completed game
never gets reverted.
"""
async with self.pool.acquire() as conn:
await conn.execute(
"""
UPDATE games_v2
SET status = 'abandoned', completed_at = NOW()
WHERE id = $1 AND status = 'active'
""",
game_id,
)
async def get_active_games(self) -> list[dict]:
"""
Get all active games for recovery on server restart.

View File

@@ -0,0 +1,327 @@
# SPDX-License-Identifier: GPL-3.0-or-later
"""
Tests for game lifecycle logging — ensuring games_v2 rows never leak as
stranded 'active' when a room is removed from memory without the game
transitioning to GAME_OVER.
Context: on staging we observed 42 games stuck in status='active' because
handle_player_leave and handle_end_game removed the room from memory
without updating games_v2. The periodic cleanup only scans in-memory rooms,
so those rows were orphaned forever.
These tests pin down the fix:
1. GameLogger.log_game_abandoned_async calls event_store.update_game_abandoned
2. handle_end_game marks the game abandoned when the host ends the game
"""
import asyncio
import pytest
from unittest.mock import AsyncMock, MagicMock, patch
from game import GameOptions
from room import Room, RoomManager
from services.game_logger import GameLogger
from handlers import ConnectionContext, handle_end_game, handle_start_game
from test_handlers import MockWebSocket, make_ctx
# ---------------------------------------------------------------------------
# GameLogger.log_game_abandoned — unit
# ---------------------------------------------------------------------------
class TestLogGameAbandoned:
@pytest.mark.asyncio
async def test_calls_update_game_abandoned(self):
"""log_game_abandoned_async delegates to event_store.update_game_abandoned."""
event_store = MagicMock()
event_store.update_game_abandoned = AsyncMock()
logger = GameLogger(event_store)
await logger.log_game_abandoned_async("game-uuid-123")
event_store.update_game_abandoned.assert_awaited_once_with("game-uuid-123")
@pytest.mark.asyncio
async def test_sync_wrapper_fires_task(self):
"""Sync log_game_abandoned fires an async task in async context."""
event_store = MagicMock()
event_store.update_game_abandoned = AsyncMock()
logger = GameLogger(event_store)
logger.log_game_abandoned("game-uuid-456")
# Let the fire-and-forget task run
await asyncio.sleep(0)
await asyncio.sleep(0)
event_store.update_game_abandoned.assert_awaited_once_with("game-uuid-456")
def test_sync_wrapper_noop_on_empty_id(self):
"""Empty game_id is a no-op (nothing to abandon)."""
event_store = MagicMock()
event_store.update_game_abandoned = AsyncMock()
logger = GameLogger(event_store)
logger.log_game_abandoned("")
logger.log_game_abandoned(None)
event_store.update_game_abandoned.assert_not_called()
@pytest.mark.asyncio
async def test_swallows_db_exceptions(self):
"""DB errors are logged, not re-raised (fire-and-forget guarantee)."""
event_store = MagicMock()
event_store.update_game_abandoned = AsyncMock(side_effect=Exception("db down"))
logger = GameLogger(event_store)
# Must not raise
await logger.log_game_abandoned_async("game-uuid-789")
# ---------------------------------------------------------------------------
# handle_end_game integration
# ---------------------------------------------------------------------------
class TestHandleEndGameMarksAbandoned:
@pytest.mark.asyncio
async def test_marks_game_abandoned_before_room_removal(self):
"""When host ends the game, games_v2 must be marked abandoned."""
rm = RoomManager()
room = rm.create_room()
host_ws = MockWebSocket()
room.add_player("host", "Host", host_ws)
room.get_player("host").is_host = True
room.game_log_id = "game-uuid-end"
mock_logger = MagicMock()
mock_logger.log_game_abandoned = MagicMock()
ctx = make_ctx(websocket=host_ws, player_id="host", room=room)
with patch("handlers.get_logger", return_value=mock_logger):
await handle_end_game(
{},
ctx,
room_manager=rm,
cleanup_room_profiles=lambda _code: None,
)
mock_logger.log_game_abandoned.assert_called_once_with("game-uuid-end")
assert room.code not in rm.rooms # room still removed
@pytest.mark.asyncio
async def test_no_log_when_game_log_id_missing(self):
"""If the game never logged a start, there's nothing to mark abandoned."""
rm = RoomManager()
room = rm.create_room()
host_ws = MockWebSocket()
room.add_player("host", "Host", host_ws)
room.get_player("host").is_host = True
# room.game_log_id stays None
mock_logger = MagicMock()
mock_logger.log_game_abandoned = MagicMock()
ctx = make_ctx(websocket=host_ws, player_id="host", room=room)
with patch("handlers.get_logger", return_value=mock_logger):
await handle_end_game(
{},
ctx,
room_manager=rm,
cleanup_room_profiles=lambda _code: None,
)
mock_logger.log_game_abandoned.assert_not_called()
@pytest.mark.asyncio
async def test_non_host_cannot_trigger_abandonment(self):
"""Only the host ends games — non-host requests are rejected unchanged."""
rm = RoomManager()
room = rm.create_room()
room.add_player("host", "Host", MockWebSocket())
room.get_player("host").is_host = True
joiner_ws = MockWebSocket()
room.add_player("joiner", "Joiner", joiner_ws)
room.game_log_id = "game-uuid-untouchable"
mock_logger = MagicMock()
mock_logger.log_game_abandoned = MagicMock()
ctx = make_ctx(websocket=joiner_ws, player_id="joiner", room=room)
with patch("handlers.get_logger", return_value=mock_logger):
await handle_end_game(
{},
ctx,
room_manager=rm,
cleanup_room_profiles=lambda _code: None,
)
mock_logger.log_game_abandoned.assert_not_called()
assert room.code in rm.rooms
class TestLogGameStartPopulatesMetadata:
"""
create_game only writes id/room_code/host_id/options. update_game_started
(which fills in started_at, num_players, num_rounds, player_ids) existed
but had zero callers — 100% of staging's 289 games had those fields NULL.
log_game_start must call both so the row is complete after start_game.
"""
@pytest.mark.asyncio
async def test_start_calls_create_and_update(self):
event_store = MagicMock()
event_store.create_game = AsyncMock()
event_store.update_game_started = AsyncMock()
logger = GameLogger(event_store)
options = GameOptions(initial_flips=0)
await logger.log_game_start_async(
room_code="ABCD",
num_players=3,
num_rounds=9,
player_ids=["p1", "p2", "p3"],
options=options,
)
event_store.create_game.assert_awaited_once()
event_store.update_game_started.assert_awaited_once()
call = event_store.update_game_started.await_args
assert call.kwargs.get("num_players", call.args[1] if len(call.args) > 1 else None) == 3
assert call.kwargs.get("num_rounds", call.args[2] if len(call.args) > 2 else None) == 9
assert call.kwargs.get("player_ids", call.args[3] if len(call.args) > 3 else None) == ["p1", "p2", "p3"]
@pytest.mark.asyncio
async def test_update_uses_same_game_id_as_create(self):
event_store = MagicMock()
event_store.create_game = AsyncMock()
event_store.update_game_started = AsyncMock()
logger = GameLogger(event_store)
await logger.log_game_start_async(
room_code="XYZW",
num_players=2,
num_rounds=1,
player_ids=["a", "b"],
options=GameOptions(initial_flips=0),
)
create_args = event_store.create_game.await_args
update_args = event_store.update_game_started.await_args
created_id = create_args.kwargs.get("game_id", create_args.args[0] if create_args.args else None)
updated_id = update_args.args[0] if update_args.args else update_args.kwargs.get("game_id")
assert created_id == updated_id
assert created_id # non-empty
@pytest.mark.asyncio
async def test_create_failure_skips_update(self):
"""If the row never landed, don't try to update a non-existent id."""
event_store = MagicMock()
event_store.create_game = AsyncMock(side_effect=Exception("db down"))
event_store.update_game_started = AsyncMock()
logger = GameLogger(event_store)
await logger.log_game_start_async(
room_code="ABCD",
num_players=2,
num_rounds=1,
player_ids=["a", "b"],
options=GameOptions(initial_flips=0),
)
event_store.update_game_started.assert_not_awaited()
class TestLogGameEndWinnerId:
"""
update_game_completed accepts winner_id but the existing sync wrapper
called it with the default None → every completed games_v2 row had
winner_id NULL. Thread the winner through so the denormalized column
is actually useful.
"""
@pytest.mark.asyncio
async def test_winner_id_passed_through(self):
event_store = MagicMock()
event_store.update_game_completed = AsyncMock()
logger = GameLogger(event_store)
await logger.log_game_end_async("game-uuid", winner_id="player-7")
event_store.update_game_completed.assert_awaited_once_with("game-uuid", "player-7")
@pytest.mark.asyncio
async def test_winner_id_optional(self):
"""A tie or abandonment-style end without a clear winner still works."""
event_store = MagicMock()
event_store.update_game_completed = AsyncMock()
logger = GameLogger(event_store)
await logger.log_game_end_async("game-uuid")
event_store.update_game_completed.assert_awaited_once_with("game-uuid", None)
@pytest.mark.asyncio
async def test_sync_wrapper_forwards_winner(self):
event_store = MagicMock()
event_store.update_game_completed = AsyncMock()
logger = GameLogger(event_store)
logger.log_game_end("game-uuid", winner_id="player-9")
await asyncio.sleep(0)
await asyncio.sleep(0)
event_store.update_game_completed.assert_awaited_once_with("game-uuid", "player-9")
class TestStatsIdempotency:
"""
broadcast_game_state can fire multiple times with phase=GAME_OVER
(double-click next-round, reconnect flush, etc.). log_game_end is
already idempotent because it nulls game_log_id immediately after.
_process_stats_safe had no such guard → every extra broadcast would
double-count games_played/games_won on the same game.
Solution: Room.stats_processed flag. Set True before firing the task.
"""
def test_room_has_stats_processed_flag_defaulting_false(self):
room = Room(code="TEST")
assert room.stats_processed is False
def test_stats_processed_survives_touch(self):
"""touch() updates last_activity but must not clobber stats_processed."""
room = Room(code="TEST")
room.stats_processed = True
room.touch()
assert room.stats_processed is True
@pytest.mark.asyncio
async def test_start_game_resets_stats_processed(self):
"""When a room is reused for a second game, the latch must reset —
otherwise the new game's stats would be silently dropped."""
from handlers import handle_start_game
rm = RoomManager()
room = rm.create_room()
host_ws = MockWebSocket()
room.add_player("host", "Host", host_ws)
room.get_player("host").is_host = True
room.add_player("p2", "P2", MockWebSocket())
# Previous game already had stats processed
room.stats_processed = True
ctx = make_ctx(websocket=host_ws, player_id="host", room=room)
with patch("handlers.get_logger", return_value=None):
await handle_start_game(
{"decks": 1, "rounds": 1},
ctx,
broadcast_game_state=AsyncMock(),
check_and_run_cpu_turn=lambda _r: None,
)
assert room.stats_processed is False

64
tests/soak/CHECKLIST.md Normal file
View File

@@ -0,0 +1,64 @@
# Soak Harness Validation Checklist
Run after significant changes or before calling the harness implementation complete.
## Post-deploy schema verification
Run after the server-side changes deploy to each environment.
- [ ] Server restarted (docker compose up -d or CI/CD deploy)
- [ ] Server logs show `User store schema initialized` after restart
- [ ] `\d users_v2` shows `is_test_account` column with default `false`
- [ ] `\d invite_codes` shows `marks_as_test` column with default `false`
- [ ] `\d leaderboard_overall` shows `is_test_account` column
- [ ] `\di idx_users_test_account` shows the partial index
- [ ] Leaderboard query still works: `curl .../api/stats/leaderboard` returns entries
- [ ] `?include_test=true` parameter is accepted (no 422/500)
## Bring-up
- [ ] Invite code flagged with `marks_as_test=TRUE` on target environment
- [ ] `bun run seed` creates/updates accounts in `.env.stresstest`
- [ ] All seeded users show `is_test_account=TRUE` in the DB
## Smoke test
- [ ] `bash scripts/smoke.sh` exits 0 within 60s
## Scenarios
- [ ] `--scenario=populate --rooms=1 --games-per-room=1` completes cleanly
- [ ] `--scenario=populate --rooms=2 --games-per-room=2` runs multiple rooms and multiple games
- [ ] `--scenario=stress --games-per-room=3` logs `chaos_injected` events and completes
## Watch modes
- [ ] `--watch=none` produces JSONL on stdout, nothing else
- [ ] `--watch=dashboard` opens http://localhost:7777, grid renders, WS shows `healthy`
- [ ] Clicking a player tile opens the video modal with live JPEG frames
- [ ] Closing the modal (Esc or Close) stops the screencast (check logs for `screencast_stopped`)
- [ ] `--watch=tiled` opens native Chromium windows sized to show the full game table
## Failure handling
- [ ] Ctrl-C during a run → graceful shutdown, summary printed, exit code 2
- [ ] Double Ctrl-C → immediate hard exit (130)
- [ ] Health probes detect server down (3 consecutive failures → fatal abort)
- [ ] Artifacts directory contains screenshots + state JSON on failure
- [ ] Artifacts older than 7 days are pruned on next startup
## Server-side filtering
- [ ] `GET /api/stats/leaderboard` (default) hides soak accounts
- [ ] `GET /api/stats/leaderboard?include_test=true` shows soak accounts
- [ ] Admin panel user list shows `[Test]` badge on soak accounts
- [ ] Admin panel invite codes tab shows `[Test-seed]` badge
- [ ] "Include test accounts" checkbox toggles visibility in admin
## Staging bring-up
- [ ] `5VC2MCCN` flagged with `marks_as_test=TRUE` on staging DB
- [ ] 16 accounts seeded via `SOAK_INVITE_CODE=5VC2MCCN bun run seed`
- [ ] Populate run against staging completes with `--watch=dashboard`
- [ ] Staging leaderboard default does NOT show soak accounts
- [ ] Staging leaderboard with `?include_test=true` does show them

View File

@@ -1,21 +1,296 @@
# Golf Soak & UX Test Harness
Runs 16 authenticated browser sessions across 4 rooms to populate
staging scoreboards and stress-test multiplayer stability.
Standalone Playwright-based runner that drives multiple authenticated
browser sessions playing real multiplayer games. Used for:
**Spec:** `docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md`
**Bring-up:** `docs/soak-harness-bringup.md`
- **Scoreboard population** — fill staging leaderboards with realistic data
- **Stability stress testing** — hunt race conditions, WebSocket leaks, cleanup bugs
- **Live monitoring** — watch bot sessions play in real time via CDP screencast
## Quick start
## Prerequisites
- [Bun](https://bun.sh/) (or Node.js + npm)
- Chromium browser binary (installed via `bunx playwright install chromium`)
- A running Golf Card Game server (local dev or staging)
- An invite code flagged as `marks_as_test=TRUE` (see [Bring-up](#first-time-setup))
## First-time setup
### 1. Install dependencies
```bash
cd tests/soak
bun install
bun run seed # first run only
TEST_URL=http://localhost:8000 bun run smoke
bunx playwright install chromium
```
(The scripts also work with `npm run`, `pnpm run`, etc. — bun is what's installed
on this dev machine.)
### 2. Flag the invite code as test-seed
Full documentation arrives with Task 31.
Any account registered with a test-seed invite gets `is_test_account=TRUE`,
which keeps it out of real-user stats and leaderboards.
**Local dev:**
```bash
PGPASSWORD=devpassword psql -h localhost -U golf -d golf <<'SQL'
INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test)
SELECT 'SOAKTEST', id, NOW() + INTERVAL '10 years', 100, TRUE, TRUE
FROM users_v2 LIMIT 1
ON CONFLICT (code) DO UPDATE SET marks_as_test = TRUE;
SQL
```
**Staging:**
```bash
ssh root@129.212.150.189 \
'docker compose -f /opt/golfgame/docker-compose.staging.yml exec -T postgres psql -U postgres -d golfgame' <<'SQL'
UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN';
SQL
```
### 3. Seed test accounts
```bash
# Local dev
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST bun run seed
# Staging
TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run seed
```
This registers 16 accounts via the invite code and caches their credentials
in `.env.stresstest`. Only needs to run once — subsequent runs reuse the
cached credentials (re-logging in if tokens expire).
### 4. Verify with a smoke test
```bash
# Local dev
TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST bash scripts/smoke.sh
```
Expected: one game plays to completion in ~60 seconds, exits 0.
## Usage
### Populate scoreboards (recommended first run)
```bash
TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run soak -- \
--scenario=populate \
--watch=dashboard
```
This runs 4 rooms x 10 games x 9 holes with varied CPU personalities.
The dashboard opens automatically at `http://localhost:7777`.
### Quick smoke against staging
```bash
TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run soak -- \
--scenario=populate \
--accounts=2 --rooms=1 --cpus-per-room=0 \
--games-per-room=1 --holes=1 \
--watch=dashboard
```
### Stress test with chaos injection
```bash
TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run soak -- \
--scenario=stress \
--accounts=4 --rooms=1 --games-per-room=5 \
--watch=dashboard
```
Rapid 1-hole games with random chaos events (rapid clicks, tab blur,
brief network outage) injected during gameplay.
### Headless mode (CI / overnight)
```bash
TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run soak -- \
--scenario=populate --watch=none
```
Outputs structured JSONL to stdout. Pipe to `jq` for filtering:
```bash
bun run soak -- --scenario=populate --watch=none 2>&1 | jq 'select(.msg == "game_complete")'
```
### Tiled mode (native browser windows)
```bash
bun run soak -- --scenario=populate --rooms=2 --watch=tiled
```
Opens visible Chromium windows for each room's host session. Useful for
hands-on debugging with DevTools.
## CLI flags
```
--scenario=populate|stress required — which scenario to run
--accounts=<n> total sessions (default: from scenario)
--rooms=<n> parallel rooms (default: from scenario)
--cpus-per-room=<n> CPU opponents per room (default: from scenario)
--games-per-room=<n> games per room (default: from scenario)
--holes=<n> holes per game (default: from scenario)
--watch=none|dashboard|tiled visualization mode (default: dashboard)
--dashboard-port=<n> dashboard server port (default: 7777)
--target=<url> override TEST_URL env var
--run-id=<string> custom run identifier (default: timestamp)
--list print available scenarios and exit
--dry-run validate config without running
```
`accounts / rooms` must divide evenly.
## Environment variables
| Variable | Description | Default |
|---|---|---|
| `TEST_URL` | Target server base URL | `http://localhost:8000` |
| `SOAK_INVITE_CODE` | Invite code for account seeding | `SOAKTEST` |
| `SOAK_HOLES` | Override `--holes` | — |
| `SOAK_ROOMS` | Override `--rooms` | — |
| `SOAK_ACCOUNTS` | Override `--accounts` | — |
| `SOAK_CPUS_PER_ROOM` | Override `--cpus-per-room` | — |
| `SOAK_GAMES_PER_ROOM` | Override `--games-per-room` | — |
| `SOAK_WATCH` | Override `--watch` | — |
| `SOAK_DASHBOARD_PORT` | Override `--dashboard-port` | — |
Config precedence: CLI flags > env vars > scenario defaults.
## Watch modes
### `dashboard` (default)
Opens `http://localhost:7777` with a live status grid:
- 2x2 room tiles showing phase, current player, move count, progress bar
- Activity log at the bottom
- **Click any player tile** to watch their live session via CDP screencast
- Press Esc or click Close to stop the video feed
- WS connection status indicator
The dashboard runs **locally on your machine** — the runner's headless
browsers connect to the target server remotely while the dashboard UI
is served from your workstation.
### `tiled`
Opens native Chromium windows for each room's host session, positioned
in a grid. Joiners stay headless. Useful for interactive debugging with
DevTools. The viewport is sized at 960x900 to show the full game table.
### `none`
Pure headless, structured JSONL to stdout. Use for CI, overnight runs,
or piping to `jq`.
## Scenarios
### `populate`
Long multi-round games to populate scoreboards with realistic data.
| Setting | Default |
|---|---|
| Accounts | 16 |
| Rooms | 4 |
| CPUs per room | 1 |
| Games per room | 10 |
| Holes | 9 |
| Decks | 2 |
| Think time | 800-2200ms |
### `stress`
Rapid short games with chaos injection for stability testing.
| Setting | Default |
|---|---|
| Accounts | 16 |
| Rooms | 4 |
| CPUs per room | 2 |
| Games per room | 50 |
| Holes | 1 |
| Decks | 1 |
| Think time | 50-150ms |
| Chaos chance | 5% per turn |
Chaos events: `rapid_clicks`, `tab_blur`, `brief_offline`
### Adding new scenarios
Create `scenarios/<name>.ts` exporting a `Scenario` object, then register
it in `scenarios/index.ts`. See existing scenarios for the pattern.
## Error handling
- **Per-room isolation**: a failure in one room never unwinds other rooms
(`Promise.allSettled`)
- **Watchdog**: 60s per-room timeout — fires if no heartbeat arrives
- **Health probes**: `GET /health` every 30s, 3 consecutive failures = fatal abort
- **Graceful shutdown**: Ctrl-C finishes current turn, then cleans up (10s timeout).
Double Ctrl-C = immediate force exit
- **Artifacts**: on failure, screenshots + HTML + game state JSON saved to
`artifacts/<run-id>/`. Old artifacts auto-pruned after 7 days
- **Exit codes**: `0` = success, `1` = errors, `2` = interrupted
## Test account filtering
Soak accounts are flagged `is_test_account=TRUE` in the database. They are:
- **Hidden by default** from public leaderboards and stats (`?include_test=false`)
- **Visible to admins** by default in the admin panel
- **Togglable** via the "Include test accounts" checkbox in the admin panel
- **Badged** with `[Test]` in the admin user list and `[Test-seed]` on the invite code
## Unit tests
```bash
bun run test
```
27 tests covering Deferred, RoomCoordinator, Watchdog, Logger, and Config.
Integration-level modules (SessionPool, scenarios, dashboard) are verified
by the smoke test and live runs.
## Architecture
```
runner.ts CLI entry — parses flags, wires everything, runs scenario
core/
session-pool.ts Owns browser contexts, seeds/logs in accounts
room-coordinator Deferred-based host→joiners room code handoff
watchdog.ts Per-room timeout detector
screencaster.ts CDP Page.startScreencast for live video
logger.ts Structured JSONL logger with child contexts
artifacts.ts Screenshot/HTML/state capture on failure
types.ts Scenario/Session/Logger contracts
scenarios/
populate.ts Long multi-round games
stress.ts Rapid games with chaos injection
shared/
multiplayer-game.ts Shared "play one game" loop
chaos.ts Chaos event injector
dashboard/
server.ts HTTP + WS server
index.html Status grid UI
dashboard.js WS client + click-to-watch
scripts/
seed-accounts.ts Account seeding CLI
smoke.sh End-to-end canary (~60s)
```
Reuses `tests/e2e/bot/golf-bot.ts` unchanged for all game interactions.
## Related docs
- [Design spec](../../docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md)
- [Bring-up steps](../../docs/soak-harness-bringup.md)
- [Implementation plan](../../docs/superpowers/plans/2026-04-10-multiplayer-soak-test.md)

View File

@@ -59,8 +59,8 @@ export class Screencaster {
await client.send('Page.startScreencast', {
format: opts.format ?? 'jpeg',
quality: opts.quality ?? 60,
maxWidth: opts.maxWidth ?? 640,
maxHeight: opts.maxHeight ?? 360,
maxWidth: opts.maxWidth ?? 960,
maxHeight: opts.maxHeight ?? 540,
everyNthFrame: opts.everyNthFrame ?? 2,
});
this.logger.info('screencast_started', { sessionKey });

View File

@@ -266,12 +266,49 @@ export class SessionPool {
const context = await targetBrowser.newContext({
...this.opts.contextOptions,
baseURL: this.opts.targetUrl,
...(useHeaded ? { viewport: { width: 960, height: 900 } } : {}),
viewport: useHeaded
? { width: 960, height: 900 }
: { width: 960, height: 800 },
});
await this.injectAuth(context, account);
const page = await context.newPage();
await page.goto(this.opts.targetUrl);
// Verify the token is valid — if expired, re-login and reload
const controlsVisible = await page
.waitForSelector('#lobby-game-controls:not(.hidden)', {
state: 'attached',
timeout: 5000,
})
.then(() => true)
.catch(() => false);
if (!controlsVisible) {
this.opts.logger.warn('token_expired_relogin', { account: account.key });
const freshToken = await loginAccount(
this.opts.targetUrl,
account.username,
account.password,
);
account.token = freshToken;
writeCredFile(this.opts.credFile, this.accounts);
await context.addInitScript(
({ token, username }) => {
window.localStorage.setItem('authToken', token);
window.localStorage.setItem(
'authUser',
JSON.stringify({ id: '', username, role: 'user', email_verified: true }),
);
},
{ token: freshToken, username: account.username },
);
await page.goto(this.opts.targetUrl);
await page.waitForSelector('#lobby-game-controls:not(.hidden)', {
state: 'attached',
timeout: 10000,
});
}
// Best-effort tile placement. window.moveTo is often a no-op on
// modern Chromium (especially under Wayland), so we don't rely on
// it — the viewport sized above is what the user actually sees.

View File

@@ -167,7 +167,9 @@ body {
}
#video-frame {
display: block;
width: 960px;
max-width: 100%;
max-height: 70vh;
max-height: 80vh;
object-fit: contain;
border: 1px solid var(--border);
}

View File

@@ -50,9 +50,15 @@ async function runRoom(
let completed = 0;
const errors: ScenarioError[] = [];
// Send player list for dashboard tiles
ctx.dashboard.update(roomId, {
players: sessions.map((s) => ({ key: s.key, score: null, isActive: false })),
});
for (let gameNum = 0; gameNum < cfg.gamesPerRoom; gameNum++) {
if (ctx.signal.aborted) break;
ctx.dashboard.update(roomId, { game: gameNum + 1, totalGames: cfg.gamesPerRoom });
ctx.dashboard.log('info', `${roomId}: starting game ${gameNum + 1}/${cfg.gamesPerRoom}`);
ctx.logger.info('game_start', { room: roomId, game: gameNum + 1 });
const result = await runOneMultiplayerGame(ctx, sessions, {
@@ -66,6 +72,9 @@ async function runRoom(
if (result.completed) {
completed++;
ctx.dashboard.incrementMetric('games_completed');
ctx.dashboard.incrementMetric('moves_total', result.turns);
ctx.dashboard.log('info', `${roomId}: game ${gameNum + 1} complete — ${result.turns} turns, ${(result.durationMs / 1000).toFixed(1)}s`);
ctx.logger.info('game_complete', {
room: roomId,
game: gameNum + 1,
@@ -73,6 +82,8 @@ async function runRoom(
durationMs: result.durationMs,
});
} else {
ctx.dashboard.incrementMetric('errors');
ctx.dashboard.log('error', `${roomId}: game ${gameNum + 1} failed — ${result.error}`);
errors.push({
room: roomId,
reason: 'game_failed',

View File

@@ -53,7 +53,28 @@ export async function runOneMultiplayerGame(
// After the first game ends each session is parked on the
// game_over screen, which hides the lobby's Create Room button.
// goto('/') bounces them back; localStorage-cached auth persists.
await Promise.all(sessions.map((s) => s.bot.goto('/')));
// We must wait for auth hydration to unhide #lobby-game-controls.
await Promise.all(
sessions.map(async (s) => {
await s.bot.goto('/');
try {
await s.page.waitForSelector('#create-room-btn', {
state: 'visible',
timeout: 15000,
});
} catch {
// Auth may have been lost — re-login via the page
const html = await s.page.content().catch(() => '');
ctx.logger.warn('lobby_not_ready', {
session: s.key,
hasControls: html.includes('lobby-game-controls'),
hasHidden: html.includes('lobby-game-controls" class="hidden"') ||
html.includes("lobby-game-controls' class='hidden'"),
});
throw new Error(`lobby not ready for ${s.key} after goto('/')`);
}
}),
);
// Use a unique coordinator key per game-start so Deferreds don't
// carry stale room codes from previous games. The coordinator's
@@ -90,12 +111,36 @@ export async function runOneMultiplayerGame(
async function sessionLoop(sessionIdx: number): Promise<void> {
const session = sessions[sessionIdx];
const isHost = sessionIdx === 0;
while (true) {
if (ctx.signal.aborted) return;
if (Date.now() - start > maxDuration) return;
const phase = await session.bot.getGamePhase();
if (phase === 'game_over' || phase === 'round_over') return;
if (phase === 'game_over') return;
if (phase === 'round_over') {
if (isHost) {
await sleep(1500);
// The scoresheet modal uses #ss-next-btn; the side panel uses #next-round-btn.
// Try both — the visible one gets clicked.
const ssBtn = session.page.locator('#ss-next-btn');
const sideBtn = session.page.locator('#next-round-btn');
const clicked = await ssBtn.click({ timeout: 3000 }).then(() => 'ss').catch(() => null)
|| await sideBtn.click({ timeout: 3000 }).then(() => 'side').catch(() => null);
ctx.logger.info('round_advance', { room: opts.roomId, session: session.key, clicked });
} else {
await sleep(2000);
}
// Wait for the next round to actually start (or game_over on last round)
for (let i = 0; i < 40; i++) {
const p = await session.bot.getGamePhase();
if (p === 'game_over' || p === 'playing' || p === 'initial_flip') break;
await sleep(500);
}
ctx.heartbeat(opts.roomId);
continue;
}
if (await session.bot.isMyTurn()) {
await session.bot.playTurn();
@@ -104,6 +149,11 @@ export async function runOneMultiplayerGame(
ctx.dashboard.update(opts.roomId, {
currentPlayer: session.account.username,
moves: turnCounts.reduce((a, b) => a + b, 0),
players: sessions.map((s, j) => ({
key: s.key,
score: null,
isActive: j === sessionIdx,
})),
});
const thinkMs = randomInt(opts.thinkTimeMs[0], opts.thinkTimeMs[1]);
await sleep(thinkMs);
@@ -115,8 +165,12 @@ export async function runOneMultiplayerGame(
await Promise.all(sessions.map((_, i) => sessionLoop(i)));
// Let the server finish processing game completion (stats, DB update)
// before we navigate away and kill the WebSocket connections.
await sleep(2000);
const totalTurns = turnCounts.reduce((a, b) => a + b, 0);
ctx.dashboard.update(opts.roomId, { phase: 'round_over' });
ctx.dashboard.update(opts.roomId, { phase: 'game_over' });
return {
completed: true,
turns: totalTurns,

View File

@@ -50,10 +50,15 @@ async function runStressRoom(
let chaosFired = 0;
const errors: ScenarioError[] = [];
ctx.dashboard.update(roomId, {
players: sessions.map((s) => ({ key: s.key, score: null, isActive: false })),
});
for (let gameNum = 0; gameNum < cfg.gamesPerRoom; gameNum++) {
if (ctx.signal.aborted) break;
ctx.dashboard.update(roomId, { game: gameNum + 1, totalGames: cfg.gamesPerRoom });
ctx.dashboard.log('info', `${roomId}: starting game ${gameNum + 1}/${cfg.gamesPerRoom}`);
// Background chaos loop — runs concurrently with the game turn loop.
// Delay the first tick by 3 seconds so room creation + joiners + game
@@ -90,12 +95,17 @@ async function runStressRoom(
if (result.completed) {
completed++;
ctx.dashboard.incrementMetric('games_completed');
ctx.dashboard.incrementMetric('moves_total', result.turns);
ctx.dashboard.log('info', `${roomId}: game ${gameNum + 1} complete — ${result.turns} turns`);
ctx.logger.info('game_complete', {
room: roomId,
game: gameNum + 1,
turns: result.turns,
});
} else {
ctx.dashboard.incrementMetric('errors');
ctx.dashboard.log('error', `${roomId}: game ${gameNum + 1} failed — ${result.error}`);
errors.push({
room: roomId,
reason: 'game_failed',