chore(staging): wire LEADERBOARD_INCLUDE_TEST_DEFAULT into compose

The v3.3.5 router reads config.LEADERBOARD_INCLUDE_TEST_DEFAULT but the staging compose file was never passing the env through to the container. This change was applied manually on the staging host before but never made it back into the repo — fixing that so CI deploys pick it up. Value on staging is sourced from .env (already set to true). Production leaves it unset, so the default of false applies. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
chore: bump version to 3.3.5
2026-04-18 00:55:00 -04:00 · 2026-04-18 00:50:30 -04:00 · 2026-04-18 00:47:53 -04:00 · 2026-04-18 00:41:58 -04:00 · 2026-04-18 00:37:49 -04:00 · 2026-04-17 20:37:24 -04:00
21 changed files with 993 additions and 110 deletions
--- a/client/admin.html
+++ b/client/admin.html
@@ -400,7 +400,7 @@
    <!-- Toast Container -->
    <div id="toast-container"></div>

-    <footer class="app-footer" style="text-align: center; padding: 16px; color: #888; font-size: 12px;">v3.3.4 &copy; Aaron D. Lee</footer>
+    <footer class="app-footer" style="text-align: center; padding: 16px; color: #888; font-size: 12px;">v3.3.5 &copy; Aaron D. Lee</footer>

    <script src="admin.js"></script>
 </body>
--- a/client/index.html
+++ b/client/index.html
@@ -55,7 +55,7 @@

            <p id="lobby-error" class="error"></p>

-            <footer class="app-footer">v3.3.4 &copy; Aaron D. Lee</footer>
+            <footer class="app-footer">v3.3.5 &copy; Aaron D. Lee</footer>
        </div>

        <!-- Matchmaking Screen -->
@@ -288,7 +288,7 @@
                <p id="waiting-message" class="info">Waiting for host to start the game...</p>
            </div>

-            <footer class="app-footer">v3.3.4 &copy; Aaron D. Lee</footer>
+            <footer class="app-footer">v3.3.5 &copy; Aaron D. Lee</footer>
        </div>

        <!-- Game Screen -->
--- a/docker-compose.staging.yml
+++ b/docker-compose.staging.yml
@@ -35,6 +35,7 @@ services:
      - BOOTSTRAP_ADMIN_USERNAME=${BOOTSTRAP_ADMIN_USERNAME:-}
      - BOOTSTRAP_ADMIN_PASSWORD=${BOOTSTRAP_ADMIN_PASSWORD:-}
      - MATCHMAKING_ENABLED=true
+      - LEADERBOARD_INCLUDE_TEST_DEFAULT=${LEADERBOARD_INCLUDE_TEST_DEFAULT:-false}
    depends_on:
      postgres:
        condition: service_healthy
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "golfgame"
-version = "3.3.4"
+version = "3.3.5"
 description = "6-Card Golf card game with AI opponents"
 readme = "README.md"
 requires-python = ">=3.11"
--- a/server/config.py
+++ b/server/config.py
@@ -171,6 +171,12 @@ class ServerConfig:
    # Rate limiting
    RATE_LIMIT_ENABLED: bool = True

+    # Leaderboard: include soak-harness test accounts by default when the
+    # client doesn't pass ?include_test. Production keeps this False so real
+    # users never see synthetic traffic; staging can flip it True so bring-up
+    # traffic actually shows on the board.
+    LEADERBOARD_INCLUDE_TEST_DEFAULT: bool = False
+
    # Error tracking (Sentry)
    SENTRY_DSN: str = ""

@@ -216,6 +222,7 @@ class ServerConfig:
            MATCHMAKING_MAX_PLAYERS=get_env_int("MATCHMAKING_MAX_PLAYERS", 4),
            ADMIN_EMAILS=admin_emails,
            RATE_LIMIT_ENABLED=get_env_bool("RATE_LIMIT_ENABLED", True),
+            LEADERBOARD_INCLUDE_TEST_DEFAULT=get_env_bool("LEADERBOARD_INCLUDE_TEST_DEFAULT", False),
            SENTRY_DSN=get_env("SENTRY_DSN", ""),
            card_values=CardValues(
                ACE=get_env_int("CARD_ACE", 1),
--- a/server/handlers.py
+++ b/server/handlers.py
@@ -209,12 +209,16 @@ async def handle_start_game(data: dict, ctx: ConnectionContext, *, broadcast_gam

    async with ctx.current_room.game_lock:
        ctx.current_room.game.start_game(num_decks, num_rounds, options)
+        # Reset the per-game idempotency latch so this game's stats can fire.
+        ctx.current_room.stats_processed = False

        game_logger = get_logger()
        if game_logger:
            ctx.current_room.game_log_id = game_logger.log_game_start(
                room_code=ctx.current_room.code,
                num_players=len(ctx.current_room.players),
+                num_rounds=num_rounds,
+                player_ids=[p.id for p in ctx.current_room.game.players],
                options=options,
            )

@@ -508,6 +512,14 @@ async def handle_end_game(data: dict, ctx: ConnectionContext, *, room_manager, c
            pass
        ctx.current_room.cpu_turn_task = None

+    # Mark the DB row abandoned before we lose the room (and its game_log_id)
+    # from memory — otherwise games_v2 would be stranded as 'active' forever.
+    if ctx.current_room.game_log_id:
+        game_logger = get_logger()
+        if game_logger:
+            game_logger.log_game_abandoned(ctx.current_room.game_log_id)
+        ctx.current_room.game_log_id = None
+
    await ctx.current_room.broadcast({
        "type": "game_ended",
        "reason": "Host ended the game",
--- a/server/main.py
+++ b/server/main.py
@@ -432,7 +432,7 @@ async def _close_all_websockets():
 app = FastAPI(
    title="Golf Card Game",
    debug=config.DEBUG,
-    version="3.2.0",
+    version="3.3.5",
    lifespan=lifespan,
 )

@@ -751,16 +751,42 @@ async def broadcast_game_state(room: Room):
        spectator_state = room.game.get_state(None)  # No player perspective
        await _spectator_manager.send_game_state(room.code, spectator_state)

-    for pid, player in room.players.items():
+    # Process game completion BEFORE the per-player loop so it runs exactly
+    # once and isn't gated on any player still being connected.
+    if room.game.phase == GamePhase.GAME_OVER:
+        # Determine winner (lowest total; None on tie) so games_v2.winner_id
+        # is actually populated and stats/rating agree with each other.
+        winner_id: Optional[str] = None
+        if room.game.players:
+            lowest = min(p.total_score for p in room.game.players)
+            leaders = [p for p in room.game.players if p.total_score == lowest]
+            if len(leaders) == 1:
+                winner_id = leaders[0].id
+
+        game_logger = get_logger()
+        if game_logger and room.game_log_id:
+            game_logger.log_game_end(room.game_log_id, winner_id=winner_id)
+            room.game_log_id = None
+        # Idempotency: latch on room so repeat GAME_OVER broadcasts don't
+        # double-count. Set before scheduling the task — the task itself is
+        # fire-and-forget and might outlive this function.
+        if _stats_service and room.game.players and not room.stats_processed:
+            room.stats_processed = True
+            asyncio.create_task(_process_stats_safe(room))
+
+    for pid, player in list(room.players.items()):
        # Skip CPU players
        if player.is_cpu or not player.websocket:
            continue

        game_state = room.game.get_state(pid)
+        try:
            await player.websocket.send_json({
                "type": "game_state",
                "game_state": game_state,
            })
+        except Exception:
+            continue

        # Check for round over
        if room.game.phase == GamePhase.ROUND_OVER:
@@ -768,9 +794,9 @@ async def broadcast_game_state(room: Room):
                {"id": p.id, "name": p.name, "score": p.score, "total": p.total_score, "rounds_won": p.rounds_won}
                for p in room.game.players
            ]
-            # Build rankings
            by_points = sorted(scores, key=lambda x: x["total"])
            by_holes_won = sorted(scores, key=lambda x: -x["rounds_won"])
+            try:
                await player.websocket.send_json({
                    "type": "round_over",
                    "scores": scores,
@@ -782,25 +808,17 @@ async def broadcast_game_state(room: Room):
                        "by_holes_won": by_holes_won,
                    },
                })
+            except Exception:
+                pass

-        # Check for game over
        elif room.game.phase == GamePhase.GAME_OVER:
-            # Log game end
-            game_logger = get_logger()
-            if game_logger and room.game_log_id:
-                game_logger.log_game_end(room.game_log_id)
-                room.game_log_id = None  # Clear to avoid duplicate logging
-
-            # Process stats asynchronously (fire-and-forget) to avoid delaying game over notifications
-            if _stats_service and room.game.players:
-                asyncio.create_task(_process_stats_safe(room))
-
            scores = [
                {"name": p.name, "total": p.total_score, "rounds_won": p.rounds_won}
                for p in room.game.players
            ]
            by_points = sorted(scores, key=lambda x: x["total"])
            by_holes_won = sorted(scores, key=lambda x: -x["rounds_won"])
+            try:
                await player.websocket.send_json({
                    "type": "game_over",
                    "final_scores": by_points,
@@ -809,6 +827,8 @@ async def broadcast_game_state(room: Room):
                        "by_holes_won": by_holes_won,
                    },
                })
+            except Exception:
+                pass

        # Notify current player it's their turn (only if human)
        elif room.game.phase in (GamePhase.PLAYING, GamePhase.FINAL_TURN):
@@ -909,6 +929,14 @@ async def handle_player_leave(room: Room, player_id: str):
    # Check both is_empty() AND human_player_count() — CPU players keep rooms
    # technically non-empty, but a room with only CPUs is an abandoned room.
    if room.is_empty() or room.human_player_count() == 0:
+        # Mark games_v2 abandoned while we still hold the game_log_id. After
+        # remove_room() the row would be stranded as 'active' — periodic
+        # cleanup only scans in-memory rooms.
+        if room.game_log_id:
+            game_logger = get_logger()
+            if game_logger:
+                game_logger.log_game_abandoned(room.game_log_id)
+            room.game_log_id = None
        # Remove all remaining CPU players to release their profiles
        for cpu in list(room.get_cpu_players()):
            room.remove_player(cpu.id)
--- a/server/room.py
+++ b/server/room.py
@@ -73,6 +73,10 @@ class Room:
    game_lock: asyncio.Lock = field(default_factory=asyncio.Lock)
    cpu_turn_task: Optional[asyncio.Task] = None
    last_activity: float = field(default_factory=time.time)
+    # Latched True after _process_stats_safe fires for this game; prevents
+    # double-counting if broadcast_game_state is invoked multiple times
+    # with phase=GAME_OVER (double-click on next-round, reconnect flush).
+    stats_processed: bool = False

    def touch(self) -> None:
        """Update last_activity timestamp to mark room as active."""
@@ -232,7 +236,7 @@ class Room:
            message: JSON-serializable message dict.
            exclude: Optional player ID to skip.
        """
-        for player_id, player in self.players.items():
+        for player_id, player in list(self.players.items()):
            if player_id != exclude and player.websocket and not player.is_cpu:
                try:
                    await player.websocket.send_json(message)
--- a/server/routers/stats.py
+++ b/server/routers/stats.py
@@ -12,6 +12,7 @@ from typing import Optional
 from fastapi import APIRouter, Depends, HTTPException, Header, Query
 from pydantic import BaseModel

+from config import config
 from models.user import User
 from services.stats_service import StatsService

@@ -159,7 +160,7 @@ async def get_leaderboard(
    metric: str = Query("wins", pattern="^(wins|win_rate|avg_score|knockouts|streak|rating)$"),
    limit: int = Query(50, ge=1, le=100),
    offset: int = Query(0, ge=0),
-    include_test: bool = Query(False, description="Include soak-harness test accounts"),
+    include_test: Optional[bool] = Query(None, description="Include soak-harness test accounts. Defaults to LEADERBOARD_INCLUDE_TEST_DEFAULT env (False in prod)."),
    service: StatsService = Depends(get_stats_service_dep),
 ):
    """
@@ -173,9 +174,14 @@ async def get_leaderboard(
    - streak: Best win streak

    Players must have 5+ games to appear on leaderboards.
-    By default, soak-harness test accounts are hidden.
+    Soak-harness test accounts are hidden unless include_test is passed,
+    or LEADERBOARD_INCLUDE_TEST_DEFAULT is set True on the server (staging).
    """
-    entries = await service.get_leaderboard(metric, limit, offset, include_test)
+    effective_include_test = (
+        include_test if include_test is not None
+        else config.LEADERBOARD_INCLUDE_TEST_DEFAULT
+    )
+    entries = await service.get_leaderboard(metric, limit, offset, effective_include_test)

    return {
        "metric": metric,
--- a/server/services/game_logger.py
+++ b/server/services/game_logger.py
@@ -63,21 +63,19 @@ class GameLogger:
        self,
        room_code: str,
        num_players: int,
+        num_rounds: int,
+        player_ids: list[str],
        options: "GameOptions",
+        game_id: Optional[str] = None,
    ) -> str:
        """
-        Log game start, return game_id.
+        Log game start. Writes the row via create_game and then populates
+        started_at/num_players/num_rounds/player_ids via update_game_started
+        so downstream queries don't see a half-initialized games_v2 row.

-        Creates a game record in games_v2 table.
-
-        Args:
-            room_code: Room code for the game.
-            num_players: Number of players.
-            options: Game options/house rules.
-
-        Returns:
-            Generated game UUID.
+        If create_game fails the update is skipped — the row doesn't exist.
        """
+        if game_id is None:
            game_id = str(uuid.uuid4())

        try:
@@ -87,9 +85,20 @@ class GameLogger:
                host_id="system",
                options=self._options_to_dict(options),
            )
+        except Exception as e:
+            log.error(f"Failed to log game start (create): {e}")
+            return game_id
+
+        try:
+            await self.event_store.update_game_started(
+                game_id,
+                num_players,
+                num_rounds,
+                player_ids,
+            )
            log.debug(f"Logged game start: {game_id} room={room_code}")
        except Exception as e:
-            log.error(f"Failed to log game start: {e}")
+            log.error(f"Failed to log game start (update): {e}")

        return game_id

@@ -97,6 +106,8 @@ class GameLogger:
        self,
        room_code: str,
        num_players: int,
+        num_rounds: int,
+        player_ids: list[str],
        options: "GameOptions",
    ) -> str:
        """
@@ -108,47 +119,46 @@ class GameLogger:
        game_id = str(uuid.uuid4())

        try:
-            loop = asyncio.get_running_loop()
-            # Already in async context - fire task, return ID immediately
-            asyncio.create_task(self._log_game_start_with_id(game_id, room_code, num_players, options))
+            asyncio.get_running_loop()
+            asyncio.create_task(
+                self.log_game_start_async(
+                    room_code=room_code,
+                    num_players=num_players,
+                    num_rounds=num_rounds,
+                    player_ids=player_ids,
+                    options=options,
+                    game_id=game_id,
+                )
+            )
            return game_id
        except RuntimeError:
-            # Not in async context - run synchronously
-            return asyncio.run(self.log_game_start_async(room_code, num_players, options))
+            return asyncio.run(
+                self.log_game_start_async(
+                    room_code=room_code,
+                    num_players=num_players,
+                    num_rounds=num_rounds,
+                    player_ids=player_ids,
+                    options=options,
+                    game_id=game_id,
+                )
+            )

-    async def _log_game_start_with_id(
+    async def log_game_end_async(
        self,
        game_id: str,
-        room_code: str,
-        num_players: int,
-        options: "GameOptions",
+        winner_id: Optional[str] = None,
    ) -> None:
-        """Helper to log game start with pre-generated ID."""
-        try:
-            await self.event_store.create_game(
-                game_id=game_id,
-                room_code=room_code,
-                host_id="system",
-                options=self._options_to_dict(options),
-            )
-            log.debug(f"Logged game start: {game_id} room={room_code}")
-        except Exception as e:
-            log.error(f"Failed to log game start: {e}")
-
-    async def log_game_end_async(self, game_id: str) -> None:
        """
-        Mark game as ended.
-
-        Args:
-            game_id: Game UUID.
+        Mark game as ended. winner_id is the player who finished with the
+        lowest total — None when tied or when the caller doesn't have it.
        """
        try:
-            await self.event_store.update_game_completed(game_id)
-            log.debug(f"Logged game end: {game_id}")
+            await self.event_store.update_game_completed(game_id, winner_id)
+            log.debug(f"Logged game end: {game_id} winner={winner_id}")
        except Exception as e:
            log.error(f"Failed to log game end: {e}")

-    def log_game_end(self, game_id: str) -> None:
+    def log_game_end(self, game_id: str, winner_id: Optional[str] = None) -> None:
        """
        Sync wrapper for log_game_end_async.

@@ -158,12 +168,30 @@ class GameLogger:
            return

        try:
-            loop = asyncio.get_running_loop()
-            asyncio.create_task(self.log_game_end_async(game_id))
+            asyncio.get_running_loop()
+            asyncio.create_task(self.log_game_end_async(game_id, winner_id))
        except RuntimeError:
            # Not in async context - skip (simulations don't need this)
            pass

+    async def log_game_abandoned_async(self, game_id: str) -> None:
+        """Mark game as abandoned (room emptied before GAME_OVER)."""
+        try:
+            await self.event_store.update_game_abandoned(game_id)
+            log.debug(f"Logged game abandoned: {game_id}")
+        except Exception as e:
+            log.error(f"Failed to log game abandoned: {e}")
+
+    def log_game_abandoned(self, game_id: str) -> None:
+        """Sync wrapper: fires async task in async context, no-op otherwise."""
+        if not game_id:
+            return
+        try:
+            asyncio.get_running_loop()
+            asyncio.create_task(self.log_game_abandoned_async(game_id))
+        except RuntimeError:
+            pass
+
    # -------------------------------------------------------------------------
    # Move Logging
    # -------------------------------------------------------------------------
--- a/server/services/stats_service.py
+++ b/server/services/stats_service.py
@@ -781,7 +781,7 @@ class StatsService:

                    # We don't have per-round data in legacy mode, so some stats are limited
                    # Use total_score / num_rounds as an approximation for avg round score
-                    avg_round_score = total_score / num_rounds if num_rounds > 0 else None
+                    avg_round_score = total_score // num_rounds if num_rounds > 0 else total_score

                    # Update stats
                    await conn.execute("""
@@ -792,13 +792,13 @@ class StatsService:
                            rounds_won = rounds_won + $4,
                            total_points = total_points + $5,
                            best_score = CASE
-                                WHEN best_score IS NULL THEN $6
-                                WHEN $6 IS NOT NULL AND $6 < best_score THEN $6
+                                WHEN best_score IS NULL THEN $6::integer
+                                WHEN $6::integer IS NOT NULL AND $6::integer < best_score THEN $6::integer
                                ELSE best_score
                            END,
                            worst_score = CASE
-                                WHEN worst_score IS NULL THEN $7
-                                WHEN $7 IS NOT NULL AND $7 > worst_score THEN $7
+                                WHEN worst_score IS NULL THEN $7::integer
+                                WHEN $7::integer IS NOT NULL AND $7::integer > worst_score THEN $7::integer
                                ELSE worst_score
                            END,
                            current_win_streak = CASE WHEN $2 = 1 THEN current_win_streak + 1 ELSE 0 END,
--- a/server/stores/event_store.py
+++ b/server/stores/event_store.py
@@ -432,6 +432,23 @@ class EventStore:
                winner_id,
            )

+    async def update_game_abandoned(self, game_id: str) -> None:
+        """
+        Mark a game as abandoned — used when a room empties before the game
+        reaches GAME_OVER, so games_v2 never leaks as stranded 'active'.
+        Only flips rows that are still active so a legitimately completed game
+        never gets reverted.
+        """
+        async with self.pool.acquire() as conn:
+            await conn.execute(
+                """
+                UPDATE games_v2
+                SET status = 'abandoned', completed_at = NOW()
+                WHERE id = $1 AND status = 'active'
+                """,
+                game_id,
+            )
+
    async def get_active_games(self) -> list[dict]:
        """
        Get all active games for recovery on server restart.
--- a/server/test_game_lifecycle_logging.py
+++ b/server/test_game_lifecycle_logging.py
@@ -0,0 +1,327 @@
+# SPDX-License-Identifier: GPL-3.0-or-later
+"""
+Tests for game lifecycle logging — ensuring games_v2 rows never leak as
+stranded 'active' when a room is removed from memory without the game
+transitioning to GAME_OVER.
+
+Context: on staging we observed 42 games stuck in status='active' because
+handle_player_leave and handle_end_game removed the room from memory
+without updating games_v2. The periodic cleanup only scans in-memory rooms,
+so those rows were orphaned forever.
+
+These tests pin down the fix:
+  1. GameLogger.log_game_abandoned_async calls event_store.update_game_abandoned
+  2. handle_end_game marks the game abandoned when the host ends the game
+"""
+
+import asyncio
+import pytest
+from unittest.mock import AsyncMock, MagicMock, patch
+
+from game import GameOptions
+from room import Room, RoomManager
+from services.game_logger import GameLogger
+from handlers import ConnectionContext, handle_end_game, handle_start_game
+from test_handlers import MockWebSocket, make_ctx
+
+
+# ---------------------------------------------------------------------------
+# GameLogger.log_game_abandoned — unit
+# ---------------------------------------------------------------------------
+
+class TestLogGameAbandoned:
+
+    @pytest.mark.asyncio
+    async def test_calls_update_game_abandoned(self):
+        """log_game_abandoned_async delegates to event_store.update_game_abandoned."""
+        event_store = MagicMock()
+        event_store.update_game_abandoned = AsyncMock()
+        logger = GameLogger(event_store)
+
+        await logger.log_game_abandoned_async("game-uuid-123")
+
+        event_store.update_game_abandoned.assert_awaited_once_with("game-uuid-123")
+
+    @pytest.mark.asyncio
+    async def test_sync_wrapper_fires_task(self):
+        """Sync log_game_abandoned fires an async task in async context."""
+        event_store = MagicMock()
+        event_store.update_game_abandoned = AsyncMock()
+        logger = GameLogger(event_store)
+
+        logger.log_game_abandoned("game-uuid-456")
+        # Let the fire-and-forget task run
+        await asyncio.sleep(0)
+        await asyncio.sleep(0)
+
+        event_store.update_game_abandoned.assert_awaited_once_with("game-uuid-456")
+
+    def test_sync_wrapper_noop_on_empty_id(self):
+        """Empty game_id is a no-op (nothing to abandon)."""
+        event_store = MagicMock()
+        event_store.update_game_abandoned = AsyncMock()
+        logger = GameLogger(event_store)
+
+        logger.log_game_abandoned("")
+        logger.log_game_abandoned(None)
+
+        event_store.update_game_abandoned.assert_not_called()
+
+    @pytest.mark.asyncio
+    async def test_swallows_db_exceptions(self):
+        """DB errors are logged, not re-raised (fire-and-forget guarantee)."""
+        event_store = MagicMock()
+        event_store.update_game_abandoned = AsyncMock(side_effect=Exception("db down"))
+        logger = GameLogger(event_store)
+
+        # Must not raise
+        await logger.log_game_abandoned_async("game-uuid-789")
+
+
+# ---------------------------------------------------------------------------
+# handle_end_game integration
+# ---------------------------------------------------------------------------
+
+class TestHandleEndGameMarksAbandoned:
+
+    @pytest.mark.asyncio
+    async def test_marks_game_abandoned_before_room_removal(self):
+        """When host ends the game, games_v2 must be marked abandoned."""
+        rm = RoomManager()
+        room = rm.create_room()
+        host_ws = MockWebSocket()
+        room.add_player("host", "Host", host_ws)
+        room.get_player("host").is_host = True
+        room.game_log_id = "game-uuid-end"
+
+        mock_logger = MagicMock()
+        mock_logger.log_game_abandoned = MagicMock()
+
+        ctx = make_ctx(websocket=host_ws, player_id="host", room=room)
+
+        with patch("handlers.get_logger", return_value=mock_logger):
+            await handle_end_game(
+                {},
+                ctx,
+                room_manager=rm,
+                cleanup_room_profiles=lambda _code: None,
+            )
+
+        mock_logger.log_game_abandoned.assert_called_once_with("game-uuid-end")
+        assert room.code not in rm.rooms  # room still removed
+
+    @pytest.mark.asyncio
+    async def test_no_log_when_game_log_id_missing(self):
+        """If the game never logged a start, there's nothing to mark abandoned."""
+        rm = RoomManager()
+        room = rm.create_room()
+        host_ws = MockWebSocket()
+        room.add_player("host", "Host", host_ws)
+        room.get_player("host").is_host = True
+        # room.game_log_id stays None
+
+        mock_logger = MagicMock()
+        mock_logger.log_game_abandoned = MagicMock()
+
+        ctx = make_ctx(websocket=host_ws, player_id="host", room=room)
+
+        with patch("handlers.get_logger", return_value=mock_logger):
+            await handle_end_game(
+                {},
+                ctx,
+                room_manager=rm,
+                cleanup_room_profiles=lambda _code: None,
+            )
+
+        mock_logger.log_game_abandoned.assert_not_called()
+
+    @pytest.mark.asyncio
+    async def test_non_host_cannot_trigger_abandonment(self):
+        """Only the host ends games — non-host requests are rejected unchanged."""
+        rm = RoomManager()
+        room = rm.create_room()
+        room.add_player("host", "Host", MockWebSocket())
+        room.get_player("host").is_host = True
+        joiner_ws = MockWebSocket()
+        room.add_player("joiner", "Joiner", joiner_ws)
+        room.game_log_id = "game-uuid-untouchable"
+
+        mock_logger = MagicMock()
+        mock_logger.log_game_abandoned = MagicMock()
+
+        ctx = make_ctx(websocket=joiner_ws, player_id="joiner", room=room)
+
+        with patch("handlers.get_logger", return_value=mock_logger):
+            await handle_end_game(
+                {},
+                ctx,
+                room_manager=rm,
+                cleanup_room_profiles=lambda _code: None,
+            )
+
+        mock_logger.log_game_abandoned.assert_not_called()
+        assert room.code in rm.rooms
+
+
+class TestLogGameStartPopulatesMetadata:
+    """
+    create_game only writes id/room_code/host_id/options. update_game_started
+    (which fills in started_at, num_players, num_rounds, player_ids) existed
+    but had zero callers — 100% of staging's 289 games had those fields NULL.
+    log_game_start must call both so the row is complete after start_game.
+    """
+
+    @pytest.mark.asyncio
+    async def test_start_calls_create_and_update(self):
+        event_store = MagicMock()
+        event_store.create_game = AsyncMock()
+        event_store.update_game_started = AsyncMock()
+        logger = GameLogger(event_store)
+        options = GameOptions(initial_flips=0)
+
+        await logger.log_game_start_async(
+            room_code="ABCD",
+            num_players=3,
+            num_rounds=9,
+            player_ids=["p1", "p2", "p3"],
+            options=options,
+        )
+
+        event_store.create_game.assert_awaited_once()
+        event_store.update_game_started.assert_awaited_once()
+        call = event_store.update_game_started.await_args
+        assert call.kwargs.get("num_players", call.args[1] if len(call.args) > 1 else None) == 3
+        assert call.kwargs.get("num_rounds", call.args[2] if len(call.args) > 2 else None) == 9
+        assert call.kwargs.get("player_ids", call.args[3] if len(call.args) > 3 else None) == ["p1", "p2", "p3"]
+
+    @pytest.mark.asyncio
+    async def test_update_uses_same_game_id_as_create(self):
+        event_store = MagicMock()
+        event_store.create_game = AsyncMock()
+        event_store.update_game_started = AsyncMock()
+        logger = GameLogger(event_store)
+
+        await logger.log_game_start_async(
+            room_code="XYZW",
+            num_players=2,
+            num_rounds=1,
+            player_ids=["a", "b"],
+            options=GameOptions(initial_flips=0),
+        )
+
+        create_args = event_store.create_game.await_args
+        update_args = event_store.update_game_started.await_args
+        created_id = create_args.kwargs.get("game_id", create_args.args[0] if create_args.args else None)
+        updated_id = update_args.args[0] if update_args.args else update_args.kwargs.get("game_id")
+        assert created_id == updated_id
+        assert created_id  # non-empty
+
+    @pytest.mark.asyncio
+    async def test_create_failure_skips_update(self):
+        """If the row never landed, don't try to update a non-existent id."""
+        event_store = MagicMock()
+        event_store.create_game = AsyncMock(side_effect=Exception("db down"))
+        event_store.update_game_started = AsyncMock()
+        logger = GameLogger(event_store)
+
+        await logger.log_game_start_async(
+            room_code="ABCD",
+            num_players=2,
+            num_rounds=1,
+            player_ids=["a", "b"],
+            options=GameOptions(initial_flips=0),
+        )
+
+        event_store.update_game_started.assert_not_awaited()
+
+
+class TestLogGameEndWinnerId:
+    """
+    update_game_completed accepts winner_id but the existing sync wrapper
+    called it with the default None → every completed games_v2 row had
+    winner_id NULL. Thread the winner through so the denormalized column
+    is actually useful.
+    """
+
+    @pytest.mark.asyncio
+    async def test_winner_id_passed_through(self):
+        event_store = MagicMock()
+        event_store.update_game_completed = AsyncMock()
+        logger = GameLogger(event_store)
+
+        await logger.log_game_end_async("game-uuid", winner_id="player-7")
+
+        event_store.update_game_completed.assert_awaited_once_with("game-uuid", "player-7")
+
+    @pytest.mark.asyncio
+    async def test_winner_id_optional(self):
+        """A tie or abandonment-style end without a clear winner still works."""
+        event_store = MagicMock()
+        event_store.update_game_completed = AsyncMock()
+        logger = GameLogger(event_store)
+
+        await logger.log_game_end_async("game-uuid")
+
+        event_store.update_game_completed.assert_awaited_once_with("game-uuid", None)
+
+    @pytest.mark.asyncio
+    async def test_sync_wrapper_forwards_winner(self):
+        event_store = MagicMock()
+        event_store.update_game_completed = AsyncMock()
+        logger = GameLogger(event_store)
+
+        logger.log_game_end("game-uuid", winner_id="player-9")
+        await asyncio.sleep(0)
+        await asyncio.sleep(0)
+
+        event_store.update_game_completed.assert_awaited_once_with("game-uuid", "player-9")
+
+
+class TestStatsIdempotency:
+    """
+    broadcast_game_state can fire multiple times with phase=GAME_OVER
+    (double-click next-round, reconnect flush, etc.). log_game_end is
+    already idempotent because it nulls game_log_id immediately after.
+    _process_stats_safe had no such guard → every extra broadcast would
+    double-count games_played/games_won on the same game.
+
+    Solution: Room.stats_processed flag. Set True before firing the task.
+    """
+
+    def test_room_has_stats_processed_flag_defaulting_false(self):
+        room = Room(code="TEST")
+        assert room.stats_processed is False
+
+    def test_stats_processed_survives_touch(self):
+        """touch() updates last_activity but must not clobber stats_processed."""
+        room = Room(code="TEST")
+        room.stats_processed = True
+        room.touch()
+        assert room.stats_processed is True
+
+    @pytest.mark.asyncio
+    async def test_start_game_resets_stats_processed(self):
+        """When a room is reused for a second game, the latch must reset —
+        otherwise the new game's stats would be silently dropped."""
+        from handlers import handle_start_game
+
+        rm = RoomManager()
+        room = rm.create_room()
+        host_ws = MockWebSocket()
+        room.add_player("host", "Host", host_ws)
+        room.get_player("host").is_host = True
+        room.add_player("p2", "P2", MockWebSocket())
+        # Previous game already had stats processed
+        room.stats_processed = True
+
+        ctx = make_ctx(websocket=host_ws, player_id="host", room=room)
+
+        with patch("handlers.get_logger", return_value=None):
+            await handle_start_game(
+                {"decks": 1, "rounds": 1},
+                ctx,
+                broadcast_game_state=AsyncMock(),
+                check_and_run_cpu_turn=lambda _r: None,
+            )
+
+        assert room.stats_processed is False
--- a/tests/soak/CHECKLIST.md
+++ b/tests/soak/CHECKLIST.md
@@ -0,0 +1,64 @@
+# Soak Harness Validation Checklist
+
+Run after significant changes or before calling the harness implementation complete.
+
+## Post-deploy schema verification
+
+Run after the server-side changes deploy to each environment.
+
+- [ ] Server restarted (docker compose up -d or CI/CD deploy)
+- [ ] Server logs show `User store schema initialized` after restart
+- [ ] `\d users_v2` shows `is_test_account` column with default `false`
+- [ ] `\d invite_codes` shows `marks_as_test` column with default `false`
+- [ ] `\d leaderboard_overall` shows `is_test_account` column
+- [ ] `\di idx_users_test_account` shows the partial index
+- [ ] Leaderboard query still works: `curl .../api/stats/leaderboard` returns entries
+- [ ] `?include_test=true` parameter is accepted (no 422/500)
+
+## Bring-up
+
+- [ ] Invite code flagged with `marks_as_test=TRUE` on target environment
+- [ ] `bun run seed` creates/updates accounts in `.env.stresstest`
+- [ ] All seeded users show `is_test_account=TRUE` in the DB
+
+## Smoke test
+
+- [ ] `bash scripts/smoke.sh` exits 0 within 60s
+
+## Scenarios
+
+- [ ] `--scenario=populate --rooms=1 --games-per-room=1` completes cleanly
+- [ ] `--scenario=populate --rooms=2 --games-per-room=2` runs multiple rooms and multiple games
+- [ ] `--scenario=stress --games-per-room=3` logs `chaos_injected` events and completes
+
+## Watch modes
+
+- [ ] `--watch=none` produces JSONL on stdout, nothing else
+- [ ] `--watch=dashboard` opens http://localhost:7777, grid renders, WS shows `healthy`
+- [ ] Clicking a player tile opens the video modal with live JPEG frames
+- [ ] Closing the modal (Esc or Close) stops the screencast (check logs for `screencast_stopped`)
+- [ ] `--watch=tiled` opens native Chromium windows sized to show the full game table
+
+## Failure handling
+
+- [ ] Ctrl-C during a run → graceful shutdown, summary printed, exit code 2
+- [ ] Double Ctrl-C → immediate hard exit (130)
+- [ ] Health probes detect server down (3 consecutive failures → fatal abort)
+- [ ] Artifacts directory contains screenshots + state JSON on failure
+- [ ] Artifacts older than 7 days are pruned on next startup
+
+## Server-side filtering
+
+- [ ] `GET /api/stats/leaderboard` (default) hides soak accounts
+- [ ] `GET /api/stats/leaderboard?include_test=true` shows soak accounts
+- [ ] Admin panel user list shows `[Test]` badge on soak accounts
+- [ ] Admin panel invite codes tab shows `[Test-seed]` badge
+- [ ] "Include test accounts" checkbox toggles visibility in admin
+
+## Staging bring-up
+
+- [ ] `5VC2MCCN` flagged with `marks_as_test=TRUE` on staging DB
+- [ ] 16 accounts seeded via `SOAK_INVITE_CODE=5VC2MCCN bun run seed`
+- [ ] Populate run against staging completes with `--watch=dashboard`
+- [ ] Staging leaderboard default does NOT show soak accounts
+- [ ] Staging leaderboard with `?include_test=true` does show them
--- a/tests/soak/README.md
+++ b/tests/soak/README.md
@@ -1,21 +1,296 @@
 # Golf Soak & UX Test Harness

-Runs 16 authenticated browser sessions across 4 rooms to populate
-staging scoreboards and stress-test multiplayer stability.
+Standalone Playwright-based runner that drives multiple authenticated
+browser sessions playing real multiplayer games. Used for:

-**Spec:** `docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md`
-**Bring-up:** `docs/soak-harness-bringup.md`
+- **Scoreboard population** — fill staging leaderboards with realistic data
+- **Stability stress testing** — hunt race conditions, WebSocket leaks, cleanup bugs
+- **Live monitoring** — watch bot sessions play in real time via CDP screencast

-## Quick start
+## Prerequisites
+
+- [Bun](https://bun.sh/) (or Node.js + npm)
+- Chromium browser binary (installed via `bunx playwright install chromium`)
+- A running Golf Card Game server (local dev or staging)
+- An invite code flagged as `marks_as_test=TRUE` (see [Bring-up](#first-time-setup))
+
+## First-time setup
+
+### 1. Install dependencies

 ```bash
 cd tests/soak
 bun install
-bun run seed                                  # first run only
-TEST_URL=http://localhost:8000 bun run smoke
+bunx playwright install chromium
 ```

-(The scripts also work with `npm run`, `pnpm run`, etc. — bun is what's installed
-on this dev machine.)
+### 2. Flag the invite code as test-seed

-Full documentation arrives with Task 31.
+Any account registered with a test-seed invite gets `is_test_account=TRUE`,
+which keeps it out of real-user stats and leaderboards.
+
+**Local dev:**
+
+```bash
+PGPASSWORD=devpassword psql -h localhost -U golf -d golf <<'SQL'
+INSERT INTO invite_codes (code, created_by, expires_at, max_uses, is_active, marks_as_test)
+SELECT 'SOAKTEST', id, NOW() + INTERVAL '10 years', 100, TRUE, TRUE
+FROM users_v2 LIMIT 1
+ON CONFLICT (code) DO UPDATE SET marks_as_test = TRUE;
+SQL
+```
+
+**Staging:**
+
+```bash
+ssh root@129.212.150.189 \
+  'docker compose -f /opt/golfgame/docker-compose.staging.yml exec -T postgres psql -U postgres -d golfgame' <<'SQL'
+UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN';
+SQL
+```
+
+### 3. Seed test accounts
+
+```bash
+# Local dev
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST bun run seed
+
+# Staging
+TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run seed
+```
+
+This registers 16 accounts via the invite code and caches their credentials
+in `.env.stresstest`. Only needs to run once — subsequent runs reuse the
+cached credentials (re-logging in if tokens expire).
+
+### 4. Verify with a smoke test
+
+```bash
+# Local dev
+TEST_URL=http://localhost:8000 SOAK_INVITE_CODE=SOAKTEST bash scripts/smoke.sh
+```
+
+Expected: one game plays to completion in ~60 seconds, exits 0.
+
+## Usage
+
+### Populate scoreboards (recommended first run)
+
+```bash
+TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run soak -- \
+  --scenario=populate \
+  --watch=dashboard
+```
+
+This runs 4 rooms x 10 games x 9 holes with varied CPU personalities.
+The dashboard opens automatically at `http://localhost:7777`.
+
+### Quick smoke against staging
+
+```bash
+TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run soak -- \
+  --scenario=populate \
+  --accounts=2 --rooms=1 --cpus-per-room=0 \
+  --games-per-room=1 --holes=1 \
+  --watch=dashboard
+```
+
+### Stress test with chaos injection
+
+```bash
+TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run soak -- \
+  --scenario=stress \
+  --accounts=4 --rooms=1 --games-per-room=5 \
+  --watch=dashboard
+```
+
+Rapid 1-hole games with random chaos events (rapid clicks, tab blur,
+brief network outage) injected during gameplay.
+
+### Headless mode (CI / overnight)
+
+```bash
+TEST_URL=https://staging.adlee.work SOAK_INVITE_CODE=5VC2MCCN bun run soak -- \
+  --scenario=populate --watch=none
+```
+
+Outputs structured JSONL to stdout. Pipe to `jq` for filtering:
+
+```bash
+bun run soak -- --scenario=populate --watch=none 2>&1 | jq 'select(.msg == "game_complete")'
+```
+
+### Tiled mode (native browser windows)
+
+```bash
+bun run soak -- --scenario=populate --rooms=2 --watch=tiled
+```
+
+Opens visible Chromium windows for each room's host session. Useful for
+hands-on debugging with DevTools.
+
+## CLI flags
+
+```
+--scenario=populate|stress    required — which scenario to run
+--accounts=<n>                total sessions (default: from scenario)
+--rooms=<n>                   parallel rooms (default: from scenario)
+--cpus-per-room=<n>           CPU opponents per room (default: from scenario)
+--games-per-room=<n>          games per room (default: from scenario)
+--holes=<n>                   holes per game (default: from scenario)
+--watch=none|dashboard|tiled  visualization mode (default: dashboard)
+--dashboard-port=<n>          dashboard server port (default: 7777)
+--target=<url>                override TEST_URL env var
+--run-id=<string>             custom run identifier (default: timestamp)
+--list                        print available scenarios and exit
+--dry-run                     validate config without running
+```
+
+`accounts / rooms` must divide evenly.
+
+## Environment variables
+
+| Variable | Description | Default |
+|---|---|---|
+| `TEST_URL` | Target server base URL | `http://localhost:8000` |
+| `SOAK_INVITE_CODE` | Invite code for account seeding | `SOAKTEST` |
+| `SOAK_HOLES` | Override `--holes` | — |
+| `SOAK_ROOMS` | Override `--rooms` | — |
+| `SOAK_ACCOUNTS` | Override `--accounts` | — |
+| `SOAK_CPUS_PER_ROOM` | Override `--cpus-per-room` | — |
+| `SOAK_GAMES_PER_ROOM` | Override `--games-per-room` | — |
+| `SOAK_WATCH` | Override `--watch` | — |
+| `SOAK_DASHBOARD_PORT` | Override `--dashboard-port` | — |
+
+Config precedence: CLI flags > env vars > scenario defaults.
+
+## Watch modes
+
+### `dashboard` (default)
+
+Opens `http://localhost:7777` with a live status grid:
+
+- 2x2 room tiles showing phase, current player, move count, progress bar
+- Activity log at the bottom
+- **Click any player tile** to watch their live session via CDP screencast
+- Press Esc or click Close to stop the video feed
+- WS connection status indicator
+
+The dashboard runs **locally on your machine** — the runner's headless
+browsers connect to the target server remotely while the dashboard UI
+is served from your workstation.
+
+### `tiled`
+
+Opens native Chromium windows for each room's host session, positioned
+in a grid. Joiners stay headless. Useful for interactive debugging with
+DevTools. The viewport is sized at 960x900 to show the full game table.
+
+### `none`
+
+Pure headless, structured JSONL to stdout. Use for CI, overnight runs,
+or piping to `jq`.
+
+## Scenarios
+
+### `populate`
+
+Long multi-round games to populate scoreboards with realistic data.
+
+| Setting | Default |
+|---|---|
+| Accounts | 16 |
+| Rooms | 4 |
+| CPUs per room | 1 |
+| Games per room | 10 |
+| Holes | 9 |
+| Decks | 2 |
+| Think time | 800-2200ms |
+
+### `stress`
+
+Rapid short games with chaos injection for stability testing.
+
+| Setting | Default |
+|---|---|
+| Accounts | 16 |
+| Rooms | 4 |
+| CPUs per room | 2 |
+| Games per room | 50 |
+| Holes | 1 |
+| Decks | 1 |
+| Think time | 50-150ms |
+| Chaos chance | 5% per turn |
+
+Chaos events: `rapid_clicks`, `tab_blur`, `brief_offline`
+
+### Adding new scenarios
+
+Create `scenarios/<name>.ts` exporting a `Scenario` object, then register
+it in `scenarios/index.ts`. See existing scenarios for the pattern.
+
+## Error handling
+
+- **Per-room isolation**: a failure in one room never unwinds other rooms
+  (`Promise.allSettled`)
+- **Watchdog**: 60s per-room timeout — fires if no heartbeat arrives
+- **Health probes**: `GET /health` every 30s, 3 consecutive failures = fatal abort
+- **Graceful shutdown**: Ctrl-C finishes current turn, then cleans up (10s timeout).
+  Double Ctrl-C = immediate force exit
+- **Artifacts**: on failure, screenshots + HTML + game state JSON saved to
+  `artifacts/<run-id>/`. Old artifacts auto-pruned after 7 days
+- **Exit codes**: `0` = success, `1` = errors, `2` = interrupted
+
+## Test account filtering
+
+Soak accounts are flagged `is_test_account=TRUE` in the database. They are:
+
+- **Hidden by default** from public leaderboards and stats (`?include_test=false`)
+- **Visible to admins** by default in the admin panel
+- **Togglable** via the "Include test accounts" checkbox in the admin panel
+- **Badged** with `[Test]` in the admin user list and `[Test-seed]` on the invite code
+
+## Unit tests
+
+```bash
+bun run test
+```
+
+27 tests covering Deferred, RoomCoordinator, Watchdog, Logger, and Config.
+Integration-level modules (SessionPool, scenarios, dashboard) are verified
+by the smoke test and live runs.
+
+## Architecture
+
+```
+runner.ts          CLI entry — parses flags, wires everything, runs scenario
+core/
+  session-pool.ts  Owns browser contexts, seeds/logs in accounts
+  room-coordinator Deferred-based host→joiners room code handoff
+  watchdog.ts      Per-room timeout detector
+  screencaster.ts  CDP Page.startScreencast for live video
+  logger.ts        Structured JSONL logger with child contexts
+  artifacts.ts     Screenshot/HTML/state capture on failure
+  types.ts         Scenario/Session/Logger contracts
+scenarios/
+  populate.ts      Long multi-round games
+  stress.ts        Rapid games with chaos injection
+  shared/
+    multiplayer-game.ts  Shared "play one game" loop
+    chaos.ts             Chaos event injector
+dashboard/
+  server.ts        HTTP + WS server
+  index.html       Status grid UI
+  dashboard.js     WS client + click-to-watch
+scripts/
+  seed-accounts.ts Account seeding CLI
+  smoke.sh         End-to-end canary (~60s)
+```
+
+Reuses `tests/e2e/bot/golf-bot.ts` unchanged for all game interactions.
+
+## Related docs
+
+- [Design spec](../../docs/superpowers/specs/2026-04-10-multiplayer-soak-test-design.md)
+- [Bring-up steps](../../docs/soak-harness-bringup.md)
+- [Implementation plan](../../docs/superpowers/plans/2026-04-10-multiplayer-soak-test.md)
--- a/tests/soak/core/screencaster.ts
+++ b/tests/soak/core/screencaster.ts
@@ -59,8 +59,8 @@ export class Screencaster {
    await client.send('Page.startScreencast', {
      format: opts.format ?? 'jpeg',
      quality: opts.quality ?? 60,
-      maxWidth: opts.maxWidth ?? 640,
-      maxHeight: opts.maxHeight ?? 360,
+      maxWidth: opts.maxWidth ?? 960,
+      maxHeight: opts.maxHeight ?? 540,
      everyNthFrame: opts.everyNthFrame ?? 2,
    });
    this.logger.info('screencast_started', { sessionKey });
--- a/tests/soak/core/session-pool.ts
+++ b/tests/soak/core/session-pool.ts
@@ -266,12 +266,49 @@ export class SessionPool {
      const context = await targetBrowser.newContext({
        ...this.opts.contextOptions,
        baseURL: this.opts.targetUrl,
-        ...(useHeaded ? { viewport: { width: 960, height: 900 } } : {}),
+        viewport: useHeaded
+          ? { width: 960, height: 900 }
+          : { width: 960, height: 800 },
      });
      await this.injectAuth(context, account);
      const page = await context.newPage();
      await page.goto(this.opts.targetUrl);

+      // Verify the token is valid — if expired, re-login and reload
+      const controlsVisible = await page
+        .waitForSelector('#lobby-game-controls:not(.hidden)', {
+          state: 'attached',
+          timeout: 5000,
+        })
+        .then(() => true)
+        .catch(() => false);
+
+      if (!controlsVisible) {
+        this.opts.logger.warn('token_expired_relogin', { account: account.key });
+        const freshToken = await loginAccount(
+          this.opts.targetUrl,
+          account.username,
+          account.password,
+        );
+        account.token = freshToken;
+        writeCredFile(this.opts.credFile, this.accounts);
+        await context.addInitScript(
+          ({ token, username }) => {
+            window.localStorage.setItem('authToken', token);
+            window.localStorage.setItem(
+              'authUser',
+              JSON.stringify({ id: '', username, role: 'user', email_verified: true }),
+            );
+          },
+          { token: freshToken, username: account.username },
+        );
+        await page.goto(this.opts.targetUrl);
+        await page.waitForSelector('#lobby-game-controls:not(.hidden)', {
+          state: 'attached',
+          timeout: 10000,
+        });
+      }
+
      // Best-effort tile placement. window.moveTo is often a no-op on
      // modern Chromium (especially under Wayland), so we don't rely on
      // it — the viewport sized above is what the user actually sees.
--- a/tests/soak/dashboard/dashboard.css
+++ b/tests/soak/dashboard/dashboard.css
@@ -167,7 +167,9 @@ body {
 }
 #video-frame {
  display: block;
+  width: 960px;
  max-width: 100%;
-  max-height: 70vh;
+  max-height: 80vh;
+  object-fit: contain;
  border: 1px solid var(--border);
 }
--- a/tests/soak/scenarios/populate.ts
+++ b/tests/soak/scenarios/populate.ts
@@ -50,9 +50,15 @@ async function runRoom(
  let completed = 0;
  const errors: ScenarioError[] = [];

+  // Send player list for dashboard tiles
+  ctx.dashboard.update(roomId, {
+    players: sessions.map((s) => ({ key: s.key, score: null, isActive: false })),
+  });
+
  for (let gameNum = 0; gameNum < cfg.gamesPerRoom; gameNum++) {
    if (ctx.signal.aborted) break;
    ctx.dashboard.update(roomId, { game: gameNum + 1, totalGames: cfg.gamesPerRoom });
+    ctx.dashboard.log('info', `${roomId}: starting game ${gameNum + 1}/${cfg.gamesPerRoom}`);
    ctx.logger.info('game_start', { room: roomId, game: gameNum + 1 });

    const result = await runOneMultiplayerGame(ctx, sessions, {
@@ -66,6 +72,9 @@ async function runRoom(

    if (result.completed) {
      completed++;
+      ctx.dashboard.incrementMetric('games_completed');
+      ctx.dashboard.incrementMetric('moves_total', result.turns);
+      ctx.dashboard.log('info', `${roomId}: game ${gameNum + 1} complete — ${result.turns} turns, ${(result.durationMs / 1000).toFixed(1)}s`);
      ctx.logger.info('game_complete', {
        room: roomId,
        game: gameNum + 1,
@@ -73,6 +82,8 @@ async function runRoom(
        durationMs: result.durationMs,
      });
    } else {
+      ctx.dashboard.incrementMetric('errors');
+      ctx.dashboard.log('error', `${roomId}: game ${gameNum + 1} failed — ${result.error}`);
      errors.push({
        room: roomId,
        reason: 'game_failed',
--- a/tests/soak/scenarios/shared/multiplayer-game.ts
+++ b/tests/soak/scenarios/shared/multiplayer-game.ts
@@ -53,7 +53,28 @@ export async function runOneMultiplayerGame(
    // After the first game ends each session is parked on the
    // game_over screen, which hides the lobby's Create Room button.
    // goto('/') bounces them back; localStorage-cached auth persists.
-    await Promise.all(sessions.map((s) => s.bot.goto('/')));
+    // We must wait for auth hydration to unhide #lobby-game-controls.
+    await Promise.all(
+      sessions.map(async (s) => {
+        await s.bot.goto('/');
+        try {
+          await s.page.waitForSelector('#create-room-btn', {
+            state: 'visible',
+            timeout: 15000,
+          });
+        } catch {
+          // Auth may have been lost — re-login via the page
+          const html = await s.page.content().catch(() => '');
+          ctx.logger.warn('lobby_not_ready', {
+            session: s.key,
+            hasControls: html.includes('lobby-game-controls'),
+            hasHidden: html.includes('lobby-game-controls" class="hidden"') ||
+              html.includes("lobby-game-controls' class='hidden'"),
+          });
+          throw new Error(`lobby not ready for ${s.key} after goto('/')`);
+        }
+      }),
+    );

    // Use a unique coordinator key per game-start so Deferreds don't
    // carry stale room codes from previous games. The coordinator's
@@ -90,12 +111,36 @@ export async function runOneMultiplayerGame(

    async function sessionLoop(sessionIdx: number): Promise<void> {
      const session = sessions[sessionIdx];
+      const isHost = sessionIdx === 0;
      while (true) {
        if (ctx.signal.aborted) return;
        if (Date.now() - start > maxDuration) return;

        const phase = await session.bot.getGamePhase();
-        if (phase === 'game_over' || phase === 'round_over') return;
+        if (phase === 'game_over') return;
+
+        if (phase === 'round_over') {
+          if (isHost) {
+            await sleep(1500);
+            // The scoresheet modal uses #ss-next-btn; the side panel uses #next-round-btn.
+            // Try both — the visible one gets clicked.
+            const ssBtn = session.page.locator('#ss-next-btn');
+            const sideBtn = session.page.locator('#next-round-btn');
+            const clicked = await ssBtn.click({ timeout: 3000 }).then(() => 'ss').catch(() => null)
+              || await sideBtn.click({ timeout: 3000 }).then(() => 'side').catch(() => null);
+            ctx.logger.info('round_advance', { room: opts.roomId, session: session.key, clicked });
+          } else {
+            await sleep(2000);
+          }
+          // Wait for the next round to actually start (or game_over on last round)
+          for (let i = 0; i < 40; i++) {
+            const p = await session.bot.getGamePhase();
+            if (p === 'game_over' || p === 'playing' || p === 'initial_flip') break;
+            await sleep(500);
+          }
+          ctx.heartbeat(opts.roomId);
+          continue;
+        }

        if (await session.bot.isMyTurn()) {
          await session.bot.playTurn();
@@ -104,6 +149,11 @@ export async function runOneMultiplayerGame(
          ctx.dashboard.update(opts.roomId, {
            currentPlayer: session.account.username,
            moves: turnCounts.reduce((a, b) => a + b, 0),
+            players: sessions.map((s, j) => ({
+              key: s.key,
+              score: null,
+              isActive: j === sessionIdx,
+            })),
          });
          const thinkMs = randomInt(opts.thinkTimeMs[0], opts.thinkTimeMs[1]);
          await sleep(thinkMs);
@@ -115,8 +165,12 @@ export async function runOneMultiplayerGame(

    await Promise.all(sessions.map((_, i) => sessionLoop(i)));

+    // Let the server finish processing game completion (stats, DB update)
+    // before we navigate away and kill the WebSocket connections.
+    await sleep(2000);
+
    const totalTurns = turnCounts.reduce((a, b) => a + b, 0);
-    ctx.dashboard.update(opts.roomId, { phase: 'round_over' });
+    ctx.dashboard.update(opts.roomId, { phase: 'game_over' });
    return {
      completed: true,
      turns: totalTurns,
--- a/tests/soak/scenarios/stress.ts
+++ b/tests/soak/scenarios/stress.ts
@@ -50,10 +50,15 @@ async function runStressRoom(
  let chaosFired = 0;
  const errors: ScenarioError[] = [];

+  ctx.dashboard.update(roomId, {
+    players: sessions.map((s) => ({ key: s.key, score: null, isActive: false })),
+  });
+
  for (let gameNum = 0; gameNum < cfg.gamesPerRoom; gameNum++) {
    if (ctx.signal.aborted) break;

    ctx.dashboard.update(roomId, { game: gameNum + 1, totalGames: cfg.gamesPerRoom });
+    ctx.dashboard.log('info', `${roomId}: starting game ${gameNum + 1}/${cfg.gamesPerRoom}`);

    // Background chaos loop — runs concurrently with the game turn loop.
    // Delay the first tick by 3 seconds so room creation + joiners + game
@@ -90,12 +95,17 @@ async function runStressRoom(

    if (result.completed) {
      completed++;
+      ctx.dashboard.incrementMetric('games_completed');
+      ctx.dashboard.incrementMetric('moves_total', result.turns);
+      ctx.dashboard.log('info', `${roomId}: game ${gameNum + 1} complete — ${result.turns} turns`);
      ctx.logger.info('game_complete', {
        room: roomId,
        game: gameNum + 1,
        turns: result.turns,
      });
    } else {
+      ctx.dashboard.incrementMetric('errors');
+      ctx.dashboard.log('error', `${roomId}: game ${gameNum + 1} failed — ${result.error}`);
      errors.push({
        room: roomId,
        reason: 'game_failed',
Author	SHA1	Message	Date
adlee-was-taken	76a9de27c2	chore(staging): wire LEADERBOARD_INCLUDE_TEST_DEFAULT into compose All checks were successful Build & Deploy Staging / build-and-deploy (release) Successful in 31s Details The v3.3.5 router reads config.LEADERBOARD_INCLUDE_TEST_DEFAULT but the staging compose file was never passing the env through to the container. This change was applied manually on the staging host before but never made it back into the repo — fixing that so CI deploys pick it up. Value on staging is sourced from .env (already set to true). Production leaves it unset, so the default of false applies. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 00:55:00 -04:00
adlee-was-taken	f37f279098	chore: bump version to 3.3.5 Some checks failed Build & Deploy Staging / build-and-deploy (release) Failing after 5s Details Covers: game-lifecycle DB fixes (stranded active games, populated metadata, winner_id, stats idempotency) and staging leaderboard include-test override. Also aligns FastAPI app version in server/main.py (was stuck at 3.2.0) with the actual release. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 00:50:30 -04:00
adlee-was-taken	c02b0054c2	fix(server): winner_id on completed games + stats idempotency latch Two issues in the GAME_OVER broadcast path: 1. log_game_end called update_game_completed with winner_id=None default, so games_v2.winner_id was NULL on all 17 completed staging rows. The denormalized column existed but carried no information. Compute winner (lowest total; None on tie) in broadcast_game_state and thread through. 2. _process_stats_safe had no idempotency guard. log_game_end was already self-guarding via game_log_id=None after first fire, but nothing stopped repeated GAME_OVER broadcasts from re-firing stats and double-counting games_played/games_won. Add Room.stats_processed latch; reset it in handle_start_game so a re-used room still records. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 00:47:53 -04:00
adlee-was-taken	8030a3c171	fix(server): populate games_v2 metadata on game start update_game_started (started_at, num_players, num_rounds, player_ids) was defined in event_store but had zero callers. 289/289 staging games had those fields NULL — queries that joined on them returned garbage, and the denormalized player_ids GIN index was dead weight. log_game_start now calls create_game THEN update_game_started in one async task. If create fails, update is skipped (row doesn't exist). handlers.py passes num_rounds and player_ids through at call time. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 00:41:58 -04:00
adlee-was-taken	d5f8eef6b3	fix(server): mark games abandoned on room teardown + staging leaderboard When handle_player_leave emptied a room or handle_end_game was invoked, the room was removed from memory without touching games_v2. Periodic cleanup only scans in-memory rooms, so those rows were stranded as status='active' forever — staging had 42 orphans accumulated over 5h. - event_store.update_game_abandoned: guarded UPDATE (status='active' only) - GameLogger.log_game_abandoned{,_async}: fire-and-forget wrapper - handle_end_game + handle_player_leave: flip status before remove_room - LEADERBOARD_INCLUDE_TEST_DEFAULT: env override so staging can show soak-harness accounts by default; prod keeps them hidden Verified on staging: 42 orphans swept on restart, soak accounts now visible on /api/stats/leaderboard (rank 1-4). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 00:37:49 -04:00
adlee-was-taken	70498b1c33	fix(soak): multi-hole round transitions, token refresh, dashboard wiring - Session loop now handles round_over by clicking #ss-next-btn (the scoresheet modal button) instead of exiting early. Waits for next round or game_over before continuing. - SessionPool detects expired tokens on acquire and re-logins automatically, writing fresh credentials to .env.stresstest. - Added 2s post-game delay before goto('/') so the server can process game completion before WebSocket disconnect. - Wired dashboard metrics (games_completed, moves_total, errors), activity log entries, and player tiles for both populate and stress scenarios. - Bumped screencast resolution to 960x540 and set headless viewport to 960x800 for better click-to-watch framing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 20:37:24 -04:00
adlee-was-taken	ccc2f3b559	fix(server): game completion pipeline — stats recording + dict iteration safety Three bugs prevented game stats from recording: 1. broadcast_game_state had game_over processing (log_game_end + stats) inside the per-player loop — if all players disconnected before the loop ran, stats never processed. Moved to run once before the loop. 2. room.broadcast and broadcast_game_state iterated players.items() without snapshotting, causing RuntimeError when concurrent player disconnects mutated the dict. Fixed with list(). 3. stats_service.process_game_from_state passed avg_round_score to a CASE expression without a type hint, causing asyncpg to fail with "could not determine data type of parameter $6". Added ::integer casts. Also wrapped per-player send_json calls in try/except so a single disconnected player doesn't abort the broadcast to remaining players. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-17 20:37:08 -04:00
adlee-was-taken	d5194f43ba	docs(soak): full README + validation checklist Replaces the Task 31 stub README with complete documentation: quickstart, first-time setup (invite flagging, seeding, smoke), usage examples for all three watch modes, CLI flag reference, env var table, scenario descriptions, error handling summary, test account filtering explanation, and architecture overview. Adds CHECKLIST.md with post-deploy verification, bring-up, scenario, watch mode, failure handling, and staging gate items. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 23:05:22 -04:00