Huge v2 uplift, now deployable with real user management and tooling!

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Aaron D. Lee
2026-01-27 11:32:15 -05:00
parent c912a56c2d
commit bea85e6b28
61 changed files with 25153 additions and 362 deletions

View File

@@ -0,0 +1,327 @@
# Golf Card Game - V2 Master Plan
## Overview
Transform the current single-server Golf game into a production-ready, hostable platform with:
- **Event-sourced architecture** for full game replay and audit trails
- **User accounts** with authentication, password reset, and profile management
- **Admin tools** for moderation and system management
- **Leaderboards** with player statistics
- **Scalable hosting** options (self-hosted or cloud)
- **Export/playback** for sharing memorable games
---
## Document Structure (VDD)
This plan is split into independent vertical slices. Each document is self-contained and can be worked on by a separate agent.
| Document | Scope | Dependencies |
|----------|-------|--------------|
| `V2_01_EVENT_SOURCING.md` | Event classes, store, state rebuilding | None (foundation) |
| `V2_02_PERSISTENCE.md` | Redis cache, PostgreSQL, game recovery | 01 |
| `V2_03_USER_ACCOUNTS.md` | Registration, login, password reset, email | 02 |
| `V2_04_ADMIN_TOOLS.md` | Admin dashboard, moderation, system stats | 03 |
| `V2_05_STATS_LEADERBOARDS.md` | Stats aggregation, leaderboard API/UI | 03 |
| `V2_06_REPLAY_EXPORT.md` | Game replay, export, share links | 01, 02 |
| `V2_07_PRODUCTION.md` | Docker, deployment, monitoring, security | All |
---
## Current State (V1)
```
Client (Vanilla JS) <──WebSocket──> FastAPI Server <──> SQLite
In-memory rooms
(lost on restart)
```
**What works well:**
- Game logic is solid and well-tested
- CPU AI with 8 distinct personalities
- Flexible house rules system (15+ options)
- Real-time multiplayer via WebSockets
- Basic auth system with invite codes
**Limitations:**
- Single server, no horizontal scaling
- Game state lost on server restart
- Move logging exists but duplicates state
- No persistent player stats or leaderboards
- Limited admin capabilities
- No password reset flow
- No email integration
---
## V2 Target Architecture
```
┌─────────────────────────────────────────────────────────────────────┐
│ Clients │
│ (Browser / Future: Mobile) │
└───────────────────────────────┬─────────────────────────────────────┘
│ WebSocket + REST API
┌─────────────────────────────────────────────────────────────────────┐
│ FastAPI Application │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Command │ │ Event │ │ State │ │ Query │ │ Auth │ │
│ │ Handler │─► Store │─► Builder │ │ Service │ │ Service │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Admin │ │ Stats │ │ Email │ │
│ │ Service │ │ Worker │ │ Service │ │
│ └──────────┘ └──────────┘ └──────────┘ │
└───────┬───────────────┬───────────────┬───────────────┬────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Redis │ │ PostgreSQL │ │ PostgreSQL │ │ Email │
│ (Live State) │ │ (Events) │ │ (Users/Stats)│ │ Provider │
│ (Pub/Sub) │ │ │ │ │ │ (Resend) │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘
```
---
## Tech Stack
| Layer | Technology | Reasoning |
|-------|------------|-----------|
| **Web framework** | FastAPI (keep) | Already using, async, fast |
| **WebSockets** | Starlette (keep) | Built into FastAPI |
| **Live state cache** | Redis | Fast, pub/sub, TTL, battle-tested |
| **Event store** | PostgreSQL | JSONB, robust, great tooling |
| **User database** | PostgreSQL | Same instance, keep it simple |
| **Background jobs** | `arq` | Async, Redis-backed, lightweight |
| **Email** | Resend | Simple API, good free tier, reliable |
| **Containerization** | Docker | Consistent deployment |
| **Orchestration** | Docker Compose | Start simple, K8s if needed |
### New Dependencies
```txt
# requirements.txt additions
redis>=5.0.0
asyncpg>=0.29.0 # Async PostgreSQL
sqlalchemy>=2.0.0 # ORM for complex queries
alembic>=1.13.0 # Database migrations
arq>=0.26.0 # Background task queue
pydantic-settings>=2.0 # Config management
resend>=0.8.0 # Email service
python-jose[cryptography] # JWT tokens
passlib[bcrypt] # Password hashing
```
---
## Phases & Milestones
### Phase 1: Event Infrastructure (Foundation)
**Goal:** Emit events alongside current code, validate replay works
| Milestone | Description | Document |
|-----------|-------------|----------|
| Event classes defined | All gameplay events as dataclasses | 01 |
| Event store working | PostgreSQL persistence | 01 |
| Dual-write enabled | Events emitted without breaking current code | 01 |
| Replay validation | Test proves events recreate identical state | 01 |
| Rate limiting on auth | Brute force protection | 07 |
### Phase 2: Persistence & Recovery
**Goal:** Games survive server restarts
| Milestone | Description | Document |
|-----------|-------------|----------|
| Redis state cache | Live game state in Redis | 02 |
| Pub/sub ready | Multi-server WebSocket fan-out | 02 |
| Game recovery | Rebuild games from events on startup | 02 |
| Graceful shutdown | Save state before stopping | 02 |
### Phase 3a: User Accounts
**Goal:** Full user lifecycle management
| Milestone | Description | Document |
|-----------|-------------|----------|
| Email service integrated | Resend configured and tested | 03 |
| Registration with verification | Email confirmation flow | 03 |
| Password reset flow | Forgot password via email token | 03 |
| Session management | View/revoke sessions | 03 |
| Account settings | Profile, preferences, deletion | 03 |
### Phase 3b: Admin Tools
**Goal:** Moderation and system management
| Milestone | Description | Document |
|-----------|-------------|----------|
| Admin dashboard | User list, search, metrics | 04 |
| User management | Ban, unban, force password reset | 04 |
| Game moderation | View any game, end stuck games | 04 |
| System monitoring | Active games, users online, events/hour | 04 |
| Audit logging | Track admin actions | 04 |
### Phase 4: Stats & Leaderboards
**Goal:** Persistent player statistics
| Milestone | Description | Document |
|-----------|-------------|----------|
| Stats schema | PostgreSQL tables for aggregated stats | 05 |
| Stats worker | Background job processing events | 05 |
| Leaderboard API | REST endpoints | 05 |
| Leaderboard UI | Client display | 05 |
| Achievement system | Badges and milestones (stretch) | 05 |
### Phase 5: Replay & Export
**Goal:** Share and replay games
| Milestone | Description | Document |
|-----------|-------------|----------|
| Export API | Download game as JSON | 06 |
| Import/load | Upload and replay | 06 |
| Replay UI | Playback controls, scrubbing | 06 |
| Share links | Public `/replay/{id}` URLs | 06 |
### Phase 6: Production
**Goal:** Deployable, monitored, secure
| Milestone | Description | Document |
|-----------|-------------|----------|
| Dockerized | All services containerized | 07 |
| Health checks | `/health` endpoint with dependency checks | 07 |
| Metrics | Prometheus metrics | 07 |
| Error tracking | Sentry integration | 07 |
| Deployment guide | Step-by-step for VPS/cloud | 07 |
---
## File Structure (Target)
```
golfgame/
├── client/ # Frontend (enhance incrementally)
│ ├── index.html
│ ├── app.js
│ ├── components/ # New: modular UI components
│ │ ├── leaderboard.js
│ │ ├── replay-controls.js
│ │ └── admin-dashboard.js
│ └── ...
├── server/
│ ├── main.py # FastAPI app entry point
│ ├── config.py # Settings from env vars
│ ├── dependencies.py # FastAPI dependency injection
│ ├── models/
│ │ ├── events.py # Event dataclasses
│ │ ├── user.py # User model
│ │ └── game_state.py # State rebuilt from events
│ ├── stores/
│ │ ├── event_store.py # PostgreSQL event persistence
│ │ ├── state_cache.py # Redis live state
│ │ └── user_store.py # User persistence
│ ├── services/
│ │ ├── game_service.py # Command handling, event emission
│ │ ├── auth_service.py # Authentication, sessions
│ │ ├── email_service.py # Email sending
│ │ ├── admin_service.py # Admin operations
│ │ ├── stats_service.py # Leaderboard queries
│ │ └── replay_service.py # Export, import, playback
│ ├── routers/
│ │ ├── auth.py # Auth endpoints
│ │ ├── admin.py # Admin endpoints
│ │ ├── games.py # Game/replay endpoints
│ │ └── stats.py # Leaderboard endpoints
│ ├── workers/
│ │ └── stats_worker.py # Background stats aggregation
│ ├── middleware/
│ │ ├── rate_limit.py # Rate limiting
│ │ └── auth.py # Auth middleware
│ ├── ai/ # Keep existing AI code
│ │ └── ...
│ └── tests/
│ ├── test_events.py
│ ├── test_replay.py
│ ├── test_auth.py
│ └── ...
├── migrations/ # Alembic migrations
│ ├── versions/
│ └── env.py
├── docker/
│ ├── Dockerfile
│ ├── docker-compose.yml
│ └── docker-compose.prod.yml
├── docs/
│ └── v2/ # These planning documents
│ ├── V2_00_MASTER_PLAN.md
│ ├── V2_01_EVENT_SOURCING.md
│ └── ...
└── scripts/
├── migrate.py # Run migrations
├── create_admin.py # Bootstrap admin user
└── export_game.py # CLI game export
```
---
## Decision Log
| Decision | Choice | Rationale |
|----------|--------|-----------|
| Event store DB | PostgreSQL | JSONB support, same DB as users, simpler ops |
| Email provider | Resend | Simple API, good free tier (3k/mo), reliable |
| Background jobs | arq | Async-native, Redis-backed, lightweight |
| Session storage | Redis | Fast, TTL support, already using for state |
| Password hashing | bcrypt | Industry standard, built-in work factor |
| JWT vs sessions | Both | JWT for API, sessions for WebSocket |
---
## Open Questions
1. **Guest play vs required accounts?**
- Decision: Allow guest play, prompt to register to save stats
- Guest games count for global stats but not personal leaderboards
2. **Game history retention?**
- Decision: Keep events forever (they're small, ~500 bytes each)
- Implement archival to cold storage after 1 year if needed
3. **Replay visibility?**
- Decision: Private by default, shareable via link
- Future: Public games opt-in
4. **CPU games count for leaderboards?**
- Decision: Yes, but separate "vs humans only" leaderboard later
5. **Multi-region?**
- Decision: Not for V2, single region is fine for card game latency
- Revisit if user base grows significantly
---
## How to Use These Documents
Each `V2_XX_*.md` document is designed to be:
1. **Self-contained** - Has all context needed to implement that slice
2. **Agent-ready** - Can be given to a Claude agent as the primary context
3. **Testable** - Includes acceptance criteria and test requirements
4. **Incremental** - Can be implemented and shipped independently (respecting dependencies)
**Workflow:**
1. Pick a document based on current phase
2. Start a new Claude session with that document as context
3. Implement the slice
4. Run tests specified in the document
5. PR and merge
6. Move to next slice
---
## Next Steps
1. Review all V2 documents
2. Set up PostgreSQL locally for development
3. Start with `V2_01_EVENT_SOURCING.md`
4. Implement rate limiting from `V2_07_PRODUCTION.md` early (security)

View File

@@ -0,0 +1,867 @@
# V2-01: Event Sourcing Infrastructure
## Overview
This document covers the foundational event sourcing system. All game actions will be stored as immutable events, enabling replay, audit trails, and stats aggregation.
**Dependencies:** None (this is the foundation)
**Dependents:** All other V2 documents
---
## Goals
1. Define event classes for all game actions
2. Create PostgreSQL event store
3. Implement dual-write (events + current mutations)
4. Build state rebuilder from events
5. Validate that event replay produces identical state
---
## Current State
The game currently uses direct mutation:
```python
# Current approach in game.py
def draw_card(self, player_id: str, source: str) -> Optional[Card]:
card = self.deck.pop() if source == "deck" else self.discard.pop()
self.drawn_card = card
self.phase = GamePhase.PLAY
return card
```
Move logging exists in `game_logger.py` but stores denormalized state snapshots, not replayable events.
---
## Event Design
### Base Event Class
```python
# server/models/events.py
from dataclasses import dataclass, field
from datetime import datetime
from typing import Optional, Any
from enum import Enum
import uuid
class EventType(str, Enum):
# Lifecycle
GAME_CREATED = "game_created"
PLAYER_JOINED = "player_joined"
PLAYER_LEFT = "player_left"
GAME_STARTED = "game_started"
ROUND_STARTED = "round_started"
ROUND_ENDED = "round_ended"
GAME_ENDED = "game_ended"
# Gameplay
INITIAL_FLIP = "initial_flip"
CARD_DRAWN = "card_drawn"
CARD_SWAPPED = "card_swapped"
CARD_DISCARDED = "card_discarded"
CARD_FLIPPED = "card_flipped"
FLIP_SKIPPED = "flip_skipped"
FLIP_AS_ACTION = "flip_as_action"
KNOCK_EARLY = "knock_early"
@dataclass
class GameEvent:
"""Base class for all game events."""
event_type: EventType
game_id: str
sequence_num: int
timestamp: datetime = field(default_factory=datetime.utcnow)
player_id: Optional[str] = None
data: dict = field(default_factory=dict)
def to_dict(self) -> dict:
return {
"event_type": self.event_type.value,
"game_id": self.game_id,
"sequence_num": self.sequence_num,
"timestamp": self.timestamp.isoformat(),
"player_id": self.player_id,
"data": self.data,
}
@classmethod
def from_dict(cls, d: dict) -> "GameEvent":
return cls(
event_type=EventType(d["event_type"]),
game_id=d["game_id"],
sequence_num=d["sequence_num"],
timestamp=datetime.fromisoformat(d["timestamp"]),
player_id=d.get("player_id"),
data=d.get("data", {}),
)
```
### Lifecycle Events
```python
# Lifecycle event data structures
@dataclass
class GameCreatedData:
room_code: str
host_id: str
options: dict # GameOptions as dict
@dataclass
class PlayerJoinedData:
player_name: str
is_cpu: bool
cpu_profile: Optional[str] = None
@dataclass
class GameStartedData:
deck_seed: int # For deterministic replay
player_order: list[str] # Player IDs in turn order
num_decks: int
num_rounds: int
dealt_cards: dict[str, list[dict]] # player_id -> cards dealt
@dataclass
class RoundStartedData:
round_num: int
deck_seed: int
dealt_cards: dict[str, list[dict]]
@dataclass
class RoundEndedData:
scores: dict[str, int] # player_id -> score
winner_id: Optional[str]
final_hands: dict[str, list[dict]] # For verification
@dataclass
class GameEndedData:
final_scores: dict[str, int] # player_id -> total score
winner_id: str
rounds_won: dict[str, int]
```
### Gameplay Events
```python
# Gameplay event data structures
@dataclass
class InitialFlipData:
positions: list[int]
cards: list[dict] # The cards revealed
@dataclass
class CardDrawnData:
source: str # "deck" or "discard"
card: dict # Card drawn
@dataclass
class CardSwappedData:
position: int
new_card: dict # Card placed (was drawn)
old_card: dict # Card removed (goes to discard)
@dataclass
class CardDiscardedData:
card: dict # Card discarded
@dataclass
class CardFlippedData:
position: int
card: dict # Card revealed
@dataclass
class FlipAsActionData:
position: int
card: dict # Card revealed
@dataclass
class KnockEarlyData:
positions: list[int] # Positions flipped
cards: list[dict] # Cards revealed
```
---
## Event Store Schema
```sql
-- migrations/versions/001_create_events.sql
-- Events table (append-only log)
CREATE TABLE events (
id BIGSERIAL PRIMARY KEY,
game_id UUID NOT NULL,
sequence_num INT NOT NULL,
event_type VARCHAR(50) NOT NULL,
player_id VARCHAR(50),
event_data JSONB NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW(),
-- Ensure events are ordered and unique per game
UNIQUE(game_id, sequence_num)
);
-- Games metadata (for queries, not source of truth)
CREATE TABLE games_v2 (
id UUID PRIMARY KEY,
room_code VARCHAR(10) NOT NULL,
status VARCHAR(20) DEFAULT 'active', -- active, completed, abandoned
created_at TIMESTAMPTZ DEFAULT NOW(),
started_at TIMESTAMPTZ,
completed_at TIMESTAMPTZ,
num_players INT,
num_rounds INT,
options JSONB,
winner_id VARCHAR(50),
host_id VARCHAR(50),
-- Denormalized for efficient queries
player_ids VARCHAR(50)[] DEFAULT '{}'
);
-- Indexes
CREATE INDEX idx_events_game_seq ON events(game_id, sequence_num);
CREATE INDEX idx_events_type ON events(event_type);
CREATE INDEX idx_events_player ON events(player_id) WHERE player_id IS NOT NULL;
CREATE INDEX idx_events_created ON events(created_at);
CREATE INDEX idx_games_status ON games_v2(status);
CREATE INDEX idx_games_room ON games_v2(room_code) WHERE status = 'active';
CREATE INDEX idx_games_players ON games_v2 USING GIN(player_ids);
CREATE INDEX idx_games_completed ON games_v2(completed_at) WHERE status = 'completed';
```
---
## Event Store Implementation
```python
# server/stores/event_store.py
from typing import Optional, AsyncIterator
from datetime import datetime
import asyncpg
import json
from models.events import GameEvent, EventType
class EventStore:
"""PostgreSQL-backed event store."""
def __init__(self, pool: asyncpg.Pool):
self.pool = pool
async def append(self, event: GameEvent) -> int:
"""
Append an event to the store.
Returns the event ID.
Raises if sequence_num already exists (optimistic concurrency).
"""
async with self.pool.acquire() as conn:
try:
row = await conn.fetchrow("""
INSERT INTO events (game_id, sequence_num, event_type, player_id, event_data)
VALUES ($1, $2, $3, $4, $5)
RETURNING id
""",
event.game_id,
event.sequence_num,
event.event_type.value,
event.player_id,
json.dumps(event.data),
)
return row["id"]
except asyncpg.UniqueViolationError:
raise ConcurrencyError(
f"Event {event.sequence_num} already exists for game {event.game_id}"
)
async def append_batch(self, events: list[GameEvent]) -> list[int]:
"""Append multiple events atomically."""
async with self.pool.acquire() as conn:
async with conn.transaction():
ids = []
for event in events:
row = await conn.fetchrow("""
INSERT INTO events (game_id, sequence_num, event_type, player_id, event_data)
VALUES ($1, $2, $3, $4, $5)
RETURNING id
""",
event.game_id,
event.sequence_num,
event.event_type.value,
event.player_id,
json.dumps(event.data),
)
ids.append(row["id"])
return ids
async def get_events(
self,
game_id: str,
from_sequence: int = 0,
to_sequence: Optional[int] = None,
) -> list[GameEvent]:
"""Get events for a game, optionally within a sequence range."""
async with self.pool.acquire() as conn:
if to_sequence is not None:
rows = await conn.fetch("""
SELECT event_type, game_id, sequence_num, player_id, event_data, created_at
FROM events
WHERE game_id = $1 AND sequence_num >= $2 AND sequence_num <= $3
ORDER BY sequence_num
""", game_id, from_sequence, to_sequence)
else:
rows = await conn.fetch("""
SELECT event_type, game_id, sequence_num, player_id, event_data, created_at
FROM events
WHERE game_id = $1 AND sequence_num >= $2
ORDER BY sequence_num
""", game_id, from_sequence)
return [
GameEvent(
event_type=EventType(row["event_type"]),
game_id=row["game_id"],
sequence_num=row["sequence_num"],
player_id=row["player_id"],
data=json.loads(row["event_data"]),
timestamp=row["created_at"],
)
for row in rows
]
async def get_latest_sequence(self, game_id: str) -> int:
"""Get the latest sequence number for a game."""
async with self.pool.acquire() as conn:
row = await conn.fetchrow("""
SELECT COALESCE(MAX(sequence_num), -1) as seq
FROM events
WHERE game_id = $1
""", game_id)
return row["seq"]
async def stream_events(
self,
game_id: str,
from_sequence: int = 0,
) -> AsyncIterator[GameEvent]:
"""Stream events for memory-efficient processing."""
async with self.pool.acquire() as conn:
async with conn.transaction():
async for row in conn.cursor("""
SELECT event_type, game_id, sequence_num, player_id, event_data, created_at
FROM events
WHERE game_id = $1 AND sequence_num >= $2
ORDER BY sequence_num
""", game_id, from_sequence):
yield GameEvent(
event_type=EventType(row["event_type"]),
game_id=row["game_id"],
sequence_num=row["sequence_num"],
player_id=row["player_id"],
data=json.loads(row["event_data"]),
timestamp=row["created_at"],
)
class ConcurrencyError(Exception):
"""Raised when optimistic concurrency check fails."""
pass
```
---
## State Rebuilder
```python
# server/models/game_state.py
from dataclasses import dataclass, field
from typing import Optional
from enum import Enum
from models.events import GameEvent, EventType
class GamePhase(str, Enum):
WAITING = "waiting"
INITIAL_FLIP = "initial_flip"
PLAYING = "playing"
FINAL_TURN = "final_turn"
ROUND_OVER = "round_over"
GAME_OVER = "game_over"
@dataclass
class Card:
rank: str
suit: str
face_up: bool = False
def to_dict(self) -> dict:
return {"rank": self.rank, "suit": self.suit, "face_up": self.face_up}
@classmethod
def from_dict(cls, d: dict) -> "Card":
return cls(rank=d["rank"], suit=d["suit"], face_up=d.get("face_up", False))
@dataclass
class PlayerState:
id: str
name: str
cards: list[Card] = field(default_factory=list)
score: Optional[int] = None
total_score: int = 0
rounds_won: int = 0
is_cpu: bool = False
cpu_profile: Optional[str] = None
@dataclass
class RebuiltGameState:
"""Game state rebuilt from events."""
game_id: str
room_code: str = ""
phase: GamePhase = GamePhase.WAITING
players: dict[str, PlayerState] = field(default_factory=dict)
player_order: list[str] = field(default_factory=list)
current_player_idx: int = 0
deck: list[Card] = field(default_factory=list)
discard: list[Card] = field(default_factory=list)
drawn_card: Optional[Card] = None
current_round: int = 0
total_rounds: int = 9
options: dict = field(default_factory=dict)
sequence_num: int = 0
finisher_id: Optional[str] = None
def apply(self, event: GameEvent) -> "RebuiltGameState":
"""
Apply an event to produce new state.
Returns self for chaining.
"""
assert event.sequence_num == self.sequence_num + 1 or self.sequence_num == 0, \
f"Expected sequence {self.sequence_num + 1}, got {event.sequence_num}"
handler = getattr(self, f"_apply_{event.event_type.value}", None)
if handler:
handler(event)
else:
raise ValueError(f"Unknown event type: {event.event_type}")
self.sequence_num = event.sequence_num
return self
def _apply_game_created(self, event: GameEvent):
self.room_code = event.data["room_code"]
self.options = event.data.get("options", {})
self.players[event.data["host_id"]] = PlayerState(
id=event.data["host_id"],
name="Host", # Will be updated by player_joined
)
def _apply_player_joined(self, event: GameEvent):
self.players[event.player_id] = PlayerState(
id=event.player_id,
name=event.data["player_name"],
is_cpu=event.data.get("is_cpu", False),
cpu_profile=event.data.get("cpu_profile"),
)
def _apply_player_left(self, event: GameEvent):
if event.player_id in self.players:
del self.players[event.player_id]
if event.player_id in self.player_order:
self.player_order.remove(event.player_id)
def _apply_game_started(self, event: GameEvent):
self.player_order = event.data["player_order"]
self.total_rounds = event.data["num_rounds"]
self.current_round = 1
self.phase = GamePhase.INITIAL_FLIP
# Deal cards
for player_id, cards_data in event.data["dealt_cards"].items():
if player_id in self.players:
self.players[player_id].cards = [
Card.from_dict(c) for c in cards_data
]
# Rebuild deck from seed would go here for full determinism
# For now, we trust the dealt_cards data
def _apply_round_started(self, event: GameEvent):
self.current_round = event.data["round_num"]
self.phase = GamePhase.INITIAL_FLIP
self.finisher_id = None
self.drawn_card = None
for player_id, cards_data in event.data["dealt_cards"].items():
if player_id in self.players:
self.players[player_id].cards = [
Card.from_dict(c) for c in cards_data
]
self.players[player_id].score = None
def _apply_initial_flip(self, event: GameEvent):
player = self.players.get(event.player_id)
if player:
for pos, card_data in zip(event.data["positions"], event.data["cards"]):
if 0 <= pos < len(player.cards):
player.cards[pos] = Card.from_dict(card_data)
player.cards[pos].face_up = True
# Check if all players have flipped
required = self.options.get("initial_flips", 2)
all_flipped = all(
sum(1 for c in p.cards if c.face_up) >= required
for p in self.players.values()
)
if all_flipped and required > 0:
self.phase = GamePhase.PLAYING
def _apply_card_drawn(self, event: GameEvent):
card = Card.from_dict(event.data["card"])
card.face_up = True
self.drawn_card = card
if event.data["source"] == "discard" and self.discard:
self.discard.pop()
def _apply_card_swapped(self, event: GameEvent):
player = self.players.get(event.player_id)
if player and self.drawn_card:
pos = event.data["position"]
old_card = player.cards[pos]
new_card = Card.from_dict(event.data["new_card"])
new_card.face_up = True
player.cards[pos] = new_card
old_card.face_up = True
self.discard.append(old_card)
self.drawn_card = None
self._advance_turn(player)
def _apply_card_discarded(self, event: GameEvent):
if self.drawn_card:
self.discard.append(self.drawn_card)
self.drawn_card = None
player = self.players.get(event.player_id)
if player:
self._advance_turn(player)
def _apply_card_flipped(self, event: GameEvent):
player = self.players.get(event.player_id)
if player:
pos = event.data["position"]
card = Card.from_dict(event.data["card"])
card.face_up = True
player.cards[pos] = card
self._advance_turn(player)
def _apply_flip_skipped(self, event: GameEvent):
player = self.players.get(event.player_id)
if player:
self._advance_turn(player)
def _apply_flip_as_action(self, event: GameEvent):
player = self.players.get(event.player_id)
if player:
pos = event.data["position"]
card = Card.from_dict(event.data["card"])
card.face_up = True
player.cards[pos] = card
self._advance_turn(player)
def _apply_knock_early(self, event: GameEvent):
player = self.players.get(event.player_id)
if player:
for pos, card_data in zip(event.data["positions"], event.data["cards"]):
card = Card.from_dict(card_data)
card.face_up = True
player.cards[pos] = card
self._check_all_face_up(player)
self._advance_turn(player)
def _apply_round_ended(self, event: GameEvent):
self.phase = GamePhase.ROUND_OVER
for player_id, score in event.data["scores"].items():
if player_id in self.players:
self.players[player_id].score = score
self.players[player_id].total_score += score
winner_id = event.data.get("winner_id")
if winner_id and winner_id in self.players:
self.players[winner_id].rounds_won += 1
def _apply_game_ended(self, event: GameEvent):
self.phase = GamePhase.GAME_OVER
def _advance_turn(self, player: PlayerState):
"""Advance to next player's turn."""
self._check_all_face_up(player)
if self.phase == GamePhase.ROUND_OVER:
return
self.current_player_idx = (self.current_player_idx + 1) % len(self.player_order)
# Check if we've come back to finisher
if self.finisher_id:
current_id = self.player_order[self.current_player_idx]
if current_id == self.finisher_id:
self.phase = GamePhase.ROUND_OVER
def _check_all_face_up(self, player: PlayerState):
"""Check if player has all cards face up (triggers final turn)."""
if all(c.face_up for c in player.cards):
if self.phase == GamePhase.PLAYING and not self.finisher_id:
self.finisher_id = player.id
self.phase = GamePhase.FINAL_TURN
@property
def current_player_id(self) -> Optional[str]:
if self.player_order and 0 <= self.current_player_idx < len(self.player_order):
return self.player_order[self.current_player_idx]
return None
def rebuild_state(events: list[GameEvent]) -> RebuiltGameState:
"""Rebuild game state from a list of events."""
if not events:
raise ValueError("Cannot rebuild state from empty event list")
state = RebuiltGameState(game_id=events[0].game_id)
for event in events:
state.apply(event)
return state
```
---
## Dual-Write Integration
Modify existing game.py to emit events alongside mutations:
```python
# server/game.py additions
class Game:
def __init__(self):
# ... existing init ...
self._event_emitter: Optional[Callable[[GameEvent], None]] = None
self._sequence_num = 0
def set_event_emitter(self, emitter: Callable[[GameEvent], None]):
"""Set callback for event emission."""
self._event_emitter = emitter
def _emit(self, event_type: EventType, player_id: Optional[str] = None, **data):
"""Emit an event if emitter is configured."""
if self._event_emitter:
self._sequence_num += 1
event = GameEvent(
event_type=event_type,
game_id=self.game_id,
sequence_num=self._sequence_num,
player_id=player_id,
data=data,
)
self._event_emitter(event)
# Example: modify draw_card
def draw_card(self, player_id: str, source: str) -> Optional[Card]:
# ... existing validation ...
if source == "deck":
card = self.deck.pop()
else:
card = self.discard_pile.pop()
self.drawn_card = card
# NEW: Emit event
self._emit(
EventType.CARD_DRAWN,
player_id=player_id,
source=source,
card=card.to_dict(),
)
return card
```
---
## Validation Test
```python
# server/tests/test_event_replay.py
import pytest
from game import Game, GameOptions
from models.events import GameEvent, rebuild_state
class TestEventReplay:
"""Verify that event replay produces identical state."""
def test_full_game_replay(self):
"""Play a complete game and verify replay matches."""
events = []
def collect_events(event: GameEvent):
events.append(event)
# Play a real game
game = Game()
game.set_event_emitter(collect_events)
game.add_player("p1", "Alice")
game.add_player("p2", "Bob")
game.start_game(num_decks=1, num_rounds=1, options=GameOptions())
# Play through initial flips
game.flip_initial_cards("p1", [0, 1])
game.flip_initial_cards("p2", [0, 1])
# Play some turns
while game.phase not in (GamePhase.ROUND_OVER, GamePhase.GAME_OVER):
current = game.current_player()
if not current:
break
# Simple bot: always draw from deck and discard
game.draw_card(current.id, "deck")
game.discard_drawn(current.id)
if len(events) > 100: # Safety limit
break
# Get final state
final_state = game.get_state("p1")
# Rebuild from events
rebuilt = rebuild_state(events)
# Verify key state matches
assert rebuilt.phase == game.phase
assert rebuilt.current_round == game.current_round
assert len(rebuilt.players) == len(game.players)
for player_id, player in rebuilt.players.items():
original = game.get_player(player_id)
assert player.score == original.score
assert player.total_score == original.total_score
assert len(player.cards) == len(original.cards)
for i, card in enumerate(player.cards):
orig_card = original.cards[i]
assert card.rank == orig_card.rank
assert card.suit == orig_card.suit
assert card.face_up == orig_card.face_up
def test_partial_replay(self):
"""Verify we can replay to any point in the game."""
events = []
def collect_events(event: GameEvent):
events.append(event)
game = Game()
game.set_event_emitter(collect_events)
# ... setup and play ...
# Replay only first N events
for n in range(1, len(events) + 1):
partial = rebuild_state(events[:n])
assert partial.sequence_num == n
def test_event_order_enforced(self):
"""Verify events must be applied in order."""
events = []
# ... collect some events ...
state = RebuiltGameState(game_id="test")
# Skip an event - should fail
with pytest.raises(AssertionError):
state.apply(events[1]) # Skipping events[0]
```
---
## Acceptance Criteria
1. **Event Classes Complete**
- [ ] All lifecycle events defined (created, joined, left, started, ended)
- [ ] All gameplay events defined (draw, swap, discard, flip, etc.)
- [ ] Events are serializable to/from JSON
- [ ] Events include all data needed for replay
2. **Event Store Working**
- [ ] PostgreSQL schema created via migration
- [ ] Can append single events
- [ ] Can append batches atomically
- [ ] Can retrieve events by game_id
- [ ] Can retrieve events by sequence range
- [ ] Concurrent writes to same sequence fail cleanly
3. **State Rebuilder Working**
- [ ] Can rebuild state from any event sequence
- [ ] Handles all event types
- [ ] Enforces event ordering
- [ ] Matches original game state exactly
4. **Dual-Write Enabled**
- [ ] Game class has event emitter hook
- [ ] All state-changing methods emit events
- [ ] Events don't affect existing game behavior
- [ ] Can be enabled/disabled via config
5. **Validation Tests Pass**
- [ ] Full game replay test
- [ ] Partial replay test
- [ ] Event order enforcement test
- [ ] At least 95% of games replay correctly
---
## Implementation Order
1. Create event dataclasses (`models/events.py`)
2. Create database migration for events table
3. Implement EventStore class
4. Implement RebuiltGameState class
5. Add event emitter to Game class
6. Add `_emit()` calls to all game methods
7. Write validation tests
8. Run tests until 100% pass
---
## Notes for Agent
- The existing `game.py` has good test coverage - don't break existing tests
- Start with lifecycle events, then gameplay events
- The deck seed is important for deterministic replay
- Consider edge cases: player disconnects, CPU players, house rules
- Events should be immutable - never modify after creation

View File

@@ -0,0 +1,870 @@
# V2-02: Persistence & Recovery
## Overview
This document covers the live state caching and game recovery system. Games will survive server restarts by storing live state in Redis and rebuilding from events.
**Dependencies:** V2-01 (Event Sourcing)
**Dependents:** V2-03 (User Accounts), V2-06 (Replay)
---
## Goals
1. Cache live game state in Redis
2. Implement Redis pub/sub for multi-server support
3. Enable game recovery from events on server restart
4. Implement graceful shutdown with state preservation
---
## Current State
Games are stored in-memory in `main.py`:
```python
# Current approach
rooms: dict[str, Room] = {} # Lost on restart!
```
On server restart, all active games are lost.
---
## Architecture
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ FastAPI #1 │ │ FastAPI #2 │ │ FastAPI #N │
│ (WebSocket) │ │ (WebSocket) │ │ (WebSocket) │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
└───────────────────────┼───────────────────────┘
┌────────────▼────────────┐
│ Redis │
│ ┌─────────────────┐ │
│ │ State Cache │ │ <- Live game state
│ │ (Hash/JSON) │ │
│ └─────────────────┘ │
│ ┌─────────────────┐ │
│ │ Pub/Sub │ │ <- Cross-server events
│ │ (Channels) │ │
│ └─────────────────┘ │
│ ┌─────────────────┐ │
│ │ Room Index │ │ <- Active room codes
│ │ (Set) │ │
│ └─────────────────┘ │
└─────────────────────────┘
┌────────────▼────────────┐
│ PostgreSQL │
│ (Event Store) │ <- Source of truth
└─────────────────────────┘
```
---
## Redis Data Model
### Key Patterns
```
golf:room:{room_code} -> Hash (room metadata)
golf:game:{game_id} -> JSON (full game state)
golf:room:{room_code}:players -> Set (connected player IDs)
golf:rooms:active -> Set (active room codes)
golf:player:{player_id}:room -> String (player's current room)
```
### Room Metadata Hash
```
golf:room:ABCD
├── game_id: "uuid-..."
├── host_id: "player-uuid"
├── created_at: "2024-01-15T10:30:00Z"
├── status: "waiting" | "playing" | "finished"
└── server_id: "server-1" # Which server owns this room
```
### Game State JSON
```json
{
"game_id": "uuid-...",
"room_code": "ABCD",
"phase": "playing",
"current_round": 3,
"total_rounds": 9,
"current_player_idx": 1,
"player_order": ["p1", "p2", "p3"],
"players": {
"p1": {
"id": "p1",
"name": "Alice",
"cards": [{"rank": "K", "suit": "hearts", "face_up": true}, ...],
"score": null,
"total_score": 15,
"rounds_won": 1,
"is_cpu": false
}
},
"deck_count": 32,
"discard_top": {"rank": "7", "suit": "clubs"},
"drawn_card": null,
"options": {...},
"sequence_num": 47
}
```
---
## State Cache Implementation
```python
# server/stores/state_cache.py
import json
from typing import Optional
from datetime import timedelta
import redis.asyncio as redis
from models.game_state import RebuiltGameState
class StateCache:
"""Redis-backed live game state cache."""
# Key patterns
ROOM_KEY = "golf:room:{room_code}"
GAME_KEY = "golf:game:{game_id}"
ROOM_PLAYERS_KEY = "golf:room:{room_code}:players"
ACTIVE_ROOMS_KEY = "golf:rooms:active"
PLAYER_ROOM_KEY = "golf:player:{player_id}:room"
# TTLs
ROOM_TTL = timedelta(hours=4) # Inactive rooms expire
GAME_TTL = timedelta(hours=4)
def __init__(self, redis_client: redis.Redis):
self.redis = redis_client
# --- Room Operations ---
async def create_room(
self,
room_code: str,
game_id: str,
host_id: str,
server_id: str,
) -> None:
"""Create a new room."""
pipe = self.redis.pipeline()
# Room metadata
pipe.hset(
self.ROOM_KEY.format(room_code=room_code),
mapping={
"game_id": game_id,
"host_id": host_id,
"status": "waiting",
"server_id": server_id,
"created_at": datetime.utcnow().isoformat(),
},
)
pipe.expire(self.ROOM_KEY.format(room_code=room_code), self.ROOM_TTL)
# Add to active rooms
pipe.sadd(self.ACTIVE_ROOMS_KEY, room_code)
# Track host's room
pipe.set(
self.PLAYER_ROOM_KEY.format(player_id=host_id),
room_code,
ex=self.ROOM_TTL,
)
await pipe.execute()
async def get_room(self, room_code: str) -> Optional[dict]:
"""Get room metadata."""
data = await self.redis.hgetall(self.ROOM_KEY.format(room_code=room_code))
if not data:
return None
return {k.decode(): v.decode() for k, v in data.items()}
async def room_exists(self, room_code: str) -> bool:
"""Check if room exists."""
return await self.redis.exists(self.ROOM_KEY.format(room_code=room_code)) > 0
async def delete_room(self, room_code: str) -> None:
"""Delete a room and all associated data."""
room = await self.get_room(room_code)
if not room:
return
pipe = self.redis.pipeline()
# Get players to clean up their mappings
players = await self.redis.smembers(
self.ROOM_PLAYERS_KEY.format(room_code=room_code)
)
for player_id in players:
pipe.delete(self.PLAYER_ROOM_KEY.format(player_id=player_id.decode()))
# Delete room data
pipe.delete(self.ROOM_KEY.format(room_code=room_code))
pipe.delete(self.ROOM_PLAYERS_KEY.format(room_code=room_code))
pipe.srem(self.ACTIVE_ROOMS_KEY, room_code)
# Delete game state if exists
if "game_id" in room:
pipe.delete(self.GAME_KEY.format(game_id=room["game_id"]))
await pipe.execute()
async def get_active_rooms(self) -> set[str]:
"""Get all active room codes."""
rooms = await self.redis.smembers(self.ACTIVE_ROOMS_KEY)
return {r.decode() for r in rooms}
# --- Player Operations ---
async def add_player_to_room(self, room_code: str, player_id: str) -> None:
"""Add a player to a room."""
pipe = self.redis.pipeline()
pipe.sadd(self.ROOM_PLAYERS_KEY.format(room_code=room_code), player_id)
pipe.set(
self.PLAYER_ROOM_KEY.format(player_id=player_id),
room_code,
ex=self.ROOM_TTL,
)
# Refresh room TTL on activity
pipe.expire(self.ROOM_KEY.format(room_code=room_code), self.ROOM_TTL)
await pipe.execute()
async def remove_player_from_room(self, room_code: str, player_id: str) -> None:
"""Remove a player from a room."""
pipe = self.redis.pipeline()
pipe.srem(self.ROOM_PLAYERS_KEY.format(room_code=room_code), player_id)
pipe.delete(self.PLAYER_ROOM_KEY.format(player_id=player_id))
await pipe.execute()
async def get_room_players(self, room_code: str) -> set[str]:
"""Get player IDs in a room."""
players = await self.redis.smembers(
self.ROOM_PLAYERS_KEY.format(room_code=room_code)
)
return {p.decode() for p in players}
async def get_player_room(self, player_id: str) -> Optional[str]:
"""Get the room a player is in."""
room = await self.redis.get(self.PLAYER_ROOM_KEY.format(player_id=player_id))
return room.decode() if room else None
# --- Game State Operations ---
async def save_game_state(self, game_id: str, state: dict) -> None:
"""Save full game state."""
await self.redis.set(
self.GAME_KEY.format(game_id=game_id),
json.dumps(state),
ex=self.GAME_TTL,
)
async def get_game_state(self, game_id: str) -> Optional[dict]:
"""Get full game state."""
data = await self.redis.get(self.GAME_KEY.format(game_id=game_id))
if not data:
return None
return json.loads(data)
async def update_game_state(self, game_id: str, updates: dict) -> None:
"""Partial update to game state (get, merge, set)."""
state = await self.get_game_state(game_id)
if state:
state.update(updates)
await self.save_game_state(game_id, state)
async def delete_game_state(self, game_id: str) -> None:
"""Delete game state."""
await self.redis.delete(self.GAME_KEY.format(game_id=game_id))
# --- Room Status ---
async def set_room_status(self, room_code: str, status: str) -> None:
"""Update room status."""
await self.redis.hset(
self.ROOM_KEY.format(room_code=room_code),
"status",
status,
)
async def refresh_room_ttl(self, room_code: str) -> None:
"""Refresh room TTL on activity."""
pipe = self.redis.pipeline()
pipe.expire(self.ROOM_KEY.format(room_code=room_code), self.ROOM_TTL)
room = await self.get_room(room_code)
if room and "game_id" in room:
pipe.expire(self.GAME_KEY.format(game_id=room["game_id"]), self.GAME_TTL)
await pipe.execute()
```
---
## Pub/Sub for Multi-Server
```python
# server/stores/pubsub.py
import asyncio
import json
from typing import Callable, Awaitable
from dataclasses import dataclass
from enum import Enum
import redis.asyncio as redis
class MessageType(str, Enum):
GAME_STATE_UPDATE = "game_state_update"
PLAYER_JOINED = "player_joined"
PLAYER_LEFT = "player_left"
ROOM_CLOSED = "room_closed"
BROADCAST = "broadcast"
@dataclass
class PubSubMessage:
type: MessageType
room_code: str
data: dict
def to_json(self) -> str:
return json.dumps({
"type": self.type.value,
"room_code": self.room_code,
"data": self.data,
})
@classmethod
def from_json(cls, raw: str) -> "PubSubMessage":
d = json.loads(raw)
return cls(
type=MessageType(d["type"]),
room_code=d["room_code"],
data=d["data"],
)
class GamePubSub:
"""Redis pub/sub for cross-server game events."""
CHANNEL_PREFIX = "golf:room:"
def __init__(self, redis_client: redis.Redis):
self.redis = redis_client
self.pubsub = redis_client.pubsub()
self._handlers: dict[str, list[Callable[[PubSubMessage], Awaitable[None]]]] = {}
self._running = False
self._task: Optional[asyncio.Task] = None
def _channel(self, room_code: str) -> str:
return f"{self.CHANNEL_PREFIX}{room_code}"
async def subscribe(
self,
room_code: str,
handler: Callable[[PubSubMessage], Awaitable[None]],
) -> None:
"""Subscribe to room events."""
channel = self._channel(room_code)
if channel not in self._handlers:
self._handlers[channel] = []
await self.pubsub.subscribe(channel)
self._handlers[channel].append(handler)
async def unsubscribe(self, room_code: str) -> None:
"""Unsubscribe from room events."""
channel = self._channel(room_code)
if channel in self._handlers:
del self._handlers[channel]
await self.pubsub.unsubscribe(channel)
async def publish(self, message: PubSubMessage) -> None:
"""Publish a message to a room's channel."""
channel = self._channel(message.room_code)
await self.redis.publish(channel, message.to_json())
async def start(self) -> None:
"""Start listening for messages."""
self._running = True
self._task = asyncio.create_task(self._listen())
async def stop(self) -> None:
"""Stop listening."""
self._running = False
if self._task:
self._task.cancel()
try:
await self._task
except asyncio.CancelledError:
pass
await self.pubsub.close()
async def _listen(self) -> None:
"""Main listener loop."""
while self._running:
try:
message = await self.pubsub.get_message(
ignore_subscribe_messages=True,
timeout=1.0,
)
if message and message["type"] == "message":
channel = message["channel"].decode()
handlers = self._handlers.get(channel, [])
try:
msg = PubSubMessage.from_json(message["data"].decode())
for handler in handlers:
await handler(msg)
except Exception as e:
print(f"Error handling pubsub message: {e}")
except asyncio.CancelledError:
break
except Exception as e:
print(f"PubSub listener error: {e}")
await asyncio.sleep(1)
```
---
## Game Recovery
```python
# server/services/recovery_service.py
from typing import Optional
import asyncio
from stores.event_store import EventStore
from stores.state_cache import StateCache
from models.events import rebuild_state, EventType
class RecoveryService:
"""Recovers games from event store on startup."""
def __init__(self, event_store: EventStore, state_cache: StateCache):
self.event_store = event_store
self.state_cache = state_cache
async def recover_all_games(self) -> dict[str, any]:
"""
Recover all active games from event store.
Returns dict of recovered games.
"""
results = {
"recovered": 0,
"failed": 0,
"skipped": 0,
"games": [],
}
# Get active rooms from Redis (may be stale)
active_rooms = await self.state_cache.get_active_rooms()
for room_code in active_rooms:
room = await self.state_cache.get_room(room_code)
if not room:
results["skipped"] += 1
continue
game_id = room.get("game_id")
if not game_id:
results["skipped"] += 1
continue
try:
game = await self.recover_game(game_id)
if game:
results["recovered"] += 1
results["games"].append({
"game_id": game_id,
"room_code": room_code,
"phase": game.phase.value,
"sequence": game.sequence_num,
})
else:
results["skipped"] += 1
except Exception as e:
print(f"Failed to recover game {game_id}: {e}")
results["failed"] += 1
return results
async def recover_game(self, game_id: str) -> Optional[any]:
"""
Recover a single game from event store.
Returns the rebuilt game state.
"""
# Get all events for this game
events = await self.event_store.get_events(game_id)
if not events:
return None
# Check if game is actually active (not ended)
last_event = events[-1]
if last_event.event_type == EventType.GAME_ENDED:
return None # Game is finished, don't recover
# Rebuild state
state = rebuild_state(events)
# Save to cache
await self.state_cache.save_game_state(
game_id,
self._state_to_dict(state),
)
return state
async def recover_from_sequence(
self,
game_id: str,
cached_state: dict,
cached_sequence: int,
) -> Optional[any]:
"""
Recover game by applying only new events to cached state.
More efficient than full rebuild.
"""
# Get events after cached sequence
new_events = await self.event_store.get_events(
game_id,
from_sequence=cached_sequence + 1,
)
if not new_events:
return None # No new events
# Rebuild state from cache + new events
state = self._dict_to_state(cached_state)
for event in new_events:
state.apply(event)
# Update cache
await self.state_cache.save_game_state(
game_id,
self._state_to_dict(state),
)
return state
def _state_to_dict(self, state) -> dict:
"""Convert RebuiltGameState to dict for caching."""
return {
"game_id": state.game_id,
"room_code": state.room_code,
"phase": state.phase.value,
"current_round": state.current_round,
"total_rounds": state.total_rounds,
"current_player_idx": state.current_player_idx,
"player_order": state.player_order,
"players": {
pid: {
"id": p.id,
"name": p.name,
"cards": [c.to_dict() for c in p.cards],
"score": p.score,
"total_score": p.total_score,
"rounds_won": p.rounds_won,
"is_cpu": p.is_cpu,
"cpu_profile": p.cpu_profile,
}
for pid, p in state.players.items()
},
"deck_count": len(state.deck),
"discard_top": state.discard[-1].to_dict() if state.discard else None,
"drawn_card": state.drawn_card.to_dict() if state.drawn_card else None,
"options": state.options,
"sequence_num": state.sequence_num,
"finisher_id": state.finisher_id,
}
def _dict_to_state(self, d: dict):
"""Convert dict back to RebuiltGameState."""
# Implementation depends on RebuiltGameState structure
pass
```
---
## Graceful Shutdown
```python
# server/main.py additions
import signal
import asyncio
from contextlib import asynccontextmanager
from stores.state_cache import StateCache
from stores.event_store import EventStore
from services.recovery_service import RecoveryService
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Application lifespan handler."""
# Startup
print("Starting up...")
# Initialize connections
app.state.redis = await create_redis_pool()
app.state.pg_pool = await create_pg_pool()
app.state.state_cache = StateCache(app.state.redis)
app.state.event_store = EventStore(app.state.pg_pool)
app.state.recovery_service = RecoveryService(
app.state.event_store,
app.state.state_cache,
)
# Recover games
print("Recovering games from event store...")
results = await app.state.recovery_service.recover_all_games()
print(f"Recovery complete: {results['recovered']} recovered, "
f"{results['failed']} failed, {results['skipped']} skipped")
# Start pub/sub
app.state.pubsub = GamePubSub(app.state.redis)
await app.state.pubsub.start()
yield
# Shutdown
print("Shutting down...")
# Stop accepting new connections
await app.state.pubsub.stop()
# Flush any pending state to Redis
await flush_pending_states(app)
# Close connections
await app.state.redis.close()
await app.state.pg_pool.close()
print("Shutdown complete")
async def flush_pending_states(app: FastAPI):
"""Flush any in-memory state to Redis before shutdown."""
# If we have any rooms with unsaved state, save them now
for room_code, room in rooms.items():
if room.game and room.game.game_id:
try:
state = room.game.get_full_state()
await app.state.state_cache.save_game_state(
room.game.game_id,
state,
)
except Exception as e:
print(f"Error flushing state for room {room_code}: {e}")
app = FastAPI(lifespan=lifespan)
# Handle SIGTERM gracefully
def handle_sigterm(signum, frame):
"""Handle SIGTERM by initiating graceful shutdown."""
raise KeyboardInterrupt()
signal.signal(signal.SIGTERM, handle_sigterm)
```
---
## Integration with Game Service
```python
# server/services/game_service.py
from stores.state_cache import StateCache
from stores.event_store import EventStore
from stores.pubsub import GamePubSub, PubSubMessage, MessageType
class GameService:
"""
Handles game commands with event sourcing.
Coordinates between event store, state cache, and pub/sub.
"""
def __init__(
self,
event_store: EventStore,
state_cache: StateCache,
pubsub: GamePubSub,
):
self.event_store = event_store
self.state_cache = state_cache
self.pubsub = pubsub
async def handle_draw(
self,
game_id: str,
player_id: str,
source: str,
) -> dict:
"""Handle draw card command."""
# 1. Get current state from cache
state = await self.state_cache.get_game_state(game_id)
if not state:
raise GameNotFoundError(game_id)
# 2. Validate command
if state["current_player_id"] != player_id:
raise NotYourTurnError()
# 3. Execute command (get card from deck/discard)
# This uses the existing game logic
game = self._load_game_from_state(state)
card = game.draw_card(player_id, source)
if not card:
raise InvalidMoveError("Cannot draw from that source")
# 4. Create event
event = GameEvent(
event_type=EventType.CARD_DRAWN,
game_id=game_id,
sequence_num=state["sequence_num"] + 1,
player_id=player_id,
data={"source": source, "card": card.to_dict()},
)
# 5. Persist event
await self.event_store.append(event)
# 6. Update cache
new_state = game.get_full_state()
new_state["sequence_num"] = event.sequence_num
await self.state_cache.save_game_state(game_id, new_state)
# 7. Publish to other servers
await self.pubsub.publish(PubSubMessage(
type=MessageType.GAME_STATE_UPDATE,
room_code=state["room_code"],
data={"game_state": new_state},
))
return new_state
```
---
## Acceptance Criteria
1. **Redis State Cache Working**
- [ ] Can create/get/delete rooms
- [ ] Can add/remove players from rooms
- [ ] Can save/get/delete game state
- [ ] TTL expiration works correctly
- [ ] Room code uniqueness enforced
2. **Pub/Sub Working**
- [ ] Can subscribe to room channels
- [ ] Can publish messages
- [ ] Messages received by all subscribers
- [ ] Handles disconnections gracefully
- [ ] Multiple servers can communicate
3. **Game Recovery Working**
- [ ] Games recovered on startup
- [ ] State matches what was saved
- [ ] Partial recovery (from sequence) works
- [ ] Ended games not recovered
- [ ] Failed recoveries logged and skipped
4. **Graceful Shutdown Working**
- [ ] SIGTERM triggers clean shutdown
- [ ] In-flight requests complete
- [ ] State flushed to Redis
- [ ] Connections closed cleanly
- [ ] No data loss on restart
5. **Integration Tests**
- [ ] Server restart doesn't lose games
- [ ] Multi-server state sync works
- [ ] State cache matches event store
- [ ] Performance acceptable (<100ms for state ops)
---
## Implementation Order
1. Set up Redis locally (docker)
2. Implement StateCache class
3. Write StateCache tests
4. Implement GamePubSub class
5. Implement RecoveryService
6. Add lifespan handler to main.py
7. Integrate with game commands
8. Test full recovery cycle
9. Test multi-server pub/sub
---
## Docker Setup for Development
```yaml
# docker-compose.dev.yml
version: '3.8'
services:
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
command: redis-server --appendonly yes
postgres:
image: postgres:16-alpine
ports:
- "5432:5432"
environment:
POSTGRES_USER: golf
POSTGRES_PASSWORD: devpassword
POSTGRES_DB: golf
volumes:
- postgres_data:/var/lib/postgresql/data
volumes:
redis_data:
postgres_data:
```
```bash
# Start services
docker-compose -f docker-compose.dev.yml up -d
# Connect to Redis CLI
docker exec -it golfgame_redis_1 redis-cli
# Connect to PostgreSQL
docker exec -it golfgame_postgres_1 psql -U golf
```
---
## Notes for Agent
- Redis operations should use pipelines for atomicity
- Consider Redis Cluster for production (but not needed initially)
- The state cache is a cache, not source of truth (events are)
- Pub/sub is best-effort; state sync should handle missed messages
- Test with multiple server instances locally
- Use connection pooling for both Redis and PostgreSQL

File diff suppressed because it is too large Load Diff

1179
docs/v2/V2_04_ADMIN_TOOLS.md Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,871 @@
# V2-05: Stats & Leaderboards
## Overview
This document covers player statistics aggregation and leaderboard systems.
**Dependencies:** V2-03 (User Accounts), V2-01 (Events for aggregation)
**Dependents:** None (end feature)
---
## Goals
1. Aggregate player statistics from game events
2. Create leaderboard views (by wins, by average score, etc.)
3. Background worker for stats processing
4. Leaderboard API endpoints
5. Leaderboard UI in client
6. Achievement/badge system (stretch goal)
---
## Database Schema
```sql
-- migrations/versions/004_stats_leaderboards.sql
-- Player statistics (aggregated from events)
CREATE TABLE player_stats (
user_id UUID PRIMARY KEY REFERENCES users(id),
-- Game counts
games_played INT DEFAULT 0,
games_won INT DEFAULT 0,
games_vs_humans INT DEFAULT 0,
games_won_vs_humans INT DEFAULT 0,
-- Round stats
rounds_played INT DEFAULT 0,
rounds_won INT DEFAULT 0,
total_points INT DEFAULT 0, -- Sum of all round scores (lower is better)
-- Best/worst
best_round_score INT,
worst_round_score INT,
best_game_score INT, -- Lowest total in a game
-- Achievements
knockouts INT DEFAULT 0, -- Times going out first
perfect_rounds INT DEFAULT 0, -- Score of 0 or less
wolfpacks INT DEFAULT 0, -- Four jacks achieved
-- Streaks
current_win_streak INT DEFAULT 0,
best_win_streak INT DEFAULT 0,
-- Timestamps
first_game_at TIMESTAMPTZ,
last_game_at TIMESTAMPTZ,
updated_at TIMESTAMPTZ DEFAULT NOW()
);
-- Stats processing queue (for background worker)
CREATE TABLE stats_queue (
id BIGSERIAL PRIMARY KEY,
game_id UUID NOT NULL,
status VARCHAR(20) DEFAULT 'pending', -- pending, processing, completed, failed
created_at TIMESTAMPTZ DEFAULT NOW(),
processed_at TIMESTAMPTZ,
error_message TEXT
);
-- Leaderboard cache (refreshed periodically)
CREATE MATERIALIZED VIEW leaderboard_overall AS
SELECT
u.id as user_id,
u.username,
s.games_played,
s.games_won,
ROUND(s.games_won::numeric / NULLIF(s.games_played, 0) * 100, 1) as win_rate,
s.rounds_won,
ROUND(s.total_points::numeric / NULLIF(s.rounds_played, 0), 1) as avg_score,
s.best_round_score,
s.knockouts,
s.best_win_streak,
s.last_game_at
FROM player_stats s
JOIN users u ON s.user_id = u.id
WHERE s.games_played >= 5 -- Minimum games for ranking
AND u.deleted_at IS NULL
AND u.is_banned = false;
CREATE UNIQUE INDEX idx_leaderboard_overall_user ON leaderboard_overall(user_id);
CREATE INDEX idx_leaderboard_overall_wins ON leaderboard_overall(games_won DESC);
CREATE INDEX idx_leaderboard_overall_rate ON leaderboard_overall(win_rate DESC);
CREATE INDEX idx_leaderboard_overall_score ON leaderboard_overall(avg_score ASC);
-- Achievements/badges
CREATE TABLE achievements (
id VARCHAR(50) PRIMARY KEY,
name VARCHAR(100) NOT NULL,
description TEXT,
icon VARCHAR(50),
category VARCHAR(50), -- games, rounds, special
threshold INT, -- e.g., 10 for "Win 10 games"
sort_order INT DEFAULT 0
);
CREATE TABLE user_achievements (
user_id UUID REFERENCES users(id),
achievement_id VARCHAR(50) REFERENCES achievements(id),
earned_at TIMESTAMPTZ DEFAULT NOW(),
game_id UUID, -- Game where it was earned (optional)
PRIMARY KEY (user_id, achievement_id)
);
-- Seed achievements
INSERT INTO achievements (id, name, description, icon, category, threshold, sort_order) VALUES
('first_win', 'First Victory', 'Win your first game', '🏆', 'games', 1, 1),
('win_10', 'Rising Star', 'Win 10 games', '', 'games', 10, 2),
('win_50', 'Veteran', 'Win 50 games', '🎖️', 'games', 50, 3),
('win_100', 'Champion', 'Win 100 games', '👑', 'games', 100, 4),
('perfect_round', 'Perfect', 'Score 0 or less in a round', '💎', 'rounds', 1, 10),
('negative_round', 'Below Zero', 'Score negative in a round', '❄️', 'rounds', 1, 11),
('knockout_10', 'Closer', 'Go out first 10 times', '🚪', 'special', 10, 20),
('wolfpack', 'Wolfpack', 'Get all 4 Jacks', '🐺', 'special', 1, 21),
('streak_5', 'Hot Streak', 'Win 5 games in a row', '🔥', 'special', 5, 30),
('streak_10', 'Unstoppable', 'Win 10 games in a row', '', 'special', 10, 31);
-- Indexes
CREATE INDEX idx_stats_queue_pending ON stats_queue(status, created_at)
WHERE status = 'pending';
CREATE INDEX idx_user_achievements_user ON user_achievements(user_id);
```
---
## Stats Service
```python
# server/services/stats_service.py
from dataclasses import dataclass
from typing import Optional, List
from datetime import datetime
import asyncpg
from stores.event_store import EventStore
from models.events import EventType
@dataclass
class PlayerStats:
user_id: str
username: str
games_played: int
games_won: int
win_rate: float
rounds_played: int
rounds_won: int
avg_score: float
best_round_score: Optional[int]
knockouts: int
best_win_streak: int
achievements: List[str]
@dataclass
class LeaderboardEntry:
rank: int
user_id: str
username: str
value: float # The metric being ranked by
games_played: int
secondary_value: Optional[float] = None
class StatsService:
"""Player statistics and leaderboards."""
def __init__(self, db_pool: asyncpg.Pool, event_store: EventStore):
self.db = db_pool
self.event_store = event_store
# --- Stats Queries ---
async def get_player_stats(self, user_id: str) -> Optional[PlayerStats]:
"""Get stats for a specific player."""
async with self.db.acquire() as conn:
row = await conn.fetchrow("""
SELECT s.*, u.username,
ROUND(s.games_won::numeric / NULLIF(s.games_played, 0) * 100, 1) as win_rate,
ROUND(s.total_points::numeric / NULLIF(s.rounds_played, 0), 1) as avg_score
FROM player_stats s
JOIN users u ON s.user_id = u.id
WHERE s.user_id = $1
""", user_id)
if not row:
return None
# Get achievements
achievements = await conn.fetch("""
SELECT achievement_id FROM user_achievements
WHERE user_id = $1
""", user_id)
return PlayerStats(
user_id=row["user_id"],
username=row["username"],
games_played=row["games_played"],
games_won=row["games_won"],
win_rate=float(row["win_rate"] or 0),
rounds_played=row["rounds_played"],
rounds_won=row["rounds_won"],
avg_score=float(row["avg_score"] or 0),
best_round_score=row["best_round_score"],
knockouts=row["knockouts"],
best_win_streak=row["best_win_streak"],
achievements=[a["achievement_id"] for a in achievements],
)
async def get_leaderboard(
self,
metric: str = "wins",
limit: int = 50,
offset: int = 0,
) -> List[LeaderboardEntry]:
"""
Get leaderboard by metric.
Metrics: wins, win_rate, avg_score, knockouts, streak
"""
order_map = {
"wins": ("games_won", "DESC"),
"win_rate": ("win_rate", "DESC"),
"avg_score": ("avg_score", "ASC"), # Lower is better
"knockouts": ("knockouts", "DESC"),
"streak": ("best_win_streak", "DESC"),
}
if metric not in order_map:
metric = "wins"
column, direction = order_map[metric]
async with self.db.acquire() as conn:
# Use materialized view for performance
rows = await conn.fetch(f"""
SELECT
user_id, username, games_played, games_won,
win_rate, avg_score, knockouts, best_win_streak,
ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank
FROM leaderboard_overall
ORDER BY {column} {direction}
LIMIT $1 OFFSET $2
""", limit, offset)
return [
LeaderboardEntry(
rank=row["rank"],
user_id=row["user_id"],
username=row["username"],
value=float(row[column] or 0),
games_played=row["games_played"],
secondary_value=float(row["win_rate"] or 0) if metric != "win_rate" else None,
)
for row in rows
]
async def get_player_rank(self, user_id: str, metric: str = "wins") -> Optional[int]:
"""Get a player's rank on a leaderboard."""
order_map = {
"wins": ("games_won", "DESC"),
"win_rate": ("win_rate", "DESC"),
"avg_score": ("avg_score", "ASC"),
}
if metric not in order_map:
return None
column, direction = order_map[metric]
async with self.db.acquire() as conn:
row = await conn.fetchrow(f"""
SELECT rank FROM (
SELECT user_id, ROW_NUMBER() OVER (ORDER BY {column} {direction}) as rank
FROM leaderboard_overall
) ranked
WHERE user_id = $1
""", user_id)
return row["rank"] if row else None
async def refresh_leaderboard(self) -> None:
"""Refresh the materialized view."""
async with self.db.acquire() as conn:
await conn.execute("REFRESH MATERIALIZED VIEW CONCURRENTLY leaderboard_overall")
# --- Achievement Queries ---
async def get_achievements(self) -> List[dict]:
"""Get all available achievements."""
async with self.db.acquire() as conn:
rows = await conn.fetch("""
SELECT id, name, description, icon, category, threshold
FROM achievements
ORDER BY sort_order
""")
return [dict(row) for row in rows]
async def get_user_achievements(self, user_id: str) -> List[dict]:
"""Get achievements earned by a user."""
async with self.db.acquire() as conn:
rows = await conn.fetch("""
SELECT a.id, a.name, a.description, a.icon, ua.earned_at
FROM user_achievements ua
JOIN achievements a ON ua.achievement_id = a.id
WHERE ua.user_id = $1
ORDER BY ua.earned_at DESC
""", user_id)
return [dict(row) for row in rows]
# --- Stats Processing ---
async def process_game_end(self, game_id: str) -> None:
"""
Process a completed game and update player stats.
Called by background worker or directly after game ends.
"""
# Get game events
events = await self.event_store.get_events(game_id)
if not events:
return
# Extract game data from events
game_data = self._extract_game_data(events)
if not game_data:
return
async with self.db.acquire() as conn:
async with conn.transaction():
for player_id, player_data in game_data["players"].items():
# Skip CPU players (they don't have user accounts)
if player_data.get("is_cpu"):
continue
# Ensure stats row exists
await conn.execute("""
INSERT INTO player_stats (user_id)
VALUES ($1)
ON CONFLICT (user_id) DO NOTHING
""", player_id)
# Update stats
is_winner = player_id == game_data["winner_id"]
total_score = player_data["total_score"]
rounds_won = player_data["rounds_won"]
await conn.execute("""
UPDATE player_stats SET
games_played = games_played + 1,
games_won = games_won + $2,
rounds_played = rounds_played + $3,
rounds_won = rounds_won + $4,
total_points = total_points + $5,
knockouts = knockouts + $6,
best_round_score = LEAST(best_round_score, $7),
worst_round_score = GREATEST(worst_round_score, $8),
best_game_score = LEAST(best_game_score, $5),
current_win_streak = CASE WHEN $2 = 1 THEN current_win_streak + 1 ELSE 0 END,
best_win_streak = GREATEST(best_win_streak,
CASE WHEN $2 = 1 THEN current_win_streak + 1 ELSE best_win_streak END),
first_game_at = COALESCE(first_game_at, NOW()),
last_game_at = NOW(),
updated_at = NOW()
WHERE user_id = $1
""",
player_id,
1 if is_winner else 0,
game_data["num_rounds"],
rounds_won,
total_score,
player_data.get("knockouts", 0),
player_data.get("best_round", total_score),
player_data.get("worst_round", total_score),
)
# Check for new achievements
await self._check_achievements(conn, player_id, game_id, player_data, is_winner)
def _extract_game_data(self, events) -> Optional[dict]:
"""Extract game data from events."""
data = {
"players": {},
"num_rounds": 0,
"winner_id": None,
}
for event in events:
if event.event_type == EventType.PLAYER_JOINED:
data["players"][event.player_id] = {
"is_cpu": event.data.get("is_cpu", False),
"total_score": 0,
"rounds_won": 0,
"knockouts": 0,
"best_round": None,
"worst_round": None,
}
elif event.event_type == EventType.ROUND_ENDED:
data["num_rounds"] += 1
scores = event.data.get("scores", {})
winner_id = event.data.get("winner_id")
for player_id, score in scores.items():
if player_id in data["players"]:
p = data["players"][player_id]
p["total_score"] += score
if p["best_round"] is None or score < p["best_round"]:
p["best_round"] = score
if p["worst_round"] is None or score > p["worst_round"]:
p["worst_round"] = score
if player_id == winner_id:
p["rounds_won"] += 1
# Track who went out first (finisher)
# This would need to be tracked in events
elif event.event_type == EventType.GAME_ENDED:
data["winner_id"] = event.data.get("winner_id")
return data if data["num_rounds"] > 0 else None
async def _check_achievements(
self,
conn: asyncpg.Connection,
user_id: str,
game_id: str,
player_data: dict,
is_winner: bool,
) -> List[str]:
"""Check and award new achievements."""
new_achievements = []
# Get current stats
stats = await conn.fetchrow("""
SELECT games_won, knockouts, best_win_streak, current_win_streak
FROM player_stats
WHERE user_id = $1
""", user_id)
if not stats:
return []
# Get already earned achievements
earned = await conn.fetch("""
SELECT achievement_id FROM user_achievements WHERE user_id = $1
""", user_id)
earned_ids = {e["achievement_id"] for e in earned}
# Check win milestones
wins = stats["games_won"]
if wins >= 1 and "first_win" not in earned_ids:
new_achievements.append("first_win")
if wins >= 10 and "win_10" not in earned_ids:
new_achievements.append("win_10")
if wins >= 50 and "win_50" not in earned_ids:
new_achievements.append("win_50")
if wins >= 100 and "win_100" not in earned_ids:
new_achievements.append("win_100")
# Check streak achievements
streak = stats["current_win_streak"]
if streak >= 5 and "streak_5" not in earned_ids:
new_achievements.append("streak_5")
if streak >= 10 and "streak_10" not in earned_ids:
new_achievements.append("streak_10")
# Check knockout achievements
if stats["knockouts"] >= 10 and "knockout_10" not in earned_ids:
new_achievements.append("knockout_10")
# Check round-specific achievements
if player_data.get("best_round") is not None:
if player_data["best_round"] <= 0 and "perfect_round" not in earned_ids:
new_achievements.append("perfect_round")
if player_data["best_round"] < 0 and "negative_round" not in earned_ids:
new_achievements.append("negative_round")
# Award new achievements
for achievement_id in new_achievements:
await conn.execute("""
INSERT INTO user_achievements (user_id, achievement_id, game_id)
VALUES ($1, $2, $3)
ON CONFLICT DO NOTHING
""", user_id, achievement_id, game_id)
return new_achievements
```
---
## Background Worker
```python
# server/workers/stats_worker.py
import asyncio
from datetime import datetime, timedelta
import asyncpg
from arq import create_pool
from arq.connections import RedisSettings
from services.stats_service import StatsService
from stores.event_store import EventStore
async def process_stats_queue(ctx):
"""Process pending games in the stats queue."""
db: asyncpg.Pool = ctx["db_pool"]
stats_service: StatsService = ctx["stats_service"]
async with db.acquire() as conn:
# Get pending games
games = await conn.fetch("""
SELECT id, game_id FROM stats_queue
WHERE status = 'pending'
ORDER BY created_at
LIMIT 100
""")
for game in games:
try:
# Mark as processing
await conn.execute("""
UPDATE stats_queue SET status = 'processing' WHERE id = $1
""", game["id"])
# Process
await stats_service.process_game_end(game["game_id"])
# Mark complete
await conn.execute("""
UPDATE stats_queue
SET status = 'completed', processed_at = NOW()
WHERE id = $1
""", game["id"])
except Exception as e:
# Mark failed
await conn.execute("""
UPDATE stats_queue
SET status = 'failed', error_message = $2
WHERE id = $1
""", game["id"], str(e))
async def refresh_leaderboard(ctx):
"""Refresh the materialized leaderboard view."""
stats_service: StatsService = ctx["stats_service"]
await stats_service.refresh_leaderboard()
async def cleanup_old_queue_entries(ctx):
"""Clean up old processed queue entries."""
db: asyncpg.Pool = ctx["db_pool"]
async with db.acquire() as conn:
await conn.execute("""
DELETE FROM stats_queue
WHERE status IN ('completed', 'failed')
AND processed_at < NOW() - INTERVAL '7 days'
""")
class WorkerSettings:
"""arq worker settings."""
functions = [
process_stats_queue,
refresh_leaderboard,
cleanup_old_queue_entries,
]
cron_jobs = [
# Process queue every minute
cron(process_stats_queue, minute={0, 15, 30, 45}),
# Refresh leaderboard every 5 minutes
cron(refresh_leaderboard, minute={0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55}),
# Cleanup daily
cron(cleanup_old_queue_entries, hour=3, minute=0),
]
redis_settings = RedisSettings()
@staticmethod
async def on_startup(ctx):
"""Initialize worker context."""
ctx["db_pool"] = await asyncpg.create_pool(DATABASE_URL)
ctx["event_store"] = EventStore(ctx["db_pool"])
ctx["stats_service"] = StatsService(ctx["db_pool"], ctx["event_store"])
@staticmethod
async def on_shutdown(ctx):
"""Cleanup worker context."""
await ctx["db_pool"].close()
```
---
## API Endpoints
```python
# server/routers/stats.py
from fastapi import APIRouter, Depends, Query
from typing import Optional
router = APIRouter(prefix="/api/stats", tags=["stats"])
@router.get("/leaderboard")
async def get_leaderboard(
metric: str = Query("wins", regex="^(wins|win_rate|avg_score|knockouts|streak)$"),
limit: int = Query(50, ge=1, le=100),
offset: int = Query(0, ge=0),
service: StatsService = Depends(get_stats_service),
):
"""Get leaderboard by metric."""
entries = await service.get_leaderboard(metric, limit, offset)
return {
"metric": metric,
"entries": [
{
"rank": e.rank,
"user_id": e.user_id,
"username": e.username,
"value": e.value,
"games_played": e.games_played,
}
for e in entries
],
}
@router.get("/players/{user_id}")
async def get_player_stats(
user_id: str,
service: StatsService = Depends(get_stats_service),
):
"""Get stats for a specific player."""
stats = await service.get_player_stats(user_id)
if not stats:
raise HTTPException(status_code=404, detail="Player not found")
return {
"user_id": stats.user_id,
"username": stats.username,
"games_played": stats.games_played,
"games_won": stats.games_won,
"win_rate": stats.win_rate,
"rounds_played": stats.rounds_played,
"rounds_won": stats.rounds_won,
"avg_score": stats.avg_score,
"best_round_score": stats.best_round_score,
"knockouts": stats.knockouts,
"best_win_streak": stats.best_win_streak,
"achievements": stats.achievements,
}
@router.get("/players/{user_id}/rank")
async def get_player_rank(
user_id: str,
metric: str = "wins",
service: StatsService = Depends(get_stats_service),
):
"""Get player's rank on a leaderboard."""
rank = await service.get_player_rank(user_id, metric)
return {"user_id": user_id, "metric": metric, "rank": rank}
@router.get("/me")
async def get_my_stats(
user: User = Depends(get_current_user),
service: StatsService = Depends(get_stats_service),
):
"""Get current user's stats."""
stats = await service.get_player_stats(user.id)
if not stats:
return {
"games_played": 0,
"games_won": 0,
"achievements": [],
}
return stats.__dict__
@router.get("/achievements")
async def get_achievements(
service: StatsService = Depends(get_stats_service),
):
"""Get all available achievements."""
return {"achievements": await service.get_achievements()}
@router.get("/players/{user_id}/achievements")
async def get_user_achievements(
user_id: str,
service: StatsService = Depends(get_stats_service),
):
"""Get achievements earned by a player."""
return {"achievements": await service.get_user_achievements(user_id)}
```
---
## Frontend Integration
```javascript
// client/components/leaderboard.js
class LeaderboardComponent {
constructor(container) {
this.container = container;
this.metric = 'wins';
this.render();
}
async fetchLeaderboard() {
const response = await fetch(`/api/stats/leaderboard?metric=${this.metric}&limit=50`);
return response.json();
}
async render() {
const data = await this.fetchLeaderboard();
this.container.innerHTML = `
<div class="leaderboard">
<div class="leaderboard-tabs">
<button class="tab ${this.metric === 'wins' ? 'active' : ''}" data-metric="wins">Wins</button>
<button class="tab ${this.metric === 'win_rate' ? 'active' : ''}" data-metric="win_rate">Win Rate</button>
<button class="tab ${this.metric === 'avg_score' ? 'active' : ''}" data-metric="avg_score">Avg Score</button>
</div>
<table class="leaderboard-table">
<thead>
<tr>
<th>#</th>
<th>Player</th>
<th>${this.getMetricLabel()}</th>
<th>Games</th>
</tr>
</thead>
<tbody>
${data.entries.map(e => `
<tr>
<td class="rank">${this.getRankBadge(e.rank)}</td>
<td class="username">${e.username}</td>
<td class="value">${this.formatValue(e.value)}</td>
<td class="games">${e.games_played}</td>
</tr>
`).join('')}
</tbody>
</table>
</div>
`;
// Bind tab clicks
this.container.querySelectorAll('.tab').forEach(tab => {
tab.addEventListener('click', () => {
this.metric = tab.dataset.metric;
this.render();
});
});
}
getMetricLabel() {
const labels = {
wins: 'Wins',
win_rate: 'Win %',
avg_score: 'Avg Score',
};
return labels[this.metric] || this.metric;
}
formatValue(value) {
if (this.metric === 'win_rate') return `${value}%`;
if (this.metric === 'avg_score') return value.toFixed(1);
return value;
}
getRankBadge(rank) {
if (rank === 1) return '🥇';
if (rank === 2) return '🥈';
if (rank === 3) return '🥉';
return rank;
}
}
```
---
## Acceptance Criteria
1. **Stats Aggregation**
- [ ] Stats calculated from game events
- [ ] Games played/won tracked
- [ ] Rounds played/won tracked
- [ ] Best/worst scores tracked
- [ ] Win streaks tracked
- [ ] Knockouts tracked
2. **Leaderboards**
- [ ] Leaderboard by wins
- [ ] Leaderboard by win rate
- [ ] Leaderboard by average score
- [ ] Minimum games requirement
- [ ] Pagination working
- [ ] Materialized view refreshes
3. **Background Worker**
- [ ] Queue processing works
- [ ] Failed jobs retried
- [ ] Leaderboard auto-refreshes
- [ ] Old entries cleaned up
4. **Achievements**
- [ ] Achievement definitions in DB
- [ ] Achievements awarded correctly
- [ ] Achievement progress tracked
- [ ] Achievement UI displays
5. **API**
- [ ] GET /leaderboard works
- [ ] GET /players/{id} works
- [ ] GET /me works
- [ ] GET /achievements works
6. **UI**
- [ ] Leaderboard displays
- [ ] Tabs switch metrics
- [ ] Player profiles show stats
- [ ] Achievements display
---
## Implementation Order
1. Create database migrations
2. Implement stats processing logic
3. Add stats queue integration
4. Set up background worker
5. Implement leaderboard queries
6. Create API endpoints
7. Build leaderboard UI
8. Add achievements system
9. Test full flow
---
## Notes
- Materialized views are great for leaderboards but need periodic refresh
- Consider caching hot leaderboard data in Redis
- Achievement checking should be efficient (batch checks)
- Stats processing is async - don't block game completion
- Consider separate "vs humans only" stats in future

View File

@@ -0,0 +1,976 @@
# V2_06: Game Replay & Export System
> **Scope**: Replay viewer, game export/import, share links, spectator mode
> **Dependencies**: V2_01 (Event Sourcing), V2_02 (Persistence), V2_03 (User Accounts)
> **Complexity**: Medium
---
## Overview
The replay system leverages our event-sourced architecture to provide:
- **Replay Viewer**: Step through any completed game move-by-move
- **Export/Import**: Download games as JSON, share with others
- **Share Links**: Generate public links to specific games
- **Spectator Mode**: Watch live games in progress
---
## 1. Database Schema
### Shared Games Table
```sql
-- Public share links for completed games
CREATE TABLE shared_games (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
game_id UUID NOT NULL REFERENCES games(id),
share_code VARCHAR(12) UNIQUE NOT NULL, -- Short shareable code
created_by UUID REFERENCES users(id),
created_at TIMESTAMPTZ DEFAULT NOW(),
expires_at TIMESTAMPTZ, -- NULL = never expires
view_count INTEGER DEFAULT 0,
is_public BOOLEAN DEFAULT true,
title VARCHAR(100), -- Optional custom title
description TEXT -- Optional description
);
CREATE INDEX idx_shared_games_code ON shared_games(share_code);
CREATE INDEX idx_shared_games_game ON shared_games(game_id);
-- Track replay views for analytics
CREATE TABLE replay_views (
id SERIAL PRIMARY KEY,
shared_game_id UUID REFERENCES shared_games(id),
viewer_id UUID REFERENCES users(id), -- NULL for anonymous
viewed_at TIMESTAMPTZ DEFAULT NOW(),
ip_hash VARCHAR(64), -- Hashed IP for rate limiting
watch_duration_seconds INTEGER
);
```
---
## 2. Replay Service
### Core Implementation
```python
# server/replay.py
from dataclasses import dataclass
from typing import Optional
import secrets
import json
from server.events import EventStore, GameEvent
from server.game import Game, GameOptions
@dataclass
class ReplayFrame:
"""Single frame in a replay."""
event_index: int
event: GameEvent
game_state: dict # Serialized game state after event
timestamp: float
@dataclass
class GameReplay:
"""Complete replay of a game."""
game_id: str
frames: list[ReplayFrame]
total_duration_seconds: float
player_names: list[str]
final_scores: dict[str, int]
winner: Optional[str]
options: GameOptions
class ReplayService:
def __init__(self, event_store: EventStore, db_pool):
self.event_store = event_store
self.db = db_pool
async def build_replay(self, game_id: str) -> GameReplay:
"""Build complete replay from event store."""
events = await self.event_store.get_events(game_id)
if not events:
raise ValueError(f"No events found for game {game_id}")
frames = []
game = None
start_time = None
for i, event in enumerate(events):
if start_time is None:
start_time = event.timestamp
# Apply event to get state
if event.event_type == "game_started":
game = Game.from_event(event)
else:
game.apply_event(event)
frames.append(ReplayFrame(
event_index=i,
event=event,
game_state=game.to_dict(reveal_all=True),
timestamp=(event.timestamp - start_time).total_seconds()
))
return GameReplay(
game_id=game_id,
frames=frames,
total_duration_seconds=frames[-1].timestamp if frames else 0,
player_names=[p.name for p in game.players],
final_scores={p.name: p.score for p in game.players},
winner=game.winner.name if game.winner else None,
options=game.options
)
async def create_share_link(
self,
game_id: str,
user_id: Optional[str] = None,
title: Optional[str] = None,
expires_days: Optional[int] = None
) -> str:
"""Generate shareable link for a game."""
share_code = secrets.token_urlsafe(8)[:12] # 12-char code
expires_at = None
if expires_days:
expires_at = f"NOW() + INTERVAL '{expires_days} days'"
async with self.db.acquire() as conn:
await conn.execute("""
INSERT INTO shared_games
(game_id, share_code, created_by, title, expires_at)
VALUES ($1, $2, $3, $4, $5)
""", game_id, share_code, user_id, title, expires_at)
return share_code
async def get_shared_game(self, share_code: str) -> Optional[dict]:
"""Retrieve shared game by code."""
async with self.db.acquire() as conn:
row = await conn.fetchrow("""
SELECT sg.*, g.room_code, g.completed_at
FROM shared_games sg
JOIN games g ON sg.game_id = g.id
WHERE sg.share_code = $1
AND sg.is_public = true
AND (sg.expires_at IS NULL OR sg.expires_at > NOW())
""", share_code)
if row:
# Increment view count
await conn.execute("""
UPDATE shared_games SET view_count = view_count + 1
WHERE share_code = $1
""", share_code)
return dict(row)
return None
async def export_game(self, game_id: str) -> dict:
"""Export game as portable JSON format."""
replay = await self.build_replay(game_id)
return {
"version": "1.0",
"exported_at": datetime.utcnow().isoformat(),
"game": {
"id": replay.game_id,
"players": replay.player_names,
"winner": replay.winner,
"final_scores": replay.final_scores,
"duration_seconds": replay.total_duration_seconds,
"options": asdict(replay.options)
},
"events": [
{
"type": f.event.event_type,
"data": f.event.data,
"timestamp": f.timestamp
}
for f in replay.frames
]
}
async def import_game(self, export_data: dict, user_id: str) -> str:
"""Import a game from exported JSON."""
if export_data.get("version") != "1.0":
raise ValueError("Unsupported export version")
# Generate new game ID for import
new_game_id = str(uuid.uuid4())
# Store events with new game ID
for event_data in export_data["events"]:
event = GameEvent(
game_id=new_game_id,
event_type=event_data["type"],
data=event_data["data"],
timestamp=datetime.fromisoformat(event_data["timestamp"])
)
await self.event_store.append(event)
# Mark as imported game
async with self.db.acquire() as conn:
await conn.execute("""
INSERT INTO games (id, imported_by, imported_at, is_imported)
VALUES ($1, $2, NOW(), true)
""", new_game_id, user_id)
return new_game_id
```
---
## 3. Spectator Mode
### Live Game Watching
```python
# server/spectator.py
from typing import Set
from fastapi import WebSocket
class SpectatorManager:
"""Manage spectators watching live games."""
def __init__(self):
# game_id -> set of spectator websockets
self.spectators: dict[str, Set[WebSocket]] = {}
async def add_spectator(self, game_id: str, ws: WebSocket):
"""Add spectator to game."""
if game_id not in self.spectators:
self.spectators[game_id] = set()
self.spectators[game_id].add(ws)
# Send current game state
game = await self.get_game_state(game_id)
await ws.send_json({
"type": "spectator_joined",
"game": game.to_dict(reveal_all=False),
"spectator_count": len(self.spectators[game_id])
})
async def remove_spectator(self, game_id: str, ws: WebSocket):
"""Remove spectator from game."""
if game_id in self.spectators:
self.spectators[game_id].discard(ws)
if not self.spectators[game_id]:
del self.spectators[game_id]
async def broadcast_to_spectators(self, game_id: str, message: dict):
"""Send update to all spectators of a game."""
if game_id not in self.spectators:
return
dead_connections = set()
for ws in self.spectators[game_id]:
try:
await ws.send_json(message)
except:
dead_connections.add(ws)
# Clean up dead connections
self.spectators[game_id] -= dead_connections
def get_spectator_count(self, game_id: str) -> int:
return len(self.spectators.get(game_id, set()))
# Integration with main game loop
async def handle_game_event(game_id: str, event: GameEvent):
"""Called after each game event to notify spectators."""
await spectator_manager.broadcast_to_spectators(game_id, {
"type": "game_update",
"event": event.to_dict(),
"timestamp": event.timestamp.isoformat()
})
```
---
## 4. API Endpoints
```python
# server/routes/replay.py
from fastapi import APIRouter, HTTPException, Query
from fastapi.responses import JSONResponse
router = APIRouter(prefix="/api/replay", tags=["replay"])
@router.get("/game/{game_id}")
async def get_replay(game_id: str, user: Optional[User] = Depends(get_current_user)):
"""Get full replay for a game."""
# Check if user has permission (played in game or game is public)
if not await can_view_game(user, game_id):
raise HTTPException(403, "Cannot view this game")
replay = await replay_service.build_replay(game_id)
return {
"game_id": replay.game_id,
"frames": [
{
"index": f.event_index,
"event_type": f.event.event_type,
"timestamp": f.timestamp,
"state": f.game_state
}
for f in replay.frames
],
"metadata": {
"players": replay.player_names,
"winner": replay.winner,
"final_scores": replay.final_scores,
"duration": replay.total_duration_seconds
}
}
@router.post("/game/{game_id}/share")
async def create_share_link(
game_id: str,
title: Optional[str] = None,
expires_days: Optional[int] = Query(None, ge=1, le=365),
user: User = Depends(require_auth)
):
"""Create shareable link for a game."""
if not await user_played_in_game(user.id, game_id):
raise HTTPException(403, "Can only share games you played in")
share_code = await replay_service.create_share_link(
game_id, user.id, title, expires_days
)
return {
"share_code": share_code,
"share_url": f"/replay/{share_code}",
"expires_days": expires_days
}
@router.get("/shared/{share_code}")
async def get_shared_replay(share_code: str):
"""Get replay via share code (public endpoint)."""
shared = await replay_service.get_shared_game(share_code)
if not shared:
raise HTTPException(404, "Shared game not found or expired")
replay = await replay_service.build_replay(shared["game_id"])
return {
"title": shared.get("title"),
"view_count": shared["view_count"],
"replay": replay
}
@router.get("/game/{game_id}/export")
async def export_game(game_id: str, user: User = Depends(require_auth)):
"""Export game as downloadable JSON."""
if not await can_view_game(user, game_id):
raise HTTPException(403, "Cannot export this game")
export_data = await replay_service.export_game(game_id)
return JSONResponse(
content=export_data,
headers={
"Content-Disposition": f'attachment; filename="golf-game-{game_id[:8]}.json"'
}
)
@router.post("/import")
async def import_game(
export_data: dict,
user: User = Depends(require_auth)
):
"""Import a game from JSON export."""
try:
new_game_id = await replay_service.import_game(export_data, user.id)
return {"game_id": new_game_id, "message": "Game imported successfully"}
except ValueError as e:
raise HTTPException(400, str(e))
# Spectator endpoints
@router.websocket("/spectate/{room_code}")
async def spectate_game(websocket: WebSocket, room_code: str):
"""WebSocket endpoint for spectating live games."""
await websocket.accept()
game_id = await get_game_id_by_room(room_code)
if not game_id:
await websocket.close(code=4004, reason="Game not found")
return
try:
await spectator_manager.add_spectator(game_id, websocket)
while True:
# Keep connection alive, handle pings
data = await websocket.receive_text()
if data == "ping":
await websocket.send_text("pong")
except WebSocketDisconnect:
pass
finally:
await spectator_manager.remove_spectator(game_id, websocket)
```
---
## 5. Frontend: Replay Viewer
### Replay Component
```javascript
// client/replay.js
class ReplayViewer {
constructor(container) {
this.container = container;
this.frames = [];
this.currentFrame = 0;
this.isPlaying = false;
this.playbackSpeed = 1.0;
this.playInterval = null;
}
async loadReplay(gameId) {
const response = await fetch(`/api/replay/game/${gameId}`);
const data = await response.json();
this.frames = data.frames;
this.metadata = data.metadata;
this.currentFrame = 0;
this.render();
this.renderControls();
}
async loadSharedReplay(shareCode) {
const response = await fetch(`/api/replay/shared/${shareCode}`);
if (!response.ok) {
this.showError("Replay not found or expired");
return;
}
const data = await response.json();
this.frames = data.replay.frames;
this.metadata = data.replay;
this.title = data.title;
this.currentFrame = 0;
this.render();
}
render() {
if (!this.frames.length) return;
const frame = this.frames[this.currentFrame];
const state = frame.state;
// Render game board at this state
this.renderBoard(state);
// Show event description
this.renderEventInfo(frame);
// Update timeline
this.updateTimeline();
}
renderBoard(state) {
// Similar to main game rendering but read-only
const boardHtml = `
<div class="replay-board">
${state.players.map(p => this.renderPlayerHand(p)).join('')}
<div class="replay-center">
<div class="deck-area">
<div class="card deck-card">
<span class="card-back"></span>
</div>
${state.discard_top ? this.renderCard(state.discard_top) : ''}
</div>
</div>
</div>
`;
this.container.querySelector('.replay-board-container').innerHTML = boardHtml;
}
renderEventInfo(frame) {
const descriptions = {
'game_started': 'Game started',
'card_drawn': `${frame.event.data.player} drew a card`,
'card_discarded': `${frame.event.data.player} discarded`,
'card_swapped': `${frame.event.data.player} swapped a card`,
'turn_ended': `${frame.event.data.player}'s turn ended`,
'round_ended': 'Round ended',
'game_ended': `Game over! ${this.metadata.winner} wins!`
};
const desc = descriptions[frame.event_type] || frame.event_type;
this.container.querySelector('.event-description').textContent = desc;
}
renderControls() {
const controls = `
<div class="replay-controls">
<button class="btn-start" title="Go to start">⏮</button>
<button class="btn-prev" title="Previous">⏪</button>
<button class="btn-play" title="Play/Pause">▶</button>
<button class="btn-next" title="Next">⏩</button>
<button class="btn-end" title="Go to end">⏭</button>
<div class="timeline">
<input type="range" min="0" max="${this.frames.length - 1}"
value="0" class="timeline-slider">
<span class="frame-counter">1 / ${this.frames.length}</span>
</div>
<div class="speed-control">
<label>Speed:</label>
<select class="speed-select">
<option value="0.5">0.5x</option>
<option value="1" selected>1x</option>
<option value="2">2x</option>
<option value="4">4x</option>
</select>
</div>
</div>
`;
this.container.querySelector('.controls-container').innerHTML = controls;
this.bindControlEvents();
}
bindControlEvents() {
this.container.querySelector('.btn-start').onclick = () => this.goToFrame(0);
this.container.querySelector('.btn-end').onclick = () => this.goToFrame(this.frames.length - 1);
this.container.querySelector('.btn-prev').onclick = () => this.prevFrame();
this.container.querySelector('.btn-next').onclick = () => this.nextFrame();
this.container.querySelector('.btn-play').onclick = () => this.togglePlay();
this.container.querySelector('.timeline-slider').oninput = (e) => {
this.goToFrame(parseInt(e.target.value));
};
this.container.querySelector('.speed-select').onchange = (e) => {
this.playbackSpeed = parseFloat(e.target.value);
if (this.isPlaying) {
this.stopPlayback();
this.startPlayback();
}
};
}
goToFrame(index) {
this.currentFrame = Math.max(0, Math.min(index, this.frames.length - 1));
this.render();
}
nextFrame() {
if (this.currentFrame < this.frames.length - 1) {
this.currentFrame++;
this.render();
} else if (this.isPlaying) {
this.togglePlay(); // Stop at end
}
}
prevFrame() {
if (this.currentFrame > 0) {
this.currentFrame--;
this.render();
}
}
togglePlay() {
this.isPlaying = !this.isPlaying;
const btn = this.container.querySelector('.btn-play');
if (this.isPlaying) {
btn.textContent = '⏸';
this.startPlayback();
} else {
btn.textContent = '▶';
this.stopPlayback();
}
}
startPlayback() {
const baseInterval = 1000; // 1 second between frames
this.playInterval = setInterval(() => {
this.nextFrame();
}, baseInterval / this.playbackSpeed);
}
stopPlayback() {
if (this.playInterval) {
clearInterval(this.playInterval);
this.playInterval = null;
}
}
updateTimeline() {
const slider = this.container.querySelector('.timeline-slider');
const counter = this.container.querySelector('.frame-counter');
if (slider) slider.value = this.currentFrame;
if (counter) counter.textContent = `${this.currentFrame + 1} / ${this.frames.length}`;
}
}
```
### Replay Page HTML
```html
<!-- client/replay.html or section in index.html -->
<div id="replay-view" class="view hidden">
<header class="replay-header">
<h2 class="replay-title">Game Replay</h2>
<div class="replay-meta">
<span class="player-names"></span>
<span class="game-duration"></span>
</div>
</header>
<div class="replay-board-container">
<!-- Board renders here -->
</div>
<div class="event-description"></div>
<div class="controls-container">
<!-- Controls render here -->
</div>
<div class="replay-actions">
<button class="btn-share">Share Replay</button>
<button class="btn-export">Export JSON</button>
<button class="btn-back">Back to Menu</button>
</div>
</div>
```
### Replay Styles
```css
/* client/style.css additions */
.replay-controls {
display: flex;
align-items: center;
gap: 1rem;
padding: 1rem;
background: var(--surface-color);
border-radius: 8px;
flex-wrap: wrap;
justify-content: center;
}
.replay-controls button {
width: 40px;
height: 40px;
border-radius: 50%;
border: none;
background: var(--primary-color);
color: white;
cursor: pointer;
font-size: 1.2rem;
}
.replay-controls button:hover {
background: var(--primary-dark);
}
.timeline {
flex: 1;
min-width: 200px;
display: flex;
align-items: center;
gap: 0.5rem;
}
.timeline-slider {
flex: 1;
height: 8px;
-webkit-appearance: none;
background: var(--border-color);
border-radius: 4px;
}
.timeline-slider::-webkit-slider-thumb {
-webkit-appearance: none;
width: 16px;
height: 16px;
background: var(--primary-color);
border-radius: 50%;
cursor: pointer;
}
.frame-counter {
font-family: monospace;
min-width: 80px;
text-align: right;
}
.event-description {
text-align: center;
padding: 1rem;
font-size: 1.1rem;
color: var(--text-secondary);
min-height: 3rem;
}
.speed-control {
display: flex;
align-items: center;
gap: 0.5rem;
}
.speed-select {
padding: 0.25rem 0.5rem;
border-radius: 4px;
}
/* Spectator badge */
.spectator-count {
position: absolute;
top: 10px;
right: 10px;
background: rgba(0,0,0,0.7);
color: white;
padding: 0.5rem 1rem;
border-radius: 20px;
display: flex;
align-items: center;
gap: 0.5rem;
}
.spectator-count::before {
content: '👁';
}
```
---
## 6. Share Dialog
```javascript
// Share modal component
class ShareDialog {
constructor(gameId) {
this.gameId = gameId;
}
async show() {
const modal = document.createElement('div');
modal.className = 'modal share-modal';
modal.innerHTML = `
<div class="modal-content">
<h3>Share This Game</h3>
<div class="share-options">
<label>
<span>Title (optional):</span>
<input type="text" id="share-title" placeholder="Epic comeback win!">
</label>
<label>
<span>Expires in:</span>
<select id="share-expiry">
<option value="">Never</option>
<option value="7">7 days</option>
<option value="30">30 days</option>
<option value="90">90 days</option>
</select>
</label>
</div>
<div class="share-result hidden">
<p>Share this link:</p>
<div class="share-link-container">
<input type="text" id="share-link" readonly>
<button class="btn-copy">Copy</button>
</div>
</div>
<div class="modal-actions">
<button class="btn-generate">Generate Link</button>
<button class="btn-cancel">Cancel</button>
</div>
</div>
`;
document.body.appendChild(modal);
this.bindEvents(modal);
}
async generateLink(modal) {
const title = modal.querySelector('#share-title').value || null;
const expiry = modal.querySelector('#share-expiry').value || null;
const response = await fetch(`/api/replay/game/${this.gameId}/share`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
title,
expires_days: expiry ? parseInt(expiry) : null
})
});
const data = await response.json();
const fullUrl = `${window.location.origin}${data.share_url}`;
modal.querySelector('#share-link').value = fullUrl;
modal.querySelector('.share-result').classList.remove('hidden');
modal.querySelector('.btn-generate').classList.add('hidden');
}
}
```
---
## 7. Integration Points
### Game End Integration
```python
# In main.py after game ends
async def on_game_end(game: Game):
# Store final game state
await event_store.append(GameEvent(
game_id=game.id,
event_type="game_ended",
data={
"winner": game.winner.id,
"final_scores": {p.id: p.score for p in game.players},
"duration": game.duration_seconds
}
))
# Notify spectators
await spectator_manager.broadcast_to_spectators(game.id, {
"type": "game_ended",
"winner": game.winner.name,
"final_scores": {p.name: p.score for p in game.players}
})
```
### Navigation Links
```javascript
// Add to game history/profile
function renderGameHistory(games) {
return games.map(game => `
<div class="history-item">
<span class="game-date">${formatDate(game.played_at)}</span>
<span class="game-result">${game.won ? 'Won' : 'Lost'}</span>
<span class="game-score">${game.score} pts</span>
<a href="/replay/${game.id}" class="btn-replay">Watch Replay</a>
</div>
`).join('');
}
```
---
## 8. Validation Tests
```python
# tests/test_replay.py
async def test_build_replay():
"""Verify replay correctly reconstructs game states."""
# Create game with known moves
game_id = await create_test_game()
replay = await replay_service.build_replay(game_id)
assert len(replay.frames) > 0
assert replay.game_id == game_id
assert replay.winner is not None
# Verify each frame has valid state
for frame in replay.frames:
assert frame.game_state is not None
assert 'players' in frame.game_state
async def test_share_link_creation():
"""Test creating and accessing share links."""
game_id = await create_completed_game()
user_id = "test-user"
share_code = await replay_service.create_share_link(game_id, user_id)
assert len(share_code) == 12
# Retrieve via share code
shared = await replay_service.get_shared_game(share_code)
assert shared is not None
assert shared["game_id"] == game_id
async def test_share_link_expiry():
"""Verify expired links return None."""
game_id = await create_completed_game()
# Create link that expires in -1 days (already expired)
share_code = await create_expired_share(game_id)
shared = await replay_service.get_shared_game(share_code)
assert shared is None
async def test_export_import_roundtrip():
"""Test game can be exported and reimported."""
original_game_id = await create_completed_game()
export_data = await replay_service.export_game(original_game_id)
assert export_data["version"] == "1.0"
assert len(export_data["events"]) > 0
# Import as new game
new_game_id = await replay_service.import_game(export_data, "importer-user")
# Verify imported game matches
original_replay = await replay_service.build_replay(original_game_id)
imported_replay = await replay_service.build_replay(new_game_id)
assert len(original_replay.frames) == len(imported_replay.frames)
assert original_replay.final_scores == imported_replay.final_scores
async def test_spectator_connection():
"""Test spectator can join and receive updates."""
game_id = await create_active_game()
async with websocket_client(f"/api/replay/spectate/{game_id}") as ws:
# Should receive initial state
msg = await ws.receive_json()
assert msg["type"] == "spectator_joined"
assert "game" in msg
# Simulate game event
await trigger_game_event(game_id)
# Should receive update
update = await ws.receive_json()
assert update["type"] == "game_update"
```
---
## 9. Security Considerations
1. **Access Control**: Users can only view replays of games they played in, unless shared
2. **Rate Limiting**: Limit share link creation to prevent abuse
3. **Expired Links**: Clean up expired share links via background job
4. **Import Validation**: Validate imported JSON structure to prevent injection
5. **Spectator Limits**: Cap spectators per game to prevent resource exhaustion
---
## Summary
This document provides a complete replay and export system that:
- Leverages event sourcing for perfect game reconstruction
- Supports shareable links with optional expiry
- Enables live spectating of games in progress
- Allows game export/import for portability
- Includes frontend replay viewer with playback controls

999
docs/v2/V2_07_PRODUCTION.md Normal file
View File

@@ -0,0 +1,999 @@
# V2_07: Production Deployment & Operations
> **Scope**: Docker, deployment, health checks, monitoring, security, rate limiting
> **Dependencies**: All other V2 documents
> **Complexity**: High (DevOps/Infrastructure)
---
## Overview
Production readiness requires:
- **Containerization**: Docker images for consistent deployment
- **Health Checks**: Liveness and readiness probes
- **Monitoring**: Metrics, logging, error tracking
- **Security**: HTTPS, headers, secrets management
- **Rate Limiting**: API protection from abuse (Phase 1 priority)
- **Graceful Operations**: Zero-downtime deploys, proper shutdown
---
## 1. Docker Configuration
### Application Dockerfile
```dockerfile
# Dockerfile
FROM python:3.11-slim as base
# Set environment
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
&& rm -rf /var/lib/apt/lists/*
# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY server/ ./server/
COPY client/ ./client/
# Create non-root user
RUN useradd --create-home --shell /bin/bash appuser \
&& chown -R appuser:appuser /app
USER appuser
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8000/health || exit 1
EXPOSE 8000
CMD ["python", "-m", "uvicorn", "server.main:app", "--host", "0.0.0.0", "--port", "8000"]
```
### Production Docker Compose
```yaml
# docker-compose.prod.yml
version: '3.8'
services:
app:
build:
context: .
dockerfile: Dockerfile
environment:
- DATABASE_URL=postgresql://golf:${DB_PASSWORD}@postgres:5432/golfgame
- REDIS_URL=redis://redis:6379/0
- SECRET_KEY=${SECRET_KEY}
- RESEND_API_KEY=${RESEND_API_KEY}
- SENTRY_DSN=${SENTRY_DSN}
- ENVIRONMENT=production
depends_on:
postgres:
condition: service_healthy
redis:
condition: service_healthy
deploy:
replicas: 2
restart_policy:
condition: on-failure
max_attempts: 3
resources:
limits:
memory: 512M
reservations:
memory: 256M
networks:
- internal
- web
labels:
- "traefik.enable=true"
- "traefik.http.routers.golf.rule=Host(`golf.example.com`)"
- "traefik.http.routers.golf.tls=true"
- "traefik.http.routers.golf.tls.certresolver=letsencrypt"
worker:
build:
context: .
dockerfile: Dockerfile
command: python -m arq server.worker.WorkerSettings
environment:
- DATABASE_URL=postgresql://golf:${DB_PASSWORD}@postgres:5432/golfgame
- REDIS_URL=redis://redis:6379/0
depends_on:
- postgres
- redis
deploy:
replicas: 1
resources:
limits:
memory: 256M
postgres:
image: postgres:15-alpine
environment:
POSTGRES_DB: golfgame
POSTGRES_USER: golf
POSTGRES_PASSWORD: ${DB_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
- ./init.sql:/docker-entrypoint-initdb.d/init.sql
healthcheck:
test: ["CMD-SHELL", "pg_isready -U golf -d golfgame"]
interval: 10s
timeout: 5s
retries: 5
networks:
- internal
redis:
image: redis:7-alpine
command: redis-server --appendonly yes --maxmemory 128mb --maxmemory-policy allkeys-lru
volumes:
- redis_data:/data
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 5s
retries: 5
networks:
- internal
traefik:
image: traefik:v2.10
command:
- "--api.dashboard=true"
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--entrypoints.web.address=:80"
- "--entrypoints.websecure.address=:443"
- "--certificatesresolvers.letsencrypt.acme.httpchallenge=true"
- "--certificatesresolvers.letsencrypt.acme.email=${ACME_EMAIL}"
- "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- letsencrypt:/letsencrypt
networks:
- web
volumes:
postgres_data:
redis_data:
letsencrypt:
networks:
internal:
web:
external: true
```
---
## 2. Health Checks & Readiness
### Health Endpoint Implementation
```python
# server/health.py
from fastapi import APIRouter, Response
from datetime import datetime
import asyncpg
import redis.asyncio as redis
router = APIRouter(tags=["health"])
@router.get("/health")
async def health_check():
"""Basic liveness check - is the app running?"""
return {"status": "ok", "timestamp": datetime.utcnow().isoformat()}
@router.get("/ready")
async def readiness_check(
db: asyncpg.Pool = Depends(get_db_pool),
redis_client: redis.Redis = Depends(get_redis)
):
"""Readiness check - can the app handle requests?"""
checks = {}
overall_healthy = True
# Check database
try:
async with db.acquire() as conn:
await conn.fetchval("SELECT 1")
checks["database"] = {"status": "ok"}
except Exception as e:
checks["database"] = {"status": "error", "message": str(e)}
overall_healthy = False
# Check Redis
try:
await redis_client.ping()
checks["redis"] = {"status": "ok"}
except Exception as e:
checks["redis"] = {"status": "error", "message": str(e)}
overall_healthy = False
status_code = 200 if overall_healthy else 503
return Response(
content=json.dumps({
"status": "ok" if overall_healthy else "degraded",
"checks": checks,
"timestamp": datetime.utcnow().isoformat()
}),
status_code=status_code,
media_type="application/json"
)
@router.get("/metrics")
async def metrics(
db: asyncpg.Pool = Depends(get_db_pool),
redis_client: redis.Redis = Depends(get_redis)
):
"""Expose application metrics for monitoring."""
async with db.acquire() as conn:
active_games = await conn.fetchval(
"SELECT COUNT(*) FROM games WHERE completed_at IS NULL"
)
total_users = await conn.fetchval("SELECT COUNT(*) FROM users")
games_today = await conn.fetchval(
"SELECT COUNT(*) FROM games WHERE created_at > NOW() - INTERVAL '1 day'"
)
connected_players = await redis_client.scard("connected_players")
return {
"active_games": active_games,
"total_users": total_users,
"games_today": games_today,
"connected_players": connected_players,
"timestamp": datetime.utcnow().isoformat()
}
```
---
## 3. Rate Limiting (Phase 1 Priority)
Rate limiting is a Phase 1 priority for security. Implement early to prevent abuse.
### Rate Limiter Implementation
```python
# server/ratelimit.py
from fastapi import Request, HTTPException
from typing import Optional
import redis.asyncio as redis
import time
import hashlib
class RateLimiter:
"""Token bucket rate limiter using Redis."""
def __init__(self, redis_client: redis.Redis):
self.redis = redis_client
async def is_allowed(
self,
key: str,
limit: int,
window_seconds: int
) -> tuple[bool, dict]:
"""Check if request is allowed under rate limit.
Returns (allowed, info) where info contains:
- remaining: requests remaining in window
- reset: seconds until window resets
- limit: the limit that was applied
"""
now = int(time.time())
window_key = f"ratelimit:{key}:{now // window_seconds}"
async with self.redis.pipeline(transaction=True) as pipe:
pipe.incr(window_key)
pipe.expire(window_key, window_seconds)
results = await pipe.execute()
current_count = results[0]
remaining = max(0, limit - current_count)
reset = window_seconds - (now % window_seconds)
info = {
"remaining": remaining,
"reset": reset,
"limit": limit
}
return current_count <= limit, info
def get_client_key(self, request: Request, user_id: Optional[str] = None) -> str:
"""Generate rate limit key for client."""
if user_id:
return f"user:{user_id}"
# For anonymous users, use IP hash
client_ip = request.client.host
forwarded = request.headers.get("X-Forwarded-For")
if forwarded:
client_ip = forwarded.split(",")[0].strip()
# Hash IP for privacy
return f"ip:{hashlib.sha256(client_ip.encode()).hexdigest()[:16]}"
# Rate limit configurations per endpoint type
RATE_LIMITS = {
"api_general": (100, 60), # 100 requests per minute
"api_auth": (10, 60), # 10 auth attempts per minute
"api_create_room": (5, 60), # 5 room creations per minute
"websocket_connect": (10, 60), # 10 WS connections per minute
"email_send": (3, 300), # 3 emails per 5 minutes
}
```
### Rate Limit Middleware
```python
# server/middleware.py
from fastapi import Request
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.responses import JSONResponse
class RateLimitMiddleware(BaseHTTPMiddleware):
def __init__(self, app, rate_limiter: RateLimiter):
super().__init__(app)
self.limiter = rate_limiter
async def dispatch(self, request: Request, call_next):
# Determine rate limit tier based on path
path = request.url.path
if path.startswith("/api/auth"):
limit, window = RATE_LIMITS["api_auth"]
elif path == "/api/rooms":
limit, window = RATE_LIMITS["api_create_room"]
elif path.startswith("/api"):
limit, window = RATE_LIMITS["api_general"]
else:
# No rate limiting for static files
return await call_next(request)
# Get user ID if authenticated
user_id = getattr(request.state, "user_id", None)
client_key = self.limiter.get_client_key(request, user_id)
allowed, info = await self.limiter.is_allowed(
f"{path}:{client_key}", limit, window
)
# Add rate limit headers to response
response = await call_next(request) if allowed else JSONResponse(
status_code=429,
content={
"error": "Rate limit exceeded",
"retry_after": info["reset"]
}
)
response.headers["X-RateLimit-Limit"] = str(info["limit"])
response.headers["X-RateLimit-Remaining"] = str(info["remaining"])
response.headers["X-RateLimit-Reset"] = str(info["reset"])
if not allowed:
response.headers["Retry-After"] = str(info["reset"])
return response
```
### WebSocket Rate Limiting
```python
# In server/main.py
async def websocket_endpoint(websocket: WebSocket):
client_key = rate_limiter.get_client_key(websocket)
allowed, info = await rate_limiter.is_allowed(
f"ws_connect:{client_key}",
*RATE_LIMITS["websocket_connect"]
)
if not allowed:
await websocket.close(code=1008, reason="Rate limit exceeded")
return
# Also rate limit messages within the connection
message_limiter = ConnectionMessageLimiter(
max_messages=30,
window_seconds=10
)
await websocket.accept()
try:
while True:
data = await websocket.receive_text()
if not message_limiter.check():
await websocket.send_json({
"type": "error",
"message": "Slow down! Too many messages."
})
continue
await handle_message(websocket, data)
except WebSocketDisconnect:
pass
```
---
## 4. Security Headers & HTTPS
### Security Middleware
```python
# server/security.py
from starlette.middleware.base import BaseHTTPMiddleware
class SecurityHeadersMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
response = await call_next(request)
# Security headers
response.headers["X-Content-Type-Options"] = "nosniff"
response.headers["X-Frame-Options"] = "DENY"
response.headers["X-XSS-Protection"] = "1; mode=block"
response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"
response.headers["Permissions-Policy"] = "geolocation=(), microphone=(), camera=()"
# Content Security Policy
csp = "; ".join([
"default-src 'self'",
"script-src 'self'",
"style-src 'self' 'unsafe-inline'", # For inline styles
"img-src 'self' data:",
"font-src 'self'",
"connect-src 'self' wss://*.example.com",
"frame-ancestors 'none'",
"base-uri 'self'",
"form-action 'self'"
])
response.headers["Content-Security-Policy"] = csp
# HSTS (only in production)
if request.url.scheme == "https":
response.headers["Strict-Transport-Security"] = (
"max-age=31536000; includeSubDomains; preload"
)
return response
```
### CORS Configuration
```python
# server/main.py
from fastapi.middleware.cors import CORSMiddleware
app.add_middleware(
CORSMiddleware,
allow_origins=[
"https://golf.example.com",
"https://www.golf.example.com",
],
allow_credentials=True,
allow_methods=["GET", "POST", "PUT", "DELETE"],
allow_headers=["*"],
)
```
---
## 5. Error Tracking with Sentry
### Sentry Integration
```python
# server/main.py
import sentry_sdk
from sentry_sdk.integrations.fastapi import FastApiIntegration
from sentry_sdk.integrations.redis import RedisIntegration
from sentry_sdk.integrations.asyncpg import AsyncPGIntegration
if os.getenv("SENTRY_DSN"):
sentry_sdk.init(
dsn=os.getenv("SENTRY_DSN"),
environment=os.getenv("ENVIRONMENT", "development"),
traces_sample_rate=0.1, # 10% of transactions for performance
profiles_sample_rate=0.1,
integrations=[
FastApiIntegration(transaction_style="endpoint"),
RedisIntegration(),
AsyncPGIntegration(),
],
# Filter out sensitive data
before_send=filter_sensitive_data,
)
def filter_sensitive_data(event, hint):
"""Remove sensitive data before sending to Sentry."""
if "request" in event:
headers = event["request"].get("headers", {})
# Remove auth headers
headers.pop("authorization", None)
headers.pop("cookie", None)
return event
```
### Custom Error Handler
```python
# server/errors.py
from fastapi import Request
from fastapi.responses import JSONResponse
import sentry_sdk
import traceback
async def global_exception_handler(request: Request, exc: Exception):
"""Handle all unhandled exceptions."""
# Log to Sentry
sentry_sdk.capture_exception(exc)
# Log locally
logger.error(f"Unhandled exception: {exc}", exc_info=True)
# Return generic error to client
return JSONResponse(
status_code=500,
content={
"error": "Internal server error",
"request_id": request.state.request_id
}
)
# Register handler
app.add_exception_handler(Exception, global_exception_handler)
```
---
## 6. Structured Logging
### Logging Configuration
```python
# server/logging_config.py
import logging
import json
from datetime import datetime
class JSONFormatter(logging.Formatter):
"""Format logs as JSON for aggregation."""
def format(self, record):
log_data = {
"timestamp": datetime.utcnow().isoformat(),
"level": record.levelname,
"logger": record.name,
"message": record.getMessage(),
}
# Add extra fields
if hasattr(record, "request_id"):
log_data["request_id"] = record.request_id
if hasattr(record, "user_id"):
log_data["user_id"] = record.user_id
if hasattr(record, "game_id"):
log_data["game_id"] = record.game_id
# Add exception info
if record.exc_info:
log_data["exception"] = self.formatException(record.exc_info)
return json.dumps(log_data)
def setup_logging():
"""Configure application logging."""
handler = logging.StreamHandler()
if os.getenv("ENVIRONMENT") == "production":
handler.setFormatter(JSONFormatter())
else:
handler.setFormatter(logging.Formatter(
"%(asctime)s - %(name)s - %(levelname)s - %(message)s"
))
logging.root.handlers = [handler]
logging.root.setLevel(logging.INFO)
# Reduce noise from libraries
logging.getLogger("uvicorn.access").setLevel(logging.WARNING)
logging.getLogger("websockets").setLevel(logging.WARNING)
```
### Request ID Middleware
```python
# server/middleware.py
import uuid
class RequestIDMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
request_id = request.headers.get("X-Request-ID", str(uuid.uuid4()))
request.state.request_id = request_id
response = await call_next(request)
response.headers["X-Request-ID"] = request_id
return response
```
---
## 7. Graceful Shutdown
### Shutdown Handler
```python
# server/main.py
import signal
import asyncio
shutdown_event = asyncio.Event()
@app.on_event("startup")
async def startup():
# Register signal handlers
loop = asyncio.get_running_loop()
for sig in (signal.SIGTERM, signal.SIGINT):
loop.add_signal_handler(sig, lambda: asyncio.create_task(shutdown()))
@app.on_event("shutdown")
async def shutdown():
logger.info("Shutdown initiated...")
# Stop accepting new connections
shutdown_event.set()
# Save all active games to Redis
await save_all_active_games()
# Close WebSocket connections gracefully
for ws in list(active_connections):
try:
await ws.close(code=1001, reason="Server shutting down")
except:
pass
# Wait for in-flight requests (max 30 seconds)
await asyncio.sleep(5)
# Close database pool
await db_pool.close()
# Close Redis connections
await redis_client.close()
logger.info("Shutdown complete")
async def save_all_active_games():
"""Persist all active games before shutdown."""
for game_id, game in active_games.items():
try:
await state_cache.save_game(game)
logger.info(f"Saved game {game_id}")
except Exception as e:
logger.error(f"Failed to save game {game_id}: {e}")
```
---
## 8. Secrets Management
### Environment Configuration
```python
# server/config.py
from pydantic import BaseSettings, PostgresDsn, RedisDsn
class Settings(BaseSettings):
# Database
database_url: PostgresDsn
# Redis
redis_url: RedisDsn
# Security
secret_key: str
jwt_algorithm: str = "HS256"
jwt_expiry_hours: int = 24
# Email
resend_api_key: str
email_from: str = "Golf Game <noreply@golf.example.com>"
# Monitoring
sentry_dsn: str = ""
environment: str = "development"
# Rate limiting
rate_limit_enabled: bool = True
class Config:
env_file = ".env"
case_sensitive = False
settings = Settings()
```
### Production Secrets (Example for Docker Swarm)
```yaml
# docker-compose.prod.yml
secrets:
db_password:
external: true
secret_key:
external: true
resend_api_key:
external: true
services:
app:
secrets:
- db_password
- secret_key
- resend_api_key
environment:
- DATABASE_URL=postgresql://golf@postgres:5432/golfgame?password_file=/run/secrets/db_password
```
---
## 9. Database Migrations
### Alembic Configuration
```ini
# alembic.ini
[alembic]
script_location = migrations
sqlalchemy.url = env://DATABASE_URL
[logging]
level = INFO
```
### Migration Script Template
```python
# migrations/versions/001_initial.py
"""Initial schema
Revision ID: 001
Create Date: 2024-01-01
"""
from alembic import op
import sqlalchemy as sa
revision = '001'
down_revision = None
def upgrade():
# Users table
op.create_table(
'users',
sa.Column('id', sa.UUID(), primary_key=True),
sa.Column('username', sa.String(50), unique=True, nullable=False),
sa.Column('email', sa.String(255), unique=True, nullable=False),
sa.Column('password_hash', sa.String(255), nullable=False),
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.func.now()),
sa.Column('is_admin', sa.Boolean(), default=False),
)
# Games table
op.create_table(
'games',
sa.Column('id', sa.UUID(), primary_key=True),
sa.Column('room_code', sa.String(10), nullable=False),
sa.Column('created_at', sa.DateTime(timezone=True), server_default=sa.func.now()),
sa.Column('completed_at', sa.DateTime(timezone=True)),
)
# Events table
op.create_table(
'events',
sa.Column('id', sa.BigInteger(), primary_key=True, autoincrement=True),
sa.Column('game_id', sa.UUID(), sa.ForeignKey('games.id'), nullable=False),
sa.Column('event_type', sa.String(50), nullable=False),
sa.Column('data', sa.JSON(), nullable=False),
sa.Column('timestamp', sa.DateTime(timezone=True), server_default=sa.func.now()),
)
# Indexes
op.create_index('idx_events_game_id', 'events', ['game_id'])
op.create_index('idx_users_email', 'users', ['email'])
op.create_index('idx_users_username', 'users', ['username'])
def downgrade():
op.drop_table('events')
op.drop_table('games')
op.drop_table('users')
```
### Migration Commands
```bash
# Create new migration
alembic revision --autogenerate -m "Add user sessions"
# Run migrations
alembic upgrade head
# Rollback one version
alembic downgrade -1
# Show current version
alembic current
```
---
## 10. Deployment Checklist
### Pre-deployment
- [ ] All environment variables set
- [ ] Database migrations applied
- [ ] Secrets configured in secret manager
- [ ] SSL certificates provisioned
- [ ] Rate limiting configured and tested
- [ ] Error tracking (Sentry) configured
- [ ] Logging aggregation set up
- [ ] Health check endpoints verified
- [ ] Backup strategy implemented
### Deployment
- [ ] Run database migrations
- [ ] Deploy new containers with rolling update
- [ ] Verify health checks pass
- [ ] Monitor error rates in Sentry
- [ ] Check application logs
- [ ] Verify WebSocket connections work
- [ ] Test critical user flows
### Post-deployment
- [ ] Monitor performance metrics
- [ ] Check database connection pool usage
- [ ] Verify Redis memory usage
- [ ] Review error logs
- [ ] Test graceful shutdown/restart
---
## 11. Monitoring Dashboard (Grafana)
### Key Metrics to Track
```yaml
# Example Prometheus metrics
metrics:
# Application
- http_requests_total
- http_request_duration_seconds
- websocket_connections_active
- games_active
- games_completed_total
# Infrastructure
- container_cpu_usage_seconds_total
- container_memory_usage_bytes
- pg_stat_activity_count
- redis_connected_clients
- redis_used_memory_bytes
# Business
- users_registered_total
- games_played_today
- average_game_duration_seconds
```
### Alert Rules
```yaml
# alertmanager rules
groups:
- name: golf-alerts
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.1
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate detected"
- alert: DatabaseConnectionExhausted
expr: pg_stat_activity_count > 90
for: 2m
labels:
severity: warning
annotations:
summary: "Database connections near limit"
- alert: HighMemoryUsage
expr: container_memory_usage_bytes / container_spec_memory_limit_bytes > 0.9
for: 5m
labels:
severity: warning
annotations:
summary: "Container memory usage above 90%"
```
---
## 12. Backup Strategy
### Database Backups
```bash
#!/bin/bash
# backup.sh - Daily database backup
BACKUP_DIR=/backups
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="${BACKUP_DIR}/golfgame_${DATE}.sql.gz"
# Backup with pg_dump
pg_dump -h postgres -U golf golfgame | gzip > "$BACKUP_FILE"
# Upload to S3/B2/etc
aws s3 cp "$BACKUP_FILE" s3://golf-backups/
# Cleanup old local backups (keep 7 days)
find "$BACKUP_DIR" -name "*.sql.gz" -mtime +7 -delete
# Cleanup old S3 backups (keep 30 days) via lifecycle policy
```
### Redis Persistence
```conf
# redis.conf
appendonly yes
appendfsync everysec
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
```
---
## Summary
This document covers all production deployment concerns:
1. **Docker**: Multi-stage builds, health checks, resource limits
2. **Rate Limiting**: Token bucket algorithm, per-endpoint limits (Phase 1 priority)
3. **Security**: Headers, CORS, CSP, HSTS
4. **Monitoring**: Sentry, structured logging, Prometheus metrics
5. **Operations**: Graceful shutdown, migrations, backups
6. **Deployment**: Checklist, rolling updates, health verification
Rate limiting is implemented in Phase 1 as a security priority to protect against abuse before public launch.