2 Commits

Author SHA1 Message Date
adlee-was-taken
121959ba27 docs(CLAUDE.md): staging deploy verification checklist
All checks were successful
Build & Deploy Staging / build-and-deploy (release) Successful in 29s
Encodes the lessons from the v3.3.5 → v3.3.5.1 hotfix cascade: CI green
is necessary but not sufficient. Walk the chain: clean worktree → tag
sha matches → container up recently → new code introspectable → env vars
present → DB state correct → end-to-end smoke.

Each step calls out a specific failure mode we just hit, so future-me
doesn't assume the next deploy will 'just work' when the primitives
underneath (git fetch tag-cache, compose env wiring, image reuse) can
silently skip changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:06:37 -04:00
adlee-was-taken
612eccf03b fix(ci): force-update tags on deploy fetch
All checks were successful
Build & Deploy Staging / build-and-deploy (release) Successful in 28s
git fetch origin won't replace a tag that already exists locally pointing
at a different commit. When v3.3.5 was force-moved on origin after a
first failed CI run, the staging runner kept the stale tag cached and
re-checked-out the old commit — the compose-env-wiring fix was never
actually applied and the container booted without LEADERBOARD_INCLUDE_TEST_DEFAULT.

--tags --force makes the behaviour safe for moved tags.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 14:02:06 -04:00
3 changed files with 60 additions and 4 deletions

View File

@@ -29,8 +29,10 @@ jobs:
docker pull "$IMAGE:$TAG" docker pull "$IMAGE:$TAG"
docker tag "$IMAGE:$TAG" golfgame-app:latest docker tag "$IMAGE:$TAG" golfgame-app:latest
# Update code for compose/env changes # Update code for compose/env changes. `--tags --force` so a
git fetch origin # moved tag (hotfix on top of existing version) updates locally
# instead of silently checking out the stale cached position.
git fetch origin --tags --force
git checkout "$TAG" git checkout "$TAG"
# Restart app # Restart app

View File

@@ -21,8 +21,11 @@ jobs:
cd /opt/golfgame cd /opt/golfgame
# Pull latest code and checkout the release tag # Pull latest code and checkout the release tag. `--tags --force`
git fetch origin # so that a tag moved on origin (e.g. hotfix on top of an existing
# version) actually updates locally instead of silently reusing a
# stale cached tag position.
git fetch origin --tags --force
git checkout "$TAG" git checkout "$TAG"
# Build the image # Build the image

View File

@@ -166,6 +166,57 @@ python server/simulate.py 100 --compare
- **Middleware** (`server/middleware/`): Security headers, request ID tracking, rate limiting - **Middleware** (`server/middleware/`): Security headers, request ID tracking, rate limiting
- **Handlers** (`server/handlers.py`): WebSocket message dispatch (extracted from main.py) - **Handlers** (`server/handlers.py`): WebSocket message dispatch (extracted from main.py)
## Staging Deploy Verification Checklist
After any release that triggers a staging deploy (via `.gitea/workflows/deploy-staging.yml`), do NOT trust "CI went green" — walk the full chain end-to-end. A green CI run does not prove the deploy did what you intended: `git fetch` won't update already-cached tags, compose yaml and `.env` can drift out of sync, and container images can cache without visible signal. The v3.3.5 → v3.3.5.1 saga cost us two releases because each of these bit in turn.
**Run through every step before declaring a staging deploy successful:**
1. **Worktree is clean on staging BEFORE cutting the release.**
```bash
ssh root@staging.golfcards.club 'cd /opt/golfgame && git status --short'
```
Must be empty. Dirty files or untracked files will abort `git checkout $TAG` mid-pipeline. If you ever scp files to staging for hot-patching, land those changes on main + commit before the release, OR `git reset --hard HEAD && git clean -fd` on staging first.
2. **Staging is actually at the new tag, not a stale cached position.**
```bash
ssh root@staging.golfcards.club 'cd /opt/golfgame && git rev-parse HEAD && git log --oneline -1'
```
Compare the sha to `git rev-parse v3.x.y` locally. If they differ, a moved tag was force-pushed and the runner used stale cache. Workflows now `git fetch --tags --force`, but verify.
3. **Container is running the new image (not a recycled old one).**
```bash
ssh root@staging.golfcards.club 'docker ps --format "{{.Names}} {{.Status}}"; docker inspect golfgame-app-1 --format "{{.Created}}"'
```
`Up X seconds/minutes` with a recent `.Created` time; not `Up 13 hours`. A compose restart picks up the *current* `:latest` image — confirm that image was built by this release, not a prior one.
4. **New code is actually in the container.** Introspect a signature/attribute that changed in this release:
```bash
ssh root@staging.golfcards.club 'docker exec golfgame-app-1 python -c "
import sys, inspect; sys.path.insert(0, \"/app/server\")
from services.game_logger import GameLogger
print(inspect.signature(GameLogger.log_game_start_async).parameters.keys())
"'
```
5. **Container env has every var the code reads.** Any config added to `server/config.py` this release needs TWO edits to flow through: `.env` on the host AND `- FOO=${FOO:-default}` in the compose yaml's `environment:` block. Setting only `.env` silently does nothing.
```bash
ssh root@staging.golfcards.club 'docker exec golfgame-app-1 printenv | grep -iE "YOUR_NEW_VAR"'
```
6. **DB schema + invariants hold.** Sample the tables this release touches and confirm the new columns/values look right:
```bash
ssh root@staging.golfcards.club 'docker exec golfgame-postgres-1 psql -U golf -d golf -c "SELECT status, COUNT(*) FROM games_v2 GROUP BY status;"'
```
7. **End-to-end smoke.** For a feature visible through the API, curl it and verify the response shape and content match expectations:
```bash
curl -s 'https://staging.golfcards.club/api/stats/leaderboard?metric=wins' | python3 -m json.tool
```
For features that only fire on specific game events (GAME_OVER, abandonment, etc.), run a soak game or manual repro and re-check the DB — don't assume "code is deployed" = "code has executed."
**If any step fails, stop and diagnose before running the next release.** Cascading hotfixes amplify the problem — each force-moved tag is another chance for the runner's cache to lie to you.
## Common Development Tasks ## Common Development Tasks
### Adjusting Animation Speed ### Adjusting Animation Speed