docs(plan): harden soak-harness schema migration for deploy

Makes the deployment path explicit in Task 1: traces the existing
lifespan → get_user_store → initialize_schema → conn.execute(SCHEMA_SQL)
flow, notes that the DO $$/IF NOT EXISTS pattern is the same one
every post-v1 column migration uses, and explains why rollback is
safe (additive changes only).

Adds two new verification steps to Task 1:
 - Step 7: post-deploy psql checks against staging
 - Step 8: same against production

Adds a "Post-deploy schema verification" block to CHECKLIST.md so
the schema state is verified after every server restart against
each target environment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
adlee-was-taken
2026-04-10 23:40:28 -04:00
parent cf916d7bc3
commit e8051b256b

View File

@@ -24,7 +24,18 @@
### Task 1: Schema migration for `is_test_account` and `marks_as_test`
Add two columns, one partial index, and rebuild the `leaderboard_overall` materialized view to include `is_test_account` (so the filter works through the view fast path). Fits the existing inline-migration pattern in `user_store.py`.
Add two columns, one partial index, and rebuild the `leaderboard_overall` materialized view to include `is_test_account` (so the filter works through the view fast path).
**Deploy path (this is load-bearing — read before editing):**
The existing codebase applies schema changes via inline `DO $$ BEGIN IF NOT EXISTS (...) THEN ALTER TABLE ... END IF; END $$;` blocks inside `SCHEMA_SQL` in `server/stores/user_store.py`. That string gets executed on **every server startup** by `UserStore.create() → initialize_schema() → conn.execute(SCHEMA_SQL)`, which is called from the FastAPI lifespan via `get_user_store(config.POSTGRES_URL)` in `server/main.py`. Same pattern added every other post-v1 column (`is_banned`, `force_password_reset`, `last_seen_at`, `rating`, and many others — see the existing DO blocks in `SCHEMA_SQL`).
What this means for deploy:
- **No separate migration tool needed.** CI/CD rebuilds the image, `docker compose up -d` restarts the container, lifespan fires, `SCHEMA_SQL` executes, the new `DO $$` blocks see the missing columns and `ALTER TABLE ADD COLUMN` them in place.
- **Idempotent by construction.** Re-running against an already-migrated DB is a no-op — the `IF NOT EXISTS` guard in each DO block skips the ALTER.
- **Fresh installs work.** `CREATE TABLE IF NOT EXISTS users_v2` uses the current column list; the ADD COLUMN DO blocks are no-ops because the column is already there from the CREATE.
- **Matview rebuild is atomic.** The `DO $$` block that DROPs+CREATEs `leaderboard_overall` runs inside a single transaction. `CREATE MATERIALIZED VIEW ... AS SELECT` populates immediately (no `WITH NO DATA`), so concurrent readers never see an empty or missing view — they see either the old version (pre-commit) or the new version (post-commit).
- **Rollback is safe.** All changes are additive. If you have to revert the code, the new columns just sit unused — old code never references them, so nothing breaks.
**Files:**
- Modify: `server/stores/user_store.py` — append to `SCHEMA_SQL` (ALTER blocks near L79L98 and the matview block near L298L335)
@@ -147,11 +158,73 @@ New columns support separating soak-harness test traffic from real
user traffic in stats queries. Rebuilds leaderboard_overall matview
to include is_test_account so the fast path stays filterable.
Migration is idempotent via DO $$ / IF NOT EXISTS blocks inside
SCHEMA_SQL, which runs on every server startup — same mechanism
every existing post-v1 column migration uses.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
EOF
)"
```
- [ ] **Step 7: Post-deploy verification (staging)**
After this commit ships to staging via CI/CD (or `docker compose up -d` on the staging host), verify the migration actually applied:
```bash
ssh root@129.212.150.189 << 'REMOTE'
cd /opt/golfgame
# Find the postgres container name (it may vary across compose files)
PG_CONTAINER=$(docker compose -f docker-compose.staging.yml ps -q postgres)
docker exec -i $PG_CONTAINER psql -U postgres -d golfgame << 'SQL'
-- Confirm columns exist
\d users_v2
\d invite_codes
\d leaderboard_overall
-- Targeted checks
SELECT column_name, data_type, column_default
FROM information_schema.columns
WHERE table_name = 'users_v2' AND column_name = 'is_test_account';
SELECT column_name, data_type, column_default
FROM information_schema.columns
WHERE table_name = 'invite_codes' AND column_name = 'marks_as_test';
SELECT column_name FROM information_schema.columns
WHERE table_name = 'leaderboard_overall' AND column_name = 'is_test_account';
-- Partial index
SELECT indexname, indexdef FROM pg_indexes
WHERE indexname = 'idx_users_v2_is_test_account';
SQL
REMOTE
```
Expected (all four present):
- `users_v2.is_test_account` with default `false`
- `invite_codes.marks_as_test` with default `false`
- `leaderboard_overall` has an `is_test_account` column
- `idx_users_v2_is_test_account` exists
If any of these are missing, the server didn't actually restart (or restarted but the container has a stale image). Check `docker compose logs golfgame` for the line `User store schema initialized` — if it's not there, the migration never ran.
- [ ] **Step 8: Post-deploy verification (production)**
Same check, against prod, after the prod deploy:
```bash
ssh root@165.245.152.51 << 'REMOTE'
cd /opt/golfgame
PG_CONTAINER=$(docker compose -f docker-compose.prod.yml ps -q postgres)
docker exec -i $PG_CONTAINER psql -U postgres -d golfgame -c "\d users_v2" | grep is_test_account
docker exec -i $PG_CONTAINER psql -U postgres -d golfgame -c "\d invite_codes" | grep marks_as_test
docker exec -i $PG_CONTAINER psql -U postgres -d golfgame -c "\d leaderboard_overall" | grep is_test_account
REMOTE
```
Expected: three matching rows. If prod migration fails, the rollback story is clean — revert the commit, redeploy, old code keeps working because it never referenced the new columns.
---
### Task 2: Propagate `is_test_account` through `User` model and `user_store`
@@ -5301,6 +5374,20 @@ Run after any significant change or before calling the implementation complete.
- [ ] Admin panel "Include test accounts" checkbox filters them out
- [ ] Admin panel invite codes tab shows `[Test-seed]` next to SOAKTEST
## Post-deploy schema verification
Run after the server-side changes (Tasks 17) ship to each environment.
- [ ] Server restarted (docker compose up -d or CI/CD deploy)
- [ ] Server logs show `User store schema initialized` after restart
- [ ] `\d users_v2` on target DB shows `is_test_account` column with default `false`
- [ ] `\d invite_codes` shows `marks_as_test` column with default `false`
- [ ] `\d leaderboard_overall` shows `is_test_account` column
- [ ] `\di idx_users_v2_is_test_account` shows the partial index
- [ ] `SELECT count(*) FROM leaderboard_overall` returns nonzero (view re-populated after rebuild)
- [ ] Default leaderboard query still works: `curl .../api/stats/leaderboard` returns entries
- [ ] `?include_test=true` parameter is accepted (no 422/500)
## Staging bring-up (final step)
- [ ] `UPDATE invite_codes SET marks_as_test = TRUE WHERE code = '5VC2MCCN';` run on staging