docs: add design spec for project documentation effort

Captures scope and structure for top-level README, home user guide, operator guide, and architecture docs (overview + conventions + 12 per-subsystem files). Approach 3 (hybrid): monolithic user guides, split architecture reference. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:17:07 -04:00
parent 965dc3b13d
commit 4dc2db00e0
1 changed files with 202 additions and 0 deletions
--- a/docs/superpowers/specs/2026-04-05-project-documentation-design.md
+++ b/docs/superpowers/specs/2026-04-05-project-documentation-design.md
@@ -0,0 +1,202 @@
+# Vigilar Project Documentation — Design Spec
+
+**Date:** 2026-04-05
+**Status:** Approved, ready for implementation plan
+**Scope:** Create user-facing and architectural documentation for Vigilar, plus a polished top-level `README.md`.
+
+---
+
+## 1. Goal
+
+Vigilar currently has no top-level `README.md`, no user guide, and no architectural reference. Contributors and would-be home users have to read source code to understand what the project is or how to run it. This effort closes that gap with a single coordinated documentation pass.
+
+The docs must:
+
+- Give a homeowner with a mini PC a clear linear path from bare hardware to working cameras on their phone.
+- Give a self-hoster a reference for config, CLI, secrets, backups, upgrades, and troubleshooting.
+- Give a contributor enough architectural context to navigate the codebase without reading every file.
+- Match the project's ethos: plain, no bloat, no cloud, no extra tooling.
+
+## 2. Non-goals
+
+- No doc site / MkDocs build (files are organized *as if* MkDocs-ready, but no tooling is added).
+- No rewrites of existing code or config.
+- No changes to `docs/camera-hardware-guide.md` (already exists, untouched).
+- No changes to anything under `docs/superpowers/` other than this spec.
+- No CI doc-linting, no link-checker automation beyond a one-time manual verification pass.
+- No recording-backup-to-NAS feature work. Docs describe only what exists today; future backup improvements are noted as planned, not documented as present.
+
+## 3. Audience & doc layout (Approach 3 — Hybrid)
+
+Three organizing principles, one tree:
+
+- **User-facing guides** are monolithic linear narratives (humans read top to bottom once).
+- **Architecture docs** are split reference material (contributors jump to the subsystem they're touching).
+- **README** is the front door that ties everything together.
+
+Final tree:
+
+```
+README.md                              # NEW — top-level front door
+docs/
+├── home-user-guide.md                 # NEW — monolithic, linear
+├── operator-guide.md                  # NEW — monolithic, reference
+├── architecture/
+│   ├── overview.md                    # NEW — process model, bus, data flow
+│   ├── conventions.md                 # NEW — coding conventions distilled
+│   └── subsystems/
+│       ├── camera.md                  # NEW
+│       ├── detection.md               # NEW
+│       ├── events.md                  # NEW
+│       ├── alerts.md                  # NEW
+│       ├── sensors.md                 # NEW
+│       ├── ups.md                     # NEW
+│       ├── storage.md                 # NEW
+│       ├── highlights.md              # NEW
+│       ├── presence.md                # NEW
+│       ├── pets.md                    # NEW
+│       ├── health.md                  # NEW
+│       └── web.md                     # NEW
+└── camera-hardware-guide.md           # EXISTING — untouched
+```
+
+## 4. `README.md` (top-level)
+
+Sells the project, orients newcomers, links into the doc tree.
+
+**Required sections, in order:**
+
+1. **Title + tagline** — "Vigilar — DIY offline-first home security"
+2. **One-line pitch** (~20 words)
+3. **Hero image placeholder** — `![screenshot](docs/images/grid.png)` with a comment noting it's a placeholder. Do not create the image.
+4. **Why Vigilar** — bullet list of differentiators: offline-first, HLS grid + MJPEG single, OpenCV MOG2 motion with 5-sec pre-motion ring buffer, event timeline + highlight reels + timelapses, visitor/pet/wildlife tracking, PWA with VAPID push (no Firebase), AES-256 encrypted recordings, NUT UPS monitoring, runs on cheap mini PC.
+5. **Quick paths table** — 4 rows mapping intent → doc: home user guide, operator guide, architecture overview, camera hardware guide.
+6. **60-second overview** — small ASCII diagram (cameras → mini PC → phone/browser) + 4 tech-stack bullets (Python 3.11, Flask+Bootstrap 5, SQLite WAL, MQTT, FFmpeg).
+7. **Status** — "Alpha. Works on the author's hardware. Expect rough edges and breaking changes."
+8. **Installation TL;DR** — 3-line fenced block (`git clone`, `sudo ./scripts/install.sh`, `sudo systemctl start vigilar`), followed by links to the full home-user and operator guides.
+9. **Documentation** — nested bullet list mirroring the `docs/` tree, each entry with a one-line description.
+10. **License** — GPL-3.0. If no `LICENSE` file exists, the README states the intent and notes a `LICENSE` file will follow; we do **not** create the license file as part of this effort unless asked.
+11. **Contributing** — short stub: "Issues and PRs welcome. See `CLAUDE.md` for code conventions."
+
+## 5. `docs/home-user-guide.md`
+
+Monolithic, linear, ~1500–2500 words. Target reader: homeowner with a mini PC, some Linux comfort, wants cameras on phone.
+
+**Required sections, in order:**
+
+1. **What you'll end up with** — outcome in 2 sentences plus a small ASCII diagram.
+2. **What you need** — hardware checklist: mini PC (x86_64, 4GB+ RAM, 128GB+ SSD), USB stick for OS install, one or more RTSP cameras, phone, optional NAS. Link to `camera-hardware-guide.md` for camera picks.
+3. **Step 1 — Install Debian/Ubuntu Server on the mini PC** — brief, points at upstream installer docs, tells the user to enable SSH. No hand-holding on the OS install itself.
+4. **Step 2 — Get Vigilar onto the box** — `git clone`, `sudo ./scripts/install.sh`, plus 3 bullets summarizing what `install.sh` does (read `scripts/install.sh` at write-time to ground these bullets).
+5. **Step 3 — First boot** — `sudo systemctl enable --now vigilar`, then open `http://<mini-pc-ip>:49735` in a browser on the same LAN. Mention the port is configurable under `[web]` in `vigilar.toml`.
+6. **Step 4 — Set your PIN** — UI walkthrough, 2–3 sentences, screenshot placeholder.
+7. **Step 5 — Add your first camera** — UI walkthrough: RTSP URL, credentials, test stream, save. Point at `camera-hardware-guide.md` for URL formats.
+8. **Step 6 — Phone push notifications (PWA)** — open web UI on phone, "Add to Home Screen", allow notifications. Under-the-hood note: VAPID keys already generated by `install.sh`.
+9. **Step 7 — Optional: NAS backup of config + database** — mount NAS share at `/mnt/nas/vigilar-backups`, set `VIGILAR_BACKUP_DIR`, enable a systemd timer wrapping `scripts/backup.sh`. **Explicitly state** that this backs up DB + `/etc/vigilar` (config + secrets) only, and that **recordings stay local** — point at a "planned" note for recording backup.
+10. **Troubleshooting** — camera won't connect, no push notifications, service won't start (`journalctl -u vigilar`), motion detection too sensitive, how to reset PIN.
+11. **Where to go next** — links to Operator Guide and Architecture Overview.
+
+**Grounding rule:** every shell command must correspond to a real file in `scripts/` or a real `vigilar` CLI subcommand. Verify before writing.
+
+## 6. `docs/operator-guide.md`
+
+Monolithic, reference-oriented, ~2500–4000 words. Target reader: self-hoster tuning, upgrading, securing.
+
+**Required sections, in order:**
+
+1. **Audience & scope** — for admins, not first-time home users. Points at home-user-guide.md for initial setup.
+2. **Layout on disk** — table of `/opt/vigilar`, `/etc/vigilar/{vigilar.toml, certs/, secrets/}`, `/var/vigilar/{data/vigilar.db, recordings/, hls/, backups/}`.
+3. **Installation** — what `scripts/install.sh` does, `systemd/vigilar.service` summary, `systemd/vigilar-mosquitto.conf` summary, system dependencies (ffmpeg, mosquitto, sqlite3, Python 3.11+).
+4. **Configuration reference (`vigilar.toml`)** — one subsection per `[section]` in the default TOML. Each key: default, what it controls, when to change. Sections to cover (from current `config/vigilar.toml`): `system`, `mqtt`, `web`, `zigbee2mqtt`, `ups`, `storage`, `remote`, `alerts.local`, `alerts.web_push`, `alerts.email`, plus any additional sections discovered by re-reading the TOML at write-time.
+5. **CLI reference (`vigilar ...`)** — enumerated at write-time by reading `vigilar/cli/`. One subsection per top-level command. Do not guess commands.
+6. **Secrets & security** — `/etc/vigilar/secrets/` layout and permissions; `vigilar config set-pin`; `vigilar config set-password`; TLS via `scripts/gen_cert.sh` → `[web] tls_cert/tls_key`; VAPID via `scripts/gen_vapid_keys.sh`; storage encryption key (`storage.key`) — **explicit warning: do not lose it, recordings are unrecoverable without it**; recommended firewall stance (LAN-only by default).
+7. **UPS / NUT integration** — `scripts/setup_nut.sh`, `[ups]` options, shutdown behavior, low-battery thresholds.
+8. **Backups** — what `scripts/backup.sh` captures (DB + `/etc/vigilar`) and what it does **not** (recordings); `VIGILAR_BACKUP_DIR` and `VIGILAR_BACKUP_RETENTION_DAYS`; suggested systemd timer snippet; restore procedure.
+9. **Upgrades** — `git pull` + `pip install -e .` + `systemctl restart vigilar`; rollback by restoring a backup tarball. If DB migrations exist, note how they're applied; if they don't, say so.
+10. **Logs & health** — `journalctl -u vigilar`, `log_level` in `[system]`, health endpoints (enumerated at write-time by reading `vigilar/web/blueprints/system.py` and `vigilar/health/`).
+11. **Remote access** — `[remote]` section, tunnel-based remote HLS, bandwidth-shaped downscaled streams, reiterated not-a-cloud.
+12. **Troubleshooting** — service crash loops, MQTT broker won't start, camera worker thrashing, disk full / `free_space_floor_gb` triggered, HLS stalling.
+
+**Grounding rule:** every TOML key, every CLI command, every file path, every endpoint must be verified against the current code before writing. Any that can't be verified must be omitted, not guessed.
+
+## 7. `docs/architecture/overview.md`
+
+~1000–1500 words. Target reader: contributor new to the codebase.
+
+**Required sections, in order:**
+
+1. **Design principles** — offline-first, subsystem isolation via multiprocessing, loose coupling via local MQTT bus, SQLite WAL as single durable store, SQLAlchemy Core (not ORM), adaptive FPS (2 idle / 30 on motion) with ring buffer.
+2. **Process topology** — ASCII or Mermaid diagram showing parent supervisor + N subsystem processes + mosquitto + Flask web.
+3. **The MQTT bus** — broker location, topic naming convention `vigilar/<subsystem>/<entity>/<event>`, retention/QoS notes (verify at write-time), rationale for MQTT over an in-process queue.
+4. **Data flow: the motion → alert path** — numbered sequence from RTSP capture through motion detection, recording, event creation, highlight scoring, push notification, and UI update. Each step names the actual file/function where it happens (verify at write-time).
+5. **Storage layout** — SQLite table summary (enumerate at write-time by reading `vigilar/storage/`), recordings (`.vge`, AES-256-GCM, key path), HLS segments, backups.
+6. **Configuration & secrets** — TOML → Pydantic v2 validation, secrets as file paths (never inline), PIN & password hashing with constant-time compare.
+7. **The web tier** — Flask + Blueprints, Jinja2 + Bootstrap 5 dark, HLS grid + MJPEG single view rationale, PWA + VAPID.
+8. **What's NOT in the critical path** — remote access (optional), email alerts (optional), cloud (never).
+
+## 8. `docs/architecture/conventions.md`
+
+~400 words. Distilled from `CLAUDE.md` but written for human contributors, not the AI. Covers: StrEnum for string constants (`vigilar/constants.py`), SQLAlchemy Core only (no mapped ORM classes), type hints on public functions, no docstrings unless logic is non-obvious, Ruff line-length 100, multiprocessing-per-subsystem rule, MQTT topic naming, Pydantic-validated TOML config, secrets-as-file-paths.
+
+## 9. `docs/architecture/subsystems/*.md` (12 files)
+
+One file per subdirectory under `vigilar/`: `camera`, `detection`, `events`, `alerts`, `sensors`, `ups`, `storage`, `highlights`, `presence`, `pets`, `health`, `web`.
+
+**Uniform template** (≈150–400 words each):
+
+```
+# <Subsystem name>
+
+## Purpose
+One paragraph — what this subsystem is responsible for.
+
+## Key files
+- `vigilar/<sub>/foo.py` — role
+- ...
+
+## MQTT topics
+**Subscribes:** `vigilar/...`
+**Publishes:** `vigilar/...`
+
+## Database tables
+`table_name` — what it holds. Or "none."
+
+## Depends on
+- sister subsystem X (via topic Y)
+
+## Consumed by
+- sister subsystem Z (via topic W)
+
+## Notes
+Gotchas or perf notes, only if any.
+```
+
+**Grounding rule (hard):** every topic name, every table name, every file role must come from reading the actual code. If a topic cannot be found, the doc must say "no MQTT publishers found at time of writing" — not invent one. This rule is the most important verification step in the plan.
+
+## 10. Verification checklist (before completion)
+
+1. **Link check** — every relative link in every new file resolves to a real path.
+2. **Command check** — every shell command in the user guides exists as a real script under `scripts/` or a real `vigilar` CLI subcommand.
+3. **Grounding check** — every topic name, table name, file path, and endpoint is verified against code, or omitted. Nothing guessed.
+4. **TOML coverage check** — every `[section]` in `config/vigilar.toml` is covered in the operator guide's configuration reference.
+5. **Subsystem coverage check** — every subdirectory in `vigilar/` (matching the 12-file list) has a corresponding subsystem doc.
+6. **Read-through pass** — tone and terminology consistent across all files.
+7. **README link check** — all doc tree links in `README.md` resolve.
+
+## 11. Out of scope (explicit)
+
+- `LICENSE` file creation (the README declares GPL-3.0; creating the file is a separate request).
+- Screenshot/image creation (placeholders only).
+- MkDocs configuration.
+- Any code changes.
+- Any changes to `docs/camera-hardware-guide.md`.
+- Any doc-linting CI.
+- Recording-backup-to-NAS feature or docs beyond the "planned" note.
+- Migration documentation beyond noting whether migrations exist.
+
+## 12. Success criteria
+
+- A new homeowner can go from a bare mini PC to working cameras on their phone using only `README.md` + `docs/home-user-guide.md`.
+- A self-hoster can answer any "how do I configure / back up / upgrade / troubleshoot" question from `docs/operator-guide.md` alone.
+- A new contributor can identify which subsystem owns a given behavior within 5 minutes using `docs/architecture/overview.md` + the subsystem files.
+- Every claim in every doc is either verified against current code or explicitly flagged.