diff --git a/docs/superpowers/specs/2026-04-05-project-documentation-design.md b/docs/superpowers/specs/2026-04-05-project-documentation-design.md new file mode 100644 index 0000000..7632874 --- /dev/null +++ b/docs/superpowers/specs/2026-04-05-project-documentation-design.md @@ -0,0 +1,202 @@ +# Vigilar Project Documentation — Design Spec + +**Date:** 2026-04-05 +**Status:** Approved, ready for implementation plan +**Scope:** Create user-facing and architectural documentation for Vigilar, plus a polished top-level `README.md`. + +--- + +## 1. Goal + +Vigilar currently has no top-level `README.md`, no user guide, and no architectural reference. Contributors and would-be home users have to read source code to understand what the project is or how to run it. This effort closes that gap with a single coordinated documentation pass. + +The docs must: + +- Give a homeowner with a mini PC a clear linear path from bare hardware to working cameras on their phone. +- Give a self-hoster a reference for config, CLI, secrets, backups, upgrades, and troubleshooting. +- Give a contributor enough architectural context to navigate the codebase without reading every file. +- Match the project's ethos: plain, no bloat, no cloud, no extra tooling. + +## 2. Non-goals + +- No doc site / MkDocs build (files are organized *as if* MkDocs-ready, but no tooling is added). +- No rewrites of existing code or config. +- No changes to `docs/camera-hardware-guide.md` (already exists, untouched). +- No changes to anything under `docs/superpowers/` other than this spec. +- No CI doc-linting, no link-checker automation beyond a one-time manual verification pass. +- No recording-backup-to-NAS feature work. Docs describe only what exists today; future backup improvements are noted as planned, not documented as present. + +## 3. Audience & doc layout (Approach 3 — Hybrid) + +Three organizing principles, one tree: + +- **User-facing guides** are monolithic linear narratives (humans read top to bottom once). +- **Architecture docs** are split reference material (contributors jump to the subsystem they're touching). +- **README** is the front door that ties everything together. + +Final tree: + +``` +README.md # NEW — top-level front door +docs/ +├── home-user-guide.md # NEW — monolithic, linear +├── operator-guide.md # NEW — monolithic, reference +├── architecture/ +│ ├── overview.md # NEW — process model, bus, data flow +│ ├── conventions.md # NEW — coding conventions distilled +│ └── subsystems/ +│ ├── camera.md # NEW +│ ├── detection.md # NEW +│ ├── events.md # NEW +│ ├── alerts.md # NEW +│ ├── sensors.md # NEW +│ ├── ups.md # NEW +│ ├── storage.md # NEW +│ ├── highlights.md # NEW +│ ├── presence.md # NEW +│ ├── pets.md # NEW +│ ├── health.md # NEW +│ └── web.md # NEW +└── camera-hardware-guide.md # EXISTING — untouched +``` + +## 4. `README.md` (top-level) + +Sells the project, orients newcomers, links into the doc tree. + +**Required sections, in order:** + +1. **Title + tagline** — "Vigilar — DIY offline-first home security" +2. **One-line pitch** (~20 words) +3. **Hero image placeholder** — `![screenshot](docs/images/grid.png)` with a comment noting it's a placeholder. Do not create the image. +4. **Why Vigilar** — bullet list of differentiators: offline-first, HLS grid + MJPEG single, OpenCV MOG2 motion with 5-sec pre-motion ring buffer, event timeline + highlight reels + timelapses, visitor/pet/wildlife tracking, PWA with VAPID push (no Firebase), AES-256 encrypted recordings, NUT UPS monitoring, runs on cheap mini PC. +5. **Quick paths table** — 4 rows mapping intent → doc: home user guide, operator guide, architecture overview, camera hardware guide. +6. **60-second overview** — small ASCII diagram (cameras → mini PC → phone/browser) + 4 tech-stack bullets (Python 3.11, Flask+Bootstrap 5, SQLite WAL, MQTT, FFmpeg). +7. **Status** — "Alpha. Works on the author's hardware. Expect rough edges and breaking changes." +8. **Installation TL;DR** — 3-line fenced block (`git clone`, `sudo ./scripts/install.sh`, `sudo systemctl start vigilar`), followed by links to the full home-user and operator guides. +9. **Documentation** — nested bullet list mirroring the `docs/` tree, each entry with a one-line description. +10. **License** — GPL-3.0. If no `LICENSE` file exists, the README states the intent and notes a `LICENSE` file will follow; we do **not** create the license file as part of this effort unless asked. +11. **Contributing** — short stub: "Issues and PRs welcome. See `CLAUDE.md` for code conventions." + +## 5. `docs/home-user-guide.md` + +Monolithic, linear, ~1500–2500 words. Target reader: homeowner with a mini PC, some Linux comfort, wants cameras on phone. + +**Required sections, in order:** + +1. **What you'll end up with** — outcome in 2 sentences plus a small ASCII diagram. +2. **What you need** — hardware checklist: mini PC (x86_64, 4GB+ RAM, 128GB+ SSD), USB stick for OS install, one or more RTSP cameras, phone, optional NAS. Link to `camera-hardware-guide.md` for camera picks. +3. **Step 1 — Install Debian/Ubuntu Server on the mini PC** — brief, points at upstream installer docs, tells the user to enable SSH. No hand-holding on the OS install itself. +4. **Step 2 — Get Vigilar onto the box** — `git clone`, `sudo ./scripts/install.sh`, plus 3 bullets summarizing what `install.sh` does (read `scripts/install.sh` at write-time to ground these bullets). +5. **Step 3 — First boot** — `sudo systemctl enable --now vigilar`, then open `http://:49735` in a browser on the same LAN. Mention the port is configurable under `[web]` in `vigilar.toml`. +6. **Step 4 — Set your PIN** — UI walkthrough, 2–3 sentences, screenshot placeholder. +7. **Step 5 — Add your first camera** — UI walkthrough: RTSP URL, credentials, test stream, save. Point at `camera-hardware-guide.md` for URL formats. +8. **Step 6 — Phone push notifications (PWA)** — open web UI on phone, "Add to Home Screen", allow notifications. Under-the-hood note: VAPID keys already generated by `install.sh`. +9. **Step 7 — Optional: NAS backup of config + database** — mount NAS share at `/mnt/nas/vigilar-backups`, set `VIGILAR_BACKUP_DIR`, enable a systemd timer wrapping `scripts/backup.sh`. **Explicitly state** that this backs up DB + `/etc/vigilar` (config + secrets) only, and that **recordings stay local** — point at a "planned" note for recording backup. +10. **Troubleshooting** — camera won't connect, no push notifications, service won't start (`journalctl -u vigilar`), motion detection too sensitive, how to reset PIN. +11. **Where to go next** — links to Operator Guide and Architecture Overview. + +**Grounding rule:** every shell command must correspond to a real file in `scripts/` or a real `vigilar` CLI subcommand. Verify before writing. + +## 6. `docs/operator-guide.md` + +Monolithic, reference-oriented, ~2500–4000 words. Target reader: self-hoster tuning, upgrading, securing. + +**Required sections, in order:** + +1. **Audience & scope** — for admins, not first-time home users. Points at home-user-guide.md for initial setup. +2. **Layout on disk** — table of `/opt/vigilar`, `/etc/vigilar/{vigilar.toml, certs/, secrets/}`, `/var/vigilar/{data/vigilar.db, recordings/, hls/, backups/}`. +3. **Installation** — what `scripts/install.sh` does, `systemd/vigilar.service` summary, `systemd/vigilar-mosquitto.conf` summary, system dependencies (ffmpeg, mosquitto, sqlite3, Python 3.11+). +4. **Configuration reference (`vigilar.toml`)** — one subsection per `[section]` in the default TOML. Each key: default, what it controls, when to change. Sections to cover (from current `config/vigilar.toml`): `system`, `mqtt`, `web`, `zigbee2mqtt`, `ups`, `storage`, `remote`, `alerts.local`, `alerts.web_push`, `alerts.email`, plus any additional sections discovered by re-reading the TOML at write-time. +5. **CLI reference (`vigilar ...`)** — enumerated at write-time by reading `vigilar/cli/`. One subsection per top-level command. Do not guess commands. +6. **Secrets & security** — `/etc/vigilar/secrets/` layout and permissions; `vigilar config set-pin`; `vigilar config set-password`; TLS via `scripts/gen_cert.sh` → `[web] tls_cert/tls_key`; VAPID via `scripts/gen_vapid_keys.sh`; storage encryption key (`storage.key`) — **explicit warning: do not lose it, recordings are unrecoverable without it**; recommended firewall stance (LAN-only by default). +7. **UPS / NUT integration** — `scripts/setup_nut.sh`, `[ups]` options, shutdown behavior, low-battery thresholds. +8. **Backups** — what `scripts/backup.sh` captures (DB + `/etc/vigilar`) and what it does **not** (recordings); `VIGILAR_BACKUP_DIR` and `VIGILAR_BACKUP_RETENTION_DAYS`; suggested systemd timer snippet; restore procedure. +9. **Upgrades** — `git pull` + `pip install -e .` + `systemctl restart vigilar`; rollback by restoring a backup tarball. If DB migrations exist, note how they're applied; if they don't, say so. +10. **Logs & health** — `journalctl -u vigilar`, `log_level` in `[system]`, health endpoints (enumerated at write-time by reading `vigilar/web/blueprints/system.py` and `vigilar/health/`). +11. **Remote access** — `[remote]` section, tunnel-based remote HLS, bandwidth-shaped downscaled streams, reiterated not-a-cloud. +12. **Troubleshooting** — service crash loops, MQTT broker won't start, camera worker thrashing, disk full / `free_space_floor_gb` triggered, HLS stalling. + +**Grounding rule:** every TOML key, every CLI command, every file path, every endpoint must be verified against the current code before writing. Any that can't be verified must be omitted, not guessed. + +## 7. `docs/architecture/overview.md` + +~1000–1500 words. Target reader: contributor new to the codebase. + +**Required sections, in order:** + +1. **Design principles** — offline-first, subsystem isolation via multiprocessing, loose coupling via local MQTT bus, SQLite WAL as single durable store, SQLAlchemy Core (not ORM), adaptive FPS (2 idle / 30 on motion) with ring buffer. +2. **Process topology** — ASCII or Mermaid diagram showing parent supervisor + N subsystem processes + mosquitto + Flask web. +3. **The MQTT bus** — broker location, topic naming convention `vigilar///`, retention/QoS notes (verify at write-time), rationale for MQTT over an in-process queue. +4. **Data flow: the motion → alert path** — numbered sequence from RTSP capture through motion detection, recording, event creation, highlight scoring, push notification, and UI update. Each step names the actual file/function where it happens (verify at write-time). +5. **Storage layout** — SQLite table summary (enumerate at write-time by reading `vigilar/storage/`), recordings (`.vge`, AES-256-GCM, key path), HLS segments, backups. +6. **Configuration & secrets** — TOML → Pydantic v2 validation, secrets as file paths (never inline), PIN & password hashing with constant-time compare. +7. **The web tier** — Flask + Blueprints, Jinja2 + Bootstrap 5 dark, HLS grid + MJPEG single view rationale, PWA + VAPID. +8. **What's NOT in the critical path** — remote access (optional), email alerts (optional), cloud (never). + +## 8. `docs/architecture/conventions.md` + +~400 words. Distilled from `CLAUDE.md` but written for human contributors, not the AI. Covers: StrEnum for string constants (`vigilar/constants.py`), SQLAlchemy Core only (no mapped ORM classes), type hints on public functions, no docstrings unless logic is non-obvious, Ruff line-length 100, multiprocessing-per-subsystem rule, MQTT topic naming, Pydantic-validated TOML config, secrets-as-file-paths. + +## 9. `docs/architecture/subsystems/*.md` (12 files) + +One file per subdirectory under `vigilar/`: `camera`, `detection`, `events`, `alerts`, `sensors`, `ups`, `storage`, `highlights`, `presence`, `pets`, `health`, `web`. + +**Uniform template** (≈150–400 words each): + +``` +# + +## Purpose +One paragraph — what this subsystem is responsible for. + +## Key files +- `vigilar//foo.py` — role +- ... + +## MQTT topics +**Subscribes:** `vigilar/...` +**Publishes:** `vigilar/...` + +## Database tables +`table_name` — what it holds. Or "none." + +## Depends on +- sister subsystem X (via topic Y) + +## Consumed by +- sister subsystem Z (via topic W) + +## Notes +Gotchas or perf notes, only if any. +``` + +**Grounding rule (hard):** every topic name, every table name, every file role must come from reading the actual code. If a topic cannot be found, the doc must say "no MQTT publishers found at time of writing" — not invent one. This rule is the most important verification step in the plan. + +## 10. Verification checklist (before completion) + +1. **Link check** — every relative link in every new file resolves to a real path. +2. **Command check** — every shell command in the user guides exists as a real script under `scripts/` or a real `vigilar` CLI subcommand. +3. **Grounding check** — every topic name, table name, file path, and endpoint is verified against code, or omitted. Nothing guessed. +4. **TOML coverage check** — every `[section]` in `config/vigilar.toml` is covered in the operator guide's configuration reference. +5. **Subsystem coverage check** — every subdirectory in `vigilar/` (matching the 12-file list) has a corresponding subsystem doc. +6. **Read-through pass** — tone and terminology consistent across all files. +7. **README link check** — all doc tree links in `README.md` resolve. + +## 11. Out of scope (explicit) + +- `LICENSE` file creation (the README declares GPL-3.0; creating the file is a separate request). +- Screenshot/image creation (placeholders only). +- MkDocs configuration. +- Any code changes. +- Any changes to `docs/camera-hardware-guide.md`. +- Any doc-linting CI. +- Recording-backup-to-NAS feature or docs beyond the "planned" note. +- Migration documentation beyond noting whether migrations exist. + +## 12. Success criteria + +- A new homeowner can go from a bare mini PC to working cameras on their phone using only `README.md` + `docs/home-user-guide.md`. +- A self-hoster can answer any "how do I configure / back up / upgrade / troubleshoot" question from `docs/operator-guide.md` alone. +- A new contributor can identify which subsystem owns a given behavior within 5 minutes using `docs/architecture/overview.md` + the subsystem files. +- Every claim in every doc is either verified against current code or explicitly flagged.