Files
vigilar/docs/architecture/overview.md
adlee-was-taken 1633e8b34e docs: final verification pass fixes
Convert the "Where to go next" items in the architecture overview from
plain text to proper Markdown links. This was the only finding from the
Task 18 verification pass; everything else (links, commands, TOML
coverage, subsystem coverage, terminology) is self-consistent across
the 17 new doc files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 10:26:02 -04:00

171 lines
8.5 KiB
Markdown

# Vigilar Architecture Overview
This document explains how Vigilar is put together for someone reading the
codebase for the first time. It is short on purpose — per-subsystem details
live under `subsystems/`.
## Design principles
- **Offline-first.** No external calls in the critical path. Cloud integrations,
if any, are opt-in and off the hot path.
- **Subsystem isolation.** Each subsystem runs in its own process. A crash in
one subsystem cannot take down another — the supervisor in `vigilar/main.py`
restarts crashed children with exponential backoff.
- **Loose coupling via MQTT.** Subsystems do not call each other directly.
They publish and subscribe to a local Mosquitto broker on `127.0.0.1:1883`.
- **SQLite (WAL) is the single durable store.** Access goes through
SQLAlchemy Core expressions (`vigilar/storage/schema.py`), not ORM mapped
classes. WAL mode and `synchronous=NORMAL` are set on every connection in
`vigilar/storage/db.py`.
- **Adaptive cost.** Cameras idle at 2 FPS and jump to 30 FPS on motion, with
a 5-second ring buffer so the moment leading up to the trigger is kept.
- **Configuration is typed.** `config/vigilar.toml` is loaded and validated
by Pydantic v2. Secrets are never inline — they are file paths under
`/etc/vigilar/secrets/`.
## Process topology
`vigilar start` loads config and calls `run_supervisor()` in `vigilar/main.py`.
The supervisor spawns every subsystem as a `multiprocessing.Process` (via the
`SubsystemProcess` wrapper) and monitors them in a 2-second restart loop.
Cameras are managed separately by `CameraManager`, which owns one child
process per configured camera.
```
systemd (vigilar.service)
|
v
vigilar start (supervisor, main.py)
|
+--------+-----------+-----------+-----------+----------+
| | | | | |
v v v v v v
web event- sensor- ups- presence- health-
(Flask) processor bridge monitor monitor monitor
CameraManager --> camera worker (front_door)
--> camera worker (backyard)
--> camera worker (side_yard)
--> camera worker (garage)
^ ^
| MQTT |
v v
mosquitto (127.0.0.1:1883, loopback only)
```
Every arrow touching the broker is a local TCP connection to loopback. The
`web` process is a Flask server (`vigilar/web/app.py:create_app`) with one
Blueprint per feature area under `vigilar/web/blueprints/`.
## The MQTT bus
- Broker: Mosquitto, bound to loopback only (see `systemd/vigilar-mosquitto.conf`:
`listener 1883 127.0.0.1`, `allow_anonymous true`, `persistence false`).
- Topic convention: every topic starts with `vigilar/` and is defined in
`vigilar/constants.py` via the `Topics` class (either static strings or
builder functions taking an ID). Real examples:
- `vigilar/camera/{camera_id}/motion/start`
- `vigilar/camera/{camera_id}/motion/end`
- `vigilar/camera/{camera_id}/heartbeat`
- `vigilar/sensor/{sensor_id}/{event_type}`
- `vigilar/ups/status`, `vigilar/ups/power_loss`, `vigilar/ups/low_battery`
- `vigilar/system/arm_state`, `vigilar/system/alert`
- Wildcard subscriptions: `vigilar/#`, `vigilar/camera/#`, `vigilar/sensor/#`
- Payloads are JSON dicts. Publishers use `bus.publish_event(topic, **kwargs)`
from `vigilar/bus.py`; new fields are callers' responsibility.
- Why MQTT rather than an in-process queue: crash isolation, introspection with
`mosquitto_sub`, and the option to move subsystems to separate hosts later
without changing the wire format.
## Data flow: from motion to phone notification
1. `vigilar/camera/worker.py:run_camera_worker` — opens the RTSP stream via
`cv2.VideoCapture(..., cv2.CAP_FFMPEG)` with reconnect/backoff, pushes every
frame into a ring buffer, and drives the capture loop.
2. `vigilar/camera/motion.py:MotionDetector.detect` — MOG2 background
subtraction on a downscaled frame; when a new motion edge is found,
`worker.py` publishes `vigilar/camera/{camera_id}/motion/start` with
confidence and zone count.
3. `vigilar/camera/recorder.py:AdaptiveRecorder.start_motion_recording`
stops any idle recording, launches a fresh FFmpeg subprocess at
`motion_fps` (default 30), and writes the flushed ring-buffer frames
(default 5s of pre-roll) before the live frames. On stop, if
`VIGILAR_ENCRYPTION_KEY` is set, the MP4 is re-encrypted in place to
`.vge` via `vigilar/storage/encryption.py:encrypt_file` (AES-256-CTR).
4. `vigilar/events/processor.py:EventProcessor._handle_event` — subscribes
to `vigilar/#`, classifies the topic into an `EventType`/`Severity`, and
writes a row to the `events` table via
`vigilar/storage/queries.py:insert_event`. Wildlife and pet sightings also
get rows in `wildlife_sightings` / `pet_sightings`.
5. `vigilar/events/rules.py:RuleEngine.evaluate` — matches the event against
configured `[[rules]]` from `vigilar.toml` (AND/OR on arm state, sensor
event, camera motion, time window), honours per-rule cooldowns, and returns
a list of actions.
6. `vigilar/alerts/sender.py:send_alert` — for `alert_all` / `push_and_record`
actions, builds a notification from the `_CONTENT_MAP` table, loads the
VAPID key from `[alerts.web_push].vapid_private_key_file`, and calls
`pywebpush.webpush` for every row in `push_subscriptions`. Successes and
failures are recorded in `alert_log`; endpoints returning `410 Gone` are
pruned.
7. Web UI — the browser holds an open SSE connection to a handler in
`vigilar/web/blueprints/events.py` (`mimetype="text/event-stream"`), which
tails new event rows and pushes them to the timeline live.
## Storage layout
- `vigilar.db` under `[system] data_dir` (default `/var/vigilar/data`), SQLite
in WAL mode. Tables defined in `vigilar/storage/schema.py`:
`cameras`, `sensors`, `sensor_states`, `events`, `recordings`,
`system_events`, `arm_state_log`, `alert_log`, `push_subscriptions`,
`pets`, `pet_sightings`, `wildlife_sightings`, `package_events`,
`pet_training_images`, `pet_rules`, `face_profiles`, `face_embeddings`,
`visits`, `timelapse_schedules`.
- Recordings: `.vge` files under `[system] recordings_dir` (default
`/var/vigilar/recordings`), AES-256-CTR with a random 16-byte IV prefixed
to each file. Key at `/etc/vigilar/secrets/storage.key`. **Losing the key
means losing the recordings** — there is no recovery path.
- HLS: rolling segments under `[system] hls_dir` (default `/var/vigilar/hls`),
written by the per-camera `HLSStreamer` in `vigilar/camera/hls.py`.
- Backups: DB + `/etc/vigilar` tarball via `scripts/backup.sh`.
## Configuration and secrets
- `config/vigilar.toml` is the only configuration file the app reads
(systemd points `VIGILAR_CONFIG` at `/etc/vigilar/vigilar.toml` in
production).
- Validated by Pydantic v2 at startup (`vigilar/config.py`).
- Secrets never live in the TOML; they are file paths under
`/etc/vigilar/secrets/` (`storage.key`, `vapid_private.pem`).
- The arm PIN and admin password are hashed; comparisons are constant-time
(see `vigilar/alerts/pin.py`). The PIN hash is written into the TOML via
`vigilar config set-pin`, never typed by hand.
## The web tier
- Flask with Blueprints, one per feature area under
`vigilar/web/blueprints/`: `cameras`, `events`, `kiosk`, `pets`,
`recordings`, `sensors`, `system`, `visitors`, `wildlife`. All registered
in `vigilar/web/app.py:create_app`.
- Jinja2 templates under `vigilar/web/templates/`, Bootstrap 5 dark theme,
static assets under `vigilar/web/static/`.
- Live view: `hls.js` grid for bandwidth efficiency, MJPEG single view for
low latency.
- Live timeline updates via Server-Sent Events from
`vigilar/web/blueprints/events.py`.
- PWA with VAPID web push — no Firebase, no Google Cloud Messaging. Service
worker at `vigilar/web/static/sw.js`.
## What is NOT in the critical path
- Remote access (`[remote]` section) — optional, bandwidth-shaped HLS over
a WireGuard tunnel.
- Email alerts (`[alerts.email]`) and webhook alerts (`[alerts.webhook]`)
— optional, off by default.
- Any cloud service — never.
## Where to go next
- [Conventions](conventions.md) — coding rules distilled for contributors.
- [Subsystems](subsystems/) — per-subsystem references.