Files
vigilar/docs/architecture/overview.md
adlee-was-taken 1633e8b34e docs: final verification pass fixes
Convert the "Where to go next" items in the architecture overview from
plain text to proper Markdown links. This was the only finding from the
Task 18 verification pass; everything else (links, commands, TOML
coverage, subsystem coverage, terminology) is self-consistent across
the 17 new doc files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 10:26:02 -04:00

8.5 KiB

Vigilar Architecture Overview

This document explains how Vigilar is put together for someone reading the codebase for the first time. It is short on purpose — per-subsystem details live under subsystems/.

Design principles

  • Offline-first. No external calls in the critical path. Cloud integrations, if any, are opt-in and off the hot path.
  • Subsystem isolation. Each subsystem runs in its own process. A crash in one subsystem cannot take down another — the supervisor in vigilar/main.py restarts crashed children with exponential backoff.
  • Loose coupling via MQTT. Subsystems do not call each other directly. They publish and subscribe to a local Mosquitto broker on 127.0.0.1:1883.
  • SQLite (WAL) is the single durable store. Access goes through SQLAlchemy Core expressions (vigilar/storage/schema.py), not ORM mapped classes. WAL mode and synchronous=NORMAL are set on every connection in vigilar/storage/db.py.
  • Adaptive cost. Cameras idle at 2 FPS and jump to 30 FPS on motion, with a 5-second ring buffer so the moment leading up to the trigger is kept.
  • Configuration is typed. config/vigilar.toml is loaded and validated by Pydantic v2. Secrets are never inline — they are file paths under /etc/vigilar/secrets/.

Process topology

vigilar start loads config and calls run_supervisor() in vigilar/main.py. The supervisor spawns every subsystem as a multiprocessing.Process (via the SubsystemProcess wrapper) and monitors them in a 2-second restart loop. Cameras are managed separately by CameraManager, which owns one child process per configured camera.

                     systemd (vigilar.service)
                              |
                              v
                  vigilar start  (supervisor, main.py)
                              |
         +--------+-----------+-----------+-----------+----------+
         |        |           |           |           |          |
         v        v           v           v           v          v
      web      event-     sensor-     ups-      presence-     health-
     (Flask)  processor   bridge     monitor     monitor      monitor
                                                                   
      CameraManager --> camera worker (front_door)
                    --> camera worker (backyard)
                    --> camera worker (side_yard)
                    --> camera worker (garage)

                         ^           ^
                         |  MQTT     |
                         v           v
                    mosquitto (127.0.0.1:1883, loopback only)

Every arrow touching the broker is a local TCP connection to loopback. The web process is a Flask server (vigilar/web/app.py:create_app) with one Blueprint per feature area under vigilar/web/blueprints/.

The MQTT bus

  • Broker: Mosquitto, bound to loopback only (see systemd/vigilar-mosquitto.conf: listener 1883 127.0.0.1, allow_anonymous true, persistence false).
  • Topic convention: every topic starts with vigilar/ and is defined in vigilar/constants.py via the Topics class (either static strings or builder functions taking an ID). Real examples:
    • vigilar/camera/{camera_id}/motion/start
    • vigilar/camera/{camera_id}/motion/end
    • vigilar/camera/{camera_id}/heartbeat
    • vigilar/sensor/{sensor_id}/{event_type}
    • vigilar/ups/status, vigilar/ups/power_loss, vigilar/ups/low_battery
    • vigilar/system/arm_state, vigilar/system/alert
    • Wildcard subscriptions: vigilar/#, vigilar/camera/#, vigilar/sensor/#
  • Payloads are JSON dicts. Publishers use bus.publish_event(topic, **kwargs) from vigilar/bus.py; new fields are callers' responsibility.
  • Why MQTT rather than an in-process queue: crash isolation, introspection with mosquitto_sub, and the option to move subsystems to separate hosts later without changing the wire format.

Data flow: from motion to phone notification

  1. vigilar/camera/worker.py:run_camera_worker — opens the RTSP stream via cv2.VideoCapture(..., cv2.CAP_FFMPEG) with reconnect/backoff, pushes every frame into a ring buffer, and drives the capture loop.
  2. vigilar/camera/motion.py:MotionDetector.detect — MOG2 background subtraction on a downscaled frame; when a new motion edge is found, worker.py publishes vigilar/camera/{camera_id}/motion/start with confidence and zone count.
  3. vigilar/camera/recorder.py:AdaptiveRecorder.start_motion_recording — stops any idle recording, launches a fresh FFmpeg subprocess at motion_fps (default 30), and writes the flushed ring-buffer frames (default 5s of pre-roll) before the live frames. On stop, if VIGILAR_ENCRYPTION_KEY is set, the MP4 is re-encrypted in place to .vge via vigilar/storage/encryption.py:encrypt_file (AES-256-CTR).
  4. vigilar/events/processor.py:EventProcessor._handle_event — subscribes to vigilar/#, classifies the topic into an EventType/Severity, and writes a row to the events table via vigilar/storage/queries.py:insert_event. Wildlife and pet sightings also get rows in wildlife_sightings / pet_sightings.
  5. vigilar/events/rules.py:RuleEngine.evaluate — matches the event against configured [[rules]] from vigilar.toml (AND/OR on arm state, sensor event, camera motion, time window), honours per-rule cooldowns, and returns a list of actions.
  6. vigilar/alerts/sender.py:send_alert — for alert_all / push_and_record actions, builds a notification from the _CONTENT_MAP table, loads the VAPID key from [alerts.web_push].vapid_private_key_file, and calls pywebpush.webpush for every row in push_subscriptions. Successes and failures are recorded in alert_log; endpoints returning 410 Gone are pruned.
  7. Web UI — the browser holds an open SSE connection to a handler in vigilar/web/blueprints/events.py (mimetype="text/event-stream"), which tails new event rows and pushes them to the timeline live.

Storage layout

  • vigilar.db under [system] data_dir (default /var/vigilar/data), SQLite in WAL mode. Tables defined in vigilar/storage/schema.py: cameras, sensors, sensor_states, events, recordings, system_events, arm_state_log, alert_log, push_subscriptions, pets, pet_sightings, wildlife_sightings, package_events, pet_training_images, pet_rules, face_profiles, face_embeddings, visits, timelapse_schedules.
  • Recordings: .vge files under [system] recordings_dir (default /var/vigilar/recordings), AES-256-CTR with a random 16-byte IV prefixed to each file. Key at /etc/vigilar/secrets/storage.key. Losing the key means losing the recordings — there is no recovery path.
  • HLS: rolling segments under [system] hls_dir (default /var/vigilar/hls), written by the per-camera HLSStreamer in vigilar/camera/hls.py.
  • Backups: DB + /etc/vigilar tarball via scripts/backup.sh.

Configuration and secrets

  • config/vigilar.toml is the only configuration file the app reads (systemd points VIGILAR_CONFIG at /etc/vigilar/vigilar.toml in production).
  • Validated by Pydantic v2 at startup (vigilar/config.py).
  • Secrets never live in the TOML; they are file paths under /etc/vigilar/secrets/ (storage.key, vapid_private.pem).
  • The arm PIN and admin password are hashed; comparisons are constant-time (see vigilar/alerts/pin.py). The PIN hash is written into the TOML via vigilar config set-pin, never typed by hand.

The web tier

  • Flask with Blueprints, one per feature area under vigilar/web/blueprints/: cameras, events, kiosk, pets, recordings, sensors, system, visitors, wildlife. All registered in vigilar/web/app.py:create_app.
  • Jinja2 templates under vigilar/web/templates/, Bootstrap 5 dark theme, static assets under vigilar/web/static/.
  • Live view: hls.js grid for bandwidth efficiency, MJPEG single view for low latency.
  • Live timeline updates via Server-Sent Events from vigilar/web/blueprints/events.py.
  • PWA with VAPID web push — no Firebase, no Google Cloud Messaging. Service worker at vigilar/web/static/sw.js.

What is NOT in the critical path

  • Remote access ([remote] section) — optional, bandwidth-shaped HLS over a WireGuard tunnel.
  • Email alerts ([alerts.email]) and webhook alerts ([alerts.webhook]) — optional, off by default.
  • Any cloud service — never.

Where to go next