Files
vigilar/docs/architecture/overview.md
adlee-was-taken d38b0c4e25 docs: add architecture overview
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 09:36:37 -04:00

8.5 KiB

Vigilar Architecture Overview

This document explains how Vigilar is put together for someone reading the codebase for the first time. It is short on purpose — per-subsystem details live under subsystems/.

Design principles

  • Offline-first. No external calls in the critical path. Cloud integrations, if any, are opt-in and off the hot path.
  • Subsystem isolation. Each subsystem runs in its own process. A crash in one subsystem cannot take down another — the supervisor in vigilar/main.py restarts crashed children with exponential backoff.
  • Loose coupling via MQTT. Subsystems do not call each other directly. They publish and subscribe to a local Mosquitto broker on 127.0.0.1:1883.
  • SQLite (WAL) is the single durable store. Access goes through SQLAlchemy Core expressions (vigilar/storage/schema.py), not ORM mapped classes. WAL mode and synchronous=NORMAL are set on every connection in vigilar/storage/db.py.
  • Adaptive cost. Cameras idle at 2 FPS and jump to 30 FPS on motion, with a 5-second ring buffer so the moment leading up to the trigger is kept.
  • Configuration is typed. config/vigilar.toml is loaded and validated by Pydantic v2. Secrets are never inline — they are file paths under /etc/vigilar/secrets/.

Process topology

vigilar start loads config and calls run_supervisor() in vigilar/main.py. The supervisor spawns every subsystem as a multiprocessing.Process (via the SubsystemProcess wrapper) and monitors them in a 2-second restart loop. Cameras are managed separately by CameraManager, which owns one child process per configured camera.

                     systemd (vigilar.service)
                              |
                              v
                  vigilar start  (supervisor, main.py)
                              |
         +--------+-----------+-----------+-----------+----------+
         |        |           |           |           |          |
         v        v           v           v           v          v
      web      event-     sensor-     ups-      presence-     health-
     (Flask)  processor   bridge     monitor     monitor      monitor
                                                                   
      CameraManager --> camera worker (front_door)
                    --> camera worker (backyard)
                    --> camera worker (side_yard)
                    --> camera worker (garage)

                         ^           ^
                         |  MQTT     |
                         v           v
                    mosquitto (127.0.0.1:1883, loopback only)

Every arrow touching the broker is a local TCP connection to loopback. The web process is a Flask server (vigilar/web/app.py:create_app) with one Blueprint per feature area under vigilar/web/blueprints/.

The MQTT bus

  • Broker: Mosquitto, bound to loopback only (see systemd/vigilar-mosquitto.conf: listener 1883 127.0.0.1, allow_anonymous true, persistence false).
  • Topic convention: every topic starts with vigilar/ and is defined in vigilar/constants.py via the Topics class (either static strings or builder functions taking an ID). Real examples:
    • vigilar/camera/{camera_id}/motion/start
    • vigilar/camera/{camera_id}/motion/end
    • vigilar/camera/{camera_id}/heartbeat
    • vigilar/sensor/{sensor_id}/{event_type}
    • vigilar/ups/status, vigilar/ups/power_loss, vigilar/ups/low_battery
    • vigilar/system/arm_state, vigilar/system/alert
    • Wildcard subscriptions: vigilar/#, vigilar/camera/#, vigilar/sensor/#
  • Payloads are JSON dicts. Publishers use bus.publish_event(topic, **kwargs) from vigilar/bus.py; new fields are callers' responsibility.
  • Why MQTT rather than an in-process queue: crash isolation, introspection with mosquitto_sub, and the option to move subsystems to separate hosts later without changing the wire format.

Data flow: from motion to phone notification

  1. vigilar/camera/worker.py:run_camera_worker — opens the RTSP stream via cv2.VideoCapture(..., cv2.CAP_FFMPEG) with reconnect/backoff, pushes every frame into a ring buffer, and drives the capture loop.
  2. vigilar/camera/motion.py:MotionDetector.detect — MOG2 background subtraction on a downscaled frame; when a new motion edge is found, worker.py publishes vigilar/camera/{camera_id}/motion/start with confidence and zone count.
  3. vigilar/camera/recorder.py:AdaptiveRecorder.start_motion_recording — stops any idle recording, launches a fresh FFmpeg subprocess at motion_fps (default 30), and writes the flushed ring-buffer frames (default 5s of pre-roll) before the live frames. On stop, if VIGILAR_ENCRYPTION_KEY is set, the MP4 is re-encrypted in place to .vge via vigilar/storage/encryption.py:encrypt_file (AES-256-CTR).
  4. vigilar/events/processor.py:EventProcessor._handle_event — subscribes to vigilar/#, classifies the topic into an EventType/Severity, and writes a row to the events table via vigilar/storage/queries.py:insert_event. Wildlife and pet sightings also get rows in wildlife_sightings / pet_sightings.
  5. vigilar/events/rules.py:RuleEngine.evaluate — matches the event against configured [[rules]] from vigilar.toml (AND/OR on arm state, sensor event, camera motion, time window), honours per-rule cooldowns, and returns a list of actions.
  6. vigilar/alerts/sender.py:send_alert — for alert_all / push_and_record actions, builds a notification from the _CONTENT_MAP table, loads the VAPID key from [alerts.web_push].vapid_private_key_file, and calls pywebpush.webpush for every row in push_subscriptions. Successes and failures are recorded in alert_log; endpoints returning 410 Gone are pruned.
  7. Web UI — the browser holds an open SSE connection to a handler in vigilar/web/blueprints/events.py (mimetype="text/event-stream"), which tails new event rows and pushes them to the timeline live.

Storage layout

  • vigilar.db under [system] data_dir (default /var/vigilar/data), SQLite in WAL mode. Tables defined in vigilar/storage/schema.py: cameras, sensors, sensor_states, events, recordings, system_events, arm_state_log, alert_log, push_subscriptions, pets, pet_sightings, wildlife_sightings, package_events, pet_training_images, pet_rules, face_profiles, face_embeddings, visits, timelapse_schedules.
  • Recordings: .vge files under [system] recordings_dir (default /var/vigilar/recordings), AES-256-CTR with a random 16-byte IV prefixed to each file. Key at /etc/vigilar/secrets/storage.key. Losing the key means losing the recordings — there is no recovery path.
  • HLS: rolling segments under [system] hls_dir (default /var/vigilar/hls), written by the per-camera HLSStreamer in vigilar/camera/hls.py.
  • Backups: DB + /etc/vigilar tarball via scripts/backup.sh.

Configuration and secrets

  • config/vigilar.toml is the only configuration file the app reads (systemd points VIGILAR_CONFIG at /etc/vigilar/vigilar.toml in production).
  • Validated by Pydantic v2 at startup (vigilar/config.py).
  • Secrets never live in the TOML; they are file paths under /etc/vigilar/secrets/ (storage.key, vapid_private.pem).
  • The arm PIN and admin password are hashed; comparisons are constant-time (see vigilar/alerts/pin.py). The PIN hash is written into the TOML via vigilar config set-pin, never typed by hand.

The web tier

  • Flask with Blueprints, one per feature area under vigilar/web/blueprints/: cameras, events, kiosk, pets, recordings, sensors, system, visitors, wildlife. All registered in vigilar/web/app.py:create_app.
  • Jinja2 templates under vigilar/web/templates/, Bootstrap 5 dark theme, static assets under vigilar/web/static/.
  • Live view: hls.js grid for bandwidth efficiency, MJPEG single view for low latency.
  • Live timeline updates via Server-Sent Events from vigilar/web/blueprints/events.py.
  • PWA with VAPID web push — no Firebase, no Google Cloud Messaging. Service worker at vigilar/web/static/sw.js.

What is NOT in the critical path

  • Remote access ([remote] section) — optional, bandwidth-shaped HLS over a WireGuard tunnel.
  • Email alerts ([alerts.email]) and webhook alerts ([alerts.webhook]) — optional, off by default.
  • Any cloud service — never.

Where to go next

  • Conventions: conventions.md
  • Per-subsystem details: subsystems/