Closes #1. The Flask event-timeline was dead: `broadcast_sse_event` existed in `vigilar/web/blueprints/events.py` but had zero call sites. Clients subscribed to `/events/stream`, received the initial "connected" message, and then only keepalives — a page refresh was required to see new events. (Web Push via VAPID was independent and already worked.) The root cause was a process-boundary gap: the events subsystem runs in its own OS process and emits to MQTT, while the Flask app runs in a separate process with no MQTT client of its own. This change adds a thin bridge: - EventProcessor._handle_event now publishes a classified summary (id, ts, type, severity, source_id, payload) to a new topic `Topics.EVENTS_PUBLISHED = "vigilar/events/published"` right after `insert_event()`. Classification logic stays in one place. - A new module `vigilar/web/sse_bridge.py` provides `forward_event` (MQTT handler) and `start_sse_bridge(cfg)` (creates a MessageBus, subscribes forward_event to EVENTS_PUBLISHED, connects, returns the bus). - `vigilar/main.py:_run_web` starts the bridge after `create_app(cfg)` and disconnects it on shutdown. Bridge failure is logged but does not kill the web process — the UI still works without live updates. - `create_app` is deliberately NOT changed. Keeping the bridge out of the app factory means no existing test triggers a real MQTT connection, and the bridge stays a production-only concern wired by the supervisor. Tests (all added with TDD, RED verified before GREEN): - tests/unit/test_events.py::TestEventsPublishedBroadcast — asserts `_handle_event` publishes the classified payload for a motion event and does NOT publish for unclassified topics (heartbeats). - tests/unit/test_sse_bridge.py — asserts `forward_event` reaches SSE subscribers, and `start_sse_bridge` wires the handler to `Topics.EVENTS_PUBLISHED` on a connected bus (fake bus, no real MQTT in tests). Also refreshes the docs that previously flagged the dead SSE as a known limitation (operator guide, web architecture doc). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
661 lines
27 KiB
Markdown
661 lines
27 KiB
Markdown
# Vigilar Operator Guide
|
||
|
||
## Audience and scope
|
||
|
||
This guide is for administrators installing and operating Vigilar on a
|
||
server they already manage. It is a reference for the on-disk layout,
|
||
configuration keys, CLI, systemd integration, secrets, UPS integration,
|
||
backups, upgrades, health and pruning, remote access, and the current
|
||
set of known limitations.
|
||
|
||
If you are setting up a home system from a bare mini PC for the first
|
||
time, start with the [Home User Guide](home-user-guide.md) and return
|
||
here when you need reference-level detail.
|
||
|
||
## Layout on disk
|
||
|
||
`scripts/install.sh` lays the system out as follows. The paths are
|
||
fixed in the installer; the configuration keys that reference them are
|
||
shown in parentheses.
|
||
|
||
| Path | Owner | Mode | Purpose |
|
||
|---|---|---|---|
|
||
| `/opt/vigilar/` | `vigilar:vigilar` | 0755 | Install root and home of the service user |
|
||
| `/opt/vigilar/venv/` | `vigilar:vigilar` | 0755 | Python virtual environment with the `vigilar` entry point |
|
||
| `/etc/vigilar/` | `root:root` | 0755 | Configuration root |
|
||
| `/etc/vigilar/vigilar.toml` | `root:vigilar` | 0644 | Main config file (`VIGILAR_CONFIG`) |
|
||
| `/etc/vigilar/secrets/` | `root:root` | 0700 | Storage key, VAPID private key |
|
||
| `/etc/vigilar/secrets/storage.key` | `root:root` | 0600 | 32-byte AES-256 key for recording encryption |
|
||
| `/etc/vigilar/secrets/vapid_private.pem` | `root:root` | 0600 | VAPID signing key for Web Push |
|
||
| `/etc/vigilar/secrets/vapid_public.txt` | `root:vigilar` | 0644 | VAPID public key (base64url) |
|
||
| `/etc/vigilar/certs/` | `root:vigilar` | 0750 | TLS material |
|
||
| `/etc/vigilar/certs/cert.pem` | `root:vigilar` | 0644 | TLS certificate (`[web] tls_cert`) |
|
||
| `/etc/vigilar/certs/key.pem` | `root:vigilar` | 0640 | TLS private key (`[web] tls_key`) |
|
||
| `/var/vigilar/` | `vigilar:vigilar` | 0750 | Runtime data root |
|
||
| `/var/vigilar/data/` | `vigilar:vigilar` | 0750 | SQLite database and supporting files (`[system] data_dir`) |
|
||
| `/var/vigilar/data/vigilar.db` | `vigilar:vigilar` | 0640 | Main SQLite database (WAL mode) |
|
||
| `/var/vigilar/recordings/` | `vigilar:vigilar` | 0750 | `.vge` encrypted recordings and thumbnails (`[system] recordings_dir`) |
|
||
| `/var/vigilar/hls/` | `vigilar:vigilar` | 0750 | HLS segments and playlists (`[system] hls_dir`) |
|
||
| `/etc/systemd/system/vigilar.service` | `root:root` | 0644 | Systemd unit |
|
||
| `/etc/mosquitto/conf.d/vigilar.conf` | `root:root` | 0644 | Localhost-only MQTT config |
|
||
|
||
The `VIGILAR_CONFIG` environment variable is read by the CLI and the
|
||
web blueprints to locate `vigilar.toml`; the systemd unit sets it to
|
||
`/etc/vigilar/vigilar.toml`.
|
||
|
||
## Installation
|
||
|
||
`scripts/install.sh` is idempotent and supports Debian/Ubuntu (apt) and
|
||
Arch Linux (pacman). It performs eight phases:
|
||
|
||
1. **System dependencies.** On apt: `ffmpeg mosquitto python3
|
||
python3-venv python3-pip nut-client`. On pacman: `ffmpeg mosquitto
|
||
python python-virtualenv nut`.
|
||
2. **System user.** Creates the `vigilar` system user and group with
|
||
`/opt/vigilar` as the home directory and `/usr/sbin/nologin` as the
|
||
shell.
|
||
3. **Directories and permissions.** Creates `/var/vigilar/{data,
|
||
recordings,hls}` owned by `vigilar:vigilar` at 0750, plus
|
||
`/etc/vigilar/{secrets,certs}` with the modes shown above.
|
||
4. **Python venv.** Creates `/opt/vigilar/venv` as the `vigilar` user
|
||
and installs the project in place with `pip install "${PROJECT_DIR}"`.
|
||
5. **Storage encryption key.** Writes 32 random bytes from
|
||
`/dev/urandom` to `/etc/vigilar/secrets/storage.key` if it does not
|
||
already exist. This file is never rewritten by the installer.
|
||
6. **Sample config.** Copies the repository's `config/vigilar.toml` to
|
||
`/etc/vigilar/vigilar.toml` if a config does not already exist.
|
||
7. **Systemd unit.** Installs and enables `vigilar.service`.
|
||
8. **Mosquitto.** Installs `systemd/vigilar-mosquitto.conf` to
|
||
`/etc/mosquitto/conf.d/vigilar.conf` and restarts `mosquitto.service`.
|
||
|
||
The installer prints recommended follow-up steps: edit the TOML, then
|
||
run `gen_cert.sh`, `gen_vapid_keys.sh`, and `setup_nut.sh`, then start
|
||
the service.
|
||
|
||
### Systemd unit
|
||
|
||
`vigilar.service` runs `/opt/vigilar/venv/bin/vigilar start --config
|
||
/etc/vigilar/vigilar.toml` as `vigilar:vigilar` with
|
||
`VIGILAR_CONFIG=/etc/vigilar/vigilar.toml`. It requires
|
||
`mosquitto.service`, wants `nut-monitor.service`, and uses
|
||
`Restart=on-failure`, `RestartSec=10`, and `WatchdogSec=120`. The unit
|
||
applies `ProtectSystem=strict`, `ProtectHome`, `PrivateTmp`,
|
||
`PrivateDevices`, `NoNewPrivileges`, and a `@system-service` syscall
|
||
filter. `ReadWritePaths` is limited to `/var/vigilar/{data,recordings,
|
||
hls}`; `/etc/vigilar` is mounted read-only. Output goes to the journal
|
||
with `SyslogIdentifier=vigilar`.
|
||
|
||
### Mosquitto configuration
|
||
|
||
`vigilar-mosquitto.conf` binds a single listener on `127.0.0.1:1883`,
|
||
allows anonymous connections (localhost only), disables persistence
|
||
(all state lives in SQLite), and logs errors, warnings, notices, and
|
||
connection events to syslog. Vigilar never authenticates to the broker
|
||
and never exposes it beyond loopback.
|
||
|
||
## Configuration reference
|
||
|
||
`config/vigilar.toml` is parsed by `tomllib`, then validated by the
|
||
Pydantic models in `vigilar/config.py`. The models are the source of
|
||
truth: any unknown key is rejected, and each section has a default so
|
||
omitted sections behave sensibly.
|
||
|
||
### `[system]`
|
||
|
||
- `name` (default `"Vigilar Home Security"`): display name used in
|
||
logs and the web UI.
|
||
- `timezone` (default `"UTC"`; sample ships as
|
||
`"America/New_York"`): used for daily digests, highlight scheduling,
|
||
and timestamped file paths.
|
||
- `data_dir` (default `/var/vigilar/data`): SQLite database and
|
||
derived state.
|
||
- `recordings_dir` (default `/var/vigilar/recordings`): encrypted
|
||
`.vge` files.
|
||
- `hls_dir` (default `/var/vigilar/hls`): HLS segment output.
|
||
- `log_level` (default `"INFO"`): one of DEBUG, INFO, WARNING, ERROR.
|
||
- `arm_pin_hash` (default `""`): commented out in the sample; set via
|
||
`vigilar config set-pin`.
|
||
|
||
### `[mqtt]`
|
||
|
||
- `host` (default `127.0.0.1`) and `port` (default `1883`): broker
|
||
address. Leave on loopback unless you deliberately run a shared
|
||
broker.
|
||
- `username`, `password` (default `""`): unused by the shipped
|
||
mosquitto config, present for operators who run their own broker.
|
||
|
||
### `[web]`
|
||
|
||
- `host` (default `0.0.0.0`) and `port` (default `49735`): Flask
|
||
listener. Change `host` to `127.0.0.1` if you front with a reverse
|
||
proxy.
|
||
- `tls_cert`, `tls_key` (default `""`): PEM paths. `gen_cert.sh`
|
||
fills these in.
|
||
- `username` (default `"admin"`): web UI login name.
|
||
- `password_hash` (default `""`): scrypt hash set via `vigilar config
|
||
set-password`.
|
||
- `session_timeout` (default `3600` seconds).
|
||
|
||
### `[zigbee2mqtt]`
|
||
|
||
- `mqtt_topic_prefix` (default `"zigbee2mqtt"`): used when subscribing
|
||
to sensor topics from an external Zigbee2MQTT bridge.
|
||
|
||
### `[ups]`
|
||
|
||
See also the UPS/NUT section below.
|
||
|
||
- `enabled` (default `true`).
|
||
- `nut_host` (default `127.0.0.1`), `nut_port` (default `3493`),
|
||
`ups_name` (default `"ups"`): matches the `[ups]` block generated by
|
||
`setup_nut.sh`.
|
||
- `poll_interval_s` (default `30`).
|
||
- `low_battery_threshold_pct` (default `20`, range 5–95).
|
||
- `critical_runtime_threshold_s` (default `300`).
|
||
- `shutdown_delay_s` (default `60`).
|
||
|
||
### `[storage]`
|
||
|
||
- `encrypt_recordings` (default `true`): toggles AES-256-CTR
|
||
encryption of new `.vge` files. Changing this does not re-encrypt
|
||
existing recordings.
|
||
- `key_file` (default `/etc/vigilar/secrets/storage.key`): 32-byte
|
||
raw key.
|
||
- `max_disk_usage_gb` (default `200`) and `free_space_floor_gb`
|
||
(default `10`): **legacy keys**. They are defined on the Pydantic
|
||
model and exposed in the settings UI, but no pruning or recording
|
||
code currently reads them. The real disk ceiling is the
|
||
percentage-based pruner in `[health]`. Do not rely on these two
|
||
fields to cap disk usage today.
|
||
|
||
### `[remote]`
|
||
|
||
- `enabled` (default `false`): turns on the remote-access bridge.
|
||
- `upload_bandwidth_mbps` (default `22.0`): informational ceiling.
|
||
- `remote_hls_resolution` (default `[426, 240]`), `remote_hls_fps`
|
||
(default `10`), `remote_hls_bitrate_kbps` (default `500`): quality
|
||
profile for HLS served over the tunnel.
|
||
- `max_remote_viewers` (default `4`; `0` = unlimited).
|
||
- `tunnel_ip` (default `"10.99.0.2"`): WireGuard address of the home
|
||
server, for display only.
|
||
|
||
### `[alerts.local]`
|
||
|
||
- `enabled` (default `true`).
|
||
- `syslog` (default `true`): the supervisor installs a `SysLogHandler`
|
||
on the `vigilar.alerts` logger when this is true.
|
||
- `desktop_notify` (default `false`): `notify-send` fallback for
|
||
operator-console deployments.
|
||
|
||
### `[alerts.web_push]`
|
||
|
||
- `enabled` (default `true`).
|
||
- `vapid_private_key_file` (default
|
||
`/etc/vigilar/secrets/vapid_private.pem`).
|
||
- `vapid_claim_email` (default `"mailto:admin@vigilar.local"`): used
|
||
as the VAPID `sub` claim.
|
||
|
||
### `[alerts.email]`
|
||
|
||
- `enabled` (default `false`).
|
||
- `smtp_host`, `smtp_port` (default `587`), `from_addr`, `to_addr`,
|
||
`use_tls` (default `true`).
|
||
|
||
### `[alerts.webhook]`
|
||
|
||
- `enabled` (default `false`).
|
||
- `url`, `secret`: HMAC secret signs outbound webhook bodies.
|
||
|
||
### `[[cameras]]` (array of tables)
|
||
|
||
One block per camera. Keys:
|
||
|
||
- `id`, `display_name`, `rtsp_url`: required.
|
||
- `enabled` (default `true`).
|
||
- `record_continuous` (default `false`), `record_on_motion` (default
|
||
`true`).
|
||
- `motion_sensitivity` (default `0.7`, range 0.0–1.0) and
|
||
`motion_min_area_px` (default `500`).
|
||
- `motion_zones`, `zones`: polygon and named-zone overrides.
|
||
- `pre_motion_buffer_s` (default `5`) and `post_motion_buffer_s`
|
||
(default `30`).
|
||
- `idle_fps` (default `2`, range 1–30) and `motion_fps` (default
|
||
`30`, range 1–60): the adaptive FPS pair.
|
||
- `retention_days` (default `30`).
|
||
- `resolution_capture` (default `[1920, 1080]`) and
|
||
`resolution_motion` (default `[640, 360]`): capture size and the
|
||
downscale used for MOG2 motion detection.
|
||
- `location` (default `INTERIOR`): `CameraLocation` enum, used for
|
||
alert profiles.
|
||
|
||
Camera IDs must be unique; the Pydantic root validator rejects
|
||
duplicates.
|
||
|
||
### `[[sensors]]` and `[sensors.gpio]`
|
||
|
||
Each `[[sensors]]` block has `id`, `display_name`, `type` (e.g.
|
||
`CONTACT`, `MOTION`, `TEMPERATURE`), `protocol` (`ZIGBEE`, `ZWAVE`,
|
||
`GPIO`), `device_address`, `location`, and `enabled` (default
|
||
`true`). `[sensors.gpio] bounce_time_ms` (default `50`) applies to all
|
||
GPIO sensors. Sensor IDs must also be unique.
|
||
|
||
### `[[rules]]`
|
||
|
||
Each rule has `id`, `description`, `conditions` (list of `{type,
|
||
value, sensor_id, event}` maps), `logic` (`AND` or `OR`, default
|
||
`AND`), `actions` (list of action names like `alert_all` or
|
||
`record_all_cameras`), and `cooldown_s` (default `60`).
|
||
|
||
### `[detection]` and `[vehicles]`
|
||
|
||
- `[detection] person_detection` (default `false`), `model_path`,
|
||
`model_config_path`, `confidence_threshold` (default `0.5`),
|
||
`cameras` (empty list means all cameras).
|
||
- `[[vehicles.known]]` entries define recognised vehicles with
|
||
`name`, `color_profile`, `size_class`, `calibration_file`.
|
||
|
||
### `[presence]`
|
||
|
||
- `enabled` (default `false`).
|
||
- `ping_interval_s` (default `30`) and `departure_delay_m` (default
|
||
`10`).
|
||
- `method`: `icmp` or `arping`.
|
||
- `[[presence.members]]` entries with `name`, `ip`, and `role`
|
||
(`adult` or `child`).
|
||
- `actions`: mapping of states (`EMPTY`, `ADULTS_HOME`, `KIDS_HOME`,
|
||
`ALL_HOME`) to arm states.
|
||
|
||
### `[health]`
|
||
|
||
This is where pruning actually lives.
|
||
|
||
- `enabled` (default `true`).
|
||
- `disk_warn_pct` (default `85`): warning threshold on the partition
|
||
hosting `data_dir`.
|
||
- `disk_critical_pct` (default `95`): critical threshold. When crossed
|
||
and `auto_prune` is true, the health monitor runs the pruner.
|
||
- `auto_prune` (default `true`).
|
||
- `auto_prune_target_pct` (default `80`): pruner deletes the oldest
|
||
non-starred recordings until disk usage drops below this percentage.
|
||
- `daily_digest` (default `true`) and `daily_digest_time` (default
|
||
`"08:00"`).
|
||
|
||
### `[pets]`, `[visitors]`, `[highlights]`, `[kiosk]`
|
||
|
||
Subsystem-specific toggles. See the subsystem references under
|
||
`docs/architecture/` for per-key behaviour. Notable defaults: `[pets]
|
||
enabled = false`, `[visitors] enabled = false`, `[highlights] enabled
|
||
= true`, `[kiosk] ambient_enabled = true`.
|
||
|
||
### `[location]` and `[security]`
|
||
|
||
- `[location] latitude`, `longitude` (default `0.0`): used for sunrise
|
||
and sunset lookups.
|
||
- `[security] pin_hash` and `recovery_passphrase_hash`: populated by
|
||
`vigilar config set-pin` (the same hash is also stored under
|
||
`[system] arm_pin_hash` on the `system` model; both fields exist
|
||
because the web UI uses `[security]` while the CLI helper prints a
|
||
`[system]` line — pick one location and stick with it).
|
||
|
||
## CLI reference
|
||
|
||
The entry point is `/opt/vigilar/venv/bin/vigilar`. All commands
|
||
accept `--version`. In production, run subcommands as the service user
|
||
so file ownership and venv paths line up:
|
||
|
||
```
|
||
sudo -u vigilar /opt/vigilar/venv/bin/vigilar <subcommand>
|
||
```
|
||
|
||
The CLI exposes exactly two top-level commands: `start` and `config`.
|
||
|
||
### `vigilar start`
|
||
|
||
Starts all services under the supervisor.
|
||
|
||
```
|
||
sudo -u vigilar /opt/vigilar/venv/bin/vigilar start \
|
||
--config /etc/vigilar/vigilar.toml
|
||
```
|
||
|
||
Options: `--config/-c PATH` (defaults to `$VIGILAR_CONFIG` then
|
||
`config/vigilar.toml`); `--log-level {DEBUG,INFO,WARNING,ERROR}`
|
||
(overrides `[system] log_level`). On invocation it loads and validates
|
||
the config, configures a console log formatter, prints a startup
|
||
summary (camera count, sensor count, UPS state), then hands off to
|
||
`vigilar.main.run_supervisor`.
|
||
|
||
### `vigilar config validate`
|
||
|
||
```
|
||
sudo -u vigilar /opt/vigilar/venv/bin/vigilar config validate \
|
||
-c /etc/vigilar/vigilar.toml
|
||
```
|
||
|
||
Parses and validates the TOML against the Pydantic models and prints
|
||
a summary. Exits non-zero if validation fails. Run this after every
|
||
edit before restarting the service.
|
||
|
||
### `vigilar config show`
|
||
|
||
```
|
||
sudo -u vigilar /opt/vigilar/venv/bin/vigilar config show \
|
||
-c /etc/vigilar/vigilar.toml
|
||
```
|
||
|
||
Dumps the parsed config as JSON with `web.password_hash`,
|
||
`system.arm_pin_hash`, and `alerts.webhook.secret` redacted. Useful
|
||
for confirming which defaults Pydantic applied for keys you did not
|
||
set.
|
||
|
||
### `vigilar config set-password`
|
||
|
||
```
|
||
sudo -u vigilar /opt/vigilar/venv/bin/vigilar config set-password
|
||
```
|
||
|
||
Prompts for a web UI password (hidden, confirmed), derives a scrypt
|
||
hash (`n=16384, r=8, p=1`, random 16-byte salt, 32-byte output), and
|
||
prints a `password_hash = "salt_hex:key_hex"` line to paste into
|
||
`[web]`. It does not write the file.
|
||
|
||
### `vigilar config set-pin`
|
||
|
||
```
|
||
sudo -u vigilar /opt/vigilar/venv/bin/vigilar config set-pin
|
||
```
|
||
|
||
Prompts for an arm/disarm PIN, generates a random 32-byte HMAC key,
|
||
computes `HMAC-SHA256(key, pin)`, and prints an `arm_pin_hash =
|
||
"secret_hex:mac_hex"` line to paste into `[system]`. Again, no file
|
||
write.
|
||
|
||
## Secrets and security
|
||
|
||
- `/etc/vigilar/secrets/` is `root:root` mode `0700`. The `vigilar`
|
||
user cannot list it. Individual files the service needs (for
|
||
example `vapid_public.txt`) are readable by group `vigilar`.
|
||
- The storage encryption key is `/etc/vigilar/secrets/storage.key`:
|
||
32 raw bytes. **If this file is lost, every existing `.vge`
|
||
recording becomes unrecoverable.** Back it up separately (and
|
||
offline) from your tar archive whenever you take the system into
|
||
production.
|
||
- Recordings use **AES-256-CTR** (see `vigilar/storage/encryption.py`).
|
||
CTR provides confidentiality but no authentication: `.vge` files
|
||
are confidential but not tamper-evident. An attacker with write
|
||
access to the recordings directory can flip bits in a ciphertext
|
||
without detection. If tamper-evidence matters, keep the recordings
|
||
volume on integrity-verified storage (dm-integrity, ZFS with
|
||
checksums) or mirror to write-once media.
|
||
- The web UI password is a scrypt hash set by `vigilar config
|
||
set-password` and stored at `[web] password_hash`. The arm PIN is
|
||
an HMAC stored at `[system] arm_pin_hash` (and/or `[security]
|
||
pin_hash`).
|
||
- TLS: `gen_cert.sh` uses `mkcert` if present, otherwise an `openssl`
|
||
ECDSA P-256 self-signed certificate valid for 3650 days with SANs
|
||
for `vigilar.local`, `localhost`, `127.0.0.1`, and the detected LAN
|
||
IP. It patches `[web] tls_cert`/`tls_key` into the config.
|
||
- VAPID: `gen_vapid_keys.sh` writes
|
||
`/etc/vigilar/secrets/vapid_private.pem` (mode 0600) and
|
||
`/etc/vigilar/secrets/vapid_public.txt` (the browser-side key).
|
||
- Firewall stance: the mosquitto broker and NUT daemon bind only to
|
||
`127.0.0.1`. The only port Vigilar exposes on the LAN is the web UI
|
||
port (default `49735`). Open that port only on the interface that
|
||
serves your LAN, and keep WAN exposure behind the WireGuard tunnel
|
||
described under `[remote]`.
|
||
|
||
## UPS and NUT integration
|
||
|
||
`scripts/setup_nut.sh` installs NUT, attempts to detect a USB UPS
|
||
(using `nut-scanner` first, then a short list of vendor IDs as a
|
||
fallback), and writes a standalone configuration:
|
||
|
||
- `/etc/nut/nut.conf` with `MODE=standalone`.
|
||
- `/etc/nut/ups.conf` with `[ups] driver=usbhid-ups port=auto` (the
|
||
block name `ups` matches the default `[ups] ups_name`).
|
||
- `/etc/nut/upsd.conf` with `LISTEN 127.0.0.1 3493` — loopback only.
|
||
- `/etc/nut/upsd.users` with a `vigilar` local monitoring user.
|
||
- `/etc/nut/upsmon.conf` pointing at `ups@localhost`.
|
||
|
||
It then enables `nut-driver`, `nut-server`, and `nut-monitor` (or
|
||
`upsd`/`upsmon` on distros that ship the old unit names). Test with
|
||
`upsc ups@localhost`. The Vigilar UPS subsystem polls this daemon
|
||
using the keys under `[ups]`.
|
||
|
||
## Backups
|
||
|
||
`scripts/backup.sh` produces
|
||
`${VIGILAR_BACKUP_DIR:-/var/vigilar/backups}/vigilar-backup-YYYYMMDD-HHMMSS.tar.gz`
|
||
and includes:
|
||
|
||
- A consistent SQLite snapshot produced with `sqlite3 … .backup` (or
|
||
a direct file copy if `sqlite3` is not available), plus any
|
||
`-wal`/`-shm` files.
|
||
- The entire `/etc/vigilar/` tree (config, secrets, certs).
|
||
|
||
It does **not** include `/var/vigilar/recordings` or
|
||
`/var/vigilar/hls`. Video is assumed to be either expendable or
|
||
handled by a separate storage tier.
|
||
|
||
Environment variables:
|
||
|
||
- `VIGILAR_BACKUP_DIR` — destination directory (default
|
||
`/var/vigilar/backups`).
|
||
- `VIGILAR_BACKUP_RETENTION_DAYS` — age in days after which old
|
||
archives are pruned; set to `0` to keep forever (default `30`).
|
||
|
||
The archive is `chmod 0600 root:root` because it contains secrets.
|
||
|
||
### Scheduling
|
||
|
||
You can run it from cron, as the script comment suggests
|
||
(`0 3 * * * /opt/vigilar/scripts/backup.sh`), or via a dedicated
|
||
systemd timer. A minimal pair of units, kept in your local systemd
|
||
directory (not in the repo):
|
||
|
||
```
|
||
# /etc/systemd/system/vigilar-backup.service
|
||
[Unit]
|
||
Description=Vigilar nightly backup
|
||
After=vigilar.service
|
||
|
||
[Service]
|
||
Type=oneshot
|
||
Environment=VIGILAR_BACKUP_DIR=/srv/backups/vigilar
|
||
ExecStart=/opt/vigilar/scripts/backup.sh
|
||
```
|
||
|
||
```
|
||
# /etc/systemd/system/vigilar-backup.timer
|
||
[Unit]
|
||
Description=Run Vigilar backup nightly
|
||
|
||
[Timer]
|
||
OnCalendar=*-*-* 03:00:00
|
||
Persistent=true
|
||
|
||
[Install]
|
||
WantedBy=timers.target
|
||
```
|
||
|
||
Enable with `sudo systemctl enable --now vigilar-backup.timer`.
|
||
|
||
### Restore
|
||
|
||
1. `sudo systemctl stop vigilar.service`.
|
||
2. Extract the archive to a staging directory.
|
||
3. Copy `etc/vigilar/` back into `/etc/vigilar/`, preserving
|
||
permissions. Double-check `/etc/vigilar/secrets/storage.key` is
|
||
`root:root 0600`.
|
||
4. Copy the database snapshot to `/var/vigilar/data/vigilar.db` and
|
||
remove any stale `vigilar.db-wal`/`vigilar.db-shm` files.
|
||
5. `sudo chown -R vigilar:vigilar /var/vigilar/data`.
|
||
6. `sudo -u vigilar /opt/vigilar/venv/bin/vigilar config validate
|
||
-c /etc/vigilar/vigilar.toml`.
|
||
7. `sudo systemctl start vigilar.service` and watch the journal.
|
||
|
||
## Upgrades
|
||
|
||
1. `sudo systemctl stop vigilar.service`.
|
||
2. `cd /path/to/vigilar && git pull`.
|
||
3. `sudo -u vigilar /opt/vigilar/venv/bin/pip install --upgrade .`
|
||
4. Diff the shipped `config/vigilar.toml` against `/etc/vigilar/
|
||
vigilar.toml` and merge any new keys by hand; Pydantic will reject
|
||
unknown keys but is tolerant of missing keys that have defaults.
|
||
5. `sudo -u vigilar /opt/vigilar/venv/bin/vigilar config validate
|
||
-c /etc/vigilar/vigilar.toml`.
|
||
6. `sudo systemctl start vigilar.service`.
|
||
|
||
**Schema migrations:** there is no migration framework. Vigilar does
|
||
not ship Alembic; `vigilar/storage/schema.py` defines the tables
|
||
(`cameras`, `sensors`, `sensor_states`, `events`, `recordings`,
|
||
`system_events`, `arm_state_log`, `alert_log`, `push_subscriptions`,
|
||
`pets`, `pet_sightings`, `wildlife_sightings`, `package_events`,
|
||
`pet_training_images`, `pet_rules`, `face_profiles`, `face_embeddings`,
|
||
`visits`, `timelapse_schedules`) and new columns are added by code
|
||
path at startup or not at all. Take a backup before every upgrade so
|
||
you can roll back if a column assumption changes.
|
||
|
||
## Logs and health
|
||
|
||
All subsystem output goes to the journal under the `vigilar` syslog
|
||
identifier:
|
||
|
||
```
|
||
sudo journalctl -u vigilar.service -f
|
||
sudo journalctl -u vigilar.service --since "1 hour ago"
|
||
sudo journalctl -u vigilar.service -p warning
|
||
```
|
||
|
||
The alerts subsystem additionally mirrors messages to syslog via the
|
||
`vigilar.alerts` logger when `[alerts.local] syslog = true`, which is
|
||
the default; the supervisor installs the handler at startup.
|
||
|
||
Set `[system] log_level = "DEBUG"` (or pass `--log-level DEBUG` to
|
||
`vigilar start`) to trace MQTT traffic, motion scoring, and FFmpeg
|
||
invocations. Expect a significant volume increase; revert to `INFO`
|
||
once you have the evidence you need.
|
||
|
||
The only HTTP endpoint currently exposing health is
|
||
`GET /system/status` on the web UI, which returns a JSON blob with
|
||
arm state, camera counts, and sensor counts. The richer health data
|
||
(disk percentage, MQTT reachability) is published to the
|
||
`vigilar/system/health` MQTT topic by `HealthMonitor` every ten
|
||
seconds and is not yet surfaced as a REST endpoint.
|
||
|
||
## Pruning and disk management
|
||
|
||
`vigilar/health/monitor.py` runs a disk check every five minutes
|
||
against `[system] data_dir` using `shutil.disk_usage`. When usage
|
||
crosses `[health] disk_critical_pct` and `[health] auto_prune` is
|
||
true, it calls `vigilar.health.pruner.auto_prune`:
|
||
|
||
- Selects up to 20 unstarred recordings at a time, ordered oldest
|
||
first.
|
||
- Deletes the file on disk, any thumbnail, and the row from the
|
||
`recordings` table.
|
||
- Loops until disk usage drops below `[health] auto_prune_target_pct`
|
||
or no more candidates exist.
|
||
|
||
Starred recordings (`recordings.starred = 1`) are never auto-pruned.
|
||
Per-camera `retention_days` is enforced separately by the camera
|
||
subsystem. There is no hard byte ceiling; the pruner is entirely
|
||
percentage-driven. The `[storage] max_disk_usage_gb` and
|
||
`[storage] free_space_floor_gb` keys described above are not
|
||
consulted by the pruner.
|
||
|
||
## Remote access
|
||
|
||
`[remote]` controls the lower-bitrate HLS profile that Vigilar serves
|
||
through a WireGuard tunnel. The tunnel itself is not set up by this
|
||
project — you are expected to bring your own WireGuard server and
|
||
peer configuration. Once the tunnel is up:
|
||
|
||
- `enabled = true` turns on the remote bridge.
|
||
- `tunnel_ip` is the home server's address inside the tunnel (default
|
||
`10.99.0.2`), shown in the UI for reference.
|
||
- `upload_bandwidth_mbps` caps the advertised upstream.
|
||
- `remote_hls_resolution`, `remote_hls_fps`, `remote_hls_bitrate_kbps`
|
||
define the transcode profile used when a client connects through
|
||
the tunnel instead of the LAN.
|
||
- `max_remote_viewers` bounds concurrent remote sessions; set to `0`
|
||
for unlimited.
|
||
|
||
Do not expose port `49735` directly on the WAN; require the tunnel.
|
||
|
||
## Known limitations
|
||
|
||
- **Recording integrity is not authenticated.** AES-256-CTR gives you
|
||
confidentiality, not tamper-evidence. If an attacker reaches the
|
||
recordings directory they can modify ciphertext unnoticed. See the
|
||
security section.
|
||
- **Camera supervision is asymmetric.** Most subsystems run under
|
||
`SubsystemProcess` in `vigilar/main.py`, which polls every two
|
||
seconds and applies an exponential backoff up to `max_restarts=10`.
|
||
Cameras do not: `CameraManager` in `vigilar/camera/manager.py`
|
||
owns its own per-camera child processes outside that supervisor.
|
||
A repeatedly crashing camera may thrash differently from, say, a
|
||
crashing UPS poller. Watch the journal for per-camera restart
|
||
messages independently from the top-level supervisor log.
|
||
- **Legacy storage keys.** `[storage] max_disk_usage_gb` and
|
||
`[storage] free_space_floor_gb` are editable but do nothing. Use
|
||
`[health]` for real disk policy.
|
||
- **No schema migrations.** There is no Alembic (or equivalent) in
|
||
the tree. Rollbacks rely on your backup discipline.
|
||
- **Duplicate PIN fields.** `vigilar config set-pin` writes to
|
||
`[system] arm_pin_hash`, while the web arm/disarm flow reads from
|
||
`[security] pin_hash`. Both models exist. If you set one and the
|
||
other side does not behave as expected, mirror the value manually.
|
||
|
||
## Troubleshooting
|
||
|
||
**Supervisor crash loops.** `journalctl -u vigilar.service` will show
|
||
a subsystem crashing and the supervisor attempting to restart it. If
|
||
the same subsystem exceeds ten restarts, the supervisor gives up on
|
||
that subsystem and logs `exceeded max restarts, giving up`. Fix the
|
||
root cause (bad config, missing secret, missing model file for
|
||
detection) and restart the unit.
|
||
|
||
**Mosquitto will not start.** Confirm that
|
||
`/etc/mosquitto/conf.d/vigilar.conf` is present and that no other
|
||
listener is bound to `127.0.0.1:1883`. Run `sudo systemctl status
|
||
mosquitto.service` and `sudo journalctl -u mosquitto.service`. The
|
||
Vigilar unit `Requires=mosquitto.service`, so Vigilar will refuse to
|
||
start until mosquitto is healthy.
|
||
|
||
**Camera thrashing.** Because cameras are not under the main
|
||
supervisor's backoff, a camera whose RTSP URL is wrong or whose
|
||
remote end is rebooting can respawn quickly. Look for repeated
|
||
`camera <id>` messages in the journal. Disable the camera in the
|
||
config (`enabled = false`) while you fix the upstream, then
|
||
re-enable.
|
||
|
||
**Disk full.** Check `[health] disk_critical_pct` and confirm
|
||
`auto_prune` is on. If the partition is already past the target
|
||
percentage and nothing is being deleted, there are no unstarred
|
||
recordings left to prune — unstar something or lower retention. The
|
||
legacy `[storage]` keys will not help here; see the pruning section.
|
||
|
||
**HLS stalls.** The HLS directory lives at `[system] hls_dir`
|
||
(default `/var/vigilar/hls`) and is mounted `ReadWritePath` in the
|
||
systemd unit. Stalls usually mean FFmpeg has died on a camera;
|
||
check the journal for FFmpeg stderr and verify the RTSP URL is still
|
||
reachable from the server with `ffprobe`.
|
||
|
||
**Config validation fails.** Run `sudo -u vigilar
|
||
/opt/vigilar/venv/bin/vigilar config validate -c
|
||
/etc/vigilar/vigilar.toml`. Pydantic error messages include the
|
||
section, key, and reason. The two common traps are duplicate camera
|
||
or sensor IDs (root validator rejects them) and a TOML table that
|
||
should be an array of tables (`[cameras]` instead of `[[cameras]]`).
|
||
|
||
**Forgotten arm PIN.** Run `vigilar config set-pin` to mint a new
|
||
hash and paste it in; restart the service. If you also forgot the
|
||
recovery passphrase set up through the UI, the web
|
||
`/system/api/reset-pin` endpoint cannot help you — fall back to the
|
||
CLI.
|
||
|
||
**Forgotten web password.** Run `vigilar config set-password` and
|
||
paste the new hash into `[web] password_hash`, then restart. No
|
||
database state needs to change.
|