From 6325e868730b6ef131a94cc9bf5f20cfd7b1e3e9 Mon Sep 17 00:00:00 2001 From: "Aaron D. Lee" Date: Wed, 1 Apr 2026 23:31:47 -0400 Subject: [PATCH] Comprehensive documentation for v0.2.0 release MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit README.md (700 lines): - Three-tier deployment model with ASCII diagram - Federation blueprint in web UI routes - deploy/ directory in architecture tree - Documentation index linking all guides CLAUDE.md (256 lines): - Updated architecture tree with all new docs and deploy files New guides: - docs/federation.md (317 lines) — gossip protocol mechanics, peer setup, trust filtering, offline bundles, relay deployment, jurisdiction - docs/evidence-guide.md (283 lines) — evidence packages, cold archives, selective disclosure, chain anchoring, legal discovery workflow - docs/source-dropbox.md (220 lines) — token management, client-side hashing, extract-then-strip pipeline, receipt mechanics, opsec - docs/index.md — documentation hub linking all guides Training materials: - docs/training/reporter-quickstart.md (105 lines) — printable one-page card: boot USB, attest photo, encode message, check-in, emergency - docs/training/emergency-card.md (79 lines) — wallet-sized laminated card: three destruction methods, 10-step order, key contacts - docs/training/admin-reference.md (219 lines) — deployment tiers, CLI tables, backup checklist, hardening checklist, troubleshooting Also includes existing architecture docs from the original repos. Co-Authored-By: Claude Opus 4.6 (1M context) --- CLAUDE.md | 203 ++++++-- README.md | 98 +++- docs/architecture/chain-format.md | 252 ++++++++++ docs/architecture/export-bundle.md | 319 +++++++++++++ docs/architecture/federation-protocol.md | 565 +++++++++++++++++++++++ docs/architecture/federation.md | 254 ++++++++++ docs/evidence-guide.md | 283 ++++++++++++ docs/federation.md | 317 +++++++++++++ docs/index.md | 34 ++ docs/source-dropbox.md | 220 +++++++++ docs/training/admin-operations-guide.md | 513 ++++++++++++++++++++ docs/training/admin-reference.md | 219 +++++++++ docs/training/emergency-card.md | 79 ++++ docs/training/reporter-field-guide.md | 263 +++++++++++ docs/training/reporter-quickstart.md | 105 +++++ 15 files changed, 3670 insertions(+), 54 deletions(-) create mode 100644 docs/architecture/chain-format.md create mode 100644 docs/architecture/export-bundle.md create mode 100644 docs/architecture/federation-protocol.md create mode 100644 docs/architecture/federation.md create mode 100644 docs/evidence-guide.md create mode 100644 docs/federation.md create mode 100644 docs/index.md create mode 100644 docs/source-dropbox.md create mode 100644 docs/training/admin-operations-guide.md create mode 100644 docs/training/admin-reference.md create mode 100644 docs/training/emergency-card.md create mode 100644 docs/training/reporter-field-guide.md create mode 100644 docs/training/reporter-quickstart.md diff --git a/CLAUDE.md b/CLAUDE.md index 7c539f4..52a7377 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,4 +1,4 @@ -# SooSeF — Claude Code Project Guide +# SooSeF -- Claude Code Project Guide SooSeF (Soo Security Fieldkit) is an offline-first security toolkit for journalists, NGOs, and at-risk organizations. Monorepo consolidating Stegasoo and Verisoo as subpackages. @@ -8,7 +8,7 @@ Version 0.2.0 · Python >=3.11 · MIT License ## Quick commands ```bash -# Development install (single command — stegasoo and verisoo are inlined subpackages) +# Development install (single command -- stegasoo and verisoo are inlined subpackages) pip install -e ".[dev]" pytest # Run tests @@ -26,23 +26,23 @@ mypy src/ # Type check ``` src/soosef/ Core library - __init__.py Package init, __version__ + __init__.py Package init, __version__ (0.2.0) _availability.py Runtime checks for optional subpackages (has_stegasoo, has_verisoo) api.py Optional unified FastAPI app (uvicorn soosef.api:app) - audit.py Audit logging + audit.py Append-only JSON-lines audit log (~/.soosef/audit.jsonl) cli.py Click CLI entry point (soosef command) - paths.py All ~/.soosef/* path constants (single source of truth) - config.py Unified config loader - exceptions.py SoosefError base exception + paths.py All ~/.soosef/* path constants (single source of truth, lazy resolution) + config.py Unified config loader (SoosefConfig dataclass + JSON) + exceptions.py SoosefError, ChainError, ChainIntegrityError, ChainAppendError, KeystoreError metadata.py Extract-then-strip EXIF pipeline with field classification evidence.py Self-contained evidence package export (ZIP with verify.py) archive.py Cold archive export for long-term preservation (OAIS-aligned) - stegasoo/ Steganography engine (inlined from stegasoo) + stegasoo/ Steganography engine (inlined from stegasoo v4.3.0) encode.py / decode.py Core encode/decode API generate.py Cover image generation - crypto.py AES-256-GCM encryption - channel.py Channel key derivation + crypto.py AES-256-GCM encryption, channel fingerprints + channel.py Channel key derivation + management steganography.py LSB steganography core dct_steganography.py DCT-domain JPEG steganography spread_steganography.py MDCT spread-spectrum audio steganography @@ -54,71 +54,110 @@ src/soosef/ Core library steganalysis.py Detection resistance analysis validation.py Input validation models.py Data models - constants.py Magic bytes, version constants + constants.py Magic bytes, version constants, AUDIO_ENABLED, VIDEO_ENABLED cli.py Stegasoo-specific CLI commands api.py / api_auth.py Stegasoo REST API + auth carrier_tracker.py Carrier image reuse tracking (warns on reuse) + platform_presets.py Social-media-aware encoding presets image_utils.py / audio_utils.py / video_utils.py keygen.py / qr_utils.py / recovery.py / debug.py / utils.py - platform_presets.py Social-media-aware encoding presets - verisoo/ Provenance attestation engine (inlined from verisoo) - attestation.py Core attestation creation + verisoo/ Provenance attestation engine (inlined from verisoo v0.1.0) + attestation.py Core attestation creation + EXIF extraction verification.py Attestation verification crypto.py Ed25519 signing - hashing.py Perceptual + cryptographic hashing + hashing.py Perceptual + cryptographic hashing (ImageHashes) embed.py Attestation embedding in images - merkle.py Merkle tree for batch attestation + merkle.py Merkle tree + consistency/inclusion proofs binlog.py Binary attestation log lmdb_store.py LMDB-backed trust store - storage.py Attestation storage abstraction - federation.py Federated attestation exchange - models.py Attestation, Verification dataclasses - exceptions.py Verisoo-specific exceptions + storage.py Attestation storage abstraction (LocalStorage) + federation.py GossipNode, HttpTransport, PeerInfo, SyncStatus + peer_store.py SQLite-backed peer persistence for federation + models.py Attestation, AttestationRecord, ImageHashes, Identity + exceptions.py VerisooError, AttestationError, VerificationError, FederationError cli.py Verisoo-specific CLI commands - api.py Verisoo REST API + api.py Verisoo REST API + federation endpoints federation/ Federated attestation chain system - chain.py Hash chain construction + key rotation/recovery/delivery-ack records - entropy.py Entropy source for chain seeds - models.py ChainEntry, ChainState dataclasses - serialization.py CBOR chain serialization + export + chain.py ChainStore -- append-only hash chain with key rotation/recovery/delivery-ack + entropy.py Entropy source for chain seeds (sys_uptime, fs_snapshot, proc_entropy, boot_id) + models.py AttestationChainRecord, ChainState, EntropyWitnesses (frozen dataclasses) + serialization.py CBOR canonical encoding + compute_record_hash + serialize/deserialize anchors.py RFC 3161 timestamps + manual chain anchors - exchange.py Cross-org attestation bundle export/import + exchange.py Cross-org attestation bundle export/import (JSON bundles) keystore/ Unified key management - manager.py Owns all key material (channel keys + Ed25519 identity) - models.py KeyBundle, IdentityBundle dataclasses - export.py Encrypted key bundle export/import + manager.py KeystoreManager -- owns all key material (channel + identity + trust store + backup) + models.py IdentityInfo, KeystoreStatus, RotationResult dataclasses + export.py Encrypted key bundle export/import (SOOBNDL format) fieldkit/ Field security features - killswitch.py Emergency data destruction - deadman.py Dead man's switch - tamper.py File integrity monitoring + killswitch.py Emergency data destruction (PurgeScope.KEYS_ONLY | ALL, deep forensic scrub) + deadman.py Dead man's switch (webhook warning + auto-purge) + tamper.py File integrity monitoring (baseline snapshots) usb_monitor.py USB device whitelist (Linux/pyudev) - geofence.py GPS boundary enforcement + geofence.py GPS boundary enforcement (gpsd integration) frontends/web/ Unified Flask web UI - app.py App factory (create_app()) - auth.py SQLite3 multi-user auth + app.py App factory (create_app()), ~36k -- mounts all blueprints + auth.py SQLite3 multi-user auth with lockout + rate limiting temp_storage.py File-based temp storage with expiry subprocess_stego.py Crash-safe subprocess isolation for stegasoo stego_worker.py Background stego processing - stego_routes.py Stego route helpers - ssl_utils.py Self-signed HTTPS cert generation + stego_routes.py Stego route helpers (~87k) + ssl_utils.py Self-signed HTTPS cert generation (cover_name support) blueprints/ stego.py /encode, /decode, /generate - attest.py /attest, /verify + attest.py /attest, /verify (~25k -- handles images + arbitrary files) fieldkit.py /fieldkit/* (killswitch, deadman, status) - keys.py /keys/* (unified key management) + keys.py /keys/* (unified key management, trust store) admin.py /admin/* (user management) - dropbox.py /dropbox/* (token-gated anonymous source upload) - templates/dropbox/ - admin.html Drop box admin panel + dropbox.py /dropbox/* (token-gated anonymous source upload, ~13k) + federation.py /federation/* (peer status dashboard, peer add/remove) + templates/ + dropbox/admin.html Drop box admin panel + federation/status.html Federation peer dashboard frontends/cli/ CLI package init (main entry point is src/soosef/cli.py) + +deploy/ Deployment artifacts + docker/ Dockerfile (multi-stage: builder, relay, server) + docker-compose.yml + kubernetes/ namespace.yaml, server-deployment.yaml, relay-deployment.yaml + live-usb/ build.sh + config/ for Debian Live USB (Tier 1) + config-presets/ low-threat.json, medium-threat.json, high-threat.json, critical-threat.json + +docs/ Documentation + deployment.md Three-tier deployment guide (~1500 lines) + federation.md Gossip protocol, peer setup, offline bundles, CLI commands + evidence-guide.md Evidence packages, cold archives, selective disclosure + source-dropbox.md Source drop box setup: tokens, EXIF pipeline, receipts + architecture/ + federation.md System architecture overview (threat model, layers, key domains) + chain-format.md Chain record spec (CBOR, entropy witnesses, serialization) + export-bundle.md Export bundle spec (SOOSEFX1 binary format, envelope encryption) + federation-protocol.md Federation server protocol (CT-inspired, gossip, storage tiers) + training/ + reporter-quickstart.md One-page reporter quick-start for Tier 1 USB (print + laminate) + reporter-field-guide.md Comprehensive reporter guide: attest, stego, killswitch, evidence + emergency-card.md Laminated wallet card: emergency destruction reference + admin-reference.md Admin CLI cheat sheet, hardening checklist, troubleshooting + admin-operations-guide.md Full admin operations: users, dropbox, federation, incidents ``` +## Three-tier deployment model + +``` +Tier 1: Field Device Tier 2: Org Server Tier 3: Federation Relay +(Bootable USB + laptop) (Docker on mini PC / VPS) (Docker on VPS) +Amnesic, LUKS-encrypted Persistent, full features Attestation sync only +Reporter in the field Newsroom / NGO office Friendly jurisdiction +``` + +- Tier 1 <-> Tier 2: sneakernet (USB drive) or LAN +- Tier 2 <-> Tier 3: federation API (port 8000) over internet +- Tier 2 <-> Tier 2: via Tier 3 relay, or directly via sneakernet + ## Dependency model Stegasoo and Verisoo are inlined subpackages, not separate pip packages: @@ -133,11 +172,12 @@ Stegasoo and Verisoo are inlined subpackages, not separate pip packages: - **Two key domains, never merged**: Stegasoo AES-256-GCM (derived from factors) and Verisoo Ed25519 (signing identity) are separate security concerns - **Extract-then-strip model**: Stego strips all EXIF (carrier is vessel); attestation - extracts evidentiary EXIF then strips dangerous fields (device serial, etc.) -- **subprocess_stego.py copies verbatim** from stegasoo — it's a crash-safety boundary -- **All state under ~/.soosef/** — one directory to back up, one to destroy + extracts evidentiary EXIF (GPS, timestamp) then strips dangerous fields (device serial) +- **subprocess_stego.py copies verbatim** from stegasoo -- it's a crash-safety boundary +- **All state under ~/.soosef/** -- one directory to back up, one to destroy. + `SOOSEF_DATA_DIR` env var relocates everything (cover mode, USB mode) - **Offline-first**: All static assets vendored, no CDN. pip wheels bundled for airgap install -- **Flask blueprints**: stego, attest, fieldkit, keys, admin, dropbox — clean route separation +- **Flask blueprints**: stego, attest, fieldkit, keys, admin, dropbox, federation - **Flask-WTF**: CSRF protection on all form endpoints; drop box is CSRF-exempt (sources don't have sessions) - **Client-side SHA-256**: Drop box upload page uses SubtleCrypto for pre-upload hashing @@ -145,11 +185,72 @@ Stegasoo and Verisoo are inlined subpackages, not separate pip packages: - **FastAPI option**: `soosef.api` provides a REST API alternative to the Flask web UI - **Pluggable backends**: Stego backends (LSB, DCT) registered via `backends/registry.py` - **ImageHashes generalized**: phash/dhash now optional, enabling non-image attestation -- **Lazy path resolution**: Modules use `import soosef.paths as _paths` for --data-dir support -- **Two-way federation**: Delivery acknowledgment records enable handshake proof -- **Chain record types**: CONTENT_TYPE_KEY_ROTATION, CONTENT_TYPE_KEY_RECOVERY, - CONTENT_TYPE_DELIVERY_ACK (in federation/chain.py) +- **Lazy path resolution**: All paths in paths.py resolve lazily via `__getattr__` from + `BASE_DIR` so that runtime overrides (--data-dir, SOOSEF_DATA_DIR) propagate correctly +- **Two-way federation**: Delivery acknowledgment records (`soosef/delivery-ack-v1`) + enable handshake proof +- **Chain record types** (in federation/chain.py): + - `CONTENT_TYPE_KEY_ROTATION = "soosef/key-rotation-v1"` -- signed by OLD key + - `CONTENT_TYPE_KEY_RECOVERY = "soosef/key-recovery-v1"` -- signed by NEW key + - `CONTENT_TYPE_DELIVERY_ACK = "soosef/delivery-ack-v1"` -- signed by receiver +- **Gossip federation** (verisoo/federation.py): GossipNode with async peer sync, + consistency proofs, HttpTransport over aiohttp. PeerStore for SQLite-backed persistence +- **Threat level presets**: deploy/config-presets/ with low/medium/high/critical configs +- **Selective disclosure**: Chain records can be exported with non-selected records + redacted to hashes only (for legal discovery) +- **Evidence packages**: Self-contained ZIPs with verify.py that needs only Python + cryptography +- **Cold archives**: OAIS-aligned full-state export with ALGORITHMS.txt +- **Transport-aware stego**: --transport whatsapp|signal|telegram auto-selects DCT/JPEG + and pre-resizes carrier for platform survival + +## Data directory layout (`~/.soosef/`) + +``` +~/.soosef/ + config.json Unified configuration (SoosefConfig dataclass) + audit.jsonl Append-only audit trail (JSON-lines) + carrier_history.json Carrier reuse tracking database + identity/ Ed25519 keypair (private.pem, public.pem, identity.meta.json) + archived/ Timestamped old keypairs from rotations + stegasoo/ Channel key (channel.key) + archived/ Timestamped old channel keys from rotations + attestations/ Verisoo attestation store + log.bin Binary attestation log + index/ LMDB index + peers.json Legacy peer file + federation/ Federation state + peers.db SQLite peer + sync history + chain/ Hash chain (chain.bin, state.cbor) + anchors/ External timestamp anchors (JSON files) + auth/ Web UI auth databases + soosef.db User accounts + dropbox.db Drop box tokens + receipts + certs/ Self-signed TLS certificates (cert.pem, key.pem) + fieldkit/ Fieldkit state + deadman.json Dead man's switch timer + tamper/baseline.json File integrity baselines + usb/whitelist.json USB device whitelist + geofence.json GPS boundary config + temp/ Ephemeral file storage + dropbox/ Source drop box submissions + instance/ Flask instance (sessions, .secret_key) + trusted_keys/ Collaborator Ed25519 public keys (trust store) + / Per-key directory (public.pem + meta.json) + last_backup.json Backup timestamp tracking +``` ## Code conventions -Black (100-char), Ruff, mypy, imperative commit messages. +Black (100-char), Ruff (E, F, I, N, W, UP), mypy (strict, ignore_missing_imports), +imperative commit messages. + +## Testing + +```bash +pytest # All tests with coverage +pytest tests/test_chain.py # Chain-specific +``` + +Test files: `test_chain.py`, `test_chain_security.py`, `test_deadman_enforcement.py`, +`test_key_rotation.py`, `test_killswitch.py`, `test_serialization.py`, +`test_stegasoo_audio.py`, `test_stegasoo.py`, `test_verisoo_hashing.py` diff --git a/README.md b/README.md index 8a0ebde..2b2f334 100644 --- a/README.md +++ b/README.md @@ -191,23 +191,84 @@ pip install "soosef[dev]" # All + pytest, black, ruff, mypy | `all` | All of the above | | `dev` | All + pytest, pytest-cov, black, ruff, mypy | -### Airgapped / Raspberry Pi install +### Airgapped install Bundle wheels on a networked machine, then install offline: ```bash # On networked machine -pip download "soosef[rpi]" -d ./wheels +pip download "soosef[web,cli]" -d ./wheels # Transfer ./wheels to target via USB # On airgapped machine -pip install --no-index --find-links=./wheels "soosef[rpi]" +pip install --no-index --find-links=./wheels "soosef[web,cli]" soosef init soosef serve --host 0.0.0.0 ``` --- +## Deployment + +SooSeF uses a three-tier deployment model designed for field journalism, organizational +evidence management, and cross-organization federation. + +``` +Tier 1: Field Device Tier 2: Org Server Tier 3: Federation Relay +(Bootable USB + laptop) (Docker on mini PC / VPS) (Docker on VPS) + +Reporter in the field Newsroom / NGO office Friendly jurisdiction +Amnesic, LUKS-encrypted Persistent storage Attestation sync only +Pull USB = zero trace Web UI + federation API Zero knowledge of keys + \ | / + \_____ sneakernet ____+____ gossip API ____/ +``` + +**Tier 1 -- Field Device.** A bootable Debian Live USB stick. Boots into a minimal desktop +with Firefox pointed at the local SooSeF web UI. LUKS-encrypted persistent partition. Pull +the USB and the host machine retains nothing. + +**Tier 2 -- Org Server.** A Docker deployment on a mini PC or trusted VPS. Runs the full +web UI (port 5000) and federation API (port 8000). Manages keys, attestations, and +federation sync. + +**Tier 3 -- Federation Relay.** A lightweight Docker container in a jurisdiction with +strong press protections. Relays attestation records between organizations. Stores only +hashes and signatures -- never keys, plaintext, or original media. + +### Quick deploy + +```bash +# Tier 1: Build USB image +cd deploy/live-usb && sudo ./build.sh + +# Tier 2: Docker org server +cd deploy/docker && docker compose up server -d + +# Tier 3: Docker federation relay +cd deploy/docker && docker compose up relay -d +``` + +### Threat level configuration presets + +SooSeF ships four configuration presets at `deploy/config-presets/`: + +| Preset | Session | Killswitch | Dead Man | Cover Name | +|---|---|---|---|---| +| `low-threat.json` | 30 min | Off | Off | None | +| `medium-threat.json` | 15 min | On | 48h / 4h grace | "Office Document Manager" | +| `high-threat.json` | 5 min | On | 12h / 1h grace | "Local Inventory Tracker" | +| `critical-threat.json` | 3 min | On | 6h / 1h grace | "System Statistics" | + +```bash +cp deploy/config-presets/high-threat.json ~/.soosef/config.json +``` + +See [docs/deployment.md](docs/deployment.md) for the full deployment guide including +security hardening, Kubernetes manifests, systemd services, and operational security notes. + +--- + ## CLI Reference All commands accept `--data-dir PATH` to override the default `~/.soosef` directory, @@ -372,6 +433,7 @@ through Flask blueprints. Served by **Waitress** (production WSGI server) by def | keys | `/keys/*` | Key management, rotation, export/import | | admin | `/admin/*` | User management (multi-user auth via SQLite) | | dropbox | `/dropbox/admin`, `/dropbox/upload/` | Source drop box: token creation (admin), anonymous upload (source), receipt verification | +| federation | `/federation/*` | Federation peer dashboard, peer add/remove | | health | `/health` | Capability reporting endpoint (see API section) | @@ -465,6 +527,13 @@ frontends/web/ keys.py /keys/* admin.py /admin/* dropbox.py /dropbox/* (source drop box) + federation.py /federation/* (peer dashboard) + +deploy/ Deployment artifacts + docker/ Dockerfile (multi-stage: builder, relay, server) + compose + kubernetes/ Namespace, server, and relay deployments + live-usb/ Debian Live USB build scripts (Tier 1) + config-presets/ Threat level presets (low/medium/high/critical) ``` ### Data directory (`~/.soosef/`) @@ -603,6 +672,29 @@ Python 3.11, 3.12, 3.13, and 3.14. --- +## Documentation + +| Document | Audience | Description | +|---|---|---| +| [docs/deployment.md](docs/deployment.md) | Field deployers, IT staff | Three-tier deployment guide, security hardening, troubleshooting | +| [docs/federation.md](docs/federation.md) | System administrators | Gossip protocol, peer setup, offline bundles, federation API | +| [docs/evidence-guide.md](docs/evidence-guide.md) | Investigators, legal teams | Evidence packages, cold archives, selective disclosure, anchoring | +| [docs/source-dropbox.md](docs/source-dropbox.md) | Administrators | Source drop box setup, EXIF pipeline, receipt codes | +| [docs/training/reporter-quickstart.md](docs/training/reporter-quickstart.md) | Field reporters | One-page quick-start card for Tier 1 USB users | +| [docs/training/emergency-card.md](docs/training/emergency-card.md) | All users | Laminated wallet card: emergency destruction, dead man's switch | +| [docs/training/admin-reference.md](docs/training/admin-reference.md) | Administrators | CLI cheat sheet, hardening checklist, troubleshooting | + +Architecture documents (design-level, for contributors): + +| Document | Description | +|---|---| +| [docs/architecture/federation.md](docs/architecture/federation.md) | System architecture overview, threat model, layer design | +| [docs/architecture/chain-format.md](docs/architecture/chain-format.md) | Chain record spec (CBOR, entropy witnesses) | +| [docs/architecture/export-bundle.md](docs/architecture/export-bundle.md) | Export bundle spec (binary format, envelope encryption) | +| [docs/architecture/federation-protocol.md](docs/architecture/federation-protocol.md) | Federation server protocol (CT-inspired, gossip) | + +--- + ## License MIT License. See [LICENSE](LICENSE) for details. diff --git a/docs/architecture/chain-format.md b/docs/architecture/chain-format.md new file mode 100644 index 0000000..f939de9 --- /dev/null +++ b/docs/architecture/chain-format.md @@ -0,0 +1,252 @@ +# Chain Format Specification + +**Status**: Design +**Version**: 1 (record format version) +**Last updated**: 2026-04-01 + +## 1. Overview + +The attestation chain is an append-only sequence of signed records stored locally on the +offline device. Each record includes a hash of the previous record, forming a tamper-evident +chain analogous to git commits or blockchain blocks. + +The chain wraps existing Verisoo attestation records. A Verisoo record's serialized bytes +become the input to `content_hash`, preserving the original attestation while adding +ordering, entropy witnesses, and chain integrity guarantees. + +## 2. AttestationChainRecord + +### Field Definitions + +| Field | CBOR Key | Type | Size | Description | +|---|---|---|---|---| +| `version` | 0 | unsigned int | 1 byte | Record format version. Currently `1`. | +| `record_id` | 1 | byte string | 16 bytes | UUID v7 (RFC 9562). Time-ordered unique identifier. | +| `chain_index` | 2 | unsigned int | 8 bytes max | Monotonically increasing, 0-based. Genesis record is index 0. | +| `prev_hash` | 3 | byte string | 32 bytes | SHA-256 of `canonical_bytes(previous_record)`. Genesis: `0x00 * 32`. | +| `content_hash` | 4 | byte string | 32 bytes | SHA-256 of the wrapped content (e.g., Verisoo record bytes). | +| `content_type` | 5 | text string | variable | MIME-like type identifier. `"verisoo/attestation-v1"` for Verisoo records. | +| `metadata` | 6 | CBOR map | variable | Extensible key-value map. See §2.1. | +| `claimed_ts` | 7 | integer | 8 bytes max | Unix timestamp in microseconds (µs). Signed integer to handle pre-epoch dates. | +| `entropy_witnesses` | 8 | CBOR map | variable | System entropy snapshot. See §3. | +| `signer_pubkey` | 9 | byte string | 32 bytes | Ed25519 raw public key bytes. | +| `signature` | 10 | byte string | 64 bytes | Ed25519 signature over `canonical_bytes(record)` excluding the signature field. | + +### 2.1 Metadata Map + +The `metadata` field is an open CBOR map with text string keys. Defined keys: + +| Key | Type | Description | +|---|---|---| +| `"backfilled"` | bool | `true` if this record was created by the backfill migration | +| `"caption"` | text | Human-readable description of the attested content | +| `"location"` | text | Location name associated with the attestation | +| `"original_ts"` | integer | Original Verisoo timestamp (µs) if different from `claimed_ts` | +| `"tags"` | array of text | User-defined classification tags | + +Applications may add custom keys. Unknown keys must be preserved during serialization. + +## 3. Entropy Witnesses + +Entropy witnesses are system-state snapshots collected at record creation time. They serve +as soft evidence that the claimed timestamp is plausible. Fabricating convincing witnesses +for a backdated record requires simulating the full system state at the claimed time. + +| Field | CBOR Key | Type | Source | Fallback (non-Linux) | +|---|---|---|---|---| +| `sys_uptime` | 0 | float64 | `time.monotonic()` | Same (cross-platform) | +| `fs_snapshot` | 1 | byte string (16 bytes) | SHA-256 of `os.stat()` on chain DB, truncated to 16 bytes | SHA-256 of chain dir stat | +| `proc_entropy` | 2 | unsigned int | `/proc/sys/kernel/random/entropy_avail` | `len(os.urandom(32))` (always 32, marker for non-Linux) | +| `boot_id` | 3 | text string | `/proc/sys/kernel/random/boot_id` | `uuid.uuid4()` cached per process lifetime | + +### Witness Properties + +- **sys_uptime**: Monotonically increasing within a boot. Cannot decrease. A record with + `sys_uptime < previous_record.sys_uptime` and `claimed_ts > previous_record.claimed_ts` + is suspicious (reboot or clock manipulation). +- **fs_snapshot**: Changes with every write to the chain DB. Hash includes mtime, ctime, + size, and inode number. +- **proc_entropy**: Varies naturally. On Linux, reflects kernel entropy pool state. +- **boot_id**: Changes on every reboot. Identical `boot_id` across records implies same + boot session — combined with `sys_uptime`, this constrains the timeline. + +## 4. Serialization + +### 4.1 Canonical Bytes + +`canonical_bytes(record)` produces the deterministic byte representation used for hashing +and signing. It is a CBOR map containing all fields **except** `signature`, encoded using +CBOR canonical encoding (RFC 8949 §4.2): + +- Map keys sorted by integer value (0, 1, 2, ..., 9) +- Integers use minimal-length encoding +- No indefinite-length items +- No duplicate keys + +``` +canonical_bytes(record) = cbor2.dumps({ + 0: record.version, + 1: record.record_id, + 2: record.chain_index, + 3: record.prev_hash, + 4: record.content_hash, + 5: record.content_type, + 6: record.metadata, + 7: record.claimed_ts, + 8: { + 0: record.entropy_witnesses.sys_uptime, + 1: record.entropy_witnesses.fs_snapshot, + 2: record.entropy_witnesses.proc_entropy, + 3: record.entropy_witnesses.boot_id, + }, + 9: record.signer_pubkey, +}, canonical=True) +``` + +### 4.2 Record Hash + +``` +compute_record_hash(record) = SHA-256(canonical_bytes(record)) +``` + +This hash is used as `prev_hash` in the next record and as Merkle tree leaves in export +bundles. + +### 4.3 Signature + +``` +record.signature = Ed25519_Sign(private_key, canonical_bytes(record)) +``` + +Verification: +``` +Ed25519_Verify(record.signer_pubkey, record.signature, canonical_bytes(record)) +``` + +### 4.4 Full Serialization + +`serialize_record(record)` produces the full CBOR encoding including the signature field +(CBOR key 10). This is used for storage and transmission. + +``` +serialize_record(record) = cbor2.dumps({ + 0: record.version, + 1: record.record_id, + ... + 9: record.signer_pubkey, + 10: record.signature, +}, canonical=True) +``` + +## 5. Chain Rules + +### 5.1 Genesis Record + +The first record in a chain (index 0) has: +- `chain_index = 0` +- `prev_hash = b'\x00' * 32` (32 zero bytes) + +The **chain ID** is defined as `SHA-256(canonical_bytes(genesis_record))`. This permanently +identifies the chain. + +### 5.2 Append Rule + +For record N (where N > 0): +``` +record_N.chain_index == record_{N-1}.chain_index + 1 +record_N.prev_hash == compute_record_hash(record_{N-1}) +record_N.claimed_ts >= record_{N-1}.claimed_ts (SHOULD, not MUST — clock skew possible) +``` + +### 5.3 Verification + +Full chain verification checks, for each record from index 0 to head: +1. `Ed25519_Verify(record.signer_pubkey, record.signature, canonical_bytes(record))` — signature valid +2. `record.chain_index == expected_index` — no gaps or duplicates +3. `record.prev_hash == compute_record_hash(previous_record)` — chain link intact +4. All `signer_pubkey` values are identical within a chain (single-signer chain) + +Violation of rule 4 indicates a chain was signed by multiple identities, which may be +legitimate (key rotation) or malicious (chain hijacking). Key rotation is out of scope for +v1; implementations should flag this as a warning. + +## 6. Storage Format + +### 6.1 chain.bin (Append-Only Log) + +Records are stored sequentially as length-prefixed CBOR: + +``` +┌─────────────────────────────┐ +│ uint32 BE: record_0 length │ +│ bytes: serialize(record_0) │ +├─────────────────────────────┤ +│ uint32 BE: record_1 length │ +│ bytes: serialize(record_1) │ +├─────────────────────────────┤ +│ ... │ +└─────────────────────────────┘ +``` + +- Length prefix is 4 bytes, big-endian unsigned 32-bit integer +- Maximum record size: 4 GiB (practical limit much smaller) +- File is append-only; records are never modified or deleted +- File locking via `fcntl.flock(LOCK_EX)` for single-writer safety + +### 6.2 state.cbor (Chain State Checkpoint) + +A single CBOR map, atomically rewritten after each append: + +```cbor +{ + "chain_id": bytes[32], # SHA-256(canonical_bytes(genesis)) + "head_index": uint, # Index of the most recent record + "head_hash": bytes[32], # Hash of the most recent record + "record_count": uint, # Total records in chain + "created_at": int, # Unix µs when chain was created + "last_append_at": int # Unix µs of last append +} +``` + +This file is a performance optimization — the canonical state is always derivable from +`chain.bin`. On corruption, `state.cbor` is rebuilt by scanning the log. + +### 6.3 File Locations + +``` +~/.soosef/chain/ + chain.bin Append-only record log + state.cbor Chain state checkpoint +``` + +Paths are defined in `src/soosef/paths.py`. + +## 7. Migration from Verisoo-Only Attestations + +Existing Verisoo attestations in `~/.soosef/attestations/` are not modified. The chain +is a parallel structure. Migration is performed by the `soosef chain backfill` command: + +1. Iterate all records in Verisoo's `LocalStorage` (ordered by timestamp) +2. For each record, compute `content_hash = SHA-256(record.to_bytes())` +3. Create a chain record with: + - `content_type = "verisoo/attestation-v1"` + - `claimed_ts` set to the original Verisoo timestamp + - `metadata = {"backfilled": true, "original_ts": }` + - Entropy witnesses collected at migration time (not original time) +4. Append to chain + +Backfilled records are distinguishable via the `backfilled` metadata flag. Their entropy +witnesses reflect migration time, not original attestation time — this is honest and +intentional. + +## 8. Content Types + +The `content_type` field identifies what was hashed into `content_hash`. Defined types: + +| Content Type | Description | +|---|---| +| `verisoo/attestation-v1` | Verisoo `AttestationRecord` serialized bytes | +| `soosef/raw-file-v1` | Raw file bytes (for non-image attestations, future) | +| `soosef/metadata-only-v1` | No file content; metadata-only attestation (future) | + +New content types may be added without changing the record format version. diff --git a/docs/architecture/export-bundle.md b/docs/architecture/export-bundle.md new file mode 100644 index 0000000..910bbb7 --- /dev/null +++ b/docs/architecture/export-bundle.md @@ -0,0 +1,319 @@ +# Export Bundle Specification + +**Status**: Design +**Version**: 1 (bundle format version) +**Last updated**: 2026-04-01 + +## 1. Overview + +An export bundle packages a contiguous range of chain records into a portable, encrypted +file suitable for transfer across an air gap. The bundle format is designed so that: + +- **Auditors** can verify chain integrity without decrypting content +- **Recipients** with the correct key can decrypt and read attestation records +- **Anyone** can detect tampering via Merkle root and signature verification +- **Steganographic embedding** is optional — bundles can be hidden in JPEG images via DCT + +The format follows the pattern established by `keystore/export.py` (SOOBNDL): magic bytes, +version, structured binary payload. + +## 2. Binary Layout + +``` +Offset Size Field +────── ───────── ────────────────────────────────────── +0 8 magic: b"SOOSEFX1" +8 1 version: uint8 (1) +9 4 summary_len: uint32 BE +13 var chain_summary: CBOR (see §3) +var 4 recipients_len: uint32 BE +var var recipients: CBOR array (see §4) +var 12 nonce: AES-256-GCM nonce +var var ciphertext: AES-256-GCM(zstd(CBOR(records))) +last 16 16 tag: AES-256-GCM authentication tag +``` + +All multi-byte integers are big-endian. The total bundle size is: +`9 + 4 + summary_len + 4 + recipients_len + 12 + ciphertext_len + 16` + +### Parsing Without Decryption + +To audit a bundle without decryption, read: +1. Magic (8 bytes) — verify `b"SOOSEFX1"` +2. Version (1 byte) — verify `1` +3. Summary length (4 bytes BE) — read the next N bytes as CBOR +4. Chain summary — verify signature, inspect metadata + +The encrypted payload and recipient list can be skipped for audit purposes. + +## 3. Chain Summary + +The chain summary sits **outside** the encryption envelope. It provides verifiable metadata +about the bundle contents without revealing the actual attestation data. + +CBOR map with integer keys: + +| CBOR Key | Field | Type | Description | +|---|---|---|---| +| 0 | `bundle_id` | byte string (16) | UUID v7, unique bundle identifier | +| 1 | `chain_id` | byte string (32) | SHA-256(genesis record) — identifies source chain | +| 2 | `range_start` | unsigned int | First record index (inclusive) | +| 3 | `range_end` | unsigned int | Last record index (inclusive) | +| 4 | `record_count` | unsigned int | Number of records in bundle | +| 5 | `first_hash` | byte string (32) | `compute_record_hash(first_record)` | +| 6 | `last_hash` | byte string (32) | `compute_record_hash(last_record)` | +| 7 | `merkle_root` | byte string (32) | Root of Merkle tree over record hashes (see §5) | +| 8 | `created_ts` | integer | Bundle creation timestamp (Unix µs) | +| 9 | `signer_pubkey` | byte string (32) | Ed25519 public key of bundle creator | +| 10 | `bundle_sig` | byte string (64) | Ed25519 signature (see §3.1) | + +### 3.1 Signature Computation + +The signature covers all summary fields except `bundle_sig` itself: + +``` +summary_bytes = cbor2.dumps({ + 0: bundle_id, + 1: chain_id, + 2: range_start, + 3: range_end, + 4: record_count, + 5: first_hash, + 6: last_hash, + 7: merkle_root, + 8: created_ts, + 9: signer_pubkey, +}, canonical=True) + +bundle_sig = Ed25519_Sign(private_key, summary_bytes) +``` + +### 3.2 Verification Without Decryption + +An auditor verifies a bundle by: +1. Parse chain summary +2. `Ed25519_Verify(signer_pubkey, bundle_sig, summary_bytes)` — authentic summary +3. `record_count == range_end - range_start + 1` — count matches range +4. If previous bundles from the same `chain_id` exist, verify `first_hash` matches + the expected continuation + +The auditor now knows: "A chain with ID X contains records [start, end], the creator +signed this claim, and the Merkle root commits to specific record contents." All without +decrypting. + +## 4. Envelope Encryption + +### 4.1 Key Derivation + +Ed25519 signing keys are converted to X25519 Diffie-Hellman keys for encryption: + +``` +x25519_private = Ed25519_to_X25519_Private(ed25519_private_key) +x25519_public = Ed25519_to_X25519_Public(ed25519_public_key_bytes) +``` + +This uses the birational map between Ed25519 and X25519 curves, supported natively by +the `cryptography` library. + +### 4.2 DEK Generation + +A random 32-byte data encryption key (DEK) is generated per bundle: + +``` +dek = os.urandom(32) # AES-256 key +``` + +### 4.3 DEK Wrapping (Per Recipient) + +For each recipient, the DEK is wrapped using X25519 ECDH + HKDF + AES-256-GCM: + +``` +1. shared_secret = X25519_ECDH(sender_x25519_private, recipient_x25519_public) +2. derived_key = HKDF-SHA256( + ikm=shared_secret, + salt=bundle_id, # binds to this specific bundle + info=b"soosef-dek-wrap-v1", + length=32 + ) +3. wrapped_dek = AES-256-GCM_Encrypt( + key=derived_key, + nonce=os.urandom(12), + plaintext=dek, + aad=bundle_id # additional authenticated data + ) +``` + +### 4.4 Recipients Array + +CBOR array of recipient entries: + +```cbor +[ + { + 0: recipient_pubkey, # byte string (32) — Ed25519 public key + 1: wrap_nonce, # byte string (12) — AES-GCM nonce for DEK wrap + 2: wrapped_dek, # byte string (48) — encrypted DEK (32) + GCM tag (16) + }, + ... +] +``` + +### 4.5 Payload Encryption + +``` +1. records_cbor = cbor2.dumps([serialize_record(r) for r in records], canonical=True) +2. compressed = zstd.compress(records_cbor, level=3) +3. nonce = os.urandom(12) +4. ciphertext, tag = AES-256-GCM_Encrypt( + key=dek, + nonce=nonce, + plaintext=compressed, + aad=summary_bytes # binds ciphertext to this summary + ) +``` + +The `summary_bytes` (same bytes that are signed) are used as additional authenticated +data (AAD). This cryptographically binds the encrypted payload to the chain summary — +modifying the summary invalidates the decryption. + +### 4.6 Decryption + +A recipient decrypts a bundle: + +``` +1. Parse chain summary, verify bundle_sig +2. Find own pubkey in recipients array +3. shared_secret = X25519_ECDH(recipient_x25519_private, sender_x25519_public) + (sender_x25519_public derived from summary.signer_pubkey) +4. derived_key = HKDF-SHA256(shared_secret, salt=bundle_id, info=b"soosef-dek-wrap-v1") +5. dek = AES-256-GCM_Decrypt(derived_key, wrap_nonce, wrapped_dek, aad=bundle_id) +6. compressed = AES-256-GCM_Decrypt(dek, nonce, ciphertext, aad=summary_bytes) +7. records_cbor = zstd.decompress(compressed) +8. records = [deserialize_record(r) for r in cbor2.loads(records_cbor)] +9. Verify each record's signature and chain linkage +``` + +## 5. Merkle Tree + +The Merkle tree provides compact proofs that specific records are included in a bundle. + +### 5.1 Construction + +Leaves are the record hashes in chain order: + +``` +leaf[i] = compute_record_hash(records[i]) +``` + +Internal nodes: + +``` +node = SHA-256(left_child || right_child) +``` + +If the number of leaves is not a power of 2, the last leaf is promoted to the next level +(standard binary Merkle tree padding). + +### 5.2 Inclusion Proof + +An inclusion proof for record at index `i` is a list of `(sibling_hash, direction)` pairs +from the leaf to the root. Verification: + +``` +current = leaf[i] +for (sibling, direction) in proof: + if direction == "L": + current = SHA-256(sibling || current) + else: + current = SHA-256(current || sibling) +assert current == merkle_root +``` + +### 5.3 Usage + +- **Export bundles**: `merkle_root` in chain summary commits to exact record contents +- **Federation servers**: Build a separate Merkle tree over bundle hashes (see federation-protocol.md) + +These are two different trees: +1. **Record tree** (this section) — leaves are record hashes within a bundle +2. **Bundle tree** (federation) — leaves are bundle hashes across the federation log + +## 6. Steganographic Embedding + +Bundles can optionally be embedded in JPEG images using stegasoo's DCT steganography: + +``` +1. bundle_bytes = create_export_bundle(chain, start, end, private_key, recipients) +2. stego_image = stegasoo.encode( + carrier=carrier_image, + reference=reference_image, + file_data=bundle_bytes, + passphrase=passphrase, + embed_mode="dct", + channel_key=channel_key # optional + ) +``` + +Extraction: +``` +1. result = stegasoo.decode( + carrier=stego_image, + reference=reference_image, + passphrase=passphrase, + channel_key=channel_key + ) +2. bundle_bytes = result.file_data +3. assert bundle_bytes[:8] == b"SOOSEFX1" +``` + +### 6.1 Capacity Considerations + +DCT steganography has limited capacity relative to the carrier image size. Approximate +capacities: + +| Carrier Size | Approximate DCT Capacity | Records (est.) | +|---|---|---| +| 1 MP (1024x1024) | ~10 KB | ~20-40 records | +| 4 MP (2048x2048) | ~40 KB | ~80-160 records | +| 12 MP (4000x3000) | ~100 KB | ~200-400 records | + +Record size varies (~200-500 bytes each after CBOR serialization, before compression). +Zstd compression typically achieves 2-4x ratio on CBOR attestation data. Use +`check_capacity()` before embedding. + +### 6.2 Multiple Images + +For large export ranges, split across multiple bundles embedded in multiple carrier images. +Each bundle is self-contained with its own chain summary. The receiving side imports them +in any order — the chain indices and hashes enable reassembly. + +## 7. Recipient Management + +### 7.1 Adding Recipients + +Recipients are identified by their Ed25519 public keys. To encrypt a bundle for a +recipient, the creator needs only their public key (no shared secret setup required). + +### 7.2 Recipient Discovery + +Recipients' Ed25519 public keys can be obtained via: +- Direct exchange (QR code, USB transfer, verbal fingerprint verification) +- Federation server identity registry (when available) +- Verisoo's existing `peers.json` file + +### 7.3 Self-Encryption + +The bundle creator should always include their own public key in the recipients list. +This allows them to decrypt their own exports (e.g., when restoring from backup). + +## 8. Error Handling + +| Error | Cause | Response | +|---|---|---| +| Bad magic | Not a SOOSEFX1 bundle | Reject with `ExportError("not a SooSeF export bundle")` | +| Bad version | Unsupported format version | Reject with `ExportError("unsupported bundle version")` | +| Signature invalid | Tampered summary or wrong signer | Reject with `ExportError("bundle signature verification failed")` | +| No matching recipient | Decryptor's key not in recipients list | Reject with `ExportError("not an authorized recipient")` | +| GCM auth failure | Tampered ciphertext or wrong key | Reject with `ExportError("decryption failed — bundle may be corrupted")` | +| Decompression failure | Corrupted compressed data | Reject with `ExportError("decompression failed")` | +| Chain integrity failure | Records don't link correctly | Reject with `ChainIntegrityError(...)` after decryption | diff --git a/docs/architecture/federation-protocol.md b/docs/architecture/federation-protocol.md new file mode 100644 index 0000000..a1c5de5 --- /dev/null +++ b/docs/architecture/federation-protocol.md @@ -0,0 +1,565 @@ +# Federation Protocol Specification + +**Status**: Design +**Version**: 1 (protocol version) +**Last updated**: 2026-04-01 + +## 1. Overview + +The federation is a network of append-only log servers inspired by Certificate Transparency +(RFC 6962). Each server acts as a "blind notary" — it stores encrypted attestation bundles, +maintains a Merkle tree over them, and issues signed receipts proving when bundles were +received. Servers gossip with peers to ensure consistency and replicate data. + +Federation servers never decrypt attestation content. They operate at the "federation +member" permission tier: they can verify chain summaries and signatures, but not read +the underlying attestation data. + +## 2. Terminology + +| Term | Definition | +|---|---| +| **Bundle** | An encrypted export bundle (SOOSEFX1 format) containing chain records | +| **STH** | Signed Tree Head — a server's signed commitment to its current Merkle tree state | +| **Receipt** | A server-signed proof that a bundle was included in its log at a specific time | +| **Inclusion proof** | Merkle path from a leaf (bundle hash) to the tree root | +| **Consistency proof** | Proof that an older tree is a prefix of a newer tree (no entries removed) | +| **Gossip** | Peer-to-peer exchange of STHs and entries to maintain consistency | + +## 3. Server Merkle Tree + +### 3.1 Structure + +The server maintains a single append-only Merkle tree. Each leaf is the SHA-256 hash +of a received bundle's raw bytes: + +``` +leaf[i] = SHA-256(bundle_bytes[i]) +``` + +Internal nodes follow standard Merkle tree construction: +``` +node = SHA-256(0x01 || left || right) # internal node +leaf = SHA-256(0x00 || data) # leaf node (domain separation) +``` + +Domain separation prefixes (`0x00` for leaves, `0x01` for internal nodes) prevent +second-preimage attacks, following CT convention (RFC 6962 §2.1). + +### 3.2 Signed Tree Head (STH) + +After each append (or periodically in batch mode), the server computes and signs a new +tree head: + +```cbor +{ + 0: tree_size, # uint — number of leaves + 1: root_hash, # bytes[32] — Merkle tree root + 2: timestamp, # int — Unix µs, server's clock + 3: server_id, # text — server identifier (domain or pubkey fingerprint) + 4: server_pubkey, # bytes[32] — Ed25519 public key + 5: signature, # bytes[64] — Ed25519(cbor(fields 0-4)) +} +``` + +The STH is the server's signed commitment: "My tree has N entries with this root at this +time." Clients and peers can verify the signature and use consistency proofs to ensure +the tree only grows (never shrinks or forks). + +### 3.3 Inclusion Proof + +Proves a specific bundle is at index `i` in a tree of size `n`: + +``` +proof = [(sibling_hash, direction), ...] +``` + +Verification: +``` +current = SHA-256(0x00 || bundle_bytes) +for (sibling, direction) in proof: + if direction == "L": + current = SHA-256(0x01 || sibling || current) + else: + current = SHA-256(0x01 || current || sibling) +assert current == sth.root_hash +``` + +### 3.4 Consistency Proof + +Proves that tree of size `m` is a prefix of tree of size `n` (where `m < n`). This +guarantees the server hasn't removed or reordered entries. + +The proof is a list of intermediate hashes that, combined with the old root, reconstruct +the new root. Verification follows RFC 6962 §2.1.2. + +## 4. API Endpoints + +All endpoints use CBOR for request/response bodies. Content-Type: `application/cbor`. + +### 4.1 Submit Bundle + +``` +POST /v1/submit +``` + +**Request body**: Raw bundle bytes (application/octet-stream) + +**Processing**: +1. Verify magic bytes `b"SOOSEFX1"` and version +2. Parse chain summary +3. Verify `bundle_sig` against `signer_pubkey` +4. Compute `bundle_hash = SHA-256(0x00 || bundle_bytes)` +5. Check for duplicate (`bundle_hash` already in tree) — if duplicate, return existing receipt +6. Append `bundle_hash` to Merkle tree +7. Store bundle bytes (encrypted blob, as-is) +8. Generate and sign receipt + +**Response** (CBOR): +```cbor +{ + 0: bundle_id, # bytes[16] — from chain summary + 1: bundle_hash, # bytes[32] — leaf hash + 2: tree_size, # uint — tree size after inclusion + 3: tree_index, # uint — leaf index in tree + 4: timestamp, # int — Unix µs, server's reception time + 5: inclusion_proof, # array of bytes[32] — Merkle path + 6: sth, # map — current STH (see §3.2) + 7: server_id, # text — server identifier + 8: server_pubkey, # bytes[32] — Ed25519 public key + 9: receipt_sig, # bytes[64] — Ed25519(cbor(fields 0-8)) +} +``` + +**Auth**: Federation member token required. + +**Errors**: +- `400` — Invalid bundle format, bad signature +- `401` — Missing or invalid auth token +- `507` — Server storage full + +### 4.2 Get Signed Tree Head + +``` +GET /v1/sth +``` + +**Response** (CBOR): STH map (see §3.2) + +**Auth**: Public (no auth required). + +### 4.3 Get Consistency Proof + +``` +GET /v1/consistency-proof?old={m}&new={n} +``` + +**Parameters**: +- `old` — previous tree size (must be > 0) +- `new` — current tree size (must be >= old) + +**Response** (CBOR): +```cbor +{ + 0: old_size, # uint + 1: new_size, # uint + 2: proof, # array of bytes[32] +} +``` + +**Auth**: Public. + +### 4.4 Get Inclusion Proof + +``` +GET /v1/inclusion-proof?hash={hex}&tree_size={n} +``` + +**Parameters**: +- `hash` — hex-encoded bundle hash (leaf hash) +- `tree_size` — tree size for the proof (use current STH tree_size) + +**Response** (CBOR): +```cbor +{ + 0: tree_index, # uint — leaf index + 1: tree_size, # uint + 2: proof, # array of bytes[32] +} +``` + +**Auth**: Public. + +### 4.5 Get Entries + +``` +GET /v1/entries?start={s}&end={e} +``` + +**Parameters**: +- `start` — first tree index (inclusive) +- `end` — last tree index (inclusive) +- Maximum range: 1000 entries per request + +**Response** (CBOR): +```cbor +{ + 0: entries, # array of entry maps (see §4.5.1) +} +``` + +#### 4.5.1 Entry Map + +```cbor +{ + 0: tree_index, # uint + 1: bundle_hash, # bytes[32] + 2: chain_summary, # CBOR map (from bundle, unencrypted) + 3: encrypted_blob, # bytes — full SOOSEFX1 bundle + 4: receipt_ts, # int — Unix µs when received +} +``` + +**Auth**: Federation member token required. + +### 4.6 Audit Summary + +``` +GET /v1/audit/summary?bundle_id={hex} +``` + +Returns the chain summary for a specific bundle without the encrypted payload. + +**Response** (CBOR): +```cbor +{ + 0: bundle_id, # bytes[16] + 1: chain_summary, # CBOR map (from bundle) + 2: tree_index, # uint + 3: receipt_ts, # int + 4: inclusion_proof, # array of bytes[32] (against current STH) +} +``` + +**Auth**: Public. + +## 5. Permission Tiers + +### 5.1 Public Auditor + +**Access**: Unauthenticated. + +**Endpoints**: `/v1/sth`, `/v1/consistency-proof`, `/v1/inclusion-proof`, `/v1/audit/summary` + +**Can verify**: +- The log exists and has a specific size at a specific time +- A specific bundle is included in the log at a specific position +- The log has not been forked (consistency proofs between STHs) +- Chain summary metadata (record count, hash range) for any bundle + +**Cannot see**: Encrypted content, chain IDs, signer identities, raw bundles. + +### 5.2 Federation Member + +**Access**: Bearer token issued by server operator. Tokens are Ed25519-signed +credentials binding a public key to a set of permissions. + +```cbor +{ + 0: token_id, # bytes[16] — UUID v7 + 1: member_pubkey, # bytes[32] — member's Ed25519 public key + 2: permissions, # array of text — ["submit", "entries", "gossip"] + 3: issued_at, # int — Unix µs + 4: expires_at, # int — Unix µs (0 = no expiry) + 5: issuer_pubkey, # bytes[32] — server's Ed25519 public key + 6: signature, # bytes[64] — Ed25519(cbor(fields 0-5)) +} +``` + +**Endpoints**: All public endpoints + `/v1/submit`, `/v1/entries`, gossip endpoints. + +**Can see**: Everything a public auditor sees + chain IDs, signer public keys, full +encrypted bundles (but not decrypted content). + +### 5.3 Authorized Recipient + +Not enforced server-side. Recipients hold Ed25519 private keys whose corresponding +public keys appear in the bundle's recipients array. They can decrypt bundle content +locally after retrieving the encrypted blob via the entries endpoint. + +The server has no knowledge of who can or cannot decrypt a given bundle. + +## 6. Gossip Protocol + +### 6.1 Overview + +Federation servers maintain a list of known peers. Periodically (default: every 5 minutes), +each server initiates gossip with its peers to: + +1. Exchange STHs — detect if any peer has entries the local server doesn't +2. Verify consistency — ensure no peer is presenting a forked log +3. Sync entries — pull missing entries from peers that have them + +### 6.2 Gossip Flow + +``` +Server A Server B + │ │ + │── POST /v1/gossip/sth ──────────────>│ (A sends its STH) + │ │ + │<── response: B's STH ───────────────│ (B responds with its STH) + │ │ + │ (A compares tree sizes) │ + │ if B.tree_size > A.tree_size: │ + │ │ + │── GET /v1/consistency-proof ────────>│ (verify B's tree extends A's) + │<── proof ────────────────────────────│ + │ │ + │ (verify consistency proof) │ + │ │ + │── GET /v1/entries?start=...&end=... >│ (pull missing entries) + │<── entries ──────────────────────────│ + │ │ + │ (append entries to local tree) │ + │ (recompute STH) │ + │ │ +``` + +### 6.3 Gossip Endpoints + +``` +POST /v1/gossip/sth +``` + +**Request body** (CBOR): Sender's current STH. + +**Response** (CBOR): Receiver's current STH. + +**Auth**: Federation member token with `"gossip"` permission. + +### 6.4 Fork Detection + +If server A receives an STH from server B where: +- `B.tree_size <= A.tree_size` but `B.root_hash != A.root_hash` at the same size + +Then B is presenting a different history. This is a **fork** — a critical security event. +The server should: + +1. Log the fork with both STHs as evidence +2. Alert the operator +3. Continue serving its own tree (do not merge the forked tree) +4. Refuse to gossip further with the forked peer until operator resolution + +### 6.5 Convergence + +Under normal operation (no forks), servers converge to identical trees. The convergence +time depends on gossip interval and network topology. With a 5-minute interval and full +mesh topology among N servers, convergence after a new entry takes at most 5 minutes. + +## 7. Receipts + +### 7.1 Purpose + +A receipt is the federation's proof that a bundle was received and included in the log +at a specific time. It is the critical artifact that closes the timestamp gap: the +offline device's claimed timestamp + the federation receipt = practical proof of timing. + +### 7.2 Receipt Format + +```cbor +{ + 0: bundle_id, # bytes[16] — from chain summary + 1: bundle_hash, # bytes[32] — leaf hash in server's tree + 2: tree_size, # uint — tree size at inclusion + 3: tree_index, # uint — leaf position + 4: timestamp, # int — Unix µs, server's clock + 5: inclusion_proof, # array of bytes[32] — Merkle path + 6: sth, # map — STH at time of inclusion + 7: server_id, # text — server identifier + 8: server_pubkey, # bytes[32] — Ed25519 public key + 9: receipt_sig, # bytes[64] — Ed25519(cbor(fields 0-8)) +} +``` + +### 7.3 Receipt Verification + +To verify a receipt: + +1. `Ed25519_Verify(server_pubkey, receipt_sig, cbor(fields 0-8))` — receipt is authentic +2. Verify `inclusion_proof` against `sth.root_hash` with `bundle_hash` at `tree_index` +3. Verify `sth.signature` — the STH itself is authentic +4. `sth.tree_size >= tree_size` — STH covers the inclusion +5. `sth.timestamp >= timestamp` — STH is at or after receipt time + +### 7.4 Receipt Lifecycle + +``` +1. Loader submits bundle to federation server +2. Server issues receipt in submit response +3. Loader stores receipt locally (receipts/ directory) +4. Loader exports receipts to USB (CBOR file) +5. Offline device imports receipts +6. Receipt is stored alongside chain records as proof of federation timestamp +``` + +### 7.5 Multi-Server Receipts + +A bundle submitted to N servers produces N independent receipts. Each receipt is from a +different server with a different timestamp and Merkle tree position. Multiple receipts +strengthen the timestamp claim — an adversary would need to compromise all N servers to +suppress evidence. + +## 8. Storage Tiers + +Federation servers manage bundle storage across three tiers based on age: + +### 8.1 Hot Tier (0-30 days) + +- **Format**: Individual files, one per bundle +- **Location**: `data/hot/{tree_index}.bundle` +- **Access**: Direct file read, O(1) +- **Purpose**: Fast access for recent entries, active gossip sync + +### 8.2 Warm Tier (30-365 days) + +- **Format**: Zstd-compressed segments, 1000 bundles per segment +- **Location**: `data/warm/segment-{start}-{end}.zst` +- **Access**: Decompress segment, extract entry +- **Compression**: Zstd level 3 (fast compression, moderate ratio) +- **Purpose**: Reduced storage for medium-term retention + +### 8.3 Cold Tier (>1 year) + +- **Format**: Zstd-compressed segments, maximum compression +- **Location**: `data/cold/segment-{start}-{end}.zst` +- **Access**: Decompress segment, extract entry +- **Compression**: Zstd level 19 (slow compression, best ratio) +- **Purpose**: Archival storage, rarely accessed + +### 8.4 Tier Promotion + +A background compaction process runs periodically (default: every 24 hours): + +1. Identify hot entries older than 30 days +2. Group into segments of 1000 +3. Compress and write to warm tier +4. Delete hot files +5. Repeat for warm → cold at 365 days + +### 8.5 Merkle Tree Preservation + +The Merkle tree is independent of storage tiers. Leaf hashes and the tree structure +are maintained in a separate data structure (compact tree format, stored in SQLite or +flat file). Moving bundles between storage tiers does not affect the tree. + +Inclusion proofs and consistency proofs remain valid across tier promotions — they +reference the tree, not the storage location. + +### 8.6 Metadata Database + +SQLite database tracking all bundles: + +```sql +CREATE TABLE bundles ( + tree_index INTEGER PRIMARY KEY, + bundle_id BLOB NOT NULL, -- UUID v7 + bundle_hash BLOB NOT NULL, -- leaf hash + chain_id BLOB NOT NULL, -- source chain ID + signer_pubkey BLOB NOT NULL, -- Ed25519 public key + record_count INTEGER NOT NULL, -- records in bundle + range_start INTEGER NOT NULL, -- first chain index + range_end INTEGER NOT NULL, -- last chain index + receipt_ts INTEGER NOT NULL, -- Unix µs reception time + storage_tier TEXT NOT NULL DEFAULT 'hot', -- 'hot', 'warm', 'cold' + storage_key TEXT NOT NULL, -- file path or segment reference + created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now')) +); + +CREATE INDEX idx_bundles_bundle_id ON bundles(bundle_id); +CREATE INDEX idx_bundles_chain_id ON bundles(chain_id); +CREATE INDEX idx_bundles_bundle_hash ON bundles(bundle_hash); +CREATE INDEX idx_bundles_receipt_ts ON bundles(receipt_ts); +``` + +## 9. Server Configuration + +```json +{ + "server_id": "my-server.example.org", + "host": "0.0.0.0", + "port": 8443, + "data_dir": "/var/lib/soosef-federation", + "identity_key_path": "/etc/soosef-federation/identity/private.pem", + "peers": [ + { + "url": "https://peer1.example.org:8443", + "pubkey_hex": "abc123...", + "name": "Peer One" + } + ], + "gossip_interval_seconds": 300, + "hot_retention_days": 30, + "warm_retention_days": 365, + "compaction_interval_hours": 24, + "max_bundle_size_bytes": 10485760, + "max_entries_per_request": 1000, + "member_tokens": [ + { + "name": "loader-1", + "pubkey_hex": "def456...", + "permissions": ["submit", "entries"] + } + ] +} +``` + +## 10. Error Codes + +| HTTP Status | CBOR Error Code | Description | +|---|---|---| +| 400 | `"invalid_bundle"` | Bundle format invalid or signature verification failed | +| 400 | `"invalid_range"` | Requested entry range is invalid | +| 401 | `"unauthorized"` | Missing or invalid auth token | +| 403 | `"forbidden"` | Token lacks required permission | +| 404 | `"not_found"` | Bundle or entry not found | +| 409 | `"duplicate"` | Bundle already in log (returns existing receipt) | +| 413 | `"bundle_too_large"` | Bundle exceeds `max_bundle_size_bytes` | +| 507 | `"storage_full"` | Server cannot accept new entries | + +Error response format: +```cbor +{ + 0: error_code, # text + 1: message, # text — human-readable description + 2: details, # map — optional additional context +} +``` + +## 11. Security Considerations + +### 11.1 Server Compromise + +A compromised server can: +- Read bundle metadata (chain IDs, signer pubkeys, timestamps) — **expected at member tier** +- Withhold entries from gossip — **detectable**: other servers will see inconsistent tree sizes +- Present a forked tree — **detectable**: consistency proofs will fail +- Issue false receipts — **detectable**: receipt's inclusion proof won't verify against other servers' STHs + +A compromised server **cannot**: +- Read attestation content (encrypted with recipient keys) +- Forge attestation signatures (requires Ed25519 private key) +- Modify bundle contents (GCM authentication would fail) +- Reorder or remove entries from other servers' trees + +### 11.2 Transport Security + +All server-to-server and client-to-server communication should use TLS 1.3. The +federation protocol provides its own authentication (Ed25519 signatures on STHs and +receipts), but TLS prevents network-level attacks. + +### 11.3 Clock Reliability + +Federation server clocks should be synchronized via NTP. Receipt timestamps are only as +reliable as the server's clock. Deploying servers across multiple time zones and operators +provides cross-checks — wildly divergent receipt timestamps for the same bundle indicate +clock problems or compromise. diff --git a/docs/architecture/federation.md b/docs/architecture/federation.md new file mode 100644 index 0000000..00c1ab1 --- /dev/null +++ b/docs/architecture/federation.md @@ -0,0 +1,254 @@ +# Federated Attestation System — Architecture Overview + +**Status**: Design +**Version**: 0.1.0-draft +**Last updated**: 2026-04-01 + +## 1. Problem Statement + +SooSeF operates offline-first: devices create Ed25519-signed attestations without network +access. This creates two fundamental challenges: + +1. **Timestamp credibility** — An offline device's clock is untrusted. An adversary with + physical access could backdate or postdate attestations. +2. **Distribution** — Attestations trapped on a single device are vulnerable to seizure, + destruction, or loss. They must be replicated to survive. + +The system must solve both problems while preserving the offline-first constraint and +protecting content confidentiality even from the distribution infrastructure. + +## 2. Threat Model + +### Adversaries + +| Adversary | Capabilities | Goal | +|---|---|---| +| State actor | Physical device access, network surveillance, legal compulsion of server operators | Suppress or discredit attestations | +| Insider threat | Access to one federation server | Fork the log, selectively omit entries, or read content | +| Device thief | Physical access to offline device | Fabricate or backdate attestations | + +### Security Properties + +| Property | Guarantee | Mechanism | +|---|---|---| +| **Integrity** | Attestations cannot be modified after creation | Ed25519 signatures + hash chain | +| **Ordering** | Attestations cannot be reordered or inserted | Hash chain (each record includes hash of previous) | +| **Existence proof** | Attestation existed at or before time T | Federation receipt (server-signed timestamp) | +| **Confidentiality** | Content is hidden from infrastructure | Envelope encryption; servers store encrypted blobs | +| **Fork detection** | Log tampering is detectable | Merkle tree + consistency proofs (CT model) | +| **Availability** | Attestations survive device loss | Replication across federation servers via gossip | + +### Non-Goals + +- **Proving exact creation time** — Impossible without a trusted time source. We prove + ordering (hash chain) and existence-before (federation receipt). The gap between + claimed time and receipt time is the trust window. +- **Anonymity of attestors** — Federation members can see signer public keys. For anonymity, + use a dedicated identity per context. +- **Preventing denial of service** — Federation servers are assumed cooperative. Byzantine + fault tolerance is out of scope for v1. + +## 3. System Architecture + +``` + OFFLINE DEVICE AIR GAP INTERNET + ────────────── ─────── ──────── + + ┌──────────┐ ┌──────────────┐ ┌──────────┐ USB/SD ┌──────────┐ ┌────────────┐ + │ Verisoo │────>│ Hash Chain │────>│ Export │───────────────>│ Loader │────>│ Federation │ + │ Attest │ │ (Layer 1) │ │ Bundle │ │ (App) │ │ Server │ + └──────────┘ └──────────────┘ │ (Layer 2)│ └────┬─────┘ └─────┬──────┘ + └──────────┘ │ │ + │ │ ┌─────────┐ │ + Optional: │<───│ Receipt │<──┘ + DCT embed │ └─────────┘ + in JPEG │ + │ USB carry-back + ┌────v─────┐ │ + │ Stego │ ┌────v─────┐ + │ Image │ │ Offline │ + └──────────┘ │ Device │ + │ (receipt │ + │ stored) │ + └──────────┘ +``` + +### Layer 1: Hash-Chained Attestation Records (Local) + +Each attestation is wrapped in a chain record that includes: +- A hash of the previous record (tamper-evident ordering) +- Entropy witnesses (system uptime, kernel state) that make timestamp fabrication expensive +- An Ed25519 signature over the entire record + +The chain lives on the offline device at `~/.soosef/chain/`. It wraps existing Verisoo +attestation records — the Verisoo record's bytes become the `content_hash` input. + +**See**: [chain-format.md](chain-format.md) + +### Layer 2: Encrypted Export Bundles + +A range of chain records is packaged into a portable bundle: +1. Records serialized as CBOR, compressed with zstd +2. Encrypted with AES-256-GCM using a random data encryption key (DEK) +3. DEK wrapped per-recipient via X25519 ECDH (derived from Ed25519 identities) +4. An unencrypted `chain_summary` (record count, hash range, Merkle root, signature) allows + auditing without decryption + +Bundles can optionally be embedded in JPEG images via stegasoo's DCT steganography, +making them indistinguishable from normal photos on a USB stick. + +**See**: [export-bundle.md](export-bundle.md) + +### Layer 3: Federated Append-Only Log + +Federation servers are "blind notaries" inspired by Certificate Transparency (RFC 6962): +- They receive encrypted bundles, verify the chain summary signature, and append to a + Merkle tree +- They issue signed receipts with a federation timestamp (proof of existence) +- They gossip Signed Tree Heads (STH) with peers to ensure consistency +- They never decrypt content — they operate at the "federation member" permission tier + +**See**: [federation-protocol.md](federation-protocol.md) + +### The Loader (Air-Gap Bridge) + +A separate application that runs on an internet-connected machine: +1. Receives a bundle (from USB) or extracts one from a steganographic image +2. Validates the bundle signature and chain summary +3. Pushes to configured federation servers +4. Collects signed receipts +5. Receipts are carried back to the offline device on the next USB round-trip + +The loader never needs signing keys — bundles are already signed. It is a transport mechanism. + +## 4. Key Domains + +SooSeF maintains strict separation between two cryptographic domains: + +| Domain | Algorithm | Purpose | Key Location | +|---|---|---|---| +| **Signing** | Ed25519 | Attestation signatures, chain records, bundle summaries | `~/.soosef/identity/` | +| **Encryption** | X25519 + AES-256-GCM | Bundle payload encryption (envelope) | Derived from Ed25519 via birational map | +| **Steganography** | AES-256-GCM (from factors) | Stegasoo channel encryption | `~/.soosef/stegasoo/channel.key` | + +The signing and encryption domains share a key lineage (Ed25519 → X25519 derivation) but +serve different purposes. The steganography domain remains fully independent — it protects +the stego carrier, not the attestation content. + +### Private Key Storage Policy + +The Ed25519 private key is stored **unencrypted** on disk (protected by 0o600 file +permissions). This is a deliberate design decision: + +- The killswitch (secure deletion) is the primary defense for at-risk users, not key + encryption. A password-protected key would require prompting on every attestation and + chain operation, which is unworkable in field conditions. +- The `password` parameter on `generate_identity()` exists for interoperability but is + not used by default. Chain operations (`_wrap_in_chain`, `backfill`) assume unencrypted keys. +- If the device is seized, the killswitch destroys key material. If the killswitch is not + triggered, the adversary has physical access and can defeat key encryption via cold boot, + memory forensics, or compelled disclosure. + +## 5. Permission Model + +Three tiers control who can see what: + +| Tier | Sees | Can Verify | Typical Actor | +|---|---|---|---| +| **Public auditor** | Chain summaries, Merkle proofs, receipt timestamps | Existence, ordering, no forks | Anyone with server URL | +| **Federation member** | + chain IDs, signer public keys | + who attested, chain continuity | Peer servers, authorized monitors | +| **Authorized recipient** | + decrypted attestation content | Everything | Designated individuals with DEK access | + +Federation servers themselves operate at the **federation member** tier. They can verify +chain integrity and detect forks, but they cannot read attestation content. + +## 6. Timestamp Credibility + +Since the device is offline, claimed timestamps are inherently untrusted. The system +provides layered evidence: + +1. **Hash chain ordering** — Records are provably ordered. Even if timestamps are wrong, + the sequence is authentic. +2. **Entropy witnesses** — Each record includes system uptime, kernel entropy pool state, + and boot ID. Fabricating a convincing set of witnesses for a backdated record requires + simulating the full system state at the claimed time. +3. **Federation receipt** — When the bundle reaches a federation server, the server signs + a receipt with its own clock. This proves the chain existed at or before the receipt time. +4. **Cross-device corroboration** — If two independent devices attest overlapping events, + their independent chains corroborate each other's timeline. + +The trust window is: `receipt_timestamp - claimed_timestamp`. A smaller window means +more credible timestamps. Frequent USB sync trips shrink the window. + +## 7. Data Lifecycle + +``` +1. CREATE Offline device: attest file → sign → append to chain +2. EXPORT Offline device: select range → compress → encrypt → bundle (→ optional stego embed) +3. TRANSFER Physical media: USB/SD card carries bundle across air gap +4. LOAD Internet machine: validate bundle → push to federation +5. RECEIPT Federation server: verify → append to Merkle tree → sign receipt +6. RETURN Physical media: receipt carried back to offline device +7. REPLICATE Federation: servers gossip entries and STHs to each other +8. AUDIT Anyone: verify Merkle proofs, check consistency across servers +``` + +## 8. Failure Modes + +| Failure | Impact | Mitigation | +|---|---|---| +| Device lost/seized | Local chain lost | Bundles already sent to federation survive; regular exports reduce data loss window | +| USB intercepted | Adversary gets encrypted bundle (or stego image) | Encryption protects content; stego hides existence | +| Federation server compromised | Adversary reads metadata (chain IDs, pubkeys, timestamps) | Content remains encrypted; other servers detect log fork via consistency proofs | +| All federation servers down | No new receipts | Bundles queue on loader; chain integrity unaffected; retry when servers recover | +| Clock manipulation on device | Timestamps unreliable | Entropy witnesses increase fabrication cost; federation receipt provides external anchor | + +## 9. Dependencies + +| Package | Version | Purpose | +|---|---|---| +| `cbor2` | >=5.6.0 | Canonical CBOR serialization (RFC 8949) | +| `uuid-utils` | >=0.9.0 | UUID v7 generation (time-ordered) | +| `zstandard` | >=0.22.0 | Zstd compression for bundles and storage tiers | +| `cryptography` | >=41.0.0 | Ed25519, X25519, AES-256-GCM, HKDF, SHA-256 (already a dependency) | + +## 10. File Layout + +``` +src/soosef/federation/ + __init__.py + models.py Chain record and state dataclasses + serialization.py CBOR canonical encoding + entropy.py System entropy collection + chain.py ChainStore — local append-only chain + merkle.py Merkle tree implementation + export.py Export bundle creation/parsing/decryption + x25519.py Ed25519→X25519 derivation, envelope encryption + stego_bundle.py Steganographic bundle embedding + protocol.py Shared federation types + loader/ + __init__.py + loader.py Air-gap bridge application + config.py Loader configuration + receipt.py Receipt handling + client.py Federation server HTTP client + server/ + __init__.py + app.py Federation server Flask app + tree.py CT-style Merkle tree + storage.py Tiered bundle storage + gossip.py Peer synchronization + permissions.py Access control + config.py Server configuration + +~/.soosef/ + chain/ Local hash chain + chain.bin Append-only record log + state.cbor Chain state checkpoint + exports/ Generated export bundles + loader/ Loader state + config.json Server list and settings + receipts/ Federation receipts + federation/ Federation server data (when running as server) + servers.json Known peer servers +``` diff --git a/docs/evidence-guide.md b/docs/evidence-guide.md new file mode 100644 index 0000000..5f8e9ec --- /dev/null +++ b/docs/evidence-guide.md @@ -0,0 +1,283 @@ +# Evidence Guide + +**Audience**: Journalists, investigators, and legal teams who need to create, export, and +verify evidence packages from SooSeF. + +**Prerequisites**: A running SooSeF instance with at least one attested image or file. +Familiarity with basic CLI commands. + +--- + +## Overview + +SooSeF provides three mechanisms for preserving and sharing evidence outside a running +instance: evidence packages (for handing specific files to third parties), cold archives +(full-state preservation for 10+ year horizons), and selective disclosure (proving specific +records without revealing the rest of the chain). All three include standalone verification +scripts that require no SooSeF installation. + +--- + +## Evidence Packages + +An evidence package is a self-contained ZIP that bundles everything needed for independent +verification of specific attested images or files. + +### What is inside an evidence package + +| File | Purpose | +|---|---| +| `images/` | Original image files | +| `manifest.json` | Attestation records, chain data, and image hashes | +| `public_key.pem` | Signer's Ed25519 public key | +| `verify.py` | Standalone verification script | +| `README.txt` | Human-readable instructions | + +### Creating an evidence package + +```bash +# Package specific images with their attestation records +$ soosef evidence export photo1.jpg photo2.jpg --output evidence_package.zip + +# Filter by investigation tag +$ soosef evidence export photo1.jpg --investigation "case-2026-001" \ + --output evidence_case001.zip +``` + +### When to create evidence packages + +- Before handing evidence to a legal team or court +- When sharing with a partner organization that does not run SooSeF +- Before crossing a hostile checkpoint (create the package, send it to a trusted party, + then activate the killswitch if needed) +- When an investigation is complete and files must be archived independently + +### Verifying an evidence package + +The recipient does not need SooSeF. They need only Python 3.11+ and the `cryptography` +pip package: + +```bash +$ python3 -m venv verify-env +$ source verify-env/bin/activate +$ pip install cryptography +$ cd evidence_package/ +$ python verify.py +``` + +The verification script checks: + +1. Image SHA-256 hashes match the attestation records in `manifest.json` +2. Chain hash linkage is unbroken (each record's `prev_hash` matches the previous record) +3. Ed25519 signatures are valid (if `public_key.pem` is included) + +--- + +## Cold Archives + +A cold archive is a full snapshot of the entire SooSeF evidence store, designed for +long-term preservation aligned with OAIS (ISO 14721). It is self-describing and includes +everything needed to verify the evidence decades later, even if SooSeF no longer exists. + +### What is inside a cold archive + +| File | Purpose | +|---|---| +| `chain/chain.bin` | Raw append-only hash chain binary | +| `chain/state.cbor` | Chain state checkpoint | +| `chain/anchors/` | External timestamp anchor files (RFC 3161 tokens, manual anchors) | +| `attestations/log.bin` | Full verisoo attestation log | +| `attestations/index/` | LMDB index files | +| `keys/public.pem` | Signer's Ed25519 public key | +| `keys/bundle.enc` | Encrypted key bundle (optional, password-protected) | +| `keys/trusted/` | Trusted collaborator public keys | +| `manifest.json` | Archive metadata and SHA-256 integrity hashes | +| `verify.py` | Standalone verification script | +| `ALGORITHMS.txt` | Cryptographic algorithm documentation | +| `README.txt` | Human-readable description | + +### Creating a cold archive + +```bash +# Full archive without encrypted key bundle +$ soosef archive export --output archive_20260401.zip + +# Include encrypted key bundle (will prompt for passphrase) +$ soosef archive export --include-keys --output archive_20260401.zip +``` + +> **Warning:** If you include the encrypted key bundle, store the passphrase separately +> from the archive media. Write it on paper and keep it in a different physical location. + +### When to create cold archives + +- At regular intervals (weekly or monthly) as part of your backup strategy +- Before key rotation (locks the existing chain state in the archive) +- Before traveling with the device +- Before anticipated risk events (raids, border crossings, legal proceedings) +- When archiving a completed investigation + +### Restoring from a cold archive + +On a fresh SooSeF instance: + +```bash +$ soosef init +$ soosef archive import archive_20260401.zip +``` + +### Long-term archival best practices + +1. Store archives on at least two separate physical media (USB drives, optical discs) +2. Keep one copy offsite (safe deposit box, trusted third party in a different jurisdiction) +3. Include the encrypted key bundle in the archive with a strong passphrase +4. Periodically verify archive integrity: unzip and run `python verify.py` +5. The `ALGORITHMS.txt` file documents every algorithm and parameter used, so a verifier + can be written from scratch even if SooSeF no longer exists + +### The ALGORITHMS.txt file + +This file documents every cryptographic algorithm, parameter, and format used: + +- **Signing**: Ed25519 (RFC 8032) -- 32-byte public keys, 64-byte signatures +- **Hashing**: SHA-256 for content and chain linkage; pHash and dHash for perceptual image matching +- **Encryption (key bundle)**: AES-256-GCM with Argon2id key derivation (time_cost=4, memory_cost=256MB, parallelism=4) +- **Chain format**: Append-only binary log with uint32 BE length prefixes and CBOR (RFC 8949) records +- **Attestation log**: Verisoo binary log format + +--- + +## Selective Disclosure + +Selective disclosure produces a verifiable proof for specific chain records while keeping +others redacted. Designed for legal discovery, court orders, and FOIA responses. + +### How it works + +Selected records are included in full (content hash, content type, signature, metadata, +signer public key). Non-selected records appear only as their record hash and chain index. +The complete hash chain is included so a third party can verify that the disclosed records +are part of an unbroken chain without seeing the contents of other records. + +### Creating a selective disclosure + +```bash +# Disclose records at chain indices 5, 12, and 47 +$ soosef chain disclose --indices 5,12,47 --output disclosure.json +``` + +### Disclosure output format + +```json +{ + "proof_version": "1", + "chain_state": { + "chain_id": "a1b2c3...", + "head_index": 100, + "record_count": 101 + }, + "selected_records": [ + { + "chain_index": 5, + "content_hash": "...", + "content_type": "verisoo/attestation-v1", + "prev_hash": "...", + "record_hash": "...", + "signer_pubkey": "...", + "signature": "...", + "claimed_ts": 1711900000000000, + "metadata": {} + } + ], + "redacted_count": 98, + "hash_chain": [ + {"chain_index": 0, "record_hash": "...", "prev_hash": "..."}, + {"chain_index": 1, "record_hash": "...", "prev_hash": "..."} + ] +} +``` + +### Verifying a selective disclosure + +A third party can verify: + +1. **Chain linkage**: each entry in `hash_chain` has a `prev_hash` that matches the + `record_hash` of the previous entry +2. **Selected record integrity**: each selected record's `record_hash` matches its position + in the hash chain +3. **Signature validity**: each selected record's Ed25519 signature is valid for its + canonical byte representation + +### When to use selective disclosure vs. evidence packages + +| Need | Use | +|---|---| +| Hand specific images to a lawyer | Evidence package | +| Respond to a court order for specific records | Selective disclosure | +| Full-state backup for long-term preservation | Cold archive | +| Share attestations with a partner organization | Federation bundle | + +--- + +## Chain Anchoring for Evidence + +External timestamp anchoring strengthens evidence by proving that the chain existed before +a given time. A single anchor for the chain head implicitly timestamps every record that +preceded it, because the chain is append-only with hash linkage. + +### RFC 3161 automated anchoring + +If the device has internet access (even temporarily): + +```bash +$ soosef chain anchor --tsa https://freetsa.org/tsr +``` + +This sends the chain head digest to a Timestamping Authority, receives a signed timestamp +token, and saves both as a JSON file under `~/.soosef/chain/anchors/`. The TSA token is a +cryptographically signed proof from a third party that the hash existed at the stated time. +This is legally stronger than a self-asserted timestamp. + +### Manual anchoring (airgapped) + +Without `--tsa`: + +```bash +$ soosef chain anchor +``` + +This prints a compact text block. Publish it to any external witness: + +- Tweet or public social media post (timestamped by the platform) +- Email to a trusted third party (timestamped by the mail server) +- Newspaper classified advertisement +- Bitcoin OP_RETURN transaction +- Notarized document + +### Anchoring strategy for legal proceedings + +1. Anchor the chain before disclosing evidence to any party +2. Anchor at regular intervals (daily or weekly) to establish a timeline +3. Anchor before and after major events in an investigation +4. Anchor before key rotation +5. Save anchor files alongside key backups on separate physical media + +--- + +## Legal Discovery Workflow + +For responding to a court order, subpoena, or legal discovery request: + +1. **Selective disclosure** (`soosef chain disclose`) when the request specifies particular + records and you must not reveal the full chain +2. **Evidence package** when the request requires original images with verification + capability +3. **Cold archive** when full preservation is required (e.g., an entire investigation) + +All three formats include standalone verification scripts so the receiving party does not +need SooSeF installed. The verification scripts require only Python 3.11+ and the +`cryptography` pip package. + +> **Note:** Consult with legal counsel before producing evidence from SooSeF. The selective +> disclosure mechanism is designed to support legal privilege and proportionality, but its +> application depends on your jurisdiction and the specific legal context. diff --git a/docs/federation.md b/docs/federation.md new file mode 100644 index 0000000..f45895c --- /dev/null +++ b/docs/federation.md @@ -0,0 +1,317 @@ +# Federation Guide + +**Audience**: System administrators and technical leads setting up cross-organization +attestation sync between SooSeF instances. + +**Prerequisites**: A running SooSeF instance (Tier 2 org server or Tier 3 relay), familiarity +with the CLI, and trusted public keys from partner organizations. + +--- + +## Overview + +SooSeF federation synchronizes attestation records between organizations using a gossip +protocol. Nodes periodically exchange Merkle roots, detect divergence, and fetch missing +records. The system is eventually consistent with no central coordinator, no leader +election, and no consensus protocol -- just append-only logs that converge. + +Federation operates at two levels: + +1. **Offline bundles** -- JSON export/import via sneakernet (USB drive). Works on all tiers + including fully airgapped Tier 1 field devices. +2. **Live gossip** -- HTTP-based periodic sync between Tier 2 org servers and Tier 3 + federation relays. Requires the `federation` extra (`pip install soosef[federation]`). + +> **Warning:** Federation shares attestation records (image hashes, Ed25519 signatures, +> timestamps, and signer public keys). It never shares encryption keys, plaintext messages, +> original images, or steganographic payloads. + +--- + +## Architecture + +``` +Tier 1: Field Device Tier 2: Org Server A Tier 3: Relay (Iceland) +(Bootable USB) (Docker / mini PC) (VPS, zero key knowledge) + | | + USB sneakernet ------> Port 5000 (Web UI) | + Port 8000 (Federation API) <-----> Port 8000 + | | + Tier 2: Org Server B <------------> + (Docker / mini PC) +``` + +Federation traffic flows: + +- **Tier 1 to Tier 2**: USB sneakernet (offline bundles only) +- **Tier 2 to Tier 3**: gossip API over HTTPS (port 8000) +- **Tier 2 to Tier 2**: through a Tier 3 relay, or directly via sneakernet +- **Tier 3 to Tier 3**: gossip between relays in different jurisdictions + +--- + +## Gossip Protocol + +### How sync works + +1. Node A sends its Merkle root and log size to Node B via `GET /federation/status` +2. If roots differ and B has more records, A requests a consistency proof via + `GET /federation/consistency-proof?old_size=N` +3. If the proof verifies (B's log is a valid extension of A's), A fetches the missing + records via `GET /federation/records?start=N&count=50` +4. A appends the new records to its local log +5. B performs the same process in reverse (bidirectional sync) + +Records are capped at 100 per request to protect memory on resource-constrained devices. + +### Peer health tracking + +Each peer tracks: + +- `last_seen` -- timestamp of last successful contact +- `last_root` -- most recent Merkle root received from the peer +- `last_size` -- most recent log size +- `healthy` -- marked `false` after 3 consecutive failures +- `consecutive_failures` -- reset to 0 on success + +Unhealthy peers are skipped during gossip rounds but remain registered. They are retried +on the next full gossip round. Peer state persists in SQLite at +`~/.soosef/attestations/federation/peers.db`. + +### Gossip interval + +The default gossip interval is 60 seconds, configurable via the `VERISOO_GOSSIP_INTERVAL` +environment variable. In Docker Compose, set it in the environment section: + +```yaml +environment: + - VERISOO_GOSSIP_INTERVAL=60 +``` + +Lower intervals mean faster convergence but more network traffic. + +--- + +## Setting Up Federation + +### Step 1: Exchange trust keys + +Before two organizations can federate, they must trust each other's Ed25519 identity keys. +Always verify fingerprints out-of-band (in person or over a known-secure voice channel). + +On Organization A: + +```bash +$ cp ~/.soosef/identity/public.pem /media/usb/org-a-pubkey.pem +``` + +On Organization B: + +```bash +$ soosef keys trust --import /media/usb/org-a-pubkey.pem +``` + +Repeat in both directions so each organization trusts the other. + +> **Warning:** Do not skip fingerprint verification. If an adversary substitutes their +> own public key, they can forge attestation records that your instance will accept as +> trusted. + +### Step 2: Register peers (live gossip) + +Through the web UI at `/federation`, or via the peer store directly: + +```bash +# On Org A's server, register Org B's federation endpoint +$ soosef federation peer add \ + --url https://orgb.example.org:8000 \ + --fingerprint a1b2c3d4e5f6... +``` + +Or through the web UI: + +1. Navigate to `/federation` +2. Enter the peer's federation API URL and Ed25519 fingerprint +3. Click "Add Peer" + +### Step 3: Start the gossip loop + +The gossip loop starts automatically when the server starts. On Docker deployments, the +federation API runs on port 8000. Ensure this port is accessible between peers (firewall, +security groups, etc.). + +For manual one-time sync: + +```bash +$ soosef federation sync --peer https://orgb.example.org:8000 +``` + +### Step 4: Monitor sync status + +The web UI at `/federation` shows: + +- Local node status (Merkle root, log size, record count) +- Registered peers with health indicators +- Recent sync history (records received, errors) + +--- + +## Offline Federation (Sneakernet) + +For Tier 1 field devices and airgapped environments, use offline bundles. + +### Exporting a bundle + +```bash +$ soosef chain export --output /media/usb/bundle.zip +``` + +To export only records from a specific investigation: + +```bash +$ soosef chain export --investigation "case-2026-001" --output /media/usb/bundle.zip +``` + +To export a specific index range: + +```bash +$ soosef chain export --start 100 --end 200 --output /media/usb/partial.zip +``` + +### Importing a bundle + +On the receiving instance: + +```bash +$ soosef chain import /media/usb/bundle.zip +``` + +During import: + +- Records signed by untrusted fingerprints are rejected +- Duplicate records (matching SHA-256) are skipped +- Imported records are tagged with `federated_from` metadata +- A delivery acknowledgment record (`soosef/delivery-ack-v1`) is automatically appended + to the local chain + +### Delivery acknowledgments + +When a bundle is imported, SooSeF signs a `soosef/delivery-ack-v1` chain record that +contains: + +- The SHA-256 of the imported bundle file +- The sender's fingerprint +- The count of records received + +This acknowledgment can be exported back to the sending organization as proof that the +bundle was delivered and ingested. It creates a two-way federation handshake. + +```bash +# On receiving org: export the acknowledgment back +$ soosef chain export --start --end \ + --output /media/usb/delivery-ack.zip +``` + +--- + +## Federation API Endpoints + +The federation API is served by FastAPI/uvicorn on port 8000. + +| Method | Endpoint | Description | +|---|---|---| +| `GET` | `/federation/status` | Current Merkle root and log size | +| `GET` | `/federation/records?start=N&count=M` | Fetch attestation records (max 100) | +| `GET` | `/federation/consistency-proof?old_size=N` | Merkle consistency proof | +| `POST` | `/federation/records` | Push records to this node | +| `GET` | `/health` | Health check | + +### Trust filtering on push + +When records are pushed via `POST /federation/records`, the receiving node checks each +record's `attestor_fingerprint` against its trust store. Records from unknown attestors +are rejected. If no trust store is configured (empty trusted keys), all records are +accepted (trust-on-first-use). + +--- + +## Federation Relay (Tier 3) + +The federation relay is a minimal Docker container that runs only the federation API. + +### What the relay stores + +- Attestation records: image SHA-256 hashes, perceptual hashes, Ed25519 signatures +- Chain linkage data: prev_hash, chain_index, claimed_ts +- Signer public keys + +### What the relay never sees + +- AES-256-GCM channel keys or Ed25519 private keys +- Original images or media files +- Steganographic payloads or plaintext messages +- User credentials or session data +- Web UI content + +### Deploying a relay + +```bash +$ cd deploy/docker +$ docker compose up relay -d +``` + +The relay listens on port 8001 (mapped to internal 8000). See `docs/deployment.md` +Section 3 for full deployment details. + +### Jurisdiction considerations + +Deploy relays in jurisdictions with strong press freedom protections: + +- **Iceland** -- strong source protection laws, no mandatory data retention for this type of data +- **Switzerland** -- strict privacy laws, resistance to foreign legal requests +- **Netherlands** -- strong press freedom, EU GDPR protections + +Consult with a press freedom lawyer for your specific situation. + +--- + +## Troubleshooting + +**Peer marked unhealthy** + +After 3 consecutive sync failures, a peer is marked unhealthy and skipped. Check: + +1. Is the peer's federation API reachable? `curl https://peer.example.org:8000/health` +2. Is TLS configured correctly? The peer's API must be accessible over HTTPS in production. +3. Are firewall rules open for port 8000? +4. The peer will be retried on subsequent gossip rounds. Once a sync succeeds, the peer + is marked healthy again. + +**Records rejected on import** + +Records are rejected if the signer's fingerprint is not in the local trust store. Import +the sender's public key first: + +```bash +$ soosef keys trust --import /path/to/sender-pubkey.pem +``` + +**Consistency proof failure** + +A consistency proof failure means the peer's log is not a valid extension of the local log. +This indicates a potential fork -- the peer may have a different chain history. Investigate +before proceeding: + +1. Compare chain heads: `soosef chain status` on both instances +2. If a fork is confirmed, one instance's records must be exported and re-imported into a + fresh chain + +**Gossip not starting** + +The gossip loop requires the `federation` extra: + +```bash +$ pip install "soosef[federation]" +``` + +This installs `aiohttp` for async HTTP communication. diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 0000000..90b7968 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,34 @@ +# SooSeF Documentation + +## For Reporters and Field Users + +| Document | Description | +|---|---| +| [Reporter Quick-Start](training/reporter-quickstart.md) | One-page card for Tier 1 USB users. Print, laminate, keep with the USB. | +| [Reporter Field Guide](training/reporter-field-guide.md) | Comprehensive guide: attesting photos, steganography, killswitch, backups, evidence packages. | +| [Emergency Card](training/emergency-card.md) | Wallet-sized reference for emergency data destruction. Print and laminate. | + +## For Administrators + +| Document | Description | +|---|---| +| [Admin Quick Reference](training/admin-reference.md) | CLI cheat sheet, hardening checklist, troubleshooting table. | +| [Admin Operations Guide](training/admin-operations-guide.md) | Full procedures: user management, drop box, federation, key rotation, incident response. | +| [Deployment Guide](deployment.md) | Three-tier deployment: bootable USB, Docker org server, Kubernetes federation relay. Threat level presets, security hardening, systemd setup. | + +## Feature Guides + +| Document | Description | +|---|---| +| [Federation Guide](federation.md) | Gossip protocol setup, offline bundles, peer management, relay deployment. | +| [Evidence Guide](evidence-guide.md) | Evidence packages, cold archives, selective disclosure, chain anchoring, legal discovery workflow. | +| [Source Drop Box](source-dropbox.md) | Anonymous file intake: tokens, EXIF pipeline, receipt codes, operational security. | + +## Architecture (Developer Reference) + +| Document | Description | +|---|---| +| [Federation Architecture](architecture/federation.md) | System design: threat model, layers (chain, bundles, federation), key domains, permission tiers. | +| [Chain Format Spec](architecture/chain-format.md) | CBOR record format, entropy witnesses, serialization, storage format, content types. | +| [Export Bundle Spec](architecture/export-bundle.md) | SOOSEFX1 binary format, envelope encryption (X25519 + AES-256-GCM), Merkle trees. | +| [Federation Protocol Spec](architecture/federation-protocol.md) | CT-inspired server protocol: API endpoints, gossip, storage tiers, receipts, security model. | diff --git a/docs/source-dropbox.md b/docs/source-dropbox.md new file mode 100644 index 0000000..65fc033 --- /dev/null +++ b/docs/source-dropbox.md @@ -0,0 +1,220 @@ +# Source Drop Box Setup Guide + +**Audience**: Administrators setting up SooSeF's anonymous source intake feature. + +**Prerequisites**: A running SooSeF instance with web UI enabled (`soosef[web]` extra), +an admin account, and HTTPS configured (self-signed is acceptable). + +--- + +## Overview + +The source drop box is a SecureDrop-style anonymous file intake built into the SooSeF web +UI. Admins create time-limited upload tokens, sources open the token URL in a browser and +submit files without creating an account. Files are processed through the extract-then-strip +EXIF pipeline and automatically attested on receipt. Sources receive HMAC-derived receipt +codes that prove delivery. + +> **Warning:** The drop box protects source identity through design -- no accounts, no +> branding, no IP logging. However, the security of the system depends on how the upload URL +> is shared. Never send drop box URLs over unencrypted email or SMS. + +--- + +## How It Works + +``` +Admin Source SooSeF Server + | | | + |-- Create token ------------->| | + | (label, expiry, max_files) | | + | | | + |-- Share URL (secure channel) | | + | | | + | |-- Open URL in browser --------->| + | | (no login required) | + | | | + | |-- Select files | + | | Browser computes SHA-256 | + | | (SubtleCrypto, client-side) | + | | | + | |-- Upload files ---------------->| + | | |-- Extract EXIF + | | |-- Strip metadata + | | |-- Attest originals + | | |-- Save stripped copy + | | | + | |<-- Receipt codes ---------------| + | | (HMAC of file hash + token) | +``` + +--- + +## Setting Up the Drop Box + +### Step 1: Ensure HTTPS is enabled + +The drop box should always be served over HTTPS. Sources must be able to trust that their +connection is not being intercepted. + +```bash +$ soosef serve --host 0.0.0.0 +``` + +SooSeF auto-generates a self-signed certificate on first HTTPS start. For production use, +place a reverse proxy with a proper TLS certificate in front of SooSeF. + +### Step 2: Create an upload token + +Navigate to `/dropbox/admin` in the web UI (admin login required), or use the admin panel. + +Each token has: + +| Field | Default | Description | +|---|---|---| +| **Label** | "Unnamed source" | Human-readable name for the source (stored server-side only, never shown to the source) | +| **Expiry** | 24 hours | How long the upload link remains valid | +| **Max files** | 10 | Maximum number of uploads allowed on this link | + +After creating the token, the admin receives a URL of the form: + +``` +https://:/dropbox/upload/ +``` + +The token is a 32-byte cryptographically random URL-safe string. + +### Step 3: Share the URL with the source + +Share the upload URL over an already-secure channel: + +- **Best**: in person, on paper +- **Good**: encrypted messaging (Signal, Wire) +- **Acceptable**: verbal dictation over a secure voice call +- **Never**: unencrypted email, SMS, or any channel that could be intercepted + +### Step 4: Source uploads files + +The source opens the URL in their browser. The upload page is minimal -- no SooSeF branding, +no identifying marks, generic styling. The page works over Tor Browser with JavaScript +enabled (no external resources, no CDN, no fonts, no analytics). + +When files are selected: + +1. The browser computes SHA-256 fingerprints client-side using SubtleCrypto +2. The source sees the fingerprints and is prompted to save them before uploading +3. On upload, the server processes each file through the extract-then-strip pipeline +4. The source receives receipt codes for each file + +### Step 5: Monitor submissions + +The admin panel at `/dropbox/admin` shows: + +- Active tokens with their usage counts +- Token expiry times +- Ability to revoke tokens immediately + +--- + +## The Extract-Then-Strip Pipeline + +Every file uploaded through the drop box is processed through SooSeF's EXIF pipeline: + +1. **Extract**: all EXIF metadata is read from the original image bytes +2. **Classify**: fields are split into evidentiary (GPS coordinates, capture timestamp -- + valuable for provenance) and dangerous (device serial number, firmware version -- could + identify the source's device) +3. **Attest**: the original bytes are attested (Ed25519 signed) with evidentiary metadata + included in the attestation record. The attestation hash matches what the source actually + submitted. +4. **Strip**: all metadata is removed from the stored copy. The stripped copy is saved to + disk. No device fingerprint persists on the server's storage. + +This resolves the tension between protecting the source (strip device-identifying metadata) +and preserving evidence (retain GPS and timestamp for provenance). + +--- + +## Receipt Codes + +Each uploaded file generates an HMAC-derived receipt code: + +``` +receipt_code = HMAC-SHA256(token, file_sha256)[:16] +``` + +The receipt code proves: + +- The server received the specific file (tied to the file's SHA-256) +- The file was received under the specific token (tied to the token value) + +Sources can verify their receipt by posting it to `/dropbox/verify-receipt`. This returns +the filename, SHA-256, and reception timestamp if the receipt is valid. + +> **Note:** Receipt codes are deterministic. The source can compute the expected receipt +> themselves if they know the token value and the file's SHA-256 hash, providing +> independent verification. + +--- + +## Client-Side SHA-256 + +The upload page computes SHA-256 fingerprints in the browser before upload using the +SubtleCrypto Web API. This gives the source a verifiable record of exactly what they +submitted -- the hash is computed on their device, not the server. + +The source should save these fingerprints before uploading. If the server later claims to +have received different content, the source can prove what they actually submitted by +comparing their locally computed hash with the server's receipt. + +--- + +## Storage + +| What | Where | +|---|---| +| Uploaded files (stripped) | `~/.soosef/temp/dropbox/` (mode 0700) | +| Token metadata | `~/.soosef/auth/dropbox.db` (SQLite) | +| Receipt codes | `~/.soosef/auth/dropbox.db` (SQLite) | +| Attestation records | `~/.soosef/attestations/` (standard attestation log) | + +Expired tokens are cleaned up automatically on every admin page load. + +--- + +## Operational Security + +### Source safety + +- **No SooSeF branding** on the upload page. Generic "Secure File Upload" title. +- **No authentication required** -- sources never create accounts or reveal identity. +- **No IP logging** -- SooSeF does not log source IP addresses. Ensure your reverse proxy + (if any) also does not log access requests to `/dropbox/upload/` paths. +- **Self-contained page** -- inline CSS and JavaScript only. No external resources, CDN + calls, web fonts, or analytics. Works with Tor Browser. +- **CSRF exempt** -- the upload endpoint does not require CSRF tokens because sources do + not have sessions. + +### Token management + +- **Short expiry** -- set token expiry as short as practical. 24 hours is the default; for + high-risk sources, consider 1-4 hours. +- **Low file limits** -- set `max_files` to the expected number of submissions. + Once reached, the link stops accepting uploads. +- **Revoke immediately** -- if a token is compromised or no longer needed, revoke it from + the admin panel. This deletes the token and all associated receipt records from SQLite. +- **Audit trail** -- token creation events are logged to `~/.soosef/audit.jsonl` with the + action `dropbox.token_created`. + +### Running as a Tor hidden service + +For maximum source protection, run SooSeF as a Tor hidden service: + +1. Install Tor on the server +2. Configure a hidden service in `torrc` pointing to `127.0.0.1:5000` +3. Share the `.onion` URL instead of a LAN address +4. The source's real IP is never visible to the server + +> **Warning:** Even with Tor, timing analysis and traffic correlation attacks are possible +> at the network level. The drop box protects source identity at the application layer; +> network-layer protection requires operational discipline beyond what software can provide. diff --git a/docs/training/admin-operations-guide.md b/docs/training/admin-operations-guide.md new file mode 100644 index 0000000..8064a7b --- /dev/null +++ b/docs/training/admin-operations-guide.md @@ -0,0 +1,513 @@ +# SooSeF Admin Operations Guide + +**Audience**: IT administrators, system operators, and technically competent journalists +responsible for deploying, configuring, and maintaining SooSeF instances for their +organization. + +**Prerequisites**: Familiarity with Linux command line, Docker basics, and SSH. For Tier 1 +USB builds, familiarity with Debian `live-build`. + +--- + +## Overview + +This guide covers the operational tasks an admin performs after initial deployment. For +installation and deployment, see [deployment.md](../deployment.md). For architecture +details, see [docs/architecture/](../architecture/). + +Your responsibilities as a SooSeF admin: + +1. Deploy and maintain SooSeF instances (Tier 1 USB, Tier 2 server, Tier 3 relay) +2. Manage user accounts and access +3. Configure threat level presets for your environment +4. Manage the source drop box +5. Set up and maintain federation between organizations +6. Monitor system health and perform backups +7. Respond to security incidents + +--- + +## 1. User Management + +### Creating User Accounts + +On first start, the web UI prompts for the first admin account. Additional users are +created through the **Admin** panel at `/admin/`. + +Each user has: +- A username and password (stored as Argon2id hashes in SQLite) +- An admin flag (admin users can manage other accounts and the drop box) + +### Password Resets + +From the admin panel, issue a temporary password for a locked-out user. The user should +change it on next login. All password resets are recorded in the audit log +(`~/.soosef/audit.jsonl`). + +### Account Lockout + +After `login_lockout_attempts` (default: 5) failed logins, the account is locked for +`login_lockout_minutes` (default: 15). Lockout state is in-memory and clears on server +restart. + +For persistent lockout (e.g., a compromised account), delete the user from the admin panel. + +### Audit Trail + +All admin actions are logged to `~/.soosef/audit.jsonl` in JSON-lines format: + +```json +{"timestamp": "2026-04-01T12:00:00+00:00", "actor": "admin", "action": "user.create", "target": "user:reporter1", "outcome": "success", "source": "web"} +``` + +Actions logged: `user.create`, `user.delete`, `user.password_reset`, +`key.channel.generate`, `key.identity.generate`, `killswitch.fire` + +> **Warning**: The audit log is destroyed by the killswitch. This is intentional -- +> in a field compromise, data destruction takes precedence over audit preservation. + +--- + +## 2. Threat Level Configuration + +SooSeF ships four presets at `deploy/config-presets/`. Select based on your operational +environment. + +### Applying a Preset + +```bash +$ cp deploy/config-presets/high-threat.json ~/.soosef/config.json +``` + +Restart the server to apply. + +### Preset Summary + +| Preset | Session Timeout | Killswitch | Dead Man's Switch | USB Monitor | Cover Name | +|---|---|---|---|---|---| +| **Low** (press freedom) | 30 min | Off | Off | Off | None | +| **Medium** (restricted press) | 15 min | On | 48h / 4h grace | On | "Office Document Manager" | +| **High** (conflict zone) | 5 min | On | 12h / 1h grace | On | "Local Inventory Tracker" | +| **Critical** (targeted surveillance) | 3 min | On | 6h / 1h grace | On | "System Statistics" | + +### Custom Configuration + +Edit `~/.soosef/config.json` directly. All fields have defaults. Key fields for security: + +| Field | What It Controls | +|---|---| +| `host` | Bind address. `127.0.0.1` = local only; `0.0.0.0` = LAN access | +| `session_timeout_minutes` | How long before idle sessions expire | +| `killswitch_enabled` | Whether the software killswitch is available | +| `deadman_enabled` | Whether the dead man's switch is active | +| `deadman_interval_hours` | Hours between required check-ins | +| `deadman_grace_hours` | Grace period after missed check-in before auto-purge | +| `deadman_warning_webhook` | URL to POST a JSON warning during grace period | +| `cover_name` | CN for the self-signed TLS certificate (cover/duress mode) | +| `backup_reminder_days` | Days before `soosef status` warns about overdue backups | + +> **Warning**: Setting `auth_enabled: false` disables all login requirements. Never +> do this on a network-accessible instance. + +--- + +## 3. Source Drop Box Operations + +The drop box provides SecureDrop-style anonymous file intake. + +### Creating Upload Tokens + +1. Go to `/dropbox/admin` in the web UI (admin account required) +2. Set a **label** (internal only -- the source never sees this) +3. Set **expiry** in hours (default: 24) +4. Set **max files** (default: 10) +5. Click **Create Token** + +You receive a URL like `https://:/dropbox/upload/`. + +### Sharing URLs With Sources + +Share the URL over an already-secure channel only: + +- **Best**: Hand-written on paper, in person +- **Good**: Signal, Wire, or other end-to-end encrypted messenger +- **Acceptable**: Encrypted email (PGP/GPG) +- **Never**: Unencrypted email, SMS, or any channel you do not control + +### What Happens When a Source Uploads + +1. The source opens the URL in any browser (no account needed, no SooSeF branding) +2. Their browser computes SHA-256 hashes client-side before upload (SubtleCrypto) +3. Files are uploaded and processed: + - EXIF metadata is extracted (evidentiary fields: GPS, timestamp) + - All metadata is stripped from the stored copy (protects source device info) + - The original bytes are attested (signed) before stripping +4. The source receives a receipt code (HMAC of file hash + token) +5. Files are stored in `~/.soosef/temp/dropbox/` with mode 0700 + +### Revoking Tokens + +From `/dropbox/admin`, click **Revoke** on any active token. The token is immediately +deleted from the database. Any source with the URL can no longer upload. + +### Receipt Verification + +Sources can verify their submission was received at `/dropbox/verify-receipt` by entering +their receipt code. This returns the filename, SHA-256, and reception timestamp. + +### Operational Security + +- The upload page has no SooSeF branding -- it is a minimal HTML form +- No external resources are loaded (no CDN, fonts, analytics) -- Tor Browser compatible +- SooSeF does not log source IP addresses +- If using a reverse proxy (nginx, Caddy), disable access logging for `/dropbox/upload/` +- Tokens auto-expire and are cleaned up on every admin page load +- For maximum source protection, run SooSeF as a Tor hidden service + +### Storage Management + +Uploaded files accumulate in `~/.soosef/temp/dropbox/`. Periodically review and process +submissions, then remove them from the temp directory. The files are not automatically +cleaned up (they persist until you act on them or the killswitch fires). + +--- + +## 4. Key Management + +### Two Key Domains + +SooSeF manages two independent key types: + +| Key | Algorithm | Location | Purpose | +|---|---|---|---| +| **Identity key** | Ed25519 | `~/.soosef/identity/` | Sign attestations, chain records | +| **Channel key** | AES-256-GCM (Argon2id-derived) | `~/.soosef/stegasoo/channel.key` | Steganographic encoding | + +These are never merged. Rotating one does not affect the other. + +### Key Rotation + +**Identity rotation** archives the old keypair and generates a new one. If the chain is +enabled, a `soosef/key-rotation-v1` record is signed by the OLD key, creating a +verifiable trust chain. + +```bash +$ soosef keys rotate-identity +``` + +After rotating, immediately: +1. Take a fresh backup (`soosef keys export`) +2. Notify all collaborators of the new fingerprint +3. Update trusted-key lists at partner organizations + +**Channel rotation** archives the old key and generates a new one: + +```bash +$ soosef keys rotate-channel +``` + +After rotating, share the new channel key with all stego correspondents. + +### Trust Store + +Import collaborator public keys so you can verify their attestations and accept their +federation bundles: + +```bash +$ soosef keys trust --import /media/usb/partner-pubkey.pem +``` + +Always verify fingerprints out-of-band (in person or over a known-secure voice channel). + +List trusted keys: + +```bash +$ soosef keys show +``` + +Remove a trusted key: + +```bash +$ soosef keys untrust +``` + +### Backup Schedule + +SooSeF warns when backups are overdue (configurable via `backup_reminder_days`). + +```bash +# Create encrypted backup +$ soosef keys export -o /media/usb/backup.enc + +# Check backup status +$ soosef status +``` + +Store backups on separate physical media, in a different location from the device. + +--- + +## 5. Federation Setup + +Federation allows multiple SooSeF instances to exchange attestation records. + +### Adding Federation Peers + +**Through the web UI:** Go to `/federation/`, click **Add Peer**, enter the peer's URL +and Ed25519 fingerprint. + +**Through the CLI or peer_store:** + +```bash +# Peers are managed via the web UI or programmatically through PeerStore +``` + +### Trust Key Exchange + +Before two organizations can federate, exchange public keys: + +1. Export your public key: `cp ~/.soosef/identity/public.pem /media/usb/our-pubkey.pem` +2. Give it to the partner organization (physical handoff or secure channel) +3. Import their key: `soosef keys trust --import /media/usb/their-pubkey.pem` +4. Verify fingerprints out-of-band + +### Exporting Attestation Bundles + +```bash +# Export all records +$ soosef chain export --output /media/usb/bundle.zip + +# Export a specific range +$ soosef chain export --start 100 --end 200 --output /media/usb/bundle.zip + +# Export filtered by investigation +# (investigation tag is set during attestation) +``` + +### Importing Attestation Bundles + +On the receiving instance, imported records are: +- Verified against the trust store (untrusted signers are rejected) +- Deduplicated by SHA-256 (existing records are skipped) +- Tagged with `federated_from` metadata +- Acknowledged via a delivery-ack chain record (two-way handshake) + +### Gossip Sync (Tier 2 <-> Tier 3) + +If the Tier 2 server and Tier 3 relay have network connectivity, gossip sync runs +automatically at the configured interval (default: 60 seconds, set via +`VERISOO_GOSSIP_INTERVAL` environment variable). + +Gossip flow: +1. Nodes exchange Merkle roots +2. If roots differ, request consistency proof +3. Fetch missing records +4. Append to local log + +Monitor sync status at `/federation/` in the web UI. + +### Airgapped Federation + +All federation is designed for sneakernet operation: + +1. Export bundle to USB on sending instance +2. Physically carry USB to receiving instance +3. Import bundle +4. Optionally export delivery acknowledgment back on USB + +No network connectivity is required at any point. + +--- + +## 6. Chain and Anchoring + +### Chain Verification + +Verify the full chain periodically: + +```bash +$ soosef chain verify +``` + +This checks all hash linkage and Ed25519 signatures. It also verifies key rotation +records and tracks authorized signers. + +### Timestamp Anchoring + +Anchor the chain head to prove it existed before a given time: + +```bash +# Automated (requires network) +$ soosef chain anchor --tsa https://freetsa.org/tsr + +# Manual (prints hash for external submission) +$ soosef chain anchor +``` + +A single anchor implicitly timestamps every prior record (the chain is append-only). + +**When to anchor:** +- Before sharing evidence with third parties +- At regular intervals (daily or weekly) +- Before key rotation +- Before and after major investigations + +### Selective Disclosure + +For legal discovery or court orders, produce a proof showing specific records while +keeping others redacted: + +```bash +$ soosef chain disclose -i 42,43,44 -o disclosure.json +``` + +The output includes full records for selected indices and hash-only entries for everything +else. A third party can verify the selected records are part of an unbroken chain. + +--- + +## 7. Evidence Preservation + +### Evidence Packages + +For handing evidence to lawyers, courts, or organizations without SooSeF: + +Self-contained ZIP containing original images, attestation records, chain data, your +public key, a standalone `verify.py`, and a README. The recipient verifies with: + +```bash +$ pip install cryptography +$ python verify.py +``` + +### Cold Archives + +For long-term preservation (10+ year horizon), OAIS-aligned: + +Full state export including chain binary, attestation log, LMDB index, anchors, public +key, trusted keys, optional encrypted key bundle, `ALGORITHMS.txt`, and `verify.py`. + +**When to create cold archives:** +- Weekly or monthly as part of backup strategy +- Before key rotation or travel +- When archiving a completed investigation + +Store on at least two separate physical media in different locations. + +--- + +## 8. Monitoring and Health + +### Health Endpoint + +```bash +# Web UI +$ curl -k https://127.0.0.1:5000/health + +# Federation API +$ curl http://localhost:8000/health +``` + +Returns capabilities (stego-lsb, stego-dct, attest, fieldkit, chain). + +### System Status + +```bash +$ soosef status --json +``` + +Checks: identity key, channel key, chain integrity, dead man's switch state, backup +status, geofence, trusted keys. + +### Docker Monitoring + +```bash +# Service status +$ docker compose ps + +# Logs +$ docker compose logs -f server + +# Resource usage +$ docker stats +``` + +The Docker images include `HEALTHCHECK` directives that poll `/health` every 30 seconds. + +--- + +## 9. Incident Response + +### Device Seizure (Imminent) + +1. Trigger killswitch: `soosef fieldkit purge --confirm CONFIRM-PURGE` +2. For Tier 1 USB: pull the USB stick and destroy it physically if possible +3. Verify with a separate device that federation copies are intact + +### Device Seizure (After the Fact) + +1. Assume all local data is compromised +2. Verify Tier 2/3 copies of attestation data +3. Generate new keys on a fresh instance +4. Record a key recovery event in the chain (if the old chain is still accessible) +5. Notify all collaborators to update their trust stores + +### Federation Peer Compromise + +1. The compromised peer has attestation metadata (hashes, signatures, timestamps) but not + decrypted content +2. Remove the peer from your peer list (`/federation/` > Remove Peer) +3. Assess what metadata exposure means for your organization +4. Consider whether attestation patterns reveal sensitive information + +### Dead Man's Switch Triggered Accidentally + +Data is gone. Restore from the most recent backup: + +```bash +$ soosef init +$ soosef keys import -b /media/usb/backup.enc +``` + +Federation copies of attestation data are unaffected. Local attestations created since +the last federation sync or backup are lost. + +--- + +## 10. Maintenance Tasks + +### Regular Schedule + +| Task | Frequency | Command | +|---|---|---| +| Check system status | Daily | `soosef status` | +| Check in (if deadman armed) | Per interval | `soosef fieldkit checkin` | +| Backup keys | Per `backup_reminder_days` | `soosef keys export` | +| Verify chain integrity | Weekly | `soosef chain verify` | +| Anchor chain | Weekly | `soosef chain anchor` | +| Review drop box submissions | As needed | `/dropbox/admin` | +| Clean temp files | Monthly | Remove processed files from `~/.soosef/temp/` | +| Create cold archive | Monthly | Export via CLI or web | +| Update SooSeF | As releases are available | `pip install --upgrade soosef` | + +### Docker Volume Backup + +```bash +$ docker compose stop server +$ docker run --rm -v server-data:/data -v /backup:/backup \ + busybox tar czf /backup/soosef-$(date +%Y%m%d).tar.gz -C /data . +$ docker compose start server +``` + +### Log Rotation + +`audit.jsonl` grows indefinitely. On long-running Tier 2 servers, archive old entries +periodically. The audit log is append-only; truncate by copying the tail: + +```bash +$ tail -n 10000 ~/.soosef/audit.jsonl > ~/.soosef/audit.jsonl.tmp +$ mv ~/.soosef/audit.jsonl.tmp ~/.soosef/audit.jsonl +``` + +> **Warning**: Truncating the audit log removes historical records. Archive the full +> file before truncating if you need the history for compliance or legal purposes. diff --git a/docs/training/admin-reference.md b/docs/training/admin-reference.md new file mode 100644 index 0000000..d59a04d --- /dev/null +++ b/docs/training/admin-reference.md @@ -0,0 +1,219 @@ +# Administrator Quick Reference + +**Audience**: IT staff and technical leads responsible for deploying and maintaining +SooSeF instances. + +--- + +## Deployment Tiers + +| Tier | Form Factor | Use Case | Key Feature | +|---|---|---|---| +| 1 -- Field Device | Bootable Debian Live USB | Reporter in the field | Amnesic, LUKS encrypted, pull USB = zero trace | +| 2 -- Org Server | Docker on mini PC or VPS | Newsroom / NGO office | Persistent, web UI + federation API | +| 3 -- Federation Relay | Docker on VPS | Friendly jurisdiction | Attestation sync only, zero knowledge of keys | + +--- + +## Quick Deploy + +### Tier 1: Build USB image + +```bash +$ sudo apt install live-build +$ cd deploy/live-usb +$ sudo ./build.sh +$ sudo dd if=live-image-amd64.hybrid.iso of=/dev/sdX bs=4M status=progress +``` + +### Tier 2: Docker org server + +```bash +$ cd deploy/docker +$ docker compose up server -d +``` + +Exposes port 5000 (web UI) and port 8000 (federation API). + +### Tier 3: Docker federation relay + +```bash +$ cd deploy/docker +$ docker compose up relay -d +``` + +Exposes port 8001 (federation API only). + +### Kubernetes + +```bash +$ docker build -t soosef-server --target server -f deploy/docker/Dockerfile . +$ docker build -t soosef-relay --target relay -f deploy/docker/Dockerfile . +$ kubectl apply -f deploy/kubernetes/namespace.yaml +$ kubectl apply -f deploy/kubernetes/server-deployment.yaml +$ kubectl apply -f deploy/kubernetes/relay-deployment.yaml +``` + +Single-replica only. SooSeF uses SQLite -- do not scale horizontally. + +--- + +## Threat Level Presets + +Copy the appropriate preset to configure SooSeF for the operational environment: + +```bash +$ cp deploy/config-presets/-threat.json ~/.soosef/config.json +``` + +| Level | Session | Killswitch | Dead Man | Cover Name | +|---|---|---|---|---| +| `low-threat` | 30 min | Off | Off | None | +| `medium-threat` | 15 min | On | 48h / 4h grace | "Office Document Manager" | +| `high-threat` | 5 min | On | 12h / 1h grace | "Local Inventory Tracker" | +| `critical-threat` | 3 min | On | 6h / 1h grace | "System Statistics" | + +--- + +## Essential CLI Commands + +### System + +| Command | Description | +|---|---| +| `soosef init` | Create directory structure, generate keys, write default config | +| `soosef serve --host 0.0.0.0` | Start web UI (LAN-accessible) | +| `soosef status` | Pre-flight check: keys, chain, deadman, backup, geofence | +| `soosef status --json` | Machine-readable status output | + +### Keys + +| Command | Description | +|---|---| +| `soosef keys show` | Display current key info and fingerprints | +| `soosef keys export -o backup.enc` | Export encrypted key bundle | +| `soosef keys import -b backup.enc` | Import key bundle from backup | +| `soosef keys rotate-identity` | Rotate Ed25519 identity (records in chain) | +| `soosef keys rotate-channel` | Rotate AES-256-GCM channel key | +| `soosef keys trust --import pubkey.pem` | Trust a collaborator's public key | + +### Fieldkit + +| Command | Description | +|---|---| +| `soosef fieldkit status` | Show fieldkit state (deadman, geofence, USB, tamper) | +| `soosef fieldkit checkin` | Reset dead man's switch timer | +| `soosef fieldkit check-deadman` | Check if deadman timer expired (for cron) | +| `soosef fieldkit purge --confirm CONFIRM-PURGE` | Activate killswitch | +| `soosef fieldkit geofence set --lat X --lon Y --radius M` | Set GPS boundary | +| `soosef fieldkit usb snapshot` | Record USB whitelist baseline | +| `soosef fieldkit tamper baseline` | Record file integrity baseline | + +### Chain and Evidence + +| Command | Description | +|---|---| +| `soosef chain status` | Show chain head, length, integrity | +| `soosef chain verify` | Verify full chain (hashes + signatures) | +| `soosef chain log --count 20` | Show recent chain entries | +| `soosef chain export -o bundle.zip` | Export attestation bundle | +| `soosef chain disclose -i 5,12,47 -o disclosure.json` | Selective disclosure | +| `soosef chain anchor` | Manual anchor (prints hash for external witness) | +| `soosef chain anchor --tsa https://freetsa.org/tsr` | RFC 3161 automated anchor | + +--- + +## User Management + +The web UI admin panel at `/admin` provides: + +- Create user accounts +- Delete user accounts +- Reset passwords (temporary password issued) +- View active sessions + +User credentials are stored in SQLite at `~/.soosef/auth/soosef.db`. + +--- + +## Backup Checklist + +| What | How often | Command | +|---|---|---| +| Key bundle | After every rotation, weekly minimum | `soosef keys export -o backup.enc` | +| Cold archive | Weekly or before travel | `soosef archive export --include-keys -o archive.zip` | +| Docker volume | Before updates | `docker compose stop server && docker run --rm -v server-data:/data -v /backup:/backup busybox tar czf /backup/soosef-$(date +%Y%m%d).tar.gz -C /data .` | + +Store backups on separate physical media. Keep one copy offsite. + +--- + +## Federation Setup + +1. Exchange public keys between organizations (verify fingerprints out-of-band) +2. Import collaborator keys: `soosef keys trust --import /path/to/pubkey.pem` +3. Register peers via web UI at `/federation` or via CLI +4. Gossip starts automatically; monitor at `/federation` + +For airgapped federation: `soosef chain export` to USB, carry to partner, import there. + +--- + +## Source Drop Box + +1. Navigate to `/dropbox/admin` (admin login required) +2. Create a token with label, expiry, and file limit +3. Share the generated URL with the source over a secure channel +4. Monitor submissions at `/dropbox/admin` +5. Revoke tokens when no longer needed + +--- + +## Hardening Checklist + +- [ ] Disable or encrypt swap (`swapoff -a` or dm-crypt random-key swap) +- [ ] Disable core dumps (`echo "* hard core 0" >> /etc/security/limits.conf`) +- [ ] Configure firewall (UFW: allow 5000, 8000; deny all other incoming) +- [ ] Disable unnecessary services (bluetooth, avahi-daemon) +- [ ] Apply a threat level preset appropriate for the environment +- [ ] Set `cover_name` in config if operating under cover +- [ ] Set `SOOSEF_DATA_DIR` to an inconspicuous path if needed +- [ ] Enable HTTPS (default) or place behind a reverse proxy with TLS +- [ ] Create systemd service for bare metal (see `docs/deployment.md` Section 7) +- [ ] Set up regular backups (key bundle + cold archive) +- [ ] Arm dead man's switch if appropriate for threat level +- [ ] Take initial USB whitelist snapshot +- [ ] Record tamper baseline + +--- + +## Troubleshooting Quick Fixes + +| Symptom | Check | +|---|---| +| Web UI unreachable from LAN | `host` must be `0.0.0.0`, not `127.0.0.1`. Check firewall. | +| Docker container exits | `docker compose logs server` -- check for port conflict or volume permissions | +| Dead man fires unexpectedly | Service crashed and exceeded interval+grace. Ensure `Restart=on-failure`. | +| Permission errors on `~/.soosef/` | Run SooSeF as the same user who ran `soosef init` | +| Drop box tokens expire immediately | System clock wrong. Run `date -u` and fix if needed. | +| Chain anchor TSA fails | Requires network. Use manual anchor on airgapped devices. | +| Account locked out | Wait for lockout to expire, or restart the server. | +| SSL cert shows wrong name | Delete `~/.soosef/certs/cert.pem`, set `cover_name`, restart. | + +--- + +## Health Checks + +```bash +# Web UI +$ curl -k https://127.0.0.1:5000/health + +# Federation API (Tier 2) +$ curl http://localhost:8000/health + +# Relay (Tier 3) +$ curl http://localhost:8001/health + +# Full system status +$ soosef status --json +``` diff --git a/docs/training/emergency-card.md b/docs/training/emergency-card.md new file mode 100644 index 0000000..9bdd014 --- /dev/null +++ b/docs/training/emergency-card.md @@ -0,0 +1,79 @@ +# Emergency Reference Card + +**Audience**: All SooSeF users. Print, laminate, and carry in your wallet. + +--- + +## EMERGENCY DATA DESTRUCTION + +### Option 1: Pull the USB (Tier 1 -- fastest) + +Remove the USB stick from the laptop. The laptop retains zero data. + +### Option 2: Software killswitch + +In the browser: **Fieldkit** > **Emergency Purge** > type `CONFIRM-PURGE` > click **Purge** + +From a terminal: + +``` +soosef fieldkit purge --confirm CONFIRM-PURGE +``` + +### Option 3: Hardware button (Raspberry Pi only) + +Hold the physical button for 5 seconds. + +--- + +## DESTRUCTION ORDER + +The killswitch destroys data in this order (most critical first): + +1. Ed25519 identity keys +2. AES-256 channel key +3. Session secrets +4. User database +5. Attestation log and chain +6. Temp files and audit log +7. Configuration +8. System logs +9. All forensic traces (bytecache, pip cache, shell history) +10. Self-uninstall + +On USB: the LUKS encryption header is destroyed instead (faster, more reliable on flash). + +--- + +## DEAD MAN'S SWITCH + +If enabled, you must check in before the deadline or all data will be destroyed. + +**Check in**: Browser > **Fieldkit** > **Check In** + +Or: `soosef fieldkit checkin` + +If you cannot check in, contact your editor. They may be able to disarm it remotely. + +--- + +## KEY CONTACTS + +| Role | Name | Contact | +|---|---|---| +| Admin | _________________ | _________________ | +| Editor | _________________ | _________________ | +| Legal | _________________ | _________________ | +| Technical support | _________________ | _________________ | + +Fill in before deploying. Keep this card current. + +--- + +## REMEMBER + +- Pull the USB = zero trace on the laptop +- Keys are destroyed first = remaining data is useless without them +- The killswitch cannot be undone +- Back up your keys regularly -- if the USB is lost, the keys are gone +- Never share your passphrase, PIN, or LUKS password over unencrypted channels diff --git a/docs/training/reporter-field-guide.md b/docs/training/reporter-field-guide.md new file mode 100644 index 0000000..1c9398e --- /dev/null +++ b/docs/training/reporter-field-guide.md @@ -0,0 +1,263 @@ +# SooSeF Reporter Field Guide + +**Audience**: Reporters, field researchers, and documentarians using SooSeF to protect +and verify their work. No technical background required. + +**Prerequisites**: A working SooSeF instance (Tier 1 USB or web UI access to a Tier 2 +server). Your IT admin should have set this up for you. + +--- + +## What SooSeF Does For You + +SooSeF helps you do three things: + +1. **Prove your photos and files are authentic** -- every photo you attest gets a + cryptographic signature that proves you took it, when, and that it has not been + tampered with since. +2. **Hide messages in images** -- send encrypted messages that look like ordinary photos. +3. **Destroy everything if compromised** -- if your device is about to be seized, SooSeF + can erase all evidence of itself and your data in seconds. + +--- + +## Daily Workflow + +### Attesting Photos + +After taking photos in the field, attest them as soon as possible. Attestation creates a +permanent, tamper-evident record. + +**Through the web UI:** + +1. Open Firefox (on Tier 1 USB, it opens automatically) +2. Go to the **Attest** page +3. Upload one or more photos +4. Add a caption describing what the photo shows (optional but recommended) +5. Add a location if relevant (optional) +6. Click **Attest** + +SooSeF will: +- Extract GPS coordinates and timestamp from the photo's EXIF data (for the provenance record) +- Strip device-identifying information (serial numbers, firmware version) from the stored copy +- Sign the photo with your Ed25519 identity key +- Add the attestation to the hash chain + +**Through the CLI (if available):** + +```bash +$ soosef attest IMAGE photo.jpg --caption "Market protest, central square" +``` + +> **Warning**: Attest the original, unedited photo. If you crop, filter, or resize +> first, the attestation will not match the original file. Attest first, edit later. + +### Batch Attestation + +If you have a folder of photos from a field visit: + +```bash +$ soosef attest batch ./field-photos/ --caption "Site visit 2026-04-01" +``` + +### Checking Your Status + +Run `soosef status` or visit the web UI home page to see: +- Whether your identity key is set up +- How many attestations you have +- Whether your dead man's switch needs a check-in +- Whether your backup is overdue + +--- + +## Sending Hidden Messages (Steganography) + +Steganography hides an encrypted message inside an ordinary-looking image. To anyone who +does not have the decryption credentials, the image looks completely normal. + +### What You Need + +Both you and the recipient must have: +1. **A reference photo** -- the same image file, shared beforehand +2. **A passphrase** -- at least 4 words, agreed in person +3. **A PIN** -- 6 to 9 digits, agreed in person + +All three are required to encode or decode. Share them in person, never over email or SMS. + +### Encoding a Message + +**Web UI:** Go to **Encode**, upload your carrier image and reference photo, enter your +message, passphrase, and PIN. + +**CLI:** + +```bash +$ soosef stego encode vacation.jpg -r shared_photo.jpg -m "Meeting moved to Thursday" +# Passphrase: (enter your passphrase, hidden) +# PIN: (enter your PIN, hidden) +``` + +The output is a normal-looking image file that contains your hidden message. + +### Transport-Aware Encoding + +If you are sending the image through a messaging app, tell SooSeF which platform. The +app will recompress images, so SooSeF needs to use a survival-resistant encoding: + +```bash +$ soosef stego encode photo.jpg -r shared.jpg -m "Safe house confirmed" --transport whatsapp +$ soosef stego encode photo.jpg -r shared.jpg -m "Safe house confirmed" --transport signal +$ soosef stego encode photo.jpg -r shared.jpg -m "Safe house confirmed" --transport telegram +``` + +> **Warning**: Never reuse the same carrier image twice. SooSeF will warn you if you +> do. Comparing two versions of the same image trivially reveals steganographic changes. + +### Decoding a Message + +```bash +$ soosef stego decode received_image.jpg -r shared_photo.jpg +# Passphrase: (same passphrase) +# PIN: (same PIN) +``` + +--- + +## Check-In (Dead Man's Switch) + +If your admin has enabled the dead man's switch, you must check in regularly. If you miss +your check-in window, SooSeF assumes something has gone wrong and will eventually destroy +all data to protect you. + +**Check in through the web UI:** Visit the **Fieldkit** page and click **Check In**. + +**Check in through the CLI:** + +```bash +$ soosef fieldkit checkin +``` + +> **Warning**: If you will be unable to check in (traveling without the device, planned +> downtime), ask your admin to disarm the dead man's switch first. Forgetting to check in +> will trigger the killswitch and destroy all your data permanently. + +--- + +## Emergency: Triggering the Killswitch + +If your device is about to be seized or compromised: + +**CLI:** + +```bash +$ soosef fieldkit purge --confirm CONFIRM-PURGE +``` + +**Web UI:** Visit the **Fieldkit** page and use the emergency purge button. + +**Tier 1 USB:** Pull the USB stick from the laptop. The laptop retains nothing (it +was running from the USB). The USB stick is LUKS-encrypted and requires a passphrase to +access. + +**Raspberry Pi with hardware button:** Hold the killswitch button for 5 seconds (default). + +### What gets destroyed + +1. Your signing keys (without these, encrypted data is unrecoverable) +2. Your channel key +3. User accounts and session data +4. All attestation records and chain data +5. Temporary files and audit logs +6. Configuration +7. System log entries mentioning SooSeF +8. Python bytecache and pip metadata (to hide that SooSeF was installed) +9. The SooSeF package itself + +> **Warning**: This is irreversible. Make sure you have recent backups stored +> separately before relying on the killswitch. See "Backups" below. + +--- + +## Backups + +Back up your keys regularly. SooSeF will remind you if your backup is overdue. + +### Creating a Backup + +```bash +$ soosef keys export -o /media/usb/soosef-backup.enc +``` + +You will be prompted for a passphrase. This creates an encrypted bundle containing your +identity key and channel key. Store the USB drive **in a different physical location** +from your SooSeF device. + +### Restoring From Backup + +On a fresh SooSeF instance: + +```bash +$ soosef init +$ soosef keys import -b /media/usb/soosef-backup.enc +``` + +--- + +## Evidence Packages + +When you need to hand evidence to a lawyer, a court, or a partner organization that does +not use SooSeF: + +1. Go to the web UI or use the CLI to create an evidence package +2. Select the photos to include +3. SooSeF creates a ZIP file containing: + - Your original photos + - Attestation records with signatures + - The chain segment proving order and integrity + - Your public key + - A standalone verification script + - A README with instructions + +The recipient can verify the evidence using only Python -- they do not need SooSeF. + +--- + +## What To Do If... + +**Your device is seized**: If you did not trigger the killswitch in time, assume all +local data is compromised. Any attestation bundles you previously sent to your +organization (Tier 2) or federated to relays (Tier 3) are safe and contain your full +chain. Contact your organization's IT team. + +**You lose your USB stick**: The USB is LUKS-encrypted, so the finder cannot read it +without your passphrase. Restore from your backup on a new USB stick. All attestations +that were already synced to Tier 2/3 are safe. + +**You forget your stego passphrase or PIN**: There is no recovery. The message is +encrypted with keys derived from the passphrase, PIN, and reference photo. If you lose +any of the three, the message cannot be recovered. + +**You need to share evidence with a court**: Use selective disclosure +(`soosef chain disclose`) to produce a proof that includes only the specific records +requested. The court can verify these records are part of an authentic, unbroken chain +without seeing your other work. + +**Your check-in is overdue**: Check in immediately. During the grace period, a warning +webhook fires (if configured) but data is not yet destroyed. After the grace period, +the killswitch fires automatically. + +--- + +## Security Reminders + +- **Attest photos immediately** after taking them. The sooner you attest, the smaller + the window during which someone could have tampered with the file. +- **Never send stego credentials over digital channels.** Share the reference photo, + passphrase, and PIN in person. +- **Never reuse carrier images** for steganography. +- **Check in on schedule** if the dead man's switch is armed. +- **Back up regularly** and store backups in a separate physical location. +- **Lock the browser** or close it when you walk away. Session timeouts help, but do not + rely on them. +- **Do not discuss SooSeF by name** in environments where your communications may be + monitored. If `cover_name` is configured, the tool presents itself under that name. diff --git a/docs/training/reporter-quickstart.md b/docs/training/reporter-quickstart.md new file mode 100644 index 0000000..0bfab9b --- /dev/null +++ b/docs/training/reporter-quickstart.md @@ -0,0 +1,105 @@ +# Reporter Quick-Start Card + +**Audience**: Field reporters using a SooSeF Tier 1 bootable USB device. +No technical background assumed. + +**Print this page on a single sheet, laminate it, and keep it with the USB stick.** + +--- + +## Getting Started + +1. **Plug the USB** into any laptop +2. **Boot from USB** (press F12 during startup, select the USB drive) +3. **Enter your passphrase** when the blue screen appears (this unlocks your data) +4. **Wait for the browser** to open automatically + +You are now running SooSeF. The laptop's own hard drive is never touched. + +--- + +## Taking and Attesting a Photo + +1. Transfer your photo to the laptop (USB cable, SD card, AirDrop, etc.) +2. In the browser, click **Attest** +3. Select your photo and click **Sign** +4. The photo is now cryptographically signed with your identity + +This proves you took this photo, where, and when. It cannot be forged later. + +--- + +## Hiding a Message in a Photo + +1. Click **Encode** in the browser +2. Select a **carrier image** (the photo that will carry the hidden message) +3. Select a **reference photo** (a photo both you and the recipient have) +4. Type your **message** +5. Enter your **passphrase** and **PIN** (the recipient needs the same ones) +6. Click **Encode** + +To send via WhatsApp, Signal, or Telegram, select the platform from the **Transport** +dropdown before encoding. This ensures the message survives the platform's image +compression. + +--- + +## Checking In (Dead Man's Switch) + +If your admin has enabled the dead man's switch, you must check in regularly. + +1. Click **Fieldkit** in the browser +2. Click **Check In** + +Or from a terminal: + +``` +soosef fieldkit checkin +``` + +If you miss your check-in window, the system will destroy all data after the grace period. + +> **If you are unable to check in, contact your editor immediately.** + +--- + +## Emergency: Destroying All Data + +If you believe the device will be seized: + +1. **Pull the USB stick** -- the laptop retains nothing +2. If you cannot pull the USB: click **Fieldkit** then **Emergency Purge** and + confirm with `CONFIRM-PURGE` + +Everything is gone. Keys, photos, attestations, messages -- all destroyed. + +--- + +## Shutting Down + +1. **Close the browser** +2. **Pull the USB stick** + +The laptop returns to its normal state. No trace of SooSeF remains. + +--- + +## Troubleshooting + +| Problem | Solution | +|---|---| +| Laptop does not boot from USB | Press F12 (or F2, Del) during startup to enter boot menu. Select the USB drive. Disable Secure Boot in BIOS if needed. | +| "Certificate warning" in browser | Normal for self-signed certificates. Click "Advanced" then "Accept the risk" or "Proceed." | +| Cannot connect to web UI | Wait 30 seconds after boot. Try refreshing the browser. The URL is `https://127.0.0.1:5000`. | +| Forgot passphrase or PIN | You cannot recover encrypted data without the correct passphrase and PIN. Contact your admin. | +| USB stick lost or broken | Get a new USB from your admin. If you had a backup, they can restore your keys onto the new stick. | + +--- + +## Key Rules + +1. **Never leave the USB in an unattended laptop** +2. **Check in on time** if the dead man's switch is enabled +3. **Back up your USB** -- your admin can help with this +4. **Verify fingerprints** before trusting a collaborator's key +5. **Use transport-aware encoding** when sending stego images through messaging apps