fieldwitness/docs/architecture/federation.md
Aaron D. Lee 490f9d4a1d Rebrand SooSeF to FieldWitness
Complete project rebrand for better positioning in the press freedom
and digital security space. FieldWitness communicates both field
deployment and evidence testimony — appropriate for the target audience
of journalists, NGOs, and human rights organizations.

Rename mapping:
- soosef → fieldwitness (package, CLI, all imports)
- soosef.stegasoo → fieldwitness.stego
- soosef.verisoo → fieldwitness.attest
- ~/.soosef/ → ~/.fwmetadata/ (innocuous data dir name)
- SOOSEF_DATA_DIR → FIELDWITNESS_DATA_DIR
- SoosefConfig → FieldWitnessConfig
- SoosefError → FieldWitnessError

Also includes:
- License switch from MIT to GPL-3.0
- C2PA bridge module (Phase 0-2 MVP): cert.py, export.py, vendor_assertions.py
- README repositioned to lead with provenance/federation, stego backgrounded
- Threat model skeleton at docs/security/threat-model.md
- Planning docs: docs/planning/c2pa-integration.md, docs/planning/gtm-feasibility.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:05:13 -04:00

255 lines
13 KiB
Markdown

# Federated Attestation System — Architecture Overview
**Status**: Design
**Version**: 0.1.0-draft
**Last updated**: 2026-04-01
## 1. Problem Statement
FieldWitness operates offline-first: devices create Ed25519-signed attestations without network
access. This creates two fundamental challenges:
1. **Timestamp credibility** — An offline device's clock is untrusted. An adversary with
physical access could backdate or postdate attestations.
2. **Distribution** — Attestations trapped on a single device are vulnerable to seizure,
destruction, or loss. They must be replicated to survive.
The system must solve both problems while preserving the offline-first constraint and
protecting content confidentiality even from the distribution infrastructure.
## 2. Threat Model
### Adversaries
| Adversary | Capabilities | Goal |
|---|---|---|
| State actor | Physical device access, network surveillance, legal compulsion of server operators | Suppress or discredit attestations |
| Insider threat | Access to one federation server | Fork the log, selectively omit entries, or read content |
| Device thief | Physical access to offline device | Fabricate or backdate attestations |
### Security Properties
| Property | Guarantee | Mechanism |
|---|---|---|
| **Integrity** | Attestations cannot be modified after creation | Ed25519 signatures + hash chain |
| **Ordering** | Attestations cannot be reordered or inserted | Hash chain (each record includes hash of previous) |
| **Existence proof** | Attestation existed at or before time T | Federation receipt (server-signed timestamp) |
| **Confidentiality** | Content is hidden from infrastructure | Envelope encryption; servers store encrypted blobs |
| **Fork detection** | Log tampering is detectable | Merkle tree + consistency proofs (CT model) |
| **Availability** | Attestations survive device loss | Replication across federation servers via gossip |
### Non-Goals
- **Proving exact creation time** — Impossible without a trusted time source. We prove
ordering (hash chain) and existence-before (federation receipt). The gap between
claimed time and receipt time is the trust window.
- **Anonymity of attestors** — Federation members can see signer public keys. For anonymity,
use a dedicated identity per context.
- **Preventing denial of service** — Federation servers are assumed cooperative. Byzantine
fault tolerance is out of scope for v1.
## 3. System Architecture
```
OFFLINE DEVICE AIR GAP INTERNET
────────────── ─────── ────────
┌──────────┐ ┌──────────────┐ ┌──────────┐ USB/SD ┌──────────┐ ┌────────────┐
│ Attest │────>│ Hash Chain │────>│ Export │───────────────>│ Loader │────>│ Federation │
│ Attest │ │ (Layer 1) │ │ Bundle │ │ (App) │ │ Server │
└──────────┘ └──────────────┘ │ (Layer 2)│ └────┬─────┘ └─────┬──────┘
└──────────┘ │ │
│ │ ┌─────────┐ │
Optional: │<───│ Receipt │<──┘
DCT embed │ └─────────┘
in JPEG │
│ USB carry-back
┌────v─────┐ │
│ Stego │ ┌────v─────┐
│ Image │ │ Offline │
└──────────┘ │ Device │
│ (receipt │
│ stored) │
└──────────┘
```
### Layer 1: Hash-Chained Attestation Records (Local)
Each attestation is wrapped in a chain record that includes:
- A hash of the previous record (tamper-evident ordering)
- Entropy witnesses (system uptime, kernel state) that make timestamp fabrication expensive
- An Ed25519 signature over the entire record
The chain lives on the offline device at `~/.fwmetadata/chain/`. It wraps existing Attest
attestation records — the Attest record's bytes become the `content_hash` input.
**See**: [chain-format.md](chain-format.md)
### Layer 2: Encrypted Export Bundles
A range of chain records is packaged into a portable bundle:
1. Records serialized as CBOR, compressed with zstd
2. Encrypted with AES-256-GCM using a random data encryption key (DEK)
3. DEK wrapped per-recipient via X25519 ECDH (derived from Ed25519 identities)
4. An unencrypted `chain_summary` (record count, hash range, Merkle root, signature) allows
auditing without decryption
Bundles can optionally be embedded in JPEG images via stego's DCT steganography,
making them indistinguishable from normal photos on a USB stick.
**See**: [export-bundle.md](export-bundle.md)
### Layer 3: Federated Append-Only Log
Federation servers are "blind notaries" inspired by Certificate Transparency (RFC 6962):
- They receive encrypted bundles, verify the chain summary signature, and append to a
Merkle tree
- They issue signed receipts with a federation timestamp (proof of existence)
- They gossip Signed Tree Heads (STH) with peers to ensure consistency
- They never decrypt content — they operate at the "federation member" permission tier
**See**: [federation-protocol.md](federation-protocol.md)
### The Loader (Air-Gap Bridge)
A separate application that runs on an internet-connected machine:
1. Receives a bundle (from USB) or extracts one from a steganographic image
2. Validates the bundle signature and chain summary
3. Pushes to configured federation servers
4. Collects signed receipts
5. Receipts are carried back to the offline device on the next USB round-trip
The loader never needs signing keys — bundles are already signed. It is a transport mechanism.
## 4. Key Domains
FieldWitness maintains strict separation between two cryptographic domains:
| Domain | Algorithm | Purpose | Key Location |
|---|---|---|---|
| **Signing** | Ed25519 | Attestation signatures, chain records, bundle summaries | `~/.fwmetadata/identity/` |
| **Encryption** | X25519 + AES-256-GCM | Bundle payload encryption (envelope) | Derived from Ed25519 via birational map |
| **Steganography** | AES-256-GCM (from factors) | Stego channel encryption | `~/.fwmetadata/stego/channel.key` |
The signing and encryption domains share a key lineage (Ed25519 → X25519 derivation) but
serve different purposes. The steganography domain remains fully independent — it protects
the stego carrier, not the attestation content.
### Private Key Storage Policy
The Ed25519 private key is stored **unencrypted** on disk (protected by 0o600 file
permissions). This is a deliberate design decision:
- The killswitch (secure deletion) is the primary defense for at-risk users, not key
encryption. A password-protected key would require prompting on every attestation and
chain operation, which is unworkable in field conditions.
- The `password` parameter on `generate_identity()` exists for interoperability but is
not used by default. Chain operations (`_wrap_in_chain`, `backfill`) assume unencrypted keys.
- If the device is seized, the killswitch destroys key material. If the killswitch is not
triggered, the adversary has physical access and can defeat key encryption via cold boot,
memory forensics, or compelled disclosure.
## 5. Permission Model
Three tiers control who can see what:
| Tier | Sees | Can Verify | Typical Actor |
|---|---|---|---|
| **Public auditor** | Chain summaries, Merkle proofs, receipt timestamps | Existence, ordering, no forks | Anyone with server URL |
| **Federation member** | + chain IDs, signer public keys | + who attested, chain continuity | Peer servers, authorized monitors |
| **Authorized recipient** | + decrypted attestation content | Everything | Designated individuals with DEK access |
Federation servers themselves operate at the **federation member** tier. They can verify
chain integrity and detect forks, but they cannot read attestation content.
## 6. Timestamp Credibility
Since the device is offline, claimed timestamps are inherently untrusted. The system
provides layered evidence:
1. **Hash chain ordering** — Records are provably ordered. Even if timestamps are wrong,
the sequence is authentic.
2. **Entropy witnesses** — Each record includes system uptime, kernel entropy pool state,
and boot ID. Fabricating a convincing set of witnesses for a backdated record requires
simulating the full system state at the claimed time.
3. **Federation receipt** — When the bundle reaches a federation server, the server signs
a receipt with its own clock. This proves the chain existed at or before the receipt time.
4. **Cross-device corroboration** — If two independent devices attest overlapping events,
their independent chains corroborate each other's timeline.
The trust window is: `receipt_timestamp - claimed_timestamp`. A smaller window means
more credible timestamps. Frequent USB sync trips shrink the window.
## 7. Data Lifecycle
```
1. CREATE Offline device: attest file → sign → append to chain
2. EXPORT Offline device: select range → compress → encrypt → bundle (→ optional stego embed)
3. TRANSFER Physical media: USB/SD card carries bundle across air gap
4. LOAD Internet machine: validate bundle → push to federation
5. RECEIPT Federation server: verify → append to Merkle tree → sign receipt
6. RETURN Physical media: receipt carried back to offline device
7. REPLICATE Federation: servers gossip entries and STHs to each other
8. AUDIT Anyone: verify Merkle proofs, check consistency across servers
```
## 8. Failure Modes
| Failure | Impact | Mitigation |
|---|---|---|
| Device lost/seized | Local chain lost | Bundles already sent to federation survive; regular exports reduce data loss window |
| USB intercepted | Adversary gets encrypted bundle (or stego image) | Encryption protects content; stego hides existence |
| Federation server compromised | Adversary reads metadata (chain IDs, pubkeys, timestamps) | Content remains encrypted; other servers detect log fork via consistency proofs |
| All federation servers down | No new receipts | Bundles queue on loader; chain integrity unaffected; retry when servers recover |
| Clock manipulation on device | Timestamps unreliable | Entropy witnesses increase fabrication cost; federation receipt provides external anchor |
## 9. Dependencies
| Package | Version | Purpose |
|---|---|---|
| `cbor2` | >=5.6.0 | Canonical CBOR serialization (RFC 8949) |
| `uuid-utils` | >=0.9.0 | UUID v7 generation (time-ordered) |
| `zstandard` | >=0.22.0 | Zstd compression for bundles and storage tiers |
| `cryptography` | >=41.0.0 | Ed25519, X25519, AES-256-GCM, HKDF, SHA-256 (already a dependency) |
## 10. File Layout
```
src/fieldwitness/federation/
__init__.py
models.py Chain record and state dataclasses
serialization.py CBOR canonical encoding
entropy.py System entropy collection
chain.py ChainStore — local append-only chain
merkle.py Merkle tree implementation
export.py Export bundle creation/parsing/decryption
x25519.py Ed25519→X25519 derivation, envelope encryption
stego_bundle.py Steganographic bundle embedding
protocol.py Shared federation types
loader/
__init__.py
loader.py Air-gap bridge application
config.py Loader configuration
receipt.py Receipt handling
client.py Federation server HTTP client
server/
__init__.py
app.py Federation server Flask app
tree.py CT-style Merkle tree
storage.py Tiered bundle storage
gossip.py Peer synchronization
permissions.py Access control
config.py Server configuration
~/.fwmetadata/
chain/ Local hash chain
chain.bin Append-only record log
state.cbor Chain state checkpoint
exports/ Generated export bundles
loader/ Loader state
config.json Server list and settings
receipts/ Federation receipts
federation/ Federation server data (when running as server)
servers.json Known peer servers
```