fieldwitness/docs/architecture/federation.md
Aaron D. Lee 490f9d4a1d Rebrand SooSeF to FieldWitness
Complete project rebrand for better positioning in the press freedom
and digital security space. FieldWitness communicates both field
deployment and evidence testimony — appropriate for the target audience
of journalists, NGOs, and human rights organizations.

Rename mapping:
- soosef → fieldwitness (package, CLI, all imports)
- soosef.stegasoo → fieldwitness.stego
- soosef.verisoo → fieldwitness.attest
- ~/.soosef/ → ~/.fwmetadata/ (innocuous data dir name)
- SOOSEF_DATA_DIR → FIELDWITNESS_DATA_DIR
- SoosefConfig → FieldWitnessConfig
- SoosefError → FieldWitnessError

Also includes:
- License switch from MIT to GPL-3.0
- C2PA bridge module (Phase 0-2 MVP): cert.py, export.py, vendor_assertions.py
- README repositioned to lead with provenance/federation, stego backgrounded
- Threat model skeleton at docs/security/threat-model.md
- Planning docs: docs/planning/c2pa-integration.md, docs/planning/gtm-feasibility.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 15:05:13 -04:00

13 KiB

Federated Attestation System — Architecture Overview

Status: Design
Version: 0.1.0-draft
Last updated: 2026-04-01

1. Problem Statement

FieldWitness operates offline-first: devices create Ed25519-signed attestations without network access. This creates two fundamental challenges:

  1. Timestamp credibility — An offline device's clock is untrusted. An adversary with physical access could backdate or postdate attestations.
  2. Distribution — Attestations trapped on a single device are vulnerable to seizure, destruction, or loss. They must be replicated to survive.

The system must solve both problems while preserving the offline-first constraint and protecting content confidentiality even from the distribution infrastructure.

2. Threat Model

Adversaries

Adversary Capabilities Goal
State actor Physical device access, network surveillance, legal compulsion of server operators Suppress or discredit attestations
Insider threat Access to one federation server Fork the log, selectively omit entries, or read content
Device thief Physical access to offline device Fabricate or backdate attestations

Security Properties

Property Guarantee Mechanism
Integrity Attestations cannot be modified after creation Ed25519 signatures + hash chain
Ordering Attestations cannot be reordered or inserted Hash chain (each record includes hash of previous)
Existence proof Attestation existed at or before time T Federation receipt (server-signed timestamp)
Confidentiality Content is hidden from infrastructure Envelope encryption; servers store encrypted blobs
Fork detection Log tampering is detectable Merkle tree + consistency proofs (CT model)
Availability Attestations survive device loss Replication across federation servers via gossip

Non-Goals

  • Proving exact creation time — Impossible without a trusted time source. We prove ordering (hash chain) and existence-before (federation receipt). The gap between claimed time and receipt time is the trust window.
  • Anonymity of attestors — Federation members can see signer public keys. For anonymity, use a dedicated identity per context.
  • Preventing denial of service — Federation servers are assumed cooperative. Byzantine fault tolerance is out of scope for v1.

3. System Architecture

                    OFFLINE DEVICE                        AIR GAP                INTERNET
                    ──────────────                        ───────                ────────
                    
 ┌──────────┐     ┌──────────────┐     ┌──────────┐     USB/SD     ┌──────────┐     ┌────────────┐
 │  Attest │────>│  Hash Chain  │────>│  Export   │───────────────>│  Loader  │────>│ Federation │
 │  Attest  │     │  (Layer 1)   │     │  Bundle   │               │  (App)   │     │  Server    │
 └──────────┘     └──────────────┘     │  (Layer 2)│               └────┬─────┘     └─────┬──────┘
                                       └──────────┘                     │                  │
                                            │                           │    ┌─────────┐   │
                                       Optional:                        │<───│ Receipt │<──┘
                                       DCT embed                        │    └─────────┘
                                       in JPEG                          │
                                            │                      USB carry-back
                                       ┌────v─────┐                     │
                                       │ Stego    │                ┌────v─────┐
                                       │ Image    │                │ Offline  │
                                       └──────────┘                │ Device   │
                                                                   │ (receipt │
                                                                   │  stored) │
                                                                   └──────────┘

Layer 1: Hash-Chained Attestation Records (Local)

Each attestation is wrapped in a chain record that includes:

  • A hash of the previous record (tamper-evident ordering)
  • Entropy witnesses (system uptime, kernel state) that make timestamp fabrication expensive
  • An Ed25519 signature over the entire record

The chain lives on the offline device at ~/.fwmetadata/chain/. It wraps existing Attest attestation records — the Attest record's bytes become the content_hash input.

See: chain-format.md

Layer 2: Encrypted Export Bundles

A range of chain records is packaged into a portable bundle:

  1. Records serialized as CBOR, compressed with zstd
  2. Encrypted with AES-256-GCM using a random data encryption key (DEK)
  3. DEK wrapped per-recipient via X25519 ECDH (derived from Ed25519 identities)
  4. An unencrypted chain_summary (record count, hash range, Merkle root, signature) allows auditing without decryption

Bundles can optionally be embedded in JPEG images via stego's DCT steganography, making them indistinguishable from normal photos on a USB stick.

See: export-bundle.md

Layer 3: Federated Append-Only Log

Federation servers are "blind notaries" inspired by Certificate Transparency (RFC 6962):

  • They receive encrypted bundles, verify the chain summary signature, and append to a Merkle tree
  • They issue signed receipts with a federation timestamp (proof of existence)
  • They gossip Signed Tree Heads (STH) with peers to ensure consistency
  • They never decrypt content — they operate at the "federation member" permission tier

See: federation-protocol.md

The Loader (Air-Gap Bridge)

A separate application that runs on an internet-connected machine:

  1. Receives a bundle (from USB) or extracts one from a steganographic image
  2. Validates the bundle signature and chain summary
  3. Pushes to configured federation servers
  4. Collects signed receipts
  5. Receipts are carried back to the offline device on the next USB round-trip

The loader never needs signing keys — bundles are already signed. It is a transport mechanism.

4. Key Domains

FieldWitness maintains strict separation between two cryptographic domains:

Domain Algorithm Purpose Key Location
Signing Ed25519 Attestation signatures, chain records, bundle summaries ~/.fwmetadata/identity/
Encryption X25519 + AES-256-GCM Bundle payload encryption (envelope) Derived from Ed25519 via birational map
Steganography AES-256-GCM (from factors) Stego channel encryption ~/.fwmetadata/stego/channel.key

The signing and encryption domains share a key lineage (Ed25519 → X25519 derivation) but serve different purposes. The steganography domain remains fully independent — it protects the stego carrier, not the attestation content.

Private Key Storage Policy

The Ed25519 private key is stored unencrypted on disk (protected by 0o600 file permissions). This is a deliberate design decision:

  • The killswitch (secure deletion) is the primary defense for at-risk users, not key encryption. A password-protected key would require prompting on every attestation and chain operation, which is unworkable in field conditions.
  • The password parameter on generate_identity() exists for interoperability but is not used by default. Chain operations (_wrap_in_chain, backfill) assume unencrypted keys.
  • If the device is seized, the killswitch destroys key material. If the killswitch is not triggered, the adversary has physical access and can defeat key encryption via cold boot, memory forensics, or compelled disclosure.

5. Permission Model

Three tiers control who can see what:

Tier Sees Can Verify Typical Actor
Public auditor Chain summaries, Merkle proofs, receipt timestamps Existence, ordering, no forks Anyone with server URL
Federation member + chain IDs, signer public keys + who attested, chain continuity Peer servers, authorized monitors
Authorized recipient + decrypted attestation content Everything Designated individuals with DEK access

Federation servers themselves operate at the federation member tier. They can verify chain integrity and detect forks, but they cannot read attestation content.

6. Timestamp Credibility

Since the device is offline, claimed timestamps are inherently untrusted. The system provides layered evidence:

  1. Hash chain ordering — Records are provably ordered. Even if timestamps are wrong, the sequence is authentic.
  2. Entropy witnesses — Each record includes system uptime, kernel entropy pool state, and boot ID. Fabricating a convincing set of witnesses for a backdated record requires simulating the full system state at the claimed time.
  3. Federation receipt — When the bundle reaches a federation server, the server signs a receipt with its own clock. This proves the chain existed at or before the receipt time.
  4. Cross-device corroboration — If two independent devices attest overlapping events, their independent chains corroborate each other's timeline.

The trust window is: receipt_timestamp - claimed_timestamp. A smaller window means more credible timestamps. Frequent USB sync trips shrink the window.

7. Data Lifecycle

1. CREATE    Offline device: attest file → sign → append to chain
2. EXPORT    Offline device: select range → compress → encrypt → bundle (→ optional stego embed)
3. TRANSFER  Physical media: USB/SD card carries bundle across air gap
4. LOAD      Internet machine: validate bundle → push to federation
5. RECEIPT   Federation server: verify → append to Merkle tree → sign receipt
6. RETURN    Physical media: receipt carried back to offline device
7. REPLICATE Federation: servers gossip entries and STHs to each other
8. AUDIT     Anyone: verify Merkle proofs, check consistency across servers

8. Failure Modes

Failure Impact Mitigation
Device lost/seized Local chain lost Bundles already sent to federation survive; regular exports reduce data loss window
USB intercepted Adversary gets encrypted bundle (or stego image) Encryption protects content; stego hides existence
Federation server compromised Adversary reads metadata (chain IDs, pubkeys, timestamps) Content remains encrypted; other servers detect log fork via consistency proofs
All federation servers down No new receipts Bundles queue on loader; chain integrity unaffected; retry when servers recover
Clock manipulation on device Timestamps unreliable Entropy witnesses increase fabrication cost; federation receipt provides external anchor

9. Dependencies

Package Version Purpose
cbor2 >=5.6.0 Canonical CBOR serialization (RFC 8949)
uuid-utils >=0.9.0 UUID v7 generation (time-ordered)
zstandard >=0.22.0 Zstd compression for bundles and storage tiers
cryptography >=41.0.0 Ed25519, X25519, AES-256-GCM, HKDF, SHA-256 (already a dependency)

10. File Layout

src/fieldwitness/federation/
  __init__.py
  models.py              Chain record and state dataclasses
  serialization.py       CBOR canonical encoding
  entropy.py             System entropy collection
  chain.py               ChainStore — local append-only chain
  merkle.py              Merkle tree implementation
  export.py              Export bundle creation/parsing/decryption
  x25519.py              Ed25519→X25519 derivation, envelope encryption
  stego_bundle.py        Steganographic bundle embedding
  protocol.py            Shared federation types
  loader/
    __init__.py
    loader.py            Air-gap bridge application
    config.py            Loader configuration
    receipt.py           Receipt handling
    client.py            Federation server HTTP client
  server/
    __init__.py
    app.py               Federation server Flask app
    tree.py              CT-style Merkle tree
    storage.py           Tiered bundle storage
    gossip.py            Peer synchronization
    permissions.py       Access control
    config.py            Server configuration

~/.fwmetadata/
  chain/                 Local hash chain
    chain.bin              Append-only record log
    state.cbor             Chain state checkpoint
  exports/               Generated export bundles
  loader/                Loader state
    config.json            Server list and settings
    receipts/              Federation receipts
  federation/            Federation server data (when running as server)
    servers.json           Known peer servers