Complete project rebrand for better positioning in the press freedom and digital security space. FieldWitness communicates both field deployment and evidence testimony — appropriate for the target audience of journalists, NGOs, and human rights organizations. Rename mapping: - soosef → fieldwitness (package, CLI, all imports) - soosef.stegasoo → fieldwitness.stego - soosef.verisoo → fieldwitness.attest - ~/.soosef/ → ~/.fwmetadata/ (innocuous data dir name) - SOOSEF_DATA_DIR → FIELDWITNESS_DATA_DIR - SoosefConfig → FieldWitnessConfig - SoosefError → FieldWitnessError Also includes: - License switch from MIT to GPL-3.0 - C2PA bridge module (Phase 0-2 MVP): cert.py, export.py, vendor_assertions.py - README repositioned to lead with provenance/federation, stego backgrounded - Threat model skeleton at docs/security/threat-model.md - Planning docs: docs/planning/c2pa-integration.md, docs/planning/gtm-feasibility.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
18 KiB
Federation Protocol Specification
Status: Design
Version: 1 (protocol version)
Last updated: 2026-04-01
1. Overview
The federation is a network of append-only log servers inspired by Certificate Transparency (RFC 6962). Each server acts as a "blind notary" — it stores encrypted attestation bundles, maintains a Merkle tree over them, and issues signed receipts proving when bundles were received. Servers gossip with peers to ensure consistency and replicate data.
Federation servers never decrypt attestation content. They operate at the "federation member" permission tier: they can verify chain summaries and signatures, but not read the underlying attestation data.
2. Terminology
| Term | Definition |
|---|---|
| Bundle | An encrypted export bundle (FIELDWITNESSX1 format) containing chain records |
| STH | Signed Tree Head — a server's signed commitment to its current Merkle tree state |
| Receipt | A server-signed proof that a bundle was included in its log at a specific time |
| Inclusion proof | Merkle path from a leaf (bundle hash) to the tree root |
| Consistency proof | Proof that an older tree is a prefix of a newer tree (no entries removed) |
| Gossip | Peer-to-peer exchange of STHs and entries to maintain consistency |
3. Server Merkle Tree
3.1 Structure
The server maintains a single append-only Merkle tree. Each leaf is the SHA-256 hash of a received bundle's raw bytes:
leaf[i] = SHA-256(bundle_bytes[i])
Internal nodes follow standard Merkle tree construction:
node = SHA-256(0x01 || left || right) # internal node
leaf = SHA-256(0x00 || data) # leaf node (domain separation)
Domain separation prefixes (0x00 for leaves, 0x01 for internal nodes) prevent
second-preimage attacks, following CT convention (RFC 6962 §2.1).
3.2 Signed Tree Head (STH)
After each append (or periodically in batch mode), the server computes and signs a new tree head:
{
0: tree_size, # uint — number of leaves
1: root_hash, # bytes[32] — Merkle tree root
2: timestamp, # int — Unix µs, server's clock
3: server_id, # text — server identifier (domain or pubkey fingerprint)
4: server_pubkey, # bytes[32] — Ed25519 public key
5: signature, # bytes[64] — Ed25519(cbor(fields 0-4))
}
The STH is the server's signed commitment: "My tree has N entries with this root at this time." Clients and peers can verify the signature and use consistency proofs to ensure the tree only grows (never shrinks or forks).
3.3 Inclusion Proof
Proves a specific bundle is at index i in a tree of size n:
proof = [(sibling_hash, direction), ...]
Verification:
current = SHA-256(0x00 || bundle_bytes)
for (sibling, direction) in proof:
if direction == "L":
current = SHA-256(0x01 || sibling || current)
else:
current = SHA-256(0x01 || current || sibling)
assert current == sth.root_hash
3.4 Consistency Proof
Proves that tree of size m is a prefix of tree of size n (where m < n). This
guarantees the server hasn't removed or reordered entries.
The proof is a list of intermediate hashes that, combined with the old root, reconstruct the new root. Verification follows RFC 6962 §2.1.2.
4. API Endpoints
All endpoints use CBOR for request/response bodies. Content-Type: application/cbor.
4.1 Submit Bundle
POST /v1/submit
Request body: Raw bundle bytes (application/octet-stream)
Processing:
- Verify magic bytes
b"FIELDWITNESSX1"and version - Parse chain summary
- Verify
bundle_sigagainstsigner_pubkey - Compute
bundle_hash = SHA-256(0x00 || bundle_bytes) - Check for duplicate (
bundle_hashalready in tree) — if duplicate, return existing receipt - Append
bundle_hashto Merkle tree - Store bundle bytes (encrypted blob, as-is)
- Generate and sign receipt
Response (CBOR):
{
0: bundle_id, # bytes[16] — from chain summary
1: bundle_hash, # bytes[32] — leaf hash
2: tree_size, # uint — tree size after inclusion
3: tree_index, # uint — leaf index in tree
4: timestamp, # int — Unix µs, server's reception time
5: inclusion_proof, # array of bytes[32] — Merkle path
6: sth, # map — current STH (see §3.2)
7: server_id, # text — server identifier
8: server_pubkey, # bytes[32] — Ed25519 public key
9: receipt_sig, # bytes[64] — Ed25519(cbor(fields 0-8))
}
Auth: Federation member token required.
Errors:
400— Invalid bundle format, bad signature401— Missing or invalid auth token507— Server storage full
4.2 Get Signed Tree Head
GET /v1/sth
Response (CBOR): STH map (see §3.2)
Auth: Public (no auth required).
4.3 Get Consistency Proof
GET /v1/consistency-proof?old={m}&new={n}
Parameters:
old— previous tree size (must be > 0)new— current tree size (must be >= old)
Response (CBOR):
{
0: old_size, # uint
1: new_size, # uint
2: proof, # array of bytes[32]
}
Auth: Public.
4.4 Get Inclusion Proof
GET /v1/inclusion-proof?hash={hex}&tree_size={n}
Parameters:
hash— hex-encoded bundle hash (leaf hash)tree_size— tree size for the proof (use current STH tree_size)
Response (CBOR):
{
0: tree_index, # uint — leaf index
1: tree_size, # uint
2: proof, # array of bytes[32]
}
Auth: Public.
4.5 Get Entries
GET /v1/entries?start={s}&end={e}
Parameters:
start— first tree index (inclusive)end— last tree index (inclusive)- Maximum range: 1000 entries per request
Response (CBOR):
{
0: entries, # array of entry maps (see §4.5.1)
}
4.5.1 Entry Map
{
0: tree_index, # uint
1: bundle_hash, # bytes[32]
2: chain_summary, # CBOR map (from bundle, unencrypted)
3: encrypted_blob, # bytes — full FIELDWITNESSX1 bundle
4: receipt_ts, # int — Unix µs when received
}
Auth: Federation member token required.
4.6 Audit Summary
GET /v1/audit/summary?bundle_id={hex}
Returns the chain summary for a specific bundle without the encrypted payload.
Response (CBOR):
{
0: bundle_id, # bytes[16]
1: chain_summary, # CBOR map (from bundle)
2: tree_index, # uint
3: receipt_ts, # int
4: inclusion_proof, # array of bytes[32] (against current STH)
}
Auth: Public.
5. Permission Tiers
5.1 Public Auditor
Access: Unauthenticated.
Endpoints: /v1/sth, /v1/consistency-proof, /v1/inclusion-proof, /v1/audit/summary
Can verify:
- The log exists and has a specific size at a specific time
- A specific bundle is included in the log at a specific position
- The log has not been forked (consistency proofs between STHs)
- Chain summary metadata (record count, hash range) for any bundle
Cannot see: Encrypted content, chain IDs, signer identities, raw bundles.
5.2 Federation Member
Access: Bearer token issued by server operator. Tokens are Ed25519-signed credentials binding a public key to a set of permissions.
{
0: token_id, # bytes[16] — UUID v7
1: member_pubkey, # bytes[32] — member's Ed25519 public key
2: permissions, # array of text — ["submit", "entries", "gossip"]
3: issued_at, # int — Unix µs
4: expires_at, # int — Unix µs (0 = no expiry)
5: issuer_pubkey, # bytes[32] — server's Ed25519 public key
6: signature, # bytes[64] — Ed25519(cbor(fields 0-5))
}
Endpoints: All public endpoints + /v1/submit, /v1/entries, gossip endpoints.
Can see: Everything a public auditor sees + chain IDs, signer public keys, full encrypted bundles (but not decrypted content).
5.3 Authorized Recipient
Not enforced server-side. Recipients hold Ed25519 private keys whose corresponding public keys appear in the bundle's recipients array. They can decrypt bundle content locally after retrieving the encrypted blob via the entries endpoint.
The server has no knowledge of who can or cannot decrypt a given bundle.
6. Gossip Protocol
6.1 Overview
Federation servers maintain a list of known peers. Periodically (default: every 5 minutes), each server initiates gossip with its peers to:
- Exchange STHs — detect if any peer has entries the local server doesn't
- Verify consistency — ensure no peer is presenting a forked log
- Sync entries — pull missing entries from peers that have them
6.2 Gossip Flow
Server A Server B
│ │
│── POST /v1/gossip/sth ──────────────>│ (A sends its STH)
│ │
│<── response: B's STH ───────────────│ (B responds with its STH)
│ │
│ (A compares tree sizes) │
│ if B.tree_size > A.tree_size: │
│ │
│── GET /v1/consistency-proof ────────>│ (verify B's tree extends A's)
│<── proof ────────────────────────────│
│ │
│ (verify consistency proof) │
│ │
│── GET /v1/entries?start=...&end=... >│ (pull missing entries)
│<── entries ──────────────────────────│
│ │
│ (append entries to local tree) │
│ (recompute STH) │
│ │
6.3 Gossip Endpoints
POST /v1/gossip/sth
Request body (CBOR): Sender's current STH.
Response (CBOR): Receiver's current STH.
Auth: Federation member token with "gossip" permission.
6.4 Fork Detection
If server A receives an STH from server B where:
B.tree_size <= A.tree_sizebutB.root_hash != A.root_hashat the same size
Then B is presenting a different history. This is a fork — a critical security event. The server should:
- Log the fork with both STHs as evidence
- Alert the operator
- Continue serving its own tree (do not merge the forked tree)
- Refuse to gossip further with the forked peer until operator resolution
6.5 Convergence
Under normal operation (no forks), servers converge to identical trees. The convergence time depends on gossip interval and network topology. With a 5-minute interval and full mesh topology among N servers, convergence after a new entry takes at most 5 minutes.
7. Receipts
7.1 Purpose
A receipt is the federation's proof that a bundle was received and included in the log at a specific time. It is the critical artifact that closes the timestamp gap: the offline device's claimed timestamp + the federation receipt = practical proof of timing.
7.2 Receipt Format
{
0: bundle_id, # bytes[16] — from chain summary
1: bundle_hash, # bytes[32] — leaf hash in server's tree
2: tree_size, # uint — tree size at inclusion
3: tree_index, # uint — leaf position
4: timestamp, # int — Unix µs, server's clock
5: inclusion_proof, # array of bytes[32] — Merkle path
6: sth, # map — STH at time of inclusion
7: server_id, # text — server identifier
8: server_pubkey, # bytes[32] — Ed25519 public key
9: receipt_sig, # bytes[64] — Ed25519(cbor(fields 0-8))
}
7.3 Receipt Verification
To verify a receipt:
Ed25519_Verify(server_pubkey, receipt_sig, cbor(fields 0-8))— receipt is authentic- Verify
inclusion_proofagainststh.root_hashwithbundle_hashattree_index - Verify
sth.signature— the STH itself is authentic sth.tree_size >= tree_size— STH covers the inclusionsth.timestamp >= timestamp— STH is at or after receipt time
7.4 Receipt Lifecycle
1. Loader submits bundle to federation server
2. Server issues receipt in submit response
3. Loader stores receipt locally (receipts/ directory)
4. Loader exports receipts to USB (CBOR file)
5. Offline device imports receipts
6. Receipt is stored alongside chain records as proof of federation timestamp
7.5 Multi-Server Receipts
A bundle submitted to N servers produces N independent receipts. Each receipt is from a different server with a different timestamp and Merkle tree position. Multiple receipts strengthen the timestamp claim — an adversary would need to compromise all N servers to suppress evidence.
8. Storage Tiers
Federation servers manage bundle storage across three tiers based on age:
8.1 Hot Tier (0-30 days)
- Format: Individual files, one per bundle
- Location:
data/hot/{tree_index}.bundle - Access: Direct file read, O(1)
- Purpose: Fast access for recent entries, active gossip sync
8.2 Warm Tier (30-365 days)
- Format: Zstd-compressed segments, 1000 bundles per segment
- Location:
data/warm/segment-{start}-{end}.zst - Access: Decompress segment, extract entry
- Compression: Zstd level 3 (fast compression, moderate ratio)
- Purpose: Reduced storage for medium-term retention
8.3 Cold Tier (>1 year)
- Format: Zstd-compressed segments, maximum compression
- Location:
data/cold/segment-{start}-{end}.zst - Access: Decompress segment, extract entry
- Compression: Zstd level 19 (slow compression, best ratio)
- Purpose: Archival storage, rarely accessed
8.4 Tier Promotion
A background compaction process runs periodically (default: every 24 hours):
- Identify hot entries older than 30 days
- Group into segments of 1000
- Compress and write to warm tier
- Delete hot files
- Repeat for warm → cold at 365 days
8.5 Merkle Tree Preservation
The Merkle tree is independent of storage tiers. Leaf hashes and the tree structure are maintained in a separate data structure (compact tree format, stored in SQLite or flat file). Moving bundles between storage tiers does not affect the tree.
Inclusion proofs and consistency proofs remain valid across tier promotions — they reference the tree, not the storage location.
8.6 Metadata Database
SQLite database tracking all bundles:
CREATE TABLE bundles (
tree_index INTEGER PRIMARY KEY,
bundle_id BLOB NOT NULL, -- UUID v7
bundle_hash BLOB NOT NULL, -- leaf hash
chain_id BLOB NOT NULL, -- source chain ID
signer_pubkey BLOB NOT NULL, -- Ed25519 public key
record_count INTEGER NOT NULL, -- records in bundle
range_start INTEGER NOT NULL, -- first chain index
range_end INTEGER NOT NULL, -- last chain index
receipt_ts INTEGER NOT NULL, -- Unix µs reception time
storage_tier TEXT NOT NULL DEFAULT 'hot', -- 'hot', 'warm', 'cold'
storage_key TEXT NOT NULL, -- file path or segment reference
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'))
);
CREATE INDEX idx_bundles_bundle_id ON bundles(bundle_id);
CREATE INDEX idx_bundles_chain_id ON bundles(chain_id);
CREATE INDEX idx_bundles_bundle_hash ON bundles(bundle_hash);
CREATE INDEX idx_bundles_receipt_ts ON bundles(receipt_ts);
9. Server Configuration
{
"server_id": "my-server.example.org",
"host": "0.0.0.0",
"port": 8443,
"data_dir": "/var/lib/fieldwitness-federation",
"identity_key_path": "/etc/fieldwitness-federation/identity/private.pem",
"peers": [
{
"url": "https://peer1.example.org:8443",
"pubkey_hex": "abc123...",
"name": "Peer One"
}
],
"gossip_interval_seconds": 300,
"hot_retention_days": 30,
"warm_retention_days": 365,
"compaction_interval_hours": 24,
"max_bundle_size_bytes": 10485760,
"max_entries_per_request": 1000,
"member_tokens": [
{
"name": "loader-1",
"pubkey_hex": "def456...",
"permissions": ["submit", "entries"]
}
]
}
10. Error Codes
| HTTP Status | CBOR Error Code | Description |
|---|---|---|
| 400 | "invalid_bundle" |
Bundle format invalid or signature verification failed |
| 400 | "invalid_range" |
Requested entry range is invalid |
| 401 | "unauthorized" |
Missing or invalid auth token |
| 403 | "forbidden" |
Token lacks required permission |
| 404 | "not_found" |
Bundle or entry not found |
| 409 | "duplicate" |
Bundle already in log (returns existing receipt) |
| 413 | "bundle_too_large" |
Bundle exceeds max_bundle_size_bytes |
| 507 | "storage_full" |
Server cannot accept new entries |
Error response format:
{
0: error_code, # text
1: message, # text — human-readable description
2: details, # map — optional additional context
}
11. Security Considerations
11.1 Server Compromise
A compromised server can:
- Read bundle metadata (chain IDs, signer pubkeys, timestamps) — expected at member tier
- Withhold entries from gossip — detectable: other servers will see inconsistent tree sizes
- Present a forked tree — detectable: consistency proofs will fail
- Issue false receipts — detectable: receipt's inclusion proof won't verify against other servers' STHs
A compromised server cannot:
- Read attestation content (encrypted with recipient keys)
- Forge attestation signatures (requires Ed25519 private key)
- Modify bundle contents (GCM authentication would fail)
- Reorder or remove entries from other servers' trees
11.2 Transport Security
All server-to-server and client-to-server communication should use TLS 1.3. The federation protocol provides its own authentication (Ed25519 signatures on STHs and receipts), but TLS prevents network-level attacks.
11.3 Clock Reliability
Federation server clocks should be synchronized via NTP. Receipt timestamps are only as reliable as the server's clock. Deploying servers across multiple time zones and operators provides cross-checks — wildly divergent receipt timestamps for the same bundle indicate clock problems or compromise.