Comprehensive documentation for v0.2.0 release
README.md (700 lines): - Three-tier deployment model with ASCII diagram - Federation blueprint in web UI routes - deploy/ directory in architecture tree - Documentation index linking all guides CLAUDE.md (256 lines): - Updated architecture tree with all new docs and deploy files New guides: - docs/federation.md (317 lines) — gossip protocol mechanics, peer setup, trust filtering, offline bundles, relay deployment, jurisdiction - docs/evidence-guide.md (283 lines) — evidence packages, cold archives, selective disclosure, chain anchoring, legal discovery workflow - docs/source-dropbox.md (220 lines) — token management, client-side hashing, extract-then-strip pipeline, receipt mechanics, opsec - docs/index.md — documentation hub linking all guides Training materials: - docs/training/reporter-quickstart.md (105 lines) — printable one-page card: boot USB, attest photo, encode message, check-in, emergency - docs/training/emergency-card.md (79 lines) — wallet-sized laminated card: three destruction methods, 10-step order, key contacts - docs/training/admin-reference.md (219 lines) — deployment tiers, CLI tables, backup checklist, hardening checklist, troubleshooting Also includes existing architecture docs from the original repos. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
252
docs/architecture/chain-format.md
Normal file
252
docs/architecture/chain-format.md
Normal file
@@ -0,0 +1,252 @@
|
||||
# Chain Format Specification
|
||||
|
||||
**Status**: Design
|
||||
**Version**: 1 (record format version)
|
||||
**Last updated**: 2026-04-01
|
||||
|
||||
## 1. Overview
|
||||
|
||||
The attestation chain is an append-only sequence of signed records stored locally on the
|
||||
offline device. Each record includes a hash of the previous record, forming a tamper-evident
|
||||
chain analogous to git commits or blockchain blocks.
|
||||
|
||||
The chain wraps existing Verisoo attestation records. A Verisoo record's serialized bytes
|
||||
become the input to `content_hash`, preserving the original attestation while adding
|
||||
ordering, entropy witnesses, and chain integrity guarantees.
|
||||
|
||||
## 2. AttestationChainRecord
|
||||
|
||||
### Field Definitions
|
||||
|
||||
| Field | CBOR Key | Type | Size | Description |
|
||||
|---|---|---|---|---|
|
||||
| `version` | 0 | unsigned int | 1 byte | Record format version. Currently `1`. |
|
||||
| `record_id` | 1 | byte string | 16 bytes | UUID v7 (RFC 9562). Time-ordered unique identifier. |
|
||||
| `chain_index` | 2 | unsigned int | 8 bytes max | Monotonically increasing, 0-based. Genesis record is index 0. |
|
||||
| `prev_hash` | 3 | byte string | 32 bytes | SHA-256 of `canonical_bytes(previous_record)`. Genesis: `0x00 * 32`. |
|
||||
| `content_hash` | 4 | byte string | 32 bytes | SHA-256 of the wrapped content (e.g., Verisoo record bytes). |
|
||||
| `content_type` | 5 | text string | variable | MIME-like type identifier. `"verisoo/attestation-v1"` for Verisoo records. |
|
||||
| `metadata` | 6 | CBOR map | variable | Extensible key-value map. See §2.1. |
|
||||
| `claimed_ts` | 7 | integer | 8 bytes max | Unix timestamp in microseconds (µs). Signed integer to handle pre-epoch dates. |
|
||||
| `entropy_witnesses` | 8 | CBOR map | variable | System entropy snapshot. See §3. |
|
||||
| `signer_pubkey` | 9 | byte string | 32 bytes | Ed25519 raw public key bytes. |
|
||||
| `signature` | 10 | byte string | 64 bytes | Ed25519 signature over `canonical_bytes(record)` excluding the signature field. |
|
||||
|
||||
### 2.1 Metadata Map
|
||||
|
||||
The `metadata` field is an open CBOR map with text string keys. Defined keys:
|
||||
|
||||
| Key | Type | Description |
|
||||
|---|---|---|
|
||||
| `"backfilled"` | bool | `true` if this record was created by the backfill migration |
|
||||
| `"caption"` | text | Human-readable description of the attested content |
|
||||
| `"location"` | text | Location name associated with the attestation |
|
||||
| `"original_ts"` | integer | Original Verisoo timestamp (µs) if different from `claimed_ts` |
|
||||
| `"tags"` | array of text | User-defined classification tags |
|
||||
|
||||
Applications may add custom keys. Unknown keys must be preserved during serialization.
|
||||
|
||||
## 3. Entropy Witnesses
|
||||
|
||||
Entropy witnesses are system-state snapshots collected at record creation time. They serve
|
||||
as soft evidence that the claimed timestamp is plausible. Fabricating convincing witnesses
|
||||
for a backdated record requires simulating the full system state at the claimed time.
|
||||
|
||||
| Field | CBOR Key | Type | Source | Fallback (non-Linux) |
|
||||
|---|---|---|---|---|
|
||||
| `sys_uptime` | 0 | float64 | `time.monotonic()` | Same (cross-platform) |
|
||||
| `fs_snapshot` | 1 | byte string (16 bytes) | SHA-256 of `os.stat()` on chain DB, truncated to 16 bytes | SHA-256 of chain dir stat |
|
||||
| `proc_entropy` | 2 | unsigned int | `/proc/sys/kernel/random/entropy_avail` | `len(os.urandom(32))` (always 32, marker for non-Linux) |
|
||||
| `boot_id` | 3 | text string | `/proc/sys/kernel/random/boot_id` | `uuid.uuid4()` cached per process lifetime |
|
||||
|
||||
### Witness Properties
|
||||
|
||||
- **sys_uptime**: Monotonically increasing within a boot. Cannot decrease. A record with
|
||||
`sys_uptime < previous_record.sys_uptime` and `claimed_ts > previous_record.claimed_ts`
|
||||
is suspicious (reboot or clock manipulation).
|
||||
- **fs_snapshot**: Changes with every write to the chain DB. Hash includes mtime, ctime,
|
||||
size, and inode number.
|
||||
- **proc_entropy**: Varies naturally. On Linux, reflects kernel entropy pool state.
|
||||
- **boot_id**: Changes on every reboot. Identical `boot_id` across records implies same
|
||||
boot session — combined with `sys_uptime`, this constrains the timeline.
|
||||
|
||||
## 4. Serialization
|
||||
|
||||
### 4.1 Canonical Bytes
|
||||
|
||||
`canonical_bytes(record)` produces the deterministic byte representation used for hashing
|
||||
and signing. It is a CBOR map containing all fields **except** `signature`, encoded using
|
||||
CBOR canonical encoding (RFC 8949 §4.2):
|
||||
|
||||
- Map keys sorted by integer value (0, 1, 2, ..., 9)
|
||||
- Integers use minimal-length encoding
|
||||
- No indefinite-length items
|
||||
- No duplicate keys
|
||||
|
||||
```
|
||||
canonical_bytes(record) = cbor2.dumps({
|
||||
0: record.version,
|
||||
1: record.record_id,
|
||||
2: record.chain_index,
|
||||
3: record.prev_hash,
|
||||
4: record.content_hash,
|
||||
5: record.content_type,
|
||||
6: record.metadata,
|
||||
7: record.claimed_ts,
|
||||
8: {
|
||||
0: record.entropy_witnesses.sys_uptime,
|
||||
1: record.entropy_witnesses.fs_snapshot,
|
||||
2: record.entropy_witnesses.proc_entropy,
|
||||
3: record.entropy_witnesses.boot_id,
|
||||
},
|
||||
9: record.signer_pubkey,
|
||||
}, canonical=True)
|
||||
```
|
||||
|
||||
### 4.2 Record Hash
|
||||
|
||||
```
|
||||
compute_record_hash(record) = SHA-256(canonical_bytes(record))
|
||||
```
|
||||
|
||||
This hash is used as `prev_hash` in the next record and as Merkle tree leaves in export
|
||||
bundles.
|
||||
|
||||
### 4.3 Signature
|
||||
|
||||
```
|
||||
record.signature = Ed25519_Sign(private_key, canonical_bytes(record))
|
||||
```
|
||||
|
||||
Verification:
|
||||
```
|
||||
Ed25519_Verify(record.signer_pubkey, record.signature, canonical_bytes(record))
|
||||
```
|
||||
|
||||
### 4.4 Full Serialization
|
||||
|
||||
`serialize_record(record)` produces the full CBOR encoding including the signature field
|
||||
(CBOR key 10). This is used for storage and transmission.
|
||||
|
||||
```
|
||||
serialize_record(record) = cbor2.dumps({
|
||||
0: record.version,
|
||||
1: record.record_id,
|
||||
...
|
||||
9: record.signer_pubkey,
|
||||
10: record.signature,
|
||||
}, canonical=True)
|
||||
```
|
||||
|
||||
## 5. Chain Rules
|
||||
|
||||
### 5.1 Genesis Record
|
||||
|
||||
The first record in a chain (index 0) has:
|
||||
- `chain_index = 0`
|
||||
- `prev_hash = b'\x00' * 32` (32 zero bytes)
|
||||
|
||||
The **chain ID** is defined as `SHA-256(canonical_bytes(genesis_record))`. This permanently
|
||||
identifies the chain.
|
||||
|
||||
### 5.2 Append Rule
|
||||
|
||||
For record N (where N > 0):
|
||||
```
|
||||
record_N.chain_index == record_{N-1}.chain_index + 1
|
||||
record_N.prev_hash == compute_record_hash(record_{N-1})
|
||||
record_N.claimed_ts >= record_{N-1}.claimed_ts (SHOULD, not MUST — clock skew possible)
|
||||
```
|
||||
|
||||
### 5.3 Verification
|
||||
|
||||
Full chain verification checks, for each record from index 0 to head:
|
||||
1. `Ed25519_Verify(record.signer_pubkey, record.signature, canonical_bytes(record))` — signature valid
|
||||
2. `record.chain_index == expected_index` — no gaps or duplicates
|
||||
3. `record.prev_hash == compute_record_hash(previous_record)` — chain link intact
|
||||
4. All `signer_pubkey` values are identical within a chain (single-signer chain)
|
||||
|
||||
Violation of rule 4 indicates a chain was signed by multiple identities, which may be
|
||||
legitimate (key rotation) or malicious (chain hijacking). Key rotation is out of scope for
|
||||
v1; implementations should flag this as a warning.
|
||||
|
||||
## 6. Storage Format
|
||||
|
||||
### 6.1 chain.bin (Append-Only Log)
|
||||
|
||||
Records are stored sequentially as length-prefixed CBOR:
|
||||
|
||||
```
|
||||
┌─────────────────────────────┐
|
||||
│ uint32 BE: record_0 length │
|
||||
│ bytes: serialize(record_0) │
|
||||
├─────────────────────────────┤
|
||||
│ uint32 BE: record_1 length │
|
||||
│ bytes: serialize(record_1) │
|
||||
├─────────────────────────────┤
|
||||
│ ... │
|
||||
└─────────────────────────────┘
|
||||
```
|
||||
|
||||
- Length prefix is 4 bytes, big-endian unsigned 32-bit integer
|
||||
- Maximum record size: 4 GiB (practical limit much smaller)
|
||||
- File is append-only; records are never modified or deleted
|
||||
- File locking via `fcntl.flock(LOCK_EX)` for single-writer safety
|
||||
|
||||
### 6.2 state.cbor (Chain State Checkpoint)
|
||||
|
||||
A single CBOR map, atomically rewritten after each append:
|
||||
|
||||
```cbor
|
||||
{
|
||||
"chain_id": bytes[32], # SHA-256(canonical_bytes(genesis))
|
||||
"head_index": uint, # Index of the most recent record
|
||||
"head_hash": bytes[32], # Hash of the most recent record
|
||||
"record_count": uint, # Total records in chain
|
||||
"created_at": int, # Unix µs when chain was created
|
||||
"last_append_at": int # Unix µs of last append
|
||||
}
|
||||
```
|
||||
|
||||
This file is a performance optimization — the canonical state is always derivable from
|
||||
`chain.bin`. On corruption, `state.cbor` is rebuilt by scanning the log.
|
||||
|
||||
### 6.3 File Locations
|
||||
|
||||
```
|
||||
~/.soosef/chain/
|
||||
chain.bin Append-only record log
|
||||
state.cbor Chain state checkpoint
|
||||
```
|
||||
|
||||
Paths are defined in `src/soosef/paths.py`.
|
||||
|
||||
## 7. Migration from Verisoo-Only Attestations
|
||||
|
||||
Existing Verisoo attestations in `~/.soosef/attestations/` are not modified. The chain
|
||||
is a parallel structure. Migration is performed by the `soosef chain backfill` command:
|
||||
|
||||
1. Iterate all records in Verisoo's `LocalStorage` (ordered by timestamp)
|
||||
2. For each record, compute `content_hash = SHA-256(record.to_bytes())`
|
||||
3. Create a chain record with:
|
||||
- `content_type = "verisoo/attestation-v1"`
|
||||
- `claimed_ts` set to the original Verisoo timestamp
|
||||
- `metadata = {"backfilled": true, "original_ts": <verisoo_timestamp>}`
|
||||
- Entropy witnesses collected at migration time (not original time)
|
||||
4. Append to chain
|
||||
|
||||
Backfilled records are distinguishable via the `backfilled` metadata flag. Their entropy
|
||||
witnesses reflect migration time, not original attestation time — this is honest and
|
||||
intentional.
|
||||
|
||||
## 8. Content Types
|
||||
|
||||
The `content_type` field identifies what was hashed into `content_hash`. Defined types:
|
||||
|
||||
| Content Type | Description |
|
||||
|---|---|
|
||||
| `verisoo/attestation-v1` | Verisoo `AttestationRecord` serialized bytes |
|
||||
| `soosef/raw-file-v1` | Raw file bytes (for non-image attestations, future) |
|
||||
| `soosef/metadata-only-v1` | No file content; metadata-only attestation (future) |
|
||||
|
||||
New content types may be added without changing the record format version.
|
||||
319
docs/architecture/export-bundle.md
Normal file
319
docs/architecture/export-bundle.md
Normal file
@@ -0,0 +1,319 @@
|
||||
# Export Bundle Specification
|
||||
|
||||
**Status**: Design
|
||||
**Version**: 1 (bundle format version)
|
||||
**Last updated**: 2026-04-01
|
||||
|
||||
## 1. Overview
|
||||
|
||||
An export bundle packages a contiguous range of chain records into a portable, encrypted
|
||||
file suitable for transfer across an air gap. The bundle format is designed so that:
|
||||
|
||||
- **Auditors** can verify chain integrity without decrypting content
|
||||
- **Recipients** with the correct key can decrypt and read attestation records
|
||||
- **Anyone** can detect tampering via Merkle root and signature verification
|
||||
- **Steganographic embedding** is optional — bundles can be hidden in JPEG images via DCT
|
||||
|
||||
The format follows the pattern established by `keystore/export.py` (SOOBNDL): magic bytes,
|
||||
version, structured binary payload.
|
||||
|
||||
## 2. Binary Layout
|
||||
|
||||
```
|
||||
Offset Size Field
|
||||
────── ───────── ──────────────────────────────────────
|
||||
0 8 magic: b"SOOSEFX1"
|
||||
8 1 version: uint8 (1)
|
||||
9 4 summary_len: uint32 BE
|
||||
13 var chain_summary: CBOR (see §3)
|
||||
var 4 recipients_len: uint32 BE
|
||||
var var recipients: CBOR array (see §4)
|
||||
var 12 nonce: AES-256-GCM nonce
|
||||
var var ciphertext: AES-256-GCM(zstd(CBOR(records)))
|
||||
last 16 16 tag: AES-256-GCM authentication tag
|
||||
```
|
||||
|
||||
All multi-byte integers are big-endian. The total bundle size is:
|
||||
`9 + 4 + summary_len + 4 + recipients_len + 12 + ciphertext_len + 16`
|
||||
|
||||
### Parsing Without Decryption
|
||||
|
||||
To audit a bundle without decryption, read:
|
||||
1. Magic (8 bytes) — verify `b"SOOSEFX1"`
|
||||
2. Version (1 byte) — verify `1`
|
||||
3. Summary length (4 bytes BE) — read the next N bytes as CBOR
|
||||
4. Chain summary — verify signature, inspect metadata
|
||||
|
||||
The encrypted payload and recipient list can be skipped for audit purposes.
|
||||
|
||||
## 3. Chain Summary
|
||||
|
||||
The chain summary sits **outside** the encryption envelope. It provides verifiable metadata
|
||||
about the bundle contents without revealing the actual attestation data.
|
||||
|
||||
CBOR map with integer keys:
|
||||
|
||||
| CBOR Key | Field | Type | Description |
|
||||
|---|---|---|---|
|
||||
| 0 | `bundle_id` | byte string (16) | UUID v7, unique bundle identifier |
|
||||
| 1 | `chain_id` | byte string (32) | SHA-256(genesis record) — identifies source chain |
|
||||
| 2 | `range_start` | unsigned int | First record index (inclusive) |
|
||||
| 3 | `range_end` | unsigned int | Last record index (inclusive) |
|
||||
| 4 | `record_count` | unsigned int | Number of records in bundle |
|
||||
| 5 | `first_hash` | byte string (32) | `compute_record_hash(first_record)` |
|
||||
| 6 | `last_hash` | byte string (32) | `compute_record_hash(last_record)` |
|
||||
| 7 | `merkle_root` | byte string (32) | Root of Merkle tree over record hashes (see §5) |
|
||||
| 8 | `created_ts` | integer | Bundle creation timestamp (Unix µs) |
|
||||
| 9 | `signer_pubkey` | byte string (32) | Ed25519 public key of bundle creator |
|
||||
| 10 | `bundle_sig` | byte string (64) | Ed25519 signature (see §3.1) |
|
||||
|
||||
### 3.1 Signature Computation
|
||||
|
||||
The signature covers all summary fields except `bundle_sig` itself:
|
||||
|
||||
```
|
||||
summary_bytes = cbor2.dumps({
|
||||
0: bundle_id,
|
||||
1: chain_id,
|
||||
2: range_start,
|
||||
3: range_end,
|
||||
4: record_count,
|
||||
5: first_hash,
|
||||
6: last_hash,
|
||||
7: merkle_root,
|
||||
8: created_ts,
|
||||
9: signer_pubkey,
|
||||
}, canonical=True)
|
||||
|
||||
bundle_sig = Ed25519_Sign(private_key, summary_bytes)
|
||||
```
|
||||
|
||||
### 3.2 Verification Without Decryption
|
||||
|
||||
An auditor verifies a bundle by:
|
||||
1. Parse chain summary
|
||||
2. `Ed25519_Verify(signer_pubkey, bundle_sig, summary_bytes)` — authentic summary
|
||||
3. `record_count == range_end - range_start + 1` — count matches range
|
||||
4. If previous bundles from the same `chain_id` exist, verify `first_hash` matches
|
||||
the expected continuation
|
||||
|
||||
The auditor now knows: "A chain with ID X contains records [start, end], the creator
|
||||
signed this claim, and the Merkle root commits to specific record contents." All without
|
||||
decrypting.
|
||||
|
||||
## 4. Envelope Encryption
|
||||
|
||||
### 4.1 Key Derivation
|
||||
|
||||
Ed25519 signing keys are converted to X25519 Diffie-Hellman keys for encryption:
|
||||
|
||||
```
|
||||
x25519_private = Ed25519_to_X25519_Private(ed25519_private_key)
|
||||
x25519_public = Ed25519_to_X25519_Public(ed25519_public_key_bytes)
|
||||
```
|
||||
|
||||
This uses the birational map between Ed25519 and X25519 curves, supported natively by
|
||||
the `cryptography` library.
|
||||
|
||||
### 4.2 DEK Generation
|
||||
|
||||
A random 32-byte data encryption key (DEK) is generated per bundle:
|
||||
|
||||
```
|
||||
dek = os.urandom(32) # AES-256 key
|
||||
```
|
||||
|
||||
### 4.3 DEK Wrapping (Per Recipient)
|
||||
|
||||
For each recipient, the DEK is wrapped using X25519 ECDH + HKDF + AES-256-GCM:
|
||||
|
||||
```
|
||||
1. shared_secret = X25519_ECDH(sender_x25519_private, recipient_x25519_public)
|
||||
2. derived_key = HKDF-SHA256(
|
||||
ikm=shared_secret,
|
||||
salt=bundle_id, # binds to this specific bundle
|
||||
info=b"soosef-dek-wrap-v1",
|
||||
length=32
|
||||
)
|
||||
3. wrapped_dek = AES-256-GCM_Encrypt(
|
||||
key=derived_key,
|
||||
nonce=os.urandom(12),
|
||||
plaintext=dek,
|
||||
aad=bundle_id # additional authenticated data
|
||||
)
|
||||
```
|
||||
|
||||
### 4.4 Recipients Array
|
||||
|
||||
CBOR array of recipient entries:
|
||||
|
||||
```cbor
|
||||
[
|
||||
{
|
||||
0: recipient_pubkey, # byte string (32) — Ed25519 public key
|
||||
1: wrap_nonce, # byte string (12) — AES-GCM nonce for DEK wrap
|
||||
2: wrapped_dek, # byte string (48) — encrypted DEK (32) + GCM tag (16)
|
||||
},
|
||||
...
|
||||
]
|
||||
```
|
||||
|
||||
### 4.5 Payload Encryption
|
||||
|
||||
```
|
||||
1. records_cbor = cbor2.dumps([serialize_record(r) for r in records], canonical=True)
|
||||
2. compressed = zstd.compress(records_cbor, level=3)
|
||||
3. nonce = os.urandom(12)
|
||||
4. ciphertext, tag = AES-256-GCM_Encrypt(
|
||||
key=dek,
|
||||
nonce=nonce,
|
||||
plaintext=compressed,
|
||||
aad=summary_bytes # binds ciphertext to this summary
|
||||
)
|
||||
```
|
||||
|
||||
The `summary_bytes` (same bytes that are signed) are used as additional authenticated
|
||||
data (AAD). This cryptographically binds the encrypted payload to the chain summary —
|
||||
modifying the summary invalidates the decryption.
|
||||
|
||||
### 4.6 Decryption
|
||||
|
||||
A recipient decrypts a bundle:
|
||||
|
||||
```
|
||||
1. Parse chain summary, verify bundle_sig
|
||||
2. Find own pubkey in recipients array
|
||||
3. shared_secret = X25519_ECDH(recipient_x25519_private, sender_x25519_public)
|
||||
(sender_x25519_public derived from summary.signer_pubkey)
|
||||
4. derived_key = HKDF-SHA256(shared_secret, salt=bundle_id, info=b"soosef-dek-wrap-v1")
|
||||
5. dek = AES-256-GCM_Decrypt(derived_key, wrap_nonce, wrapped_dek, aad=bundle_id)
|
||||
6. compressed = AES-256-GCM_Decrypt(dek, nonce, ciphertext, aad=summary_bytes)
|
||||
7. records_cbor = zstd.decompress(compressed)
|
||||
8. records = [deserialize_record(r) for r in cbor2.loads(records_cbor)]
|
||||
9. Verify each record's signature and chain linkage
|
||||
```
|
||||
|
||||
## 5. Merkle Tree
|
||||
|
||||
The Merkle tree provides compact proofs that specific records are included in a bundle.
|
||||
|
||||
### 5.1 Construction
|
||||
|
||||
Leaves are the record hashes in chain order:
|
||||
|
||||
```
|
||||
leaf[i] = compute_record_hash(records[i])
|
||||
```
|
||||
|
||||
Internal nodes:
|
||||
|
||||
```
|
||||
node = SHA-256(left_child || right_child)
|
||||
```
|
||||
|
||||
If the number of leaves is not a power of 2, the last leaf is promoted to the next level
|
||||
(standard binary Merkle tree padding).
|
||||
|
||||
### 5.2 Inclusion Proof
|
||||
|
||||
An inclusion proof for record at index `i` is a list of `(sibling_hash, direction)` pairs
|
||||
from the leaf to the root. Verification:
|
||||
|
||||
```
|
||||
current = leaf[i]
|
||||
for (sibling, direction) in proof:
|
||||
if direction == "L":
|
||||
current = SHA-256(sibling || current)
|
||||
else:
|
||||
current = SHA-256(current || sibling)
|
||||
assert current == merkle_root
|
||||
```
|
||||
|
||||
### 5.3 Usage
|
||||
|
||||
- **Export bundles**: `merkle_root` in chain summary commits to exact record contents
|
||||
- **Federation servers**: Build a separate Merkle tree over bundle hashes (see federation-protocol.md)
|
||||
|
||||
These are two different trees:
|
||||
1. **Record tree** (this section) — leaves are record hashes within a bundle
|
||||
2. **Bundle tree** (federation) — leaves are bundle hashes across the federation log
|
||||
|
||||
## 6. Steganographic Embedding
|
||||
|
||||
Bundles can optionally be embedded in JPEG images using stegasoo's DCT steganography:
|
||||
|
||||
```
|
||||
1. bundle_bytes = create_export_bundle(chain, start, end, private_key, recipients)
|
||||
2. stego_image = stegasoo.encode(
|
||||
carrier=carrier_image,
|
||||
reference=reference_image,
|
||||
file_data=bundle_bytes,
|
||||
passphrase=passphrase,
|
||||
embed_mode="dct",
|
||||
channel_key=channel_key # optional
|
||||
)
|
||||
```
|
||||
|
||||
Extraction:
|
||||
```
|
||||
1. result = stegasoo.decode(
|
||||
carrier=stego_image,
|
||||
reference=reference_image,
|
||||
passphrase=passphrase,
|
||||
channel_key=channel_key
|
||||
)
|
||||
2. bundle_bytes = result.file_data
|
||||
3. assert bundle_bytes[:8] == b"SOOSEFX1"
|
||||
```
|
||||
|
||||
### 6.1 Capacity Considerations
|
||||
|
||||
DCT steganography has limited capacity relative to the carrier image size. Approximate
|
||||
capacities:
|
||||
|
||||
| Carrier Size | Approximate DCT Capacity | Records (est.) |
|
||||
|---|---|---|
|
||||
| 1 MP (1024x1024) | ~10 KB | ~20-40 records |
|
||||
| 4 MP (2048x2048) | ~40 KB | ~80-160 records |
|
||||
| 12 MP (4000x3000) | ~100 KB | ~200-400 records |
|
||||
|
||||
Record size varies (~200-500 bytes each after CBOR serialization, before compression).
|
||||
Zstd compression typically achieves 2-4x ratio on CBOR attestation data. Use
|
||||
`check_capacity()` before embedding.
|
||||
|
||||
### 6.2 Multiple Images
|
||||
|
||||
For large export ranges, split across multiple bundles embedded in multiple carrier images.
|
||||
Each bundle is self-contained with its own chain summary. The receiving side imports them
|
||||
in any order — the chain indices and hashes enable reassembly.
|
||||
|
||||
## 7. Recipient Management
|
||||
|
||||
### 7.1 Adding Recipients
|
||||
|
||||
Recipients are identified by their Ed25519 public keys. To encrypt a bundle for a
|
||||
recipient, the creator needs only their public key (no shared secret setup required).
|
||||
|
||||
### 7.2 Recipient Discovery
|
||||
|
||||
Recipients' Ed25519 public keys can be obtained via:
|
||||
- Direct exchange (QR code, USB transfer, verbal fingerprint verification)
|
||||
- Federation server identity registry (when available)
|
||||
- Verisoo's existing `peers.json` file
|
||||
|
||||
### 7.3 Self-Encryption
|
||||
|
||||
The bundle creator should always include their own public key in the recipients list.
|
||||
This allows them to decrypt their own exports (e.g., when restoring from backup).
|
||||
|
||||
## 8. Error Handling
|
||||
|
||||
| Error | Cause | Response |
|
||||
|---|---|---|
|
||||
| Bad magic | Not a SOOSEFX1 bundle | Reject with `ExportError("not a SooSeF export bundle")` |
|
||||
| Bad version | Unsupported format version | Reject with `ExportError("unsupported bundle version")` |
|
||||
| Signature invalid | Tampered summary or wrong signer | Reject with `ExportError("bundle signature verification failed")` |
|
||||
| No matching recipient | Decryptor's key not in recipients list | Reject with `ExportError("not an authorized recipient")` |
|
||||
| GCM auth failure | Tampered ciphertext or wrong key | Reject with `ExportError("decryption failed — bundle may be corrupted")` |
|
||||
| Decompression failure | Corrupted compressed data | Reject with `ExportError("decompression failed")` |
|
||||
| Chain integrity failure | Records don't link correctly | Reject with `ChainIntegrityError(...)` after decryption |
|
||||
565
docs/architecture/federation-protocol.md
Normal file
565
docs/architecture/federation-protocol.md
Normal file
@@ -0,0 +1,565 @@
|
||||
# Federation Protocol Specification
|
||||
|
||||
**Status**: Design
|
||||
**Version**: 1 (protocol version)
|
||||
**Last updated**: 2026-04-01
|
||||
|
||||
## 1. Overview
|
||||
|
||||
The federation is a network of append-only log servers inspired by Certificate Transparency
|
||||
(RFC 6962). Each server acts as a "blind notary" — it stores encrypted attestation bundles,
|
||||
maintains a Merkle tree over them, and issues signed receipts proving when bundles were
|
||||
received. Servers gossip with peers to ensure consistency and replicate data.
|
||||
|
||||
Federation servers never decrypt attestation content. They operate at the "federation
|
||||
member" permission tier: they can verify chain summaries and signatures, but not read
|
||||
the underlying attestation data.
|
||||
|
||||
## 2. Terminology
|
||||
|
||||
| Term | Definition |
|
||||
|---|---|
|
||||
| **Bundle** | An encrypted export bundle (SOOSEFX1 format) containing chain records |
|
||||
| **STH** | Signed Tree Head — a server's signed commitment to its current Merkle tree state |
|
||||
| **Receipt** | A server-signed proof that a bundle was included in its log at a specific time |
|
||||
| **Inclusion proof** | Merkle path from a leaf (bundle hash) to the tree root |
|
||||
| **Consistency proof** | Proof that an older tree is a prefix of a newer tree (no entries removed) |
|
||||
| **Gossip** | Peer-to-peer exchange of STHs and entries to maintain consistency |
|
||||
|
||||
## 3. Server Merkle Tree
|
||||
|
||||
### 3.1 Structure
|
||||
|
||||
The server maintains a single append-only Merkle tree. Each leaf is the SHA-256 hash
|
||||
of a received bundle's raw bytes:
|
||||
|
||||
```
|
||||
leaf[i] = SHA-256(bundle_bytes[i])
|
||||
```
|
||||
|
||||
Internal nodes follow standard Merkle tree construction:
|
||||
```
|
||||
node = SHA-256(0x01 || left || right) # internal node
|
||||
leaf = SHA-256(0x00 || data) # leaf node (domain separation)
|
||||
```
|
||||
|
||||
Domain separation prefixes (`0x00` for leaves, `0x01` for internal nodes) prevent
|
||||
second-preimage attacks, following CT convention (RFC 6962 §2.1).
|
||||
|
||||
### 3.2 Signed Tree Head (STH)
|
||||
|
||||
After each append (or periodically in batch mode), the server computes and signs a new
|
||||
tree head:
|
||||
|
||||
```cbor
|
||||
{
|
||||
0: tree_size, # uint — number of leaves
|
||||
1: root_hash, # bytes[32] — Merkle tree root
|
||||
2: timestamp, # int — Unix µs, server's clock
|
||||
3: server_id, # text — server identifier (domain or pubkey fingerprint)
|
||||
4: server_pubkey, # bytes[32] — Ed25519 public key
|
||||
5: signature, # bytes[64] — Ed25519(cbor(fields 0-4))
|
||||
}
|
||||
```
|
||||
|
||||
The STH is the server's signed commitment: "My tree has N entries with this root at this
|
||||
time." Clients and peers can verify the signature and use consistency proofs to ensure
|
||||
the tree only grows (never shrinks or forks).
|
||||
|
||||
### 3.3 Inclusion Proof
|
||||
|
||||
Proves a specific bundle is at index `i` in a tree of size `n`:
|
||||
|
||||
```
|
||||
proof = [(sibling_hash, direction), ...]
|
||||
```
|
||||
|
||||
Verification:
|
||||
```
|
||||
current = SHA-256(0x00 || bundle_bytes)
|
||||
for (sibling, direction) in proof:
|
||||
if direction == "L":
|
||||
current = SHA-256(0x01 || sibling || current)
|
||||
else:
|
||||
current = SHA-256(0x01 || current || sibling)
|
||||
assert current == sth.root_hash
|
||||
```
|
||||
|
||||
### 3.4 Consistency Proof
|
||||
|
||||
Proves that tree of size `m` is a prefix of tree of size `n` (where `m < n`). This
|
||||
guarantees the server hasn't removed or reordered entries.
|
||||
|
||||
The proof is a list of intermediate hashes that, combined with the old root, reconstruct
|
||||
the new root. Verification follows RFC 6962 §2.1.2.
|
||||
|
||||
## 4. API Endpoints
|
||||
|
||||
All endpoints use CBOR for request/response bodies. Content-Type: `application/cbor`.
|
||||
|
||||
### 4.1 Submit Bundle
|
||||
|
||||
```
|
||||
POST /v1/submit
|
||||
```
|
||||
|
||||
**Request body**: Raw bundle bytes (application/octet-stream)
|
||||
|
||||
**Processing**:
|
||||
1. Verify magic bytes `b"SOOSEFX1"` and version
|
||||
2. Parse chain summary
|
||||
3. Verify `bundle_sig` against `signer_pubkey`
|
||||
4. Compute `bundle_hash = SHA-256(0x00 || bundle_bytes)`
|
||||
5. Check for duplicate (`bundle_hash` already in tree) — if duplicate, return existing receipt
|
||||
6. Append `bundle_hash` to Merkle tree
|
||||
7. Store bundle bytes (encrypted blob, as-is)
|
||||
8. Generate and sign receipt
|
||||
|
||||
**Response** (CBOR):
|
||||
```cbor
|
||||
{
|
||||
0: bundle_id, # bytes[16] — from chain summary
|
||||
1: bundle_hash, # bytes[32] — leaf hash
|
||||
2: tree_size, # uint — tree size after inclusion
|
||||
3: tree_index, # uint — leaf index in tree
|
||||
4: timestamp, # int — Unix µs, server's reception time
|
||||
5: inclusion_proof, # array of bytes[32] — Merkle path
|
||||
6: sth, # map — current STH (see §3.2)
|
||||
7: server_id, # text — server identifier
|
||||
8: server_pubkey, # bytes[32] — Ed25519 public key
|
||||
9: receipt_sig, # bytes[64] — Ed25519(cbor(fields 0-8))
|
||||
}
|
||||
```
|
||||
|
||||
**Auth**: Federation member token required.
|
||||
|
||||
**Errors**:
|
||||
- `400` — Invalid bundle format, bad signature
|
||||
- `401` — Missing or invalid auth token
|
||||
- `507` — Server storage full
|
||||
|
||||
### 4.2 Get Signed Tree Head
|
||||
|
||||
```
|
||||
GET /v1/sth
|
||||
```
|
||||
|
||||
**Response** (CBOR): STH map (see §3.2)
|
||||
|
||||
**Auth**: Public (no auth required).
|
||||
|
||||
### 4.3 Get Consistency Proof
|
||||
|
||||
```
|
||||
GET /v1/consistency-proof?old={m}&new={n}
|
||||
```
|
||||
|
||||
**Parameters**:
|
||||
- `old` — previous tree size (must be > 0)
|
||||
- `new` — current tree size (must be >= old)
|
||||
|
||||
**Response** (CBOR):
|
||||
```cbor
|
||||
{
|
||||
0: old_size, # uint
|
||||
1: new_size, # uint
|
||||
2: proof, # array of bytes[32]
|
||||
}
|
||||
```
|
||||
|
||||
**Auth**: Public.
|
||||
|
||||
### 4.4 Get Inclusion Proof
|
||||
|
||||
```
|
||||
GET /v1/inclusion-proof?hash={hex}&tree_size={n}
|
||||
```
|
||||
|
||||
**Parameters**:
|
||||
- `hash` — hex-encoded bundle hash (leaf hash)
|
||||
- `tree_size` — tree size for the proof (use current STH tree_size)
|
||||
|
||||
**Response** (CBOR):
|
||||
```cbor
|
||||
{
|
||||
0: tree_index, # uint — leaf index
|
||||
1: tree_size, # uint
|
||||
2: proof, # array of bytes[32]
|
||||
}
|
||||
```
|
||||
|
||||
**Auth**: Public.
|
||||
|
||||
### 4.5 Get Entries
|
||||
|
||||
```
|
||||
GET /v1/entries?start={s}&end={e}
|
||||
```
|
||||
|
||||
**Parameters**:
|
||||
- `start` — first tree index (inclusive)
|
||||
- `end` — last tree index (inclusive)
|
||||
- Maximum range: 1000 entries per request
|
||||
|
||||
**Response** (CBOR):
|
||||
```cbor
|
||||
{
|
||||
0: entries, # array of entry maps (see §4.5.1)
|
||||
}
|
||||
```
|
||||
|
||||
#### 4.5.1 Entry Map
|
||||
|
||||
```cbor
|
||||
{
|
||||
0: tree_index, # uint
|
||||
1: bundle_hash, # bytes[32]
|
||||
2: chain_summary, # CBOR map (from bundle, unencrypted)
|
||||
3: encrypted_blob, # bytes — full SOOSEFX1 bundle
|
||||
4: receipt_ts, # int — Unix µs when received
|
||||
}
|
||||
```
|
||||
|
||||
**Auth**: Federation member token required.
|
||||
|
||||
### 4.6 Audit Summary
|
||||
|
||||
```
|
||||
GET /v1/audit/summary?bundle_id={hex}
|
||||
```
|
||||
|
||||
Returns the chain summary for a specific bundle without the encrypted payload.
|
||||
|
||||
**Response** (CBOR):
|
||||
```cbor
|
||||
{
|
||||
0: bundle_id, # bytes[16]
|
||||
1: chain_summary, # CBOR map (from bundle)
|
||||
2: tree_index, # uint
|
||||
3: receipt_ts, # int
|
||||
4: inclusion_proof, # array of bytes[32] (against current STH)
|
||||
}
|
||||
```
|
||||
|
||||
**Auth**: Public.
|
||||
|
||||
## 5. Permission Tiers
|
||||
|
||||
### 5.1 Public Auditor
|
||||
|
||||
**Access**: Unauthenticated.
|
||||
|
||||
**Endpoints**: `/v1/sth`, `/v1/consistency-proof`, `/v1/inclusion-proof`, `/v1/audit/summary`
|
||||
|
||||
**Can verify**:
|
||||
- The log exists and has a specific size at a specific time
|
||||
- A specific bundle is included in the log at a specific position
|
||||
- The log has not been forked (consistency proofs between STHs)
|
||||
- Chain summary metadata (record count, hash range) for any bundle
|
||||
|
||||
**Cannot see**: Encrypted content, chain IDs, signer identities, raw bundles.
|
||||
|
||||
### 5.2 Federation Member
|
||||
|
||||
**Access**: Bearer token issued by server operator. Tokens are Ed25519-signed
|
||||
credentials binding a public key to a set of permissions.
|
||||
|
||||
```cbor
|
||||
{
|
||||
0: token_id, # bytes[16] — UUID v7
|
||||
1: member_pubkey, # bytes[32] — member's Ed25519 public key
|
||||
2: permissions, # array of text — ["submit", "entries", "gossip"]
|
||||
3: issued_at, # int — Unix µs
|
||||
4: expires_at, # int — Unix µs (0 = no expiry)
|
||||
5: issuer_pubkey, # bytes[32] — server's Ed25519 public key
|
||||
6: signature, # bytes[64] — Ed25519(cbor(fields 0-5))
|
||||
}
|
||||
```
|
||||
|
||||
**Endpoints**: All public endpoints + `/v1/submit`, `/v1/entries`, gossip endpoints.
|
||||
|
||||
**Can see**: Everything a public auditor sees + chain IDs, signer public keys, full
|
||||
encrypted bundles (but not decrypted content).
|
||||
|
||||
### 5.3 Authorized Recipient
|
||||
|
||||
Not enforced server-side. Recipients hold Ed25519 private keys whose corresponding
|
||||
public keys appear in the bundle's recipients array. They can decrypt bundle content
|
||||
locally after retrieving the encrypted blob via the entries endpoint.
|
||||
|
||||
The server has no knowledge of who can or cannot decrypt a given bundle.
|
||||
|
||||
## 6. Gossip Protocol
|
||||
|
||||
### 6.1 Overview
|
||||
|
||||
Federation servers maintain a list of known peers. Periodically (default: every 5 minutes),
|
||||
each server initiates gossip with its peers to:
|
||||
|
||||
1. Exchange STHs — detect if any peer has entries the local server doesn't
|
||||
2. Verify consistency — ensure no peer is presenting a forked log
|
||||
3. Sync entries — pull missing entries from peers that have them
|
||||
|
||||
### 6.2 Gossip Flow
|
||||
|
||||
```
|
||||
Server A Server B
|
||||
│ │
|
||||
│── POST /v1/gossip/sth ──────────────>│ (A sends its STH)
|
||||
│ │
|
||||
│<── response: B's STH ───────────────│ (B responds with its STH)
|
||||
│ │
|
||||
│ (A compares tree sizes) │
|
||||
│ if B.tree_size > A.tree_size: │
|
||||
│ │
|
||||
│── GET /v1/consistency-proof ────────>│ (verify B's tree extends A's)
|
||||
│<── proof ────────────────────────────│
|
||||
│ │
|
||||
│ (verify consistency proof) │
|
||||
│ │
|
||||
│── GET /v1/entries?start=...&end=... >│ (pull missing entries)
|
||||
│<── entries ──────────────────────────│
|
||||
│ │
|
||||
│ (append entries to local tree) │
|
||||
│ (recompute STH) │
|
||||
│ │
|
||||
```
|
||||
|
||||
### 6.3 Gossip Endpoints
|
||||
|
||||
```
|
||||
POST /v1/gossip/sth
|
||||
```
|
||||
|
||||
**Request body** (CBOR): Sender's current STH.
|
||||
|
||||
**Response** (CBOR): Receiver's current STH.
|
||||
|
||||
**Auth**: Federation member token with `"gossip"` permission.
|
||||
|
||||
### 6.4 Fork Detection
|
||||
|
||||
If server A receives an STH from server B where:
|
||||
- `B.tree_size <= A.tree_size` but `B.root_hash != A.root_hash` at the same size
|
||||
|
||||
Then B is presenting a different history. This is a **fork** — a critical security event.
|
||||
The server should:
|
||||
|
||||
1. Log the fork with both STHs as evidence
|
||||
2. Alert the operator
|
||||
3. Continue serving its own tree (do not merge the forked tree)
|
||||
4. Refuse to gossip further with the forked peer until operator resolution
|
||||
|
||||
### 6.5 Convergence
|
||||
|
||||
Under normal operation (no forks), servers converge to identical trees. The convergence
|
||||
time depends on gossip interval and network topology. With a 5-minute interval and full
|
||||
mesh topology among N servers, convergence after a new entry takes at most 5 minutes.
|
||||
|
||||
## 7. Receipts
|
||||
|
||||
### 7.1 Purpose
|
||||
|
||||
A receipt is the federation's proof that a bundle was received and included in the log
|
||||
at a specific time. It is the critical artifact that closes the timestamp gap: the
|
||||
offline device's claimed timestamp + the federation receipt = practical proof of timing.
|
||||
|
||||
### 7.2 Receipt Format
|
||||
|
||||
```cbor
|
||||
{
|
||||
0: bundle_id, # bytes[16] — from chain summary
|
||||
1: bundle_hash, # bytes[32] — leaf hash in server's tree
|
||||
2: tree_size, # uint — tree size at inclusion
|
||||
3: tree_index, # uint — leaf position
|
||||
4: timestamp, # int — Unix µs, server's clock
|
||||
5: inclusion_proof, # array of bytes[32] — Merkle path
|
||||
6: sth, # map — STH at time of inclusion
|
||||
7: server_id, # text — server identifier
|
||||
8: server_pubkey, # bytes[32] — Ed25519 public key
|
||||
9: receipt_sig, # bytes[64] — Ed25519(cbor(fields 0-8))
|
||||
}
|
||||
```
|
||||
|
||||
### 7.3 Receipt Verification
|
||||
|
||||
To verify a receipt:
|
||||
|
||||
1. `Ed25519_Verify(server_pubkey, receipt_sig, cbor(fields 0-8))` — receipt is authentic
|
||||
2. Verify `inclusion_proof` against `sth.root_hash` with `bundle_hash` at `tree_index`
|
||||
3. Verify `sth.signature` — the STH itself is authentic
|
||||
4. `sth.tree_size >= tree_size` — STH covers the inclusion
|
||||
5. `sth.timestamp >= timestamp` — STH is at or after receipt time
|
||||
|
||||
### 7.4 Receipt Lifecycle
|
||||
|
||||
```
|
||||
1. Loader submits bundle to federation server
|
||||
2. Server issues receipt in submit response
|
||||
3. Loader stores receipt locally (receipts/ directory)
|
||||
4. Loader exports receipts to USB (CBOR file)
|
||||
5. Offline device imports receipts
|
||||
6. Receipt is stored alongside chain records as proof of federation timestamp
|
||||
```
|
||||
|
||||
### 7.5 Multi-Server Receipts
|
||||
|
||||
A bundle submitted to N servers produces N independent receipts. Each receipt is from a
|
||||
different server with a different timestamp and Merkle tree position. Multiple receipts
|
||||
strengthen the timestamp claim — an adversary would need to compromise all N servers to
|
||||
suppress evidence.
|
||||
|
||||
## 8. Storage Tiers
|
||||
|
||||
Federation servers manage bundle storage across three tiers based on age:
|
||||
|
||||
### 8.1 Hot Tier (0-30 days)
|
||||
|
||||
- **Format**: Individual files, one per bundle
|
||||
- **Location**: `data/hot/{tree_index}.bundle`
|
||||
- **Access**: Direct file read, O(1)
|
||||
- **Purpose**: Fast access for recent entries, active gossip sync
|
||||
|
||||
### 8.2 Warm Tier (30-365 days)
|
||||
|
||||
- **Format**: Zstd-compressed segments, 1000 bundles per segment
|
||||
- **Location**: `data/warm/segment-{start}-{end}.zst`
|
||||
- **Access**: Decompress segment, extract entry
|
||||
- **Compression**: Zstd level 3 (fast compression, moderate ratio)
|
||||
- **Purpose**: Reduced storage for medium-term retention
|
||||
|
||||
### 8.3 Cold Tier (>1 year)
|
||||
|
||||
- **Format**: Zstd-compressed segments, maximum compression
|
||||
- **Location**: `data/cold/segment-{start}-{end}.zst`
|
||||
- **Access**: Decompress segment, extract entry
|
||||
- **Compression**: Zstd level 19 (slow compression, best ratio)
|
||||
- **Purpose**: Archival storage, rarely accessed
|
||||
|
||||
### 8.4 Tier Promotion
|
||||
|
||||
A background compaction process runs periodically (default: every 24 hours):
|
||||
|
||||
1. Identify hot entries older than 30 days
|
||||
2. Group into segments of 1000
|
||||
3. Compress and write to warm tier
|
||||
4. Delete hot files
|
||||
5. Repeat for warm → cold at 365 days
|
||||
|
||||
### 8.5 Merkle Tree Preservation
|
||||
|
||||
The Merkle tree is independent of storage tiers. Leaf hashes and the tree structure
|
||||
are maintained in a separate data structure (compact tree format, stored in SQLite or
|
||||
flat file). Moving bundles between storage tiers does not affect the tree.
|
||||
|
||||
Inclusion proofs and consistency proofs remain valid across tier promotions — they
|
||||
reference the tree, not the storage location.
|
||||
|
||||
### 8.6 Metadata Database
|
||||
|
||||
SQLite database tracking all bundles:
|
||||
|
||||
```sql
|
||||
CREATE TABLE bundles (
|
||||
tree_index INTEGER PRIMARY KEY,
|
||||
bundle_id BLOB NOT NULL, -- UUID v7
|
||||
bundle_hash BLOB NOT NULL, -- leaf hash
|
||||
chain_id BLOB NOT NULL, -- source chain ID
|
||||
signer_pubkey BLOB NOT NULL, -- Ed25519 public key
|
||||
record_count INTEGER NOT NULL, -- records in bundle
|
||||
range_start INTEGER NOT NULL, -- first chain index
|
||||
range_end INTEGER NOT NULL, -- last chain index
|
||||
receipt_ts INTEGER NOT NULL, -- Unix µs reception time
|
||||
storage_tier TEXT NOT NULL DEFAULT 'hot', -- 'hot', 'warm', 'cold'
|
||||
storage_key TEXT NOT NULL, -- file path or segment reference
|
||||
created_at TEXT NOT NULL DEFAULT (strftime('%Y-%m-%dT%H:%M:%fZ', 'now'))
|
||||
);
|
||||
|
||||
CREATE INDEX idx_bundles_bundle_id ON bundles(bundle_id);
|
||||
CREATE INDEX idx_bundles_chain_id ON bundles(chain_id);
|
||||
CREATE INDEX idx_bundles_bundle_hash ON bundles(bundle_hash);
|
||||
CREATE INDEX idx_bundles_receipt_ts ON bundles(receipt_ts);
|
||||
```
|
||||
|
||||
## 9. Server Configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"server_id": "my-server.example.org",
|
||||
"host": "0.0.0.0",
|
||||
"port": 8443,
|
||||
"data_dir": "/var/lib/soosef-federation",
|
||||
"identity_key_path": "/etc/soosef-federation/identity/private.pem",
|
||||
"peers": [
|
||||
{
|
||||
"url": "https://peer1.example.org:8443",
|
||||
"pubkey_hex": "abc123...",
|
||||
"name": "Peer One"
|
||||
}
|
||||
],
|
||||
"gossip_interval_seconds": 300,
|
||||
"hot_retention_days": 30,
|
||||
"warm_retention_days": 365,
|
||||
"compaction_interval_hours": 24,
|
||||
"max_bundle_size_bytes": 10485760,
|
||||
"max_entries_per_request": 1000,
|
||||
"member_tokens": [
|
||||
{
|
||||
"name": "loader-1",
|
||||
"pubkey_hex": "def456...",
|
||||
"permissions": ["submit", "entries"]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## 10. Error Codes
|
||||
|
||||
| HTTP Status | CBOR Error Code | Description |
|
||||
|---|---|---|
|
||||
| 400 | `"invalid_bundle"` | Bundle format invalid or signature verification failed |
|
||||
| 400 | `"invalid_range"` | Requested entry range is invalid |
|
||||
| 401 | `"unauthorized"` | Missing or invalid auth token |
|
||||
| 403 | `"forbidden"` | Token lacks required permission |
|
||||
| 404 | `"not_found"` | Bundle or entry not found |
|
||||
| 409 | `"duplicate"` | Bundle already in log (returns existing receipt) |
|
||||
| 413 | `"bundle_too_large"` | Bundle exceeds `max_bundle_size_bytes` |
|
||||
| 507 | `"storage_full"` | Server cannot accept new entries |
|
||||
|
||||
Error response format:
|
||||
```cbor
|
||||
{
|
||||
0: error_code, # text
|
||||
1: message, # text — human-readable description
|
||||
2: details, # map — optional additional context
|
||||
}
|
||||
```
|
||||
|
||||
## 11. Security Considerations
|
||||
|
||||
### 11.1 Server Compromise
|
||||
|
||||
A compromised server can:
|
||||
- Read bundle metadata (chain IDs, signer pubkeys, timestamps) — **expected at member tier**
|
||||
- Withhold entries from gossip — **detectable**: other servers will see inconsistent tree sizes
|
||||
- Present a forked tree — **detectable**: consistency proofs will fail
|
||||
- Issue false receipts — **detectable**: receipt's inclusion proof won't verify against other servers' STHs
|
||||
|
||||
A compromised server **cannot**:
|
||||
- Read attestation content (encrypted with recipient keys)
|
||||
- Forge attestation signatures (requires Ed25519 private key)
|
||||
- Modify bundle contents (GCM authentication would fail)
|
||||
- Reorder or remove entries from other servers' trees
|
||||
|
||||
### 11.2 Transport Security
|
||||
|
||||
All server-to-server and client-to-server communication should use TLS 1.3. The
|
||||
federation protocol provides its own authentication (Ed25519 signatures on STHs and
|
||||
receipts), but TLS prevents network-level attacks.
|
||||
|
||||
### 11.3 Clock Reliability
|
||||
|
||||
Federation server clocks should be synchronized via NTP. Receipt timestamps are only as
|
||||
reliable as the server's clock. Deploying servers across multiple time zones and operators
|
||||
provides cross-checks — wildly divergent receipt timestamps for the same bundle indicate
|
||||
clock problems or compromise.
|
||||
254
docs/architecture/federation.md
Normal file
254
docs/architecture/federation.md
Normal file
@@ -0,0 +1,254 @@
|
||||
# Federated Attestation System — Architecture Overview
|
||||
|
||||
**Status**: Design
|
||||
**Version**: 0.1.0-draft
|
||||
**Last updated**: 2026-04-01
|
||||
|
||||
## 1. Problem Statement
|
||||
|
||||
SooSeF operates offline-first: devices create Ed25519-signed attestations without network
|
||||
access. This creates two fundamental challenges:
|
||||
|
||||
1. **Timestamp credibility** — An offline device's clock is untrusted. An adversary with
|
||||
physical access could backdate or postdate attestations.
|
||||
2. **Distribution** — Attestations trapped on a single device are vulnerable to seizure,
|
||||
destruction, or loss. They must be replicated to survive.
|
||||
|
||||
The system must solve both problems while preserving the offline-first constraint and
|
||||
protecting content confidentiality even from the distribution infrastructure.
|
||||
|
||||
## 2. Threat Model
|
||||
|
||||
### Adversaries
|
||||
|
||||
| Adversary | Capabilities | Goal |
|
||||
|---|---|---|
|
||||
| State actor | Physical device access, network surveillance, legal compulsion of server operators | Suppress or discredit attestations |
|
||||
| Insider threat | Access to one federation server | Fork the log, selectively omit entries, or read content |
|
||||
| Device thief | Physical access to offline device | Fabricate or backdate attestations |
|
||||
|
||||
### Security Properties
|
||||
|
||||
| Property | Guarantee | Mechanism |
|
||||
|---|---|---|
|
||||
| **Integrity** | Attestations cannot be modified after creation | Ed25519 signatures + hash chain |
|
||||
| **Ordering** | Attestations cannot be reordered or inserted | Hash chain (each record includes hash of previous) |
|
||||
| **Existence proof** | Attestation existed at or before time T | Federation receipt (server-signed timestamp) |
|
||||
| **Confidentiality** | Content is hidden from infrastructure | Envelope encryption; servers store encrypted blobs |
|
||||
| **Fork detection** | Log tampering is detectable | Merkle tree + consistency proofs (CT model) |
|
||||
| **Availability** | Attestations survive device loss | Replication across federation servers via gossip |
|
||||
|
||||
### Non-Goals
|
||||
|
||||
- **Proving exact creation time** — Impossible without a trusted time source. We prove
|
||||
ordering (hash chain) and existence-before (federation receipt). The gap between
|
||||
claimed time and receipt time is the trust window.
|
||||
- **Anonymity of attestors** — Federation members can see signer public keys. For anonymity,
|
||||
use a dedicated identity per context.
|
||||
- **Preventing denial of service** — Federation servers are assumed cooperative. Byzantine
|
||||
fault tolerance is out of scope for v1.
|
||||
|
||||
## 3. System Architecture
|
||||
|
||||
```
|
||||
OFFLINE DEVICE AIR GAP INTERNET
|
||||
────────────── ─────── ────────
|
||||
|
||||
┌──────────┐ ┌──────────────┐ ┌──────────┐ USB/SD ┌──────────┐ ┌────────────┐
|
||||
│ Verisoo │────>│ Hash Chain │────>│ Export │───────────────>│ Loader │────>│ Federation │
|
||||
│ Attest │ │ (Layer 1) │ │ Bundle │ │ (App) │ │ Server │
|
||||
└──────────┘ └──────────────┘ │ (Layer 2)│ └────┬─────┘ └─────┬──────┘
|
||||
└──────────┘ │ │
|
||||
│ │ ┌─────────┐ │
|
||||
Optional: │<───│ Receipt │<──┘
|
||||
DCT embed │ └─────────┘
|
||||
in JPEG │
|
||||
│ USB carry-back
|
||||
┌────v─────┐ │
|
||||
│ Stego │ ┌────v─────┐
|
||||
│ Image │ │ Offline │
|
||||
└──────────┘ │ Device │
|
||||
│ (receipt │
|
||||
│ stored) │
|
||||
└──────────┘
|
||||
```
|
||||
|
||||
### Layer 1: Hash-Chained Attestation Records (Local)
|
||||
|
||||
Each attestation is wrapped in a chain record that includes:
|
||||
- A hash of the previous record (tamper-evident ordering)
|
||||
- Entropy witnesses (system uptime, kernel state) that make timestamp fabrication expensive
|
||||
- An Ed25519 signature over the entire record
|
||||
|
||||
The chain lives on the offline device at `~/.soosef/chain/`. It wraps existing Verisoo
|
||||
attestation records — the Verisoo record's bytes become the `content_hash` input.
|
||||
|
||||
**See**: [chain-format.md](chain-format.md)
|
||||
|
||||
### Layer 2: Encrypted Export Bundles
|
||||
|
||||
A range of chain records is packaged into a portable bundle:
|
||||
1. Records serialized as CBOR, compressed with zstd
|
||||
2. Encrypted with AES-256-GCM using a random data encryption key (DEK)
|
||||
3. DEK wrapped per-recipient via X25519 ECDH (derived from Ed25519 identities)
|
||||
4. An unencrypted `chain_summary` (record count, hash range, Merkle root, signature) allows
|
||||
auditing without decryption
|
||||
|
||||
Bundles can optionally be embedded in JPEG images via stegasoo's DCT steganography,
|
||||
making them indistinguishable from normal photos on a USB stick.
|
||||
|
||||
**See**: [export-bundle.md](export-bundle.md)
|
||||
|
||||
### Layer 3: Federated Append-Only Log
|
||||
|
||||
Federation servers are "blind notaries" inspired by Certificate Transparency (RFC 6962):
|
||||
- They receive encrypted bundles, verify the chain summary signature, and append to a
|
||||
Merkle tree
|
||||
- They issue signed receipts with a federation timestamp (proof of existence)
|
||||
- They gossip Signed Tree Heads (STH) with peers to ensure consistency
|
||||
- They never decrypt content — they operate at the "federation member" permission tier
|
||||
|
||||
**See**: [federation-protocol.md](federation-protocol.md)
|
||||
|
||||
### The Loader (Air-Gap Bridge)
|
||||
|
||||
A separate application that runs on an internet-connected machine:
|
||||
1. Receives a bundle (from USB) or extracts one from a steganographic image
|
||||
2. Validates the bundle signature and chain summary
|
||||
3. Pushes to configured federation servers
|
||||
4. Collects signed receipts
|
||||
5. Receipts are carried back to the offline device on the next USB round-trip
|
||||
|
||||
The loader never needs signing keys — bundles are already signed. It is a transport mechanism.
|
||||
|
||||
## 4. Key Domains
|
||||
|
||||
SooSeF maintains strict separation between two cryptographic domains:
|
||||
|
||||
| Domain | Algorithm | Purpose | Key Location |
|
||||
|---|---|---|---|
|
||||
| **Signing** | Ed25519 | Attestation signatures, chain records, bundle summaries | `~/.soosef/identity/` |
|
||||
| **Encryption** | X25519 + AES-256-GCM | Bundle payload encryption (envelope) | Derived from Ed25519 via birational map |
|
||||
| **Steganography** | AES-256-GCM (from factors) | Stegasoo channel encryption | `~/.soosef/stegasoo/channel.key` |
|
||||
|
||||
The signing and encryption domains share a key lineage (Ed25519 → X25519 derivation) but
|
||||
serve different purposes. The steganography domain remains fully independent — it protects
|
||||
the stego carrier, not the attestation content.
|
||||
|
||||
### Private Key Storage Policy
|
||||
|
||||
The Ed25519 private key is stored **unencrypted** on disk (protected by 0o600 file
|
||||
permissions). This is a deliberate design decision:
|
||||
|
||||
- The killswitch (secure deletion) is the primary defense for at-risk users, not key
|
||||
encryption. A password-protected key would require prompting on every attestation and
|
||||
chain operation, which is unworkable in field conditions.
|
||||
- The `password` parameter on `generate_identity()` exists for interoperability but is
|
||||
not used by default. Chain operations (`_wrap_in_chain`, `backfill`) assume unencrypted keys.
|
||||
- If the device is seized, the killswitch destroys key material. If the killswitch is not
|
||||
triggered, the adversary has physical access and can defeat key encryption via cold boot,
|
||||
memory forensics, or compelled disclosure.
|
||||
|
||||
## 5. Permission Model
|
||||
|
||||
Three tiers control who can see what:
|
||||
|
||||
| Tier | Sees | Can Verify | Typical Actor |
|
||||
|---|---|---|---|
|
||||
| **Public auditor** | Chain summaries, Merkle proofs, receipt timestamps | Existence, ordering, no forks | Anyone with server URL |
|
||||
| **Federation member** | + chain IDs, signer public keys | + who attested, chain continuity | Peer servers, authorized monitors |
|
||||
| **Authorized recipient** | + decrypted attestation content | Everything | Designated individuals with DEK access |
|
||||
|
||||
Federation servers themselves operate at the **federation member** tier. They can verify
|
||||
chain integrity and detect forks, but they cannot read attestation content.
|
||||
|
||||
## 6. Timestamp Credibility
|
||||
|
||||
Since the device is offline, claimed timestamps are inherently untrusted. The system
|
||||
provides layered evidence:
|
||||
|
||||
1. **Hash chain ordering** — Records are provably ordered. Even if timestamps are wrong,
|
||||
the sequence is authentic.
|
||||
2. **Entropy witnesses** — Each record includes system uptime, kernel entropy pool state,
|
||||
and boot ID. Fabricating a convincing set of witnesses for a backdated record requires
|
||||
simulating the full system state at the claimed time.
|
||||
3. **Federation receipt** — When the bundle reaches a federation server, the server signs
|
||||
a receipt with its own clock. This proves the chain existed at or before the receipt time.
|
||||
4. **Cross-device corroboration** — If two independent devices attest overlapping events,
|
||||
their independent chains corroborate each other's timeline.
|
||||
|
||||
The trust window is: `receipt_timestamp - claimed_timestamp`. A smaller window means
|
||||
more credible timestamps. Frequent USB sync trips shrink the window.
|
||||
|
||||
## 7. Data Lifecycle
|
||||
|
||||
```
|
||||
1. CREATE Offline device: attest file → sign → append to chain
|
||||
2. EXPORT Offline device: select range → compress → encrypt → bundle (→ optional stego embed)
|
||||
3. TRANSFER Physical media: USB/SD card carries bundle across air gap
|
||||
4. LOAD Internet machine: validate bundle → push to federation
|
||||
5. RECEIPT Federation server: verify → append to Merkle tree → sign receipt
|
||||
6. RETURN Physical media: receipt carried back to offline device
|
||||
7. REPLICATE Federation: servers gossip entries and STHs to each other
|
||||
8. AUDIT Anyone: verify Merkle proofs, check consistency across servers
|
||||
```
|
||||
|
||||
## 8. Failure Modes
|
||||
|
||||
| Failure | Impact | Mitigation |
|
||||
|---|---|---|
|
||||
| Device lost/seized | Local chain lost | Bundles already sent to federation survive; regular exports reduce data loss window |
|
||||
| USB intercepted | Adversary gets encrypted bundle (or stego image) | Encryption protects content; stego hides existence |
|
||||
| Federation server compromised | Adversary reads metadata (chain IDs, pubkeys, timestamps) | Content remains encrypted; other servers detect log fork via consistency proofs |
|
||||
| All federation servers down | No new receipts | Bundles queue on loader; chain integrity unaffected; retry when servers recover |
|
||||
| Clock manipulation on device | Timestamps unreliable | Entropy witnesses increase fabrication cost; federation receipt provides external anchor |
|
||||
|
||||
## 9. Dependencies
|
||||
|
||||
| Package | Version | Purpose |
|
||||
|---|---|---|
|
||||
| `cbor2` | >=5.6.0 | Canonical CBOR serialization (RFC 8949) |
|
||||
| `uuid-utils` | >=0.9.0 | UUID v7 generation (time-ordered) |
|
||||
| `zstandard` | >=0.22.0 | Zstd compression for bundles and storage tiers |
|
||||
| `cryptography` | >=41.0.0 | Ed25519, X25519, AES-256-GCM, HKDF, SHA-256 (already a dependency) |
|
||||
|
||||
## 10. File Layout
|
||||
|
||||
```
|
||||
src/soosef/federation/
|
||||
__init__.py
|
||||
models.py Chain record and state dataclasses
|
||||
serialization.py CBOR canonical encoding
|
||||
entropy.py System entropy collection
|
||||
chain.py ChainStore — local append-only chain
|
||||
merkle.py Merkle tree implementation
|
||||
export.py Export bundle creation/parsing/decryption
|
||||
x25519.py Ed25519→X25519 derivation, envelope encryption
|
||||
stego_bundle.py Steganographic bundle embedding
|
||||
protocol.py Shared federation types
|
||||
loader/
|
||||
__init__.py
|
||||
loader.py Air-gap bridge application
|
||||
config.py Loader configuration
|
||||
receipt.py Receipt handling
|
||||
client.py Federation server HTTP client
|
||||
server/
|
||||
__init__.py
|
||||
app.py Federation server Flask app
|
||||
tree.py CT-style Merkle tree
|
||||
storage.py Tiered bundle storage
|
||||
gossip.py Peer synchronization
|
||||
permissions.py Access control
|
||||
config.py Server configuration
|
||||
|
||||
~/.soosef/
|
||||
chain/ Local hash chain
|
||||
chain.bin Append-only record log
|
||||
state.cbor Chain state checkpoint
|
||||
exports/ Generated export bundles
|
||||
loader/ Loader state
|
||||
config.json Server list and settings
|
||||
receipts/ Federation receipts
|
||||
federation/ Federation server data (when running as server)
|
||||
servers.json Known peer servers
|
||||
```
|
||||
Reference in New Issue
Block a user