# Export Bundle Specification **Status**: Design **Version**: 1 (bundle format version) **Last updated**: 2026-04-01 ## 1. Overview An export bundle packages a contiguous range of chain records into a portable, encrypted file suitable for transfer across an air gap. The bundle format is designed so that: - **Auditors** can verify chain integrity without decrypting content - **Recipients** with the correct key can decrypt and read attestation records - **Anyone** can detect tampering via Merkle root and signature verification - **Steganographic embedding** is optional — bundles can be hidden in JPEG images via DCT The format follows the pattern established by `keystore/export.py` (SOOBNDL): magic bytes, version, structured binary payload. ## 2. Binary Layout ``` Offset Size Field ────── ───────── ────────────────────────────────────── 0 8 magic: b"SOOSEFX1" 8 1 version: uint8 (1) 9 4 summary_len: uint32 BE 13 var chain_summary: CBOR (see §3) var 4 recipients_len: uint32 BE var var recipients: CBOR array (see §4) var 12 nonce: AES-256-GCM nonce var var ciphertext: AES-256-GCM(zstd(CBOR(records))) last 16 16 tag: AES-256-GCM authentication tag ``` All multi-byte integers are big-endian. The total bundle size is: `9 + 4 + summary_len + 4 + recipients_len + 12 + ciphertext_len + 16` ### Parsing Without Decryption To audit a bundle without decryption, read: 1. Magic (8 bytes) — verify `b"SOOSEFX1"` 2. Version (1 byte) — verify `1` 3. Summary length (4 bytes BE) — read the next N bytes as CBOR 4. Chain summary — verify signature, inspect metadata The encrypted payload and recipient list can be skipped for audit purposes. ## 3. Chain Summary The chain summary sits **outside** the encryption envelope. It provides verifiable metadata about the bundle contents without revealing the actual attestation data. CBOR map with integer keys: | CBOR Key | Field | Type | Description | |---|---|---|---| | 0 | `bundle_id` | byte string (16) | UUID v7, unique bundle identifier | | 1 | `chain_id` | byte string (32) | SHA-256(genesis record) — identifies source chain | | 2 | `range_start` | unsigned int | First record index (inclusive) | | 3 | `range_end` | unsigned int | Last record index (inclusive) | | 4 | `record_count` | unsigned int | Number of records in bundle | | 5 | `first_hash` | byte string (32) | `compute_record_hash(first_record)` | | 6 | `last_hash` | byte string (32) | `compute_record_hash(last_record)` | | 7 | `merkle_root` | byte string (32) | Root of Merkle tree over record hashes (see §5) | | 8 | `created_ts` | integer | Bundle creation timestamp (Unix µs) | | 9 | `signer_pubkey` | byte string (32) | Ed25519 public key of bundle creator | | 10 | `bundle_sig` | byte string (64) | Ed25519 signature (see §3.1) | ### 3.1 Signature Computation The signature covers all summary fields except `bundle_sig` itself: ``` summary_bytes = cbor2.dumps({ 0: bundle_id, 1: chain_id, 2: range_start, 3: range_end, 4: record_count, 5: first_hash, 6: last_hash, 7: merkle_root, 8: created_ts, 9: signer_pubkey, }, canonical=True) bundle_sig = Ed25519_Sign(private_key, summary_bytes) ``` ### 3.2 Verification Without Decryption An auditor verifies a bundle by: 1. Parse chain summary 2. `Ed25519_Verify(signer_pubkey, bundle_sig, summary_bytes)` — authentic summary 3. `record_count == range_end - range_start + 1` — count matches range 4. If previous bundles from the same `chain_id` exist, verify `first_hash` matches the expected continuation The auditor now knows: "A chain with ID X contains records [start, end], the creator signed this claim, and the Merkle root commits to specific record contents." All without decrypting. ## 4. Envelope Encryption ### 4.1 Key Derivation Ed25519 signing keys are converted to X25519 Diffie-Hellman keys for encryption: ``` x25519_private = Ed25519_to_X25519_Private(ed25519_private_key) x25519_public = Ed25519_to_X25519_Public(ed25519_public_key_bytes) ``` This uses the birational map between Ed25519 and X25519 curves, supported natively by the `cryptography` library. ### 4.2 DEK Generation A random 32-byte data encryption key (DEK) is generated per bundle: ``` dek = os.urandom(32) # AES-256 key ``` ### 4.3 DEK Wrapping (Per Recipient) For each recipient, the DEK is wrapped using X25519 ECDH + HKDF + AES-256-GCM: ``` 1. shared_secret = X25519_ECDH(sender_x25519_private, recipient_x25519_public) 2. derived_key = HKDF-SHA256( ikm=shared_secret, salt=bundle_id, # binds to this specific bundle info=b"soosef-dek-wrap-v1", length=32 ) 3. wrapped_dek = AES-256-GCM_Encrypt( key=derived_key, nonce=os.urandom(12), plaintext=dek, aad=bundle_id # additional authenticated data ) ``` ### 4.4 Recipients Array CBOR array of recipient entries: ```cbor [ { 0: recipient_pubkey, # byte string (32) — Ed25519 public key 1: wrap_nonce, # byte string (12) — AES-GCM nonce for DEK wrap 2: wrapped_dek, # byte string (48) — encrypted DEK (32) + GCM tag (16) }, ... ] ``` ### 4.5 Payload Encryption ``` 1. records_cbor = cbor2.dumps([serialize_record(r) for r in records], canonical=True) 2. compressed = zstd.compress(records_cbor, level=3) 3. nonce = os.urandom(12) 4. ciphertext, tag = AES-256-GCM_Encrypt( key=dek, nonce=nonce, plaintext=compressed, aad=summary_bytes # binds ciphertext to this summary ) ``` The `summary_bytes` (same bytes that are signed) are used as additional authenticated data (AAD). This cryptographically binds the encrypted payload to the chain summary — modifying the summary invalidates the decryption. ### 4.6 Decryption A recipient decrypts a bundle: ``` 1. Parse chain summary, verify bundle_sig 2. Find own pubkey in recipients array 3. shared_secret = X25519_ECDH(recipient_x25519_private, sender_x25519_public) (sender_x25519_public derived from summary.signer_pubkey) 4. derived_key = HKDF-SHA256(shared_secret, salt=bundle_id, info=b"soosef-dek-wrap-v1") 5. dek = AES-256-GCM_Decrypt(derived_key, wrap_nonce, wrapped_dek, aad=bundle_id) 6. compressed = AES-256-GCM_Decrypt(dek, nonce, ciphertext, aad=summary_bytes) 7. records_cbor = zstd.decompress(compressed) 8. records = [deserialize_record(r) for r in cbor2.loads(records_cbor)] 9. Verify each record's signature and chain linkage ``` ## 5. Merkle Tree The Merkle tree provides compact proofs that specific records are included in a bundle. ### 5.1 Construction Leaves are the record hashes in chain order: ``` leaf[i] = compute_record_hash(records[i]) ``` Internal nodes: ``` node = SHA-256(left_child || right_child) ``` If the number of leaves is not a power of 2, the last leaf is promoted to the next level (standard binary Merkle tree padding). ### 5.2 Inclusion Proof An inclusion proof for record at index `i` is a list of `(sibling_hash, direction)` pairs from the leaf to the root. Verification: ``` current = leaf[i] for (sibling, direction) in proof: if direction == "L": current = SHA-256(sibling || current) else: current = SHA-256(current || sibling) assert current == merkle_root ``` ### 5.3 Usage - **Export bundles**: `merkle_root` in chain summary commits to exact record contents - **Federation servers**: Build a separate Merkle tree over bundle hashes (see federation-protocol.md) These are two different trees: 1. **Record tree** (this section) — leaves are record hashes within a bundle 2. **Bundle tree** (federation) — leaves are bundle hashes across the federation log ## 6. Steganographic Embedding Bundles can optionally be embedded in JPEG images using stegasoo's DCT steganography: ``` 1. bundle_bytes = create_export_bundle(chain, start, end, private_key, recipients) 2. stego_image = stegasoo.encode( carrier=carrier_image, reference=reference_image, file_data=bundle_bytes, passphrase=passphrase, embed_mode="dct", channel_key=channel_key # optional ) ``` Extraction: ``` 1. result = stegasoo.decode( carrier=stego_image, reference=reference_image, passphrase=passphrase, channel_key=channel_key ) 2. bundle_bytes = result.file_data 3. assert bundle_bytes[:8] == b"SOOSEFX1" ``` ### 6.1 Capacity Considerations DCT steganography has limited capacity relative to the carrier image size. Approximate capacities: | Carrier Size | Approximate DCT Capacity | Records (est.) | |---|---|---| | 1 MP (1024x1024) | ~10 KB | ~20-40 records | | 4 MP (2048x2048) | ~40 KB | ~80-160 records | | 12 MP (4000x3000) | ~100 KB | ~200-400 records | Record size varies (~200-500 bytes each after CBOR serialization, before compression). Zstd compression typically achieves 2-4x ratio on CBOR attestation data. Use `check_capacity()` before embedding. ### 6.2 Multiple Images For large export ranges, split across multiple bundles embedded in multiple carrier images. Each bundle is self-contained with its own chain summary. The receiving side imports them in any order — the chain indices and hashes enable reassembly. ## 7. Recipient Management ### 7.1 Adding Recipients Recipients are identified by their Ed25519 public keys. To encrypt a bundle for a recipient, the creator needs only their public key (no shared secret setup required). ### 7.2 Recipient Discovery Recipients' Ed25519 public keys can be obtained via: - Direct exchange (QR code, USB transfer, verbal fingerprint verification) - Federation server identity registry (when available) - Verisoo's existing `peers.json` file ### 7.3 Self-Encryption The bundle creator should always include their own public key in the recipients list. This allows them to decrypt their own exports (e.g., when restoring from backup). ## 8. Error Handling | Error | Cause | Response | |---|---|---| | Bad magic | Not a SOOSEFX1 bundle | Reject with `ExportError("not a SooSeF export bundle")` | | Bad version | Unsupported format version | Reject with `ExportError("unsupported bundle version")` | | Signature invalid | Tampered summary or wrong signer | Reject with `ExportError("bundle signature verification failed")` | | No matching recipient | Decryptor's key not in recipients list | Reject with `ExportError("not an authorized recipient")` | | GCM auth failure | Tampered ciphertext or wrong key | Reject with `ExportError("decryption failed — bundle may be corrupted")` | | Decompression failure | Corrupted compressed data | Reject with `ExportError("decompression failed")` | | Chain integrity failure | Records don't link correctly | Reject with `ChainIntegrityError(...)` after decryption |