Complete project rebrand for better positioning in the press freedom and digital security space. FieldWitness communicates both field deployment and evidence testimony — appropriate for the target audience of journalists, NGOs, and human rights organizations. Rename mapping: - soosef → fieldwitness (package, CLI, all imports) - soosef.stegasoo → fieldwitness.stego - soosef.verisoo → fieldwitness.attest - ~/.soosef/ → ~/.fwmetadata/ (innocuous data dir name) - SOOSEF_DATA_DIR → FIELDWITNESS_DATA_DIR - SoosefConfig → FieldWitnessConfig - SoosefError → FieldWitnessError Also includes: - License switch from MIT to GPL-3.0 - C2PA bridge module (Phase 0-2 MVP): cert.py, export.py, vendor_assertions.py - README repositioned to lead with provenance/federation, stego backgrounded - Threat model skeleton at docs/security/threat-model.md - Planning docs: docs/planning/c2pa-integration.md, docs/planning/gtm-feasibility.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9.7 KiB
Chain Format Specification
Status: Design
Version: 1 (record format version)
Last updated: 2026-04-01
1. Overview
The attestation chain is an append-only sequence of signed records stored locally on the offline device. Each record includes a hash of the previous record, forming a tamper-evident chain analogous to git commits or blockchain blocks.
The chain wraps existing Attest attestation records. A Attest record's serialized bytes
become the input to content_hash, preserving the original attestation while adding
ordering, entropy witnesses, and chain integrity guarantees.
2. AttestationChainRecord
Field Definitions
| Field | CBOR Key | Type | Size | Description |
|---|---|---|---|---|
version |
0 | unsigned int | 1 byte | Record format version. Currently 1. |
record_id |
1 | byte string | 16 bytes | UUID v7 (RFC 9562). Time-ordered unique identifier. |
chain_index |
2 | unsigned int | 8 bytes max | Monotonically increasing, 0-based. Genesis record is index 0. |
prev_hash |
3 | byte string | 32 bytes | SHA-256 of canonical_bytes(previous_record). Genesis: 0x00 * 32. |
content_hash |
4 | byte string | 32 bytes | SHA-256 of the wrapped content (e.g., Attest record bytes). |
content_type |
5 | text string | variable | MIME-like type identifier. "attest/attestation-v1" for Attest records. |
metadata |
6 | CBOR map | variable | Extensible key-value map. See §2.1. |
claimed_ts |
7 | integer | 8 bytes max | Unix timestamp in microseconds (µs). Signed integer to handle pre-epoch dates. |
entropy_witnesses |
8 | CBOR map | variable | System entropy snapshot. See §3. |
signer_pubkey |
9 | byte string | 32 bytes | Ed25519 raw public key bytes. |
signature |
10 | byte string | 64 bytes | Ed25519 signature over canonical_bytes(record) excluding the signature field. |
2.1 Metadata Map
The metadata field is an open CBOR map with text string keys. Defined keys:
| Key | Type | Description |
|---|---|---|
"backfilled" |
bool | true if this record was created by the backfill migration |
"caption" |
text | Human-readable description of the attested content |
"location" |
text | Location name associated with the attestation |
"original_ts" |
integer | Original Attest timestamp (µs) if different from claimed_ts |
"tags" |
array of text | User-defined classification tags |
Applications may add custom keys. Unknown keys must be preserved during serialization.
3. Entropy Witnesses
Entropy witnesses are system-state snapshots collected at record creation time. They serve as soft evidence that the claimed timestamp is plausible. Fabricating convincing witnesses for a backdated record requires simulating the full system state at the claimed time.
| Field | CBOR Key | Type | Source | Fallback (non-Linux) |
|---|---|---|---|---|
sys_uptime |
0 | float64 | time.monotonic() |
Same (cross-platform) |
fs_snapshot |
1 | byte string (16 bytes) | SHA-256 of os.stat() on chain DB, truncated to 16 bytes |
SHA-256 of chain dir stat |
proc_entropy |
2 | unsigned int | /proc/sys/kernel/random/entropy_avail |
len(os.urandom(32)) (always 32, marker for non-Linux) |
boot_id |
3 | text string | /proc/sys/kernel/random/boot_id |
uuid.uuid4() cached per process lifetime |
Witness Properties
- sys_uptime: Monotonically increasing within a boot. Cannot decrease. A record with
sys_uptime < previous_record.sys_uptimeandclaimed_ts > previous_record.claimed_tsis suspicious (reboot or clock manipulation). - fs_snapshot: Changes with every write to the chain DB. Hash includes mtime, ctime, size, and inode number.
- proc_entropy: Varies naturally. On Linux, reflects kernel entropy pool state.
- boot_id: Changes on every reboot. Identical
boot_idacross records implies same boot session — combined withsys_uptime, this constrains the timeline.
4. Serialization
4.1 Canonical Bytes
canonical_bytes(record) produces the deterministic byte representation used for hashing
and signing. It is a CBOR map containing all fields except signature, encoded using
CBOR canonical encoding (RFC 8949 §4.2):
- Map keys sorted by integer value (0, 1, 2, ..., 9)
- Integers use minimal-length encoding
- No indefinite-length items
- No duplicate keys
canonical_bytes(record) = cbor2.dumps({
0: record.version,
1: record.record_id,
2: record.chain_index,
3: record.prev_hash,
4: record.content_hash,
5: record.content_type,
6: record.metadata,
7: record.claimed_ts,
8: {
0: record.entropy_witnesses.sys_uptime,
1: record.entropy_witnesses.fs_snapshot,
2: record.entropy_witnesses.proc_entropy,
3: record.entropy_witnesses.boot_id,
},
9: record.signer_pubkey,
}, canonical=True)
4.2 Record Hash
compute_record_hash(record) = SHA-256(canonical_bytes(record))
This hash is used as prev_hash in the next record and as Merkle tree leaves in export
bundles.
4.3 Signature
record.signature = Ed25519_Sign(private_key, canonical_bytes(record))
Verification:
Ed25519_Verify(record.signer_pubkey, record.signature, canonical_bytes(record))
4.4 Full Serialization
serialize_record(record) produces the full CBOR encoding including the signature field
(CBOR key 10). This is used for storage and transmission.
serialize_record(record) = cbor2.dumps({
0: record.version,
1: record.record_id,
...
9: record.signer_pubkey,
10: record.signature,
}, canonical=True)
5. Chain Rules
5.1 Genesis Record
The first record in a chain (index 0) has:
chain_index = 0prev_hash = b'\x00' * 32(32 zero bytes)
The chain ID is defined as SHA-256(canonical_bytes(genesis_record)). This permanently
identifies the chain.
5.2 Append Rule
For record N (where N > 0):
record_N.chain_index == record_{N-1}.chain_index + 1
record_N.prev_hash == compute_record_hash(record_{N-1})
record_N.claimed_ts >= record_{N-1}.claimed_ts (SHOULD, not MUST — clock skew possible)
5.3 Verification
Full chain verification checks, for each record from index 0 to head:
Ed25519_Verify(record.signer_pubkey, record.signature, canonical_bytes(record))— signature validrecord.chain_index == expected_index— no gaps or duplicatesrecord.prev_hash == compute_record_hash(previous_record)— chain link intact- All
signer_pubkeyvalues are identical within a chain (single-signer chain)
Violation of rule 4 indicates a chain was signed by multiple identities, which may be legitimate (key rotation) or malicious (chain hijacking). Key rotation is out of scope for v1; implementations should flag this as a warning.
6. Storage Format
6.1 chain.bin (Append-Only Log)
Records are stored sequentially as length-prefixed CBOR:
┌─────────────────────────────┐
│ uint32 BE: record_0 length │
│ bytes: serialize(record_0) │
├─────────────────────────────┤
│ uint32 BE: record_1 length │
│ bytes: serialize(record_1) │
├─────────────────────────────┤
│ ... │
└─────────────────────────────┘
- Length prefix is 4 bytes, big-endian unsigned 32-bit integer
- Maximum record size: 4 GiB (practical limit much smaller)
- File is append-only; records are never modified or deleted
- File locking via
fcntl.flock(LOCK_EX)for single-writer safety
6.2 state.cbor (Chain State Checkpoint)
A single CBOR map, atomically rewritten after each append:
{
"chain_id": bytes[32], # SHA-256(canonical_bytes(genesis))
"head_index": uint, # Index of the most recent record
"head_hash": bytes[32], # Hash of the most recent record
"record_count": uint, # Total records in chain
"created_at": int, # Unix µs when chain was created
"last_append_at": int # Unix µs of last append
}
This file is a performance optimization — the canonical state is always derivable from
chain.bin. On corruption, state.cbor is rebuilt by scanning the log.
6.3 File Locations
~/.fwmetadata/chain/
chain.bin Append-only record log
state.cbor Chain state checkpoint
Paths are defined in src/fieldwitness/paths.py.
7. Migration from Attest-Only Attestations
Existing Attest attestations in ~/.fwmetadata/attestations/ are not modified. The chain
is a parallel structure. Migration is performed by the fieldwitness chain backfill command:
- Iterate all records in Attest's
LocalStorage(ordered by timestamp) - For each record, compute
content_hash = SHA-256(record.to_bytes()) - Create a chain record with:
content_type = "attest/attestation-v1"claimed_tsset to the original Attest timestampmetadata = {"backfilled": true, "original_ts": <attest_timestamp>}- Entropy witnesses collected at migration time (not original time)
- Append to chain
Backfilled records are distinguishable via the backfilled metadata flag. Their entropy
witnesses reflect migration time, not original attestation time — this is honest and
intentional.
8. Content Types
The content_type field identifies what was hashed into content_hash. Defined types:
| Content Type | Description |
|---|---|
attest/attestation-v1 |
Attest AttestationRecord serialized bytes |
fieldwitness/raw-file-v1 |
Raw file bytes (for non-image attestations, future) |
fieldwitness/metadata-only-v1 |
No file content; metadata-only attestation (future) |
New content types may be added without changing the record format version.