Complete project rebrand for better positioning in the press freedom and digital security space. FieldWitness communicates both field deployment and evidence testimony — appropriate for the target audience of journalists, NGOs, and human rights organizations. Rename mapping: - soosef → fieldwitness (package, CLI, all imports) - soosef.stegasoo → fieldwitness.stego - soosef.verisoo → fieldwitness.attest - ~/.soosef/ → ~/.fwmetadata/ (innocuous data dir name) - SOOSEF_DATA_DIR → FIELDWITNESS_DATA_DIR - SoosefConfig → FieldWitnessConfig - SoosefError → FieldWitnessError Also includes: - License switch from MIT to GPL-3.0 - C2PA bridge module (Phase 0-2 MVP): cert.py, export.py, vendor_assertions.py - README repositioned to lead with provenance/federation, stego backgrounded - Threat model skeleton at docs/security/threat-model.md - Planning docs: docs/planning/c2pa-integration.md, docs/planning/gtm-feasibility.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
241 lines
8.8 KiB
Markdown
241 lines
8.8 KiB
Markdown
# C2PA Integration Plan
|
|
|
|
**Audience:** FieldWitness developers and maintainers
|
|
**Status:** Planning (pre-implementation)
|
|
**Last updated:** 2026-04-01
|
|
|
|
## Overview
|
|
|
|
FieldWitness needs C2PA (Coalition for Content Provenance and Authenticity) export/import
|
|
capability. C2PA is the emerging industry standard for content provenance, backed by
|
|
Adobe, Microsoft, Google, and the BBC. ProofMode, Guardian Project, and Starling Lab
|
|
have all adopted C2PA. FieldWitness must speak C2PA to remain relevant in the provenance
|
|
space.
|
|
|
|
---
|
|
|
|
## C2PA Spec Essentials
|
|
|
|
- JUMBF-based provenance standard embedded in media files
|
|
- Core structures: **Manifest Store > Manifest > Claim + Assertions + Ingredients + Signature**
|
|
- Claims are CBOR maps with assertion references, signing algorithm, `claim_generator`,
|
|
and timestamps
|
|
- Standard assertions:
|
|
- `c2pa.actions` -- edit history
|
|
- `c2pa.hash.data` -- hard binding (byte-range)
|
|
- `c2pa.location.broad` -- city/region location
|
|
- `c2pa.exif` -- EXIF metadata
|
|
- `c2pa.creative.work` -- title, description, authorship
|
|
- `c2pa.training-mining` -- AI training/mining consent
|
|
- Vendor-specific assertions under reverse-DNS (e.g., `org.fieldwitness.*`)
|
|
- Signing uses **COSE_Sign1** (RFC 9052)
|
|
- Supported algorithms: Ed25519 (OKP), ES256/ES384/ES512 (ECDSA), PS256/PS384/PS512 (RSA-PSS)
|
|
- **X.509 certificate chain required** -- embedded in COSE unprotected header; raw public
|
|
keys are not sufficient
|
|
- Offline validation works with pre-installed trust anchors; self-signed certs work in
|
|
"local trust anchor" mode
|
|
|
|
## Python Library: c2pa-python
|
|
|
|
- Canonical binding from C2PA org (PyPI: `c2pa-python`, GitHub: `contentauth/c2pa-python`)
|
|
- Rust extension (`c2pa-rs` via PyO3), not pure Python
|
|
- Version ~0.6.x, API not fully stable
|
|
- Platform wheels: manylinux2014 x86_64/aarch64, macOS, Windows
|
|
- **No armv6/armv7 wheels** -- affects Tier 1 Raspberry Pi deployments
|
|
- Core API: `c2pa.Reader`, `c2pa.Builder`, `builder.sign()`, `c2pa.create_signer()`
|
|
- `create_signer` takes a callback, algorithm, certs PEM, optional timestamp URL
|
|
- `timestamp_url=None` skips RFC 3161 timestamping (acceptable for offline use)
|
|
|
|
---
|
|
|
|
## Concept Mapping: FieldWitness to C2PA
|
|
|
|
### Clean mappings
|
|
|
|
| FieldWitness | C2PA |
|
|
|--------|------|
|
|
| `AttestationRecord` | C2PA Manifest |
|
|
| `attestor_fingerprint` | Signer cert subject (wrapped in X.509) |
|
|
| `AttestationRecord.timestamp` | Claim `created` (ISO 8601) |
|
|
| `CaptureMetadata.captured_at` | `c2pa.exif` DateTimeOriginal |
|
|
| `CaptureMetadata.location` | `c2pa.location.broad` |
|
|
| `CaptureMetadata.device` | `c2pa.exif` Make/Model |
|
|
| `CaptureMetadata.caption` | `c2pa.creative.work` description |
|
|
| `ImageHashes.sha256` | `c2pa.hash.data` (hard binding) |
|
|
| Ed25519 private key | COSE_Sign1 signing key (needs X.509 wrapper) |
|
|
|
|
### FieldWitness has, C2PA does not
|
|
|
|
- Perceptual hashes (phash, dhash) -- map to vendor assertion `org.fieldwitness.perceptual-hashes`
|
|
- Merkle log inclusion proofs -- map to vendor assertion `org.fieldwitness.merkle-proof`
|
|
- Chain records with entropy witnesses -- map to vendor assertion `org.fieldwitness.chain-record`
|
|
- Delivery acknowledgment records (entirely FieldWitness-specific)
|
|
- Cross-org gossip federation
|
|
- Perceptual matching for verification (survives recompression)
|
|
- Selective disclosure / redaction
|
|
|
|
### C2PA has, FieldWitness does not
|
|
|
|
- Hard file binding (byte-range exclusion zones)
|
|
- X.509 certificate trust chains
|
|
- Actions history (`c2pa.actions`: crop, rotate, AI-generate, etc.)
|
|
- AI training/mining consent
|
|
- Ingredient DAG (content derivation graph)
|
|
|
|
---
|
|
|
|
## Privacy Design
|
|
|
|
Three tiers of identity disclosure:
|
|
|
|
1. **Org-level cert (preferred):** One self-signed X.509 cert per organization, not per
|
|
person. Subject is org name. Individual reporters do not appear in the manifest.
|
|
|
|
2. **Pseudonym cert:** Subject is pseudonym or random UUID. Valid C2PA but unrecognized
|
|
by external trust anchors.
|
|
|
|
3. **No C2PA export:** For critical-threat presets, evidence stays in FieldWitness format until
|
|
reaching Tier 2.
|
|
|
|
### GPS handling
|
|
|
|
C2PA's `c2pa.location.broad` is city/region level. FieldWitness captures precise GPS. On
|
|
export, downsample to city-level unless the operator explicitly opts in. Precise GPS
|
|
stays in FieldWitness record only.
|
|
|
|
### Metadata handling
|
|
|
|
Strip all EXIF from the output file except what is intentionally placed in the
|
|
`c2pa.exif` assertion.
|
|
|
|
---
|
|
|
|
## Offline-First Constraints
|
|
|
|
- **Tier 1 (field, no internet):** C2PA manifests without RFC 3161 timestamp. FieldWitness
|
|
chain record provides timestamp anchoring via vendor assertion.
|
|
- **Tier 2 (org server, may have internet):** Optionally contact TSA at export time.
|
|
Connects to existing `anchors.py` infrastructure.
|
|
- Entropy witnesses embedded as vendor assertions provide soft timestamp evidence.
|
|
- Evidence packages include org cert PEM alongside C2PA manifest for offline verification.
|
|
- `c2pa-python` availability gated behind `has_c2pa()` -- not all hardware can run it.
|
|
|
|
---
|
|
|
|
## Architecture
|
|
|
|
### New module: `src/fieldwitness/c2pa_bridge/`
|
|
|
|
```
|
|
src/fieldwitness/c2pa_bridge/
|
|
__init__.py # Public API: export, import, has_c2pa()
|
|
cert.py # Self-signed X.509 cert generation from Ed25519 key
|
|
export.py # AttestationRecord -> C2PA manifest
|
|
importer.py # C2PA manifest -> AttestationRecord (best-effort)
|
|
vendor_assertions.py # org.fieldwitness.* assertion schemas
|
|
cli.py # CLI subcommands: fieldwitness c2pa export / verify / import
|
|
```
|
|
|
|
### Module relationships
|
|
|
|
- `export.py` reads from `attest/models.py`, `federation/chain.py`,
|
|
`keystore/manager.py`; calls `cert.py` and `vendor_assertions.py`
|
|
- `importer.py` reads image bytes, writes `AttestationRecord` via
|
|
`attest/attestation.py`, parses vendor assertions
|
|
|
|
### Web UI
|
|
|
|
New routes in the `attest.py` blueprint:
|
|
- `GET /attest/<record_id>/c2pa` -- download C2PA-embedded image
|
|
- `POST /attest/import-c2pa` -- upload and import C2PA manifest
|
|
|
|
### Evidence packages
|
|
|
|
`evidence.py` gains `include_c2pa=True` option. Adds C2PA-embedded file variants and
|
|
org cert to the ZIP.
|
|
|
|
### pyproject.toml extra
|
|
|
|
```toml
|
|
c2pa = ["c2pa-python>=0.6.0", "fieldwitness[attest]"]
|
|
```
|
|
|
|
---
|
|
|
|
## Implementation Phases
|
|
|
|
### Phase 0 -- Prerequisites (~1h)
|
|
|
|
- `has_c2pa()` in `_availability.py`
|
|
- `c2pa` extra in `pyproject.toml`
|
|
|
|
### Phase 1 -- Certificate management (~3h)
|
|
|
|
- `c2pa_bridge/cert.py`
|
|
- Self-signed X.509 from Ed25519 identity key
|
|
- Configurable subject (org name default, pseudonym for high-threat)
|
|
- Store at `~/.fwmetadata/identity/c2pa_cert.pem`
|
|
- Regenerate on key rotation
|
|
|
|
### Phase 2 -- Export path (~6h)
|
|
|
|
- `c2pa_bridge/export.py` + `vendor_assertions.py`
|
|
- Core function `export_c2pa()` takes image data, `AttestationRecord`, key, cert, options
|
|
- Builds assertions: `c2pa.actions`, `c2pa.hash.data`, `c2pa.exif`, `c2pa.creative.work`,
|
|
`org.fieldwitness.perceptual-hashes`, `org.fieldwitness.chain-record`, `org.fieldwitness.attestation-id`
|
|
- Vendor assertion schemas versioned (v1)
|
|
|
|
### Phase 3 -- Import path (~5h)
|
|
|
|
- `c2pa_bridge/importer.py`
|
|
- `import_c2pa()` reads C2PA manifest, produces `AttestationRecord`
|
|
- Maps C2PA fields to FieldWitness model
|
|
- Returns `C2PAImportResult` with `trust_status`
|
|
- Creates new FieldWitness attestation record over imported data
|
|
|
|
### Phase 4 -- CLI integration (~4h)
|
|
|
|
- `fieldwitness c2pa export/verify/import/show` subcommands
|
|
- Gated on `has_c2pa()`
|
|
|
|
### Phase 5 -- Web UI + evidence packages (~5h)
|
|
|
|
- Blueprint routes for export/import
|
|
- Evidence package C2PA option
|
|
|
|
### Phase 6 -- Threat-level presets (~2h)
|
|
|
|
- Add `c2pa` config block to each preset (`export_enabled`, `privacy_level`,
|
|
`include_precise_gps`, `timestamp_url`)
|
|
- `C2PAConfig` sub-dataclass in `FieldWitnessConfig`
|
|
|
|
### MVP scope
|
|
|
|
**Phases 0-2 (~10h):** Produces C2PA-compatible images viewable in Adobe Content
|
|
Credentials and any C2PA verifier.
|
|
|
|
---
|
|
|
|
## Key Decisions (Before Coding)
|
|
|
|
1. **Use existing Ed25519 identity key for cert** (not a separate key) -- preserves
|
|
single-key-domain design.
|
|
2. **Cert stored at `~/.fwmetadata/identity/c2pa_cert.pem`**, regenerated on key rotation.
|
|
3. **Tier 1 ARM fallback:** Tier 1 produces FieldWitness records; Tier 2 generates C2PA export
|
|
on their behalf.
|
|
4. **Pin `c2pa-python>=0.6.0`**, add shim layer for API stability.
|
|
5. **Hard binding computed by `c2pa-python` Builder** automatically.
|
|
|
|
---
|
|
|
|
## FieldWitness's Unique C2PA Value
|
|
|
|
- **Cross-org chain of custody** via gossip federation (delivery ack records as ingredients)
|
|
- **Perceptual hash matching** embedded in C2PA (survives JPEG recompression via
|
|
WhatsApp/Telegram)
|
|
- **Merkle log inclusion proofs** in manifest (proves attestation committed to append-only log)
|
|
- **Entropy witnesses** as soft timestamp attestation (makes backdating harder without
|
|
RFC 3161)
|
|
- **Privacy-preserving by design** (org certs, GPS downsampling, zero-identity mode)
|
|
- **Fully offline end-to-end verification** (bundled cert + `c2pa-python`, no network needed)
|