fieldwitness/docs/source-dropbox.md
Aaron D. Lee 6325e86873
Some checks failed
CI / lint (push) Failing after 1m1s
CI / typecheck (push) Failing after 31s
Comprehensive documentation for v0.2.0 release
README.md (700 lines):
- Three-tier deployment model with ASCII diagram
- Federation blueprint in web UI routes
- deploy/ directory in architecture tree
- Documentation index linking all guides

CLAUDE.md (256 lines):
- Updated architecture tree with all new docs and deploy files

New guides:
- docs/federation.md (317 lines) — gossip protocol mechanics, peer
  setup, trust filtering, offline bundles, relay deployment, jurisdiction
- docs/evidence-guide.md (283 lines) — evidence packages, cold archives,
  selective disclosure, chain anchoring, legal discovery workflow
- docs/source-dropbox.md (220 lines) — token management, client-side
  hashing, extract-then-strip pipeline, receipt mechanics, opsec
- docs/index.md — documentation hub linking all guides

Training materials:
- docs/training/reporter-quickstart.md (105 lines) — printable one-page
  card: boot USB, attest photo, encode message, check-in, emergency
- docs/training/emergency-card.md (79 lines) — wallet-sized laminated
  card: three destruction methods, 10-step order, key contacts
- docs/training/admin-reference.md (219 lines) — deployment tiers,
  CLI tables, backup checklist, hardening checklist, troubleshooting

Also includes existing architecture docs from the original repos.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 23:31:47 -04:00

221 lines
8.8 KiB
Markdown

# Source Drop Box Setup Guide
**Audience**: Administrators setting up SooSeF's anonymous source intake feature.
**Prerequisites**: A running SooSeF instance with web UI enabled (`soosef[web]` extra),
an admin account, and HTTPS configured (self-signed is acceptable).
---
## Overview
The source drop box is a SecureDrop-style anonymous file intake built into the SooSeF web
UI. Admins create time-limited upload tokens, sources open the token URL in a browser and
submit files without creating an account. Files are processed through the extract-then-strip
EXIF pipeline and automatically attested on receipt. Sources receive HMAC-derived receipt
codes that prove delivery.
> **Warning:** The drop box protects source identity through design -- no accounts, no
> branding, no IP logging. However, the security of the system depends on how the upload URL
> is shared. Never send drop box URLs over unencrypted email or SMS.
---
## How It Works
```
Admin Source SooSeF Server
| | |
|-- Create token ------------->| |
| (label, expiry, max_files) | |
| | |
|-- Share URL (secure channel) | |
| | |
| |-- Open URL in browser --------->|
| | (no login required) |
| | |
| |-- Select files |
| | Browser computes SHA-256 |
| | (SubtleCrypto, client-side) |
| | |
| |-- Upload files ---------------->|
| | |-- Extract EXIF
| | |-- Strip metadata
| | |-- Attest originals
| | |-- Save stripped copy
| | |
| |<-- Receipt codes ---------------|
| | (HMAC of file hash + token) |
```
---
## Setting Up the Drop Box
### Step 1: Ensure HTTPS is enabled
The drop box should always be served over HTTPS. Sources must be able to trust that their
connection is not being intercepted.
```bash
$ soosef serve --host 0.0.0.0
```
SooSeF auto-generates a self-signed certificate on first HTTPS start. For production use,
place a reverse proxy with a proper TLS certificate in front of SooSeF.
### Step 2: Create an upload token
Navigate to `/dropbox/admin` in the web UI (admin login required), or use the admin panel.
Each token has:
| Field | Default | Description |
|---|---|---|
| **Label** | "Unnamed source" | Human-readable name for the source (stored server-side only, never shown to the source) |
| **Expiry** | 24 hours | How long the upload link remains valid |
| **Max files** | 10 | Maximum number of uploads allowed on this link |
After creating the token, the admin receives a URL of the form:
```
https://<host>:<port>/dropbox/upload/<token>
```
The token is a 32-byte cryptographically random URL-safe string.
### Step 3: Share the URL with the source
Share the upload URL over an already-secure channel:
- **Best**: in person, on paper
- **Good**: encrypted messaging (Signal, Wire)
- **Acceptable**: verbal dictation over a secure voice call
- **Never**: unencrypted email, SMS, or any channel that could be intercepted
### Step 4: Source uploads files
The source opens the URL in their browser. The upload page is minimal -- no SooSeF branding,
no identifying marks, generic styling. The page works over Tor Browser with JavaScript
enabled (no external resources, no CDN, no fonts, no analytics).
When files are selected:
1. The browser computes SHA-256 fingerprints client-side using SubtleCrypto
2. The source sees the fingerprints and is prompted to save them before uploading
3. On upload, the server processes each file through the extract-then-strip pipeline
4. The source receives receipt codes for each file
### Step 5: Monitor submissions
The admin panel at `/dropbox/admin` shows:
- Active tokens with their usage counts
- Token expiry times
- Ability to revoke tokens immediately
---
## The Extract-Then-Strip Pipeline
Every file uploaded through the drop box is processed through SooSeF's EXIF pipeline:
1. **Extract**: all EXIF metadata is read from the original image bytes
2. **Classify**: fields are split into evidentiary (GPS coordinates, capture timestamp --
valuable for provenance) and dangerous (device serial number, firmware version -- could
identify the source's device)
3. **Attest**: the original bytes are attested (Ed25519 signed) with evidentiary metadata
included in the attestation record. The attestation hash matches what the source actually
submitted.
4. **Strip**: all metadata is removed from the stored copy. The stripped copy is saved to
disk. No device fingerprint persists on the server's storage.
This resolves the tension between protecting the source (strip device-identifying metadata)
and preserving evidence (retain GPS and timestamp for provenance).
---
## Receipt Codes
Each uploaded file generates an HMAC-derived receipt code:
```
receipt_code = HMAC-SHA256(token, file_sha256)[:16]
```
The receipt code proves:
- The server received the specific file (tied to the file's SHA-256)
- The file was received under the specific token (tied to the token value)
Sources can verify their receipt by posting it to `/dropbox/verify-receipt`. This returns
the filename, SHA-256, and reception timestamp if the receipt is valid.
> **Note:** Receipt codes are deterministic. The source can compute the expected receipt
> themselves if they know the token value and the file's SHA-256 hash, providing
> independent verification.
---
## Client-Side SHA-256
The upload page computes SHA-256 fingerprints in the browser before upload using the
SubtleCrypto Web API. This gives the source a verifiable record of exactly what they
submitted -- the hash is computed on their device, not the server.
The source should save these fingerprints before uploading. If the server later claims to
have received different content, the source can prove what they actually submitted by
comparing their locally computed hash with the server's receipt.
---
## Storage
| What | Where |
|---|---|
| Uploaded files (stripped) | `~/.soosef/temp/dropbox/` (mode 0700) |
| Token metadata | `~/.soosef/auth/dropbox.db` (SQLite) |
| Receipt codes | `~/.soosef/auth/dropbox.db` (SQLite) |
| Attestation records | `~/.soosef/attestations/` (standard attestation log) |
Expired tokens are cleaned up automatically on every admin page load.
---
## Operational Security
### Source safety
- **No SooSeF branding** on the upload page. Generic "Secure File Upload" title.
- **No authentication required** -- sources never create accounts or reveal identity.
- **No IP logging** -- SooSeF does not log source IP addresses. Ensure your reverse proxy
(if any) also does not log access requests to `/dropbox/upload/` paths.
- **Self-contained page** -- inline CSS and JavaScript only. No external resources, CDN
calls, web fonts, or analytics. Works with Tor Browser.
- **CSRF exempt** -- the upload endpoint does not require CSRF tokens because sources do
not have sessions.
### Token management
- **Short expiry** -- set token expiry as short as practical. 24 hours is the default; for
high-risk sources, consider 1-4 hours.
- **Low file limits** -- set `max_files` to the expected number of submissions.
Once reached, the link stops accepting uploads.
- **Revoke immediately** -- if a token is compromised or no longer needed, revoke it from
the admin panel. This deletes the token and all associated receipt records from SQLite.
- **Audit trail** -- token creation events are logged to `~/.soosef/audit.jsonl` with the
action `dropbox.token_created`.
### Running as a Tor hidden service
For maximum source protection, run SooSeF as a Tor hidden service:
1. Install Tor on the server
2. Configure a hidden service in `torrc` pointing to `127.0.0.1:5000`
3. Share the `.onion` URL instead of a LAN address
4. The source's real IP is never visible to the server
> **Warning:** Even with Tor, timing analysis and traffic correlation attacks are possible
> at the network level. The drop box protects source identity at the application layer;
> network-layer protection requires operational discipline beyond what software can provide.