Files
relicario/docs/superpowers/specs/2026-05-01-recovery-qr-design.md
adlee-was-taken 39ae2ecbf3 style: capitalize "Relicario" in prose / UI / CLI help
Brand name uses capital R in user-facing text — extension UI strings,
CLI clap help / descriptions / error prose, markdown docs. Lowercase
preserved for the binary command, crate names, npm package, file
paths, env vars, and code identifiers.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-01 17:29:10 -04:00

242 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Recovery QR + passphrase entropy floor — disaster recovery for lost reference image
**Status:** design
**Target release:** v0.4.0 (post-v0.3.0 train)
**Scope:** `relicario-core` (new `recovery_qr` module + extracted `normalize_passphrase`), `relicario-cli` (new `recovery-qr` subcommand group), `relicario-wasm` (bindings), extension (display/print route + vault-tab button + init-wizard zxcvbn gate)
**Out of scope:** passphrase-loss recovery (deliberate non-goal), online or server-mediated recovery, multi-device key sharing, threshold schemes, device onboarding "magic link" (separate effort), in-extension webcam QR scanning (a future feature; v1 unlocks via paste).
## Background
Relicario's two-factor model derives `master_key = Argon2id(len-prefixed(passphrase) || image_secret, salt, params)` (`crates/relicario-core/src/crypto.rs:207`). Lose either factor and the vault is unrecoverable. The reference image is the more loseable factor — it lives outside the user's head, often as a "dead drop" on social media or a personal site, and a single platform takedown or accidental deletion permanently bricks the vault.
The original design spec already sketched a post-V1 recovery path (`docs/superpowers/specs/2026-04-11-relicario-design.md:342-349`): a small encrypted file containing only `image_secret`, locked under the passphrase via a separate Argon2id derivation, stored offline. This spec finalizes that sketch with three refinements landed during brainstorming:
1. The artifact is a **QR code displayed on screen** (primary) or printed (secondary) — never written to disk. The user snaps the displayed QR with a phone or prints a hard copy. "Memory-only" is enforced architecturally: no API path produces a file.
2. **Domain separation** in the recovery KDF input prevents collision with the main `derive_master_key` output namespace under adversarial inputs.
3. A **passphrase entropy floor** is enforced at vault init. Recovery-QR security is exactly `passphrase_strength × Argon2id_cost`; without an entropy floor at init, a user can configure their vault into a state where the recovery QR is brute-forceable on commodity hardware.
## Goals
1. Provide an offline, paper-or-photo fallback that recovers `image_secret` when the reference image is lost but the passphrase is known.
2. Make it impossible — by API shape, not convention — to (a) write the recovery payload to disk, (b) generate it with weak Argon2id parameters, or (c) compute it without NFC-normalizing the passphrase identically to the main KDF.
3. Enforce a passphrase entropy floor at vault init so the recovery-QR security guarantee is not silently undermined.
4. Surface the feature in CLI, extension vault tab, and the new-vault wizard with parity (see `feedback_cli_extension_parity` in user memory).
## Non-goals
- Recovering from a forgotten passphrase. Forgotten passphrases remain unrecoverable; this is the deliberate stance for a self-hosted password manager with no recovery server.
- Re-introducing TOTP, online recovery, or any third factor. The brainstorm explicitly settled on 1-of-2 with a paper substitute for the second factor.
- Retroactively forcing existing vaults whose passphrases are below the new entropy floor to rotate. Existing vaults are grandfathered with a non-blocking warning.
- Vault format change. The recovery QR is a derived artifact; the vault on disk is unchanged.
## Threat model
| Attacker capability | What this protects | What it does not protect |
| --------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Photographs the displayed QR or steals the printed paper | Recovery payload alone is useless: it's `image_secret` encrypted under Argon2id-of-passphrase. Attacker must additionally brute-force the passphrase, gated by Argon2id cost (m=64 MiB, t=3, p=4). With a passphrase at the enforced entropy floor (zxcvbn ≥ 3, ≈ 10¹⁰ guesses), brute-force is infeasible on commodity hardware. | A weak passphrase (zxcvbn < 3) below the floor — but the floor is enforced at init, so this only applies to grandfathered vaults that pre-date this feature. |
| Captures recovery payload + already knows passphrase | Nothing — equivalent to the existing "compromised reference image + passphrase" failure mode that the vault has always accepted as the universal worst case. | Same. |
| Reads files written to disk by relicario | Recovery payload is never written to disk by any code path. No file artifact exists to read. | OS print spooler may briefly cache a print job (Windows: `C:\Windows\System32\spool\PRINTERS\`). Print is the secondary path; users with concerns use the display path. |
| MitM on git transport | Recovery payload never traverses git or any network — it lives only in user-rendered output. | N/A |
| Crafts adversarial inputs to confuse vault KDF and recovery KDF outputs | Domain separation tag `b"relicario-recovery-v1\0"` prefixes the recovery KDF input, ensuring no input can produce identical Argon2id outputs across the two namespaces. | N/A |
## Cryptographic design
### Recovery KDF input
```text
recovery_kdf_input =
b"relicario-recovery-v1\0" // 22-byte domain separator
|| u64_be(len(nfc(passphrase))) // 8 bytes
|| nfc(passphrase) // variable
```
Fed to Argon2id with `RecoveryKdfParams::production()` and a fresh 32-byte salt generated at recovery-QR creation time (separate from the vault salt). Output is a 32-byte `wrap_key`.
Argon2id is a PRF, so distinct inputs yield uncorrelated outputs with negligible collision probability. The domain separator's role is to make inputs structurally distinguishable: the vault KDF input begins with `u64_be(passphrase_len)`, whose first 6+ bytes are zero for any realistic passphrase length (< 2⁴⁸ bytes), while the recovery KDF input begins with the literal ASCII `relicario-recovery-v1\0` — non-zero from byte 0. This is robust against any adversarially crafted passphrase value because the structural prefix difference is independent of passphrase content.
### Wrap
```text
nonce = OsRng(24)
ciphertext = XChaCha20-Poly1305(wrap_key, nonce, image_secret) // 32 + 16 = 48 bytes
```
Same AEAD primitive as the vault. Reuses `crypto::encrypt`/`crypto::decrypt` after the wrap key is derived.
### QR payload (binary)
```text
[magic "RREC" 4 bytes ] // matches the "RBAK" pattern from backup.rs:29
[version 0x01 1 byte ]
[salt 32 bytes ]
[nonce 24 bytes ]
[ciphertext 48 bytes ] // 32 plaintext + 16 Poly1305 tag
// ───────────
// 109 bytes total
```
Salt is included so recovery is self-sufficient — the user does not need to bring along the original `.relicario/salt`. The salt is not secret; storing it in the QR is not a confidentiality concern, and excluding it would tie recovery to a specific repo clone, which is the wrong invariant.
QR encoding: byte mode, error-correction level **M** (15% recovery — comfortable for paper-and-camera workflows). Payload + ECC fits in QR version 6 (41×41 modules, ≈ 30 mm at typical 300 DPI). Plenty of room.
### `RecoveryKdfParams` — type-level params floor
New type in `crates/relicario-core/src/recovery_qr.rs`:
```rust
pub struct RecoveryKdfParams {
argon2_m: u32, // private
argon2_t: u32, // private
argon2_p: u32, // private
}
impl RecoveryKdfParams {
pub const fn production() -> Self { /* m=65536, t=3, p=4 */ }
// No `new`, no `with_*`, no public field, no `Deserialize`.
// Test code that needs fast params must use a `#[cfg(test)]`-gated constructor.
}
```
This is the type-system enforcement of the "hard floor on KDF params" requirement. There is no runtime path — adversarial JSON, accidental `params.json` reuse, or developer error — that produces a `RecoveryKdfParams` with weak parameters. Test-only fast params (for unit and integration tests) are exposed via a feature-gated or `cfg(test)`-gated constructor; the exact mechanism (test feature flag vs. crate-internal helper accessed via a dedicated test-only re-export) is an implementation-time decision deferred to the plan, but the constraint is firm: no public path to weak params in release builds.
### Shared `normalize_passphrase` helper
Currently `derive_master_key` does NFC normalization inline (`crypto.rs:224-227`). Extract this into `pub(crate) fn normalize_passphrase(p: &[u8]) -> Vec<u8>` in `crypto.rs` and have both `derive_master_key` and the recovery KDF call it. Add a regression test that asserts the two paths use the same helper (a doctest or a test that compares both code paths' inputs to Argon2id is sufficient — the goal is to make drift fail loudly).
## Memory hygiene
All intermediate buffers are `Zeroizing<…>` end-to-end:
- `wrap_key``Zeroizing<[u8; 32]>` (already the convention; reuse `derive_master_key`'s pattern).
- The 32-byte `image_secret` going into the wrap — already wrapped in `Zeroizing` upstream by `imgsecret::extract`; the recovery path must not copy it into a non-Zeroizing buffer.
- The encrypted payload buffer (109 bytes, no plaintext) does not need Zeroizing — it's the artifact we display.
The wasm binding returns the encoded payload as `Vec<u8>` (the QR-encodable bytes) for the extension to render. The 32-byte `image_secret` never crosses the wasm boundary; only the encrypted blob does.
## Display + print pipeline (no on-disk path)
There is no API in any crate that writes a recovery payload to a file. Reviewer-visible invariant.
- **`relicario-core`** exposes `recovery_qr::generate(passphrase, image_secret) -> Vec<u8>` (returns the 109-byte payload). It does **not** expose `generate_to_file` or accept a `Path`.
- **`relicario-wasm`** exposes `generate_recovery_payload(passphrase, image_secret) -> Vec<u8>`. Same constraint.
- **`relicario-cli`** subcommand `recovery-qr generate` renders to TTY using a Unicode block-drawing QR (e.g. via `qrcode` crate's `render::unicode::Dense1x2`). Offers no `--out` flag. A `--print` flag pipes a PostScript QR to `lp` (Linux/macOS); on Windows the CLI's print path is best-effort and the in-app help recommends the extension's print flow instead, since the extension's `window.print()` integrates with the OS print dialog more cleanly than a one-off CLI shell-out.
- **Extension** routes to a dedicated `recovery-qr.html` page that renders the QR onto a `<canvas>`. Two buttons: **Display** (the page IS the display) and **Print** (calls `window.print()` on the same page with a `@media print` stylesheet that scales the canvas appropriately). No `<img>` or Blob URL — those create right-click-save attack vectors. The canvas itself is non-rightclick-save in practice but `oncontextmenu` is also blocked on this route as defense in depth.
The Windows print-spooler caveat (`C:\Windows\System32\spool\PRINTERS\` cache) is documented in the in-app copy on the Print button: "Display is recommended on Windows. The system print queue may briefly cache the QR before printing."
## Passphrase entropy floor
zxcvbn integration already exists in `crates/relicario-core/src/generators.rs` (`rate_passphrase` returning `score` and `guesses_log10`). This work wires it into the gate at vault init.
**Threshold:** zxcvbn `score >= 3` (= "safely unguessable: moderate protection from offline slow-hash scenarios", ≈ 10¹⁰ guesses). Score 4 is "very unguessable" and is the upper rung; we do not require it because user research consistently shows 4-word diceware (~51 bits, score 3) is the realistic ceiling for real-world adoption.
**Where enforced:**
| Surface | Enforcement |
| ------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `relicario init` (CLI) | Hard gate — refuses to create the vault, returns exit code 2 with `RelicarioError::WeakPassphrase { score, required: 3 }`. Suggests using `relicario generate-passphrase` (which already produces score-4 BIP39 outputs). |
| Extension setup wizard, "create new vault" branch | Hard gate at the passphrase step. The wizard already shows zxcvbn feedback; this change makes the Next button refuse to advance below score 3. Mirrors the existing attach-flow's structure (see `2026-04-27-attach-existing-vault-design.md` Step 3a). |
| Existing vaults at unlock (CLI + extension) | Soft warning: "Your passphrase scores below the current entropy floor. Consider rotating it to enable a secure recovery QR." Non-blocking. Surfaces once per session. |
| `recovery-qr generate` | Pre-flight check: if the unlock passphrase scores below 3, print a stronger warning and require a `--force-weak-passphrase` flag to proceed. The warning explains: "A recovery QR generated with a weak passphrase is feasibly brute-forceable from a photograph or printout." |
The weak-passphrase warning copy is the same in CLI and extension to keep the threat narrative consistent.
## Surfaces
### CLI
```bash
relicario recovery-qr generate # interactive: prompts passphrase, displays QR in TTY
relicario recovery-qr generate --print # secondary: pipes to system printer
relicario recovery-qr unlock --payload <hex> # one-shot recover image_secret from a scanned QR's hex
# (caller decoded the QR; we accept the payload bytes)
relicario unlock --recovery-qr-payload <hex> # alternative: full unlock using recovery payload + passphrase,
# bypassing the reference-image prompt for this invocation only
```
The `unlock --recovery-qr-payload` form is the actual disaster-recovery flow: the user is on a fresh device with no reference image, has just scanned their printed QR with a phone, and pastes the hex payload to unlock. After successful unlock, the CLI prints a recovery-completion notice and a pointer to the re-establishment flow:
> Recovered image_secret. Your reference image is currently lost — re-embed the recovered secret into a new carrier JPEG before relying on it. Run: `relicario imgsecret embed --carrier <new.jpg> --out <reference.jpg>` (uses the secret recovered in this session).
This requires a **new CLI subcommand `relicario imgsecret embed`** that wraps the existing `imgsecret::embed` function (already in `relicario-core/src/imgsecret.rs` and exposed via wasm at `relicario-wasm/src/lib.rs:273`). The command takes a fresh carrier JPEG and writes a reference image carrying the in-session-recovered secret. Bringing this to the CLI is in-scope for this spec because the disaster-recovery flow is incomplete without a path to re-establish the primary factor; the extension's existing image-creation flow already covers the equivalent there.
### Extension
Vault tab grows a **Disaster recovery** section with one button: **Generate recovery QR**. Clicking opens `recovery-qr.html` in a popup window (not a modal — popup gives `window.print()` cleaner ownership of the print dialog). Page contents:
```
┌──────────────────────────────────────────┐
│ Recovery QR │
│ │
│ [ canvas-rendered QR ] │
│ │
│ Snap with your phone, or click Print. │
│ This QR alone cannot unlock your vault. │
│ Combined with your passphrase, it can. │
│ │
│ [ Print ] [ Done ] │
│ │
│ ⚠ Windows users: prefer Display over │
│ Print. The system print queue may │
│ briefly cache the QR. │
└──────────────────────────────────────────┘
```
`Done` clears the canvas and closes the window. The wasm-returned 109-byte payload is held only in the popup's `window` scope; both `Done` and the `beforeunload` event handler zero it via `payload.fill(0)` before the window's JS context is torn down. (The 109-byte blob is encrypted, so its sensitivity is bounded by the passphrase strength regardless — but zeroing is cheap and removes one layer of "what if a browser extension snoops popup memory" worry.)
The init wizard's Step 3a (passphrase entry for new vaults) gains the score-3 hard gate — an inline change to `extension/src/setup/setup.ts` near where `rate_passphrase` is already called for the strength meter.
The unlock dialog gains a **Use recovery QR** link below the reference-image picker. Clicking opens a paste field for the hex payload; submitting recovers the image_secret in-process and continues the normal unlock flow with that recovered secret. After successful unlock, a banner suggests re-establishing the reference image.
### wasm bindings (additions to `relicario-wasm/src/lib.rs`)
```rust
#[wasm_bindgen]
pub fn generate_recovery_payload(passphrase: &str, image_secret: &[u8]) -> Result<Vec<u8>, JsError>;
#[wasm_bindgen]
pub fn unwrap_recovery_payload(passphrase: &str, payload: &[u8]) -> Result<Vec<u8>, JsError>;
// returns the 32-byte image_secret on success
```
## Migration & backwards compatibility
Additive only. No vault format change, no `params.json` change, no `manifest.enc` change. Existing vaults gain access to the feature on upgrade.
The passphrase entropy floor only gates **new** vault creation. Existing vaults (which may have weaker passphrases) continue to unlock normally; they receive a soft warning at unlock-time as described above. There is no forced rotation.
## Testing strategy
`crates/relicario-core/src/recovery_qr.rs`:
1. **Round-trip:** `image_secret = bytes; payload = generate(passphrase, image_secret); recovered = unwrap(passphrase, payload); assert_eq!(image_secret, recovered)`.
2. **Wrong passphrase rejected:** `unwrap("wrong", payload)` returns `RelicarioError::Decrypt`, no information leaked about which bit was wrong.
3. **Tampered payload rejected:** flip a byte anywhere in the 109 bytes — payload rejects.
4. **Domain separation:** assert the recovery KDF output for a given `(passphrase, salt)` differs from `derive_master_key`'s output for that same passphrase paired with the all-zero image_secret and the same salt. This regression guards against accidental input-shape collisions.
5. **NFC parity:** passphrase encoded as NFC vs NFD recovers identically — and explicitly call `normalize_passphrase` from both paths in the test setup to assert the helper is the single source of truth.
6. **Weak-params unconstructable:** type-level — there is no public path to construct `RecoveryKdfParams` with `argon2_m < 65536`. Asserted by a compile-fail test (trybuild) or by the absence of a public constructor (sufficient on its own; trybuild is gravy).
`crates/relicario-cli/tests/recovery_qr.rs`:
7. **No `--out` or file-write flag exists:** assert the clap surface for `recovery-qr generate` has no flags accepting a path. Negative test on the help output.
8. **End-to-end:** init a vault, generate a recovery QR (hex form for test purposes), purge the reference image, run `unlock --recovery-qr-payload <hex>` with the passphrase, assert the vault opens.
`crates/relicario-cli/tests/entropy_floor.rs`:
9. **Init rejects weak passphrase:** `relicario init` with passphrase `"correcthorse"` exits with code 2 and `WeakPassphrase` error.
10. **Init accepts strong passphrase:** `relicario init` with a fresh BIP39 4-word passphrase succeeds.
11. **Existing weak vault unlocks with warning:** simulate an existing vault with a weak passphrase; unlock succeeds and emits the soft warning to stderr.
Extension tests (Playwright or equivalent, following existing extension test patterns):
12. **Wizard rejects weak passphrase:** Next button disabled until score ≥ 3.
13. **Recovery QR popup never writes a file:** assert no `<a download>` or Blob URL appears in the popup DOM.
14. **`Done` clears canvas:** after Done, `getImageData` on the canvas returns all-zero bytes.
## Open questions
None remaining at design time. Defer to implementation:
- The exact CLI flag spelling (`--recovery-qr-payload` vs `--recover` vs `--recovery <hex>`). To be settled when the unlock-flow plan is written.
- Whether the extension popup's recovery flow accepts photographed-QR upload (image → QR-decode → payload) or only manual hex paste. The spec ships hex-paste only; image upload + decode is a follow-up that needs its own threat-model pass (uploading an image to the extension reintroduces a file-write vector that this design carefully avoided).