docs: add WASM + Chrome MV3 extension design spec

Plan 2 design covering idfoto-wasm crate, Chrome extension with
terminal-aesthetic popup, conservative autofill, Gitea/GitHub API
integration, and TOTP code generation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
adlee-was-taken
2026-04-12 00:05:31 -04:00
parent c50e0d448b
commit 596daf320a

View File

@@ -0,0 +1,502 @@
# idfoto — WASM + Chrome MV3 Extension Design
The browser extension for idfoto. Compiles `idfoto-core` to WASM, wraps it in a Chrome MV3 extension with a terminal-aesthetic popup, conservative autofill, and direct Gitea/GitHub API access. No CLI dependency, no native messaging bridge.
## Scope
- `idfoto-wasm` crate — wasm-bindgen wrapper around `idfoto-core`
- Chrome MV3 extension:
- One-time setup wizard (git host + token + repo + reference image)
- Service worker — WASM runtime, master_key holder, vault operations, git API
- Popup — unlock, search/list, group filtering, entry detail, TOTP countdown, keyboard-first
- Content script — conservative login form detection, explicit-trigger autofill
- Data model addition: `group` field on entries for logical organization
## Data Model Changes
### Entry struct
```rust
pub struct Entry {
pub name: String,
pub url: Option<String>,
pub username: Option<String>,
pub password: Option<String>,
pub notes: Option<String>,
pub totp_secret: Option<String>,
pub group: Option<String>, // NEW — None = ungrouped
pub created_at: String,
pub updated_at: String,
}
```
### ManifestEntry struct
```rust
pub struct ManifestEntry {
pub name: String,
pub url: Option<String>,
pub username: Option<String>,
pub group: Option<String>, // NEW — for popup filtering without decrypting entries
pub updated_at: String,
}
```
The `group` field is a free-form string. No predefined list, no nesting. User types "work" or "family" and entries cluster. Backwards-compatible — existing vaults without `group` deserialize as `None` (ungrouped).
## WASM Crate (`idfoto-wasm`)
Thin wasm-bindgen wrapper exposing `idfoto-core` functions to JavaScript. Lives at `crates/idfoto-wasm/`.
### Public API
```rust
// KDF + crypto
#[wasm_bindgen]
pub fn derive_master_key(passphrase: &str, image_secret: &[u8], salt: &[u8], params_json: &str) -> Result<Vec<u8>, JsValue>
#[wasm_bindgen]
pub fn encrypt(plaintext: &[u8], key: &[u8]) -> Result<Vec<u8>, JsValue>
#[wasm_bindgen]
pub fn decrypt(ciphertext: &[u8], key: &[u8]) -> Result<Vec<u8>, JsValue>
// Image secret extraction
#[wasm_bindgen]
pub fn extract_image_secret(jpeg_bytes: &[u8]) -> Result<Vec<u8>, JsValue>
// Vault operations (convenience wrappers — JSON in, encrypted bytes out)
#[wasm_bindgen]
pub fn encrypt_entry(entry_json: &str, key: &[u8]) -> Result<Vec<u8>, JsValue>
#[wasm_bindgen]
pub fn decrypt_entry(ciphertext: &[u8], key: &[u8]) -> Result<String, JsValue>
#[wasm_bindgen]
pub fn encrypt_manifest(manifest_json: &str, key: &[u8]) -> Result<Vec<u8>, JsValue>
#[wasm_bindgen]
pub fn decrypt_manifest(ciphertext: &[u8], key: &[u8]) -> Result<String, JsValue>
// TOTP — RFC 6238, HMAC-SHA1, 6-digit codes, 30-second step
#[wasm_bindgen]
pub fn generate_totp(secret_base32: &str, timestamp_secs: u64) -> Result<String, JsValue>
// Utilities
#[wasm_bindgen]
pub fn generate_password(length: u32) -> String
#[wasm_bindgen]
pub fn generate_entry_id() -> String
```
### Dependencies
```toml
[dependencies]
idfoto-core = { path = "../idfoto-core" }
wasm-bindgen = "0.2"
js-sys = "0.3"
serde_json = "1"
hmac = "0.12"
sha1 = "0.10" # TOTP requires HMAC-SHA1 per RFC 6238
data-encoding = "2" # base32 decoding for TOTP secrets
```
### WASM build
```bash
wasm-pack build crates/idfoto-wasm --target web --out-dir ../../extension/wasm
```
Output: `idfoto_wasm.js` (JS glue) + `idfoto_wasm_bg.wasm` (binary). Expected size ~200-500 KB gzipped. The `image` crate's JPEG decoder is the heaviest component — optimize only if measured size is a problem.
### TOTP implementation
Standard RFC 6238:
1. Base32-decode the secret
2. Compute time step: `counter = timestamp_secs / 30`
3. HMAC-SHA1(secret, counter as big-endian u64)
4. Dynamic truncation → 6-digit code
5. Zero-pad to 6 digits
Implemented in the WASM crate, not in JavaScript. No JS crypto dependency.
## Extension Architecture
### Approach: Monolith Service Worker
All logic lives in the service worker. Popup and content script are thin UI/DOM layers that communicate via `chrome.runtime.sendMessage`.
The master_key exists only in the service worker's memory. Chrome MV3 may terminate idle service workers after ~30 seconds — this clears the key and requires re-unlock. This is a feature: natural session timeout with zero additional code.
Mitigations for premature termination:
- Chrome keeps workers alive while message ports are open (popup open = worker alive)
- Content scripts can send periodic keepalive pings on active tabs
- Re-unlock is fast enough (~1-2s for Argon2id in WASM) that it's not painful
### Service Worker State
```typescript
interface WorkerState {
masterKey: Uint8Array | null; // held in memory after unlock, cleared on termination
manifest: Manifest | null; // cached after first decrypt, refreshed on sync
config: VaultConfig | null; // from chrome.storage.local
}
interface VaultConfig {
hostType: "gitea" | "github";
hostUrl: string; // e.g. "https://git.adlee.work"
repoPath: string; // e.g. "alee/idfoto-vault"
apiToken: string; // personal access token
imageBytes: Uint8Array; // reference JPEG, stored in chrome.storage.local
}
```
### Message API
Popup and content script communicate with the service worker via typed messages:
```typescript
// Auth
{ type: "unlock", passphrase: string } { ok: true } | { error: string }
{ type: "lock" } { ok: true }
{ type: "is_unlocked" } { unlocked: boolean }
// Vault reads
{ type: "list_entries", group?: string } ManifestEntry[]
{ type: "get_entry", id: string } Entry
{ type: "search_entries", query: string } ManifestEntry[]
// Vault writes
{ type: "add_entry", entry: EntryInput } { id: string }
{ type: "update_entry", id: string, entry: EntryInput } { ok: true }
{ type: "delete_entry", id: string } { ok: true }
// TOTP
{ type: "get_totp", id: string } { code: string, remaining_seconds: number }
// Autofill
{ type: "get_autofill_candidates", url: string } ManifestEntry[]
{ type: "get_credentials", id: string } { username: string, password: string }
// Sync
{ type: "sync" } { ok: true } | { error: string }
```
### Unlock Flow
1. User enters passphrase in popup
2. Popup sends `{ type: "unlock", passphrase }` to service worker
3. Service worker loads vault config from `chrome.storage.local` (includes image bytes)
4. WASM: `extract_image_secret(image_bytes)``image_secret`
5. Service worker fetches `.idfoto/salt` and `.idfoto/params.json` via git API
6. WASM: `derive_master_key(passphrase, image_secret, salt, params)``master_key`
7. Service worker fetches `manifest.enc` via git API
8. WASM: `decrypt_manifest(manifest_enc, master_key)` → manifest
9. Cache `master_key` and `manifest` in worker memory
10. Reply `{ ok: true }` to popup
Steps 4-6 take ~1-2 seconds (Argon2id dominates). Popup shows a spinner.
## Git API Layer
Abstracts Gitea and GitHub behind a common interface. Both use nearly identical REST APIs for file CRUD.
```typescript
interface GitHost {
readFile(path: string): Promise<Uint8Array>;
writeFile(path: string, content: Uint8Array, message: string): Promise<void>;
deleteFile(path: string, message: string): Promise<void>;
listDir(path: string): Promise<string[]>;
}
```
### GiteaHost
- Base: `{hostUrl}/api/v1/repos/{repoPath}/contents/{path}`
- Auth: `Authorization: token {apiToken}`
- File content returned as base64 in JSON response
- Write/delete requires the file's SHA (fetched first, then sent with the update)
### GitHubHost
- Base: `https://api.github.com/repos/{repoPath}/contents/{path}`
- Auth: `Authorization: Bearer {apiToken}`
- Same base64 content model, same SHA requirement for updates
### Sync behavior
- On unlock: fetch salt, params, manifest
- On entry access: fetch individual entry file on demand
- On write (add/edit/rm): two sequential API commits — entry file first, then updated manifest
- Each API call = one commit. A write operation is two commits (entry + manifest), linear history
- No branching, no merging, no conflict resolution in V1
- If the remote has changed since last read (SHA mismatch on write), the API returns 409 — surface the error, user re-syncs
## Popup UI
### Design Language
- **Theme:** Dark background (#0d1117), monospace typography (system monospace stack, JetBrains Mono preferred)
- **Aesthetic:** Terminal/dev tool feel. Minimal chrome, tight spacing, no rounded corners beyond 2px
- **Colors:** Blue (#58a6ff) for interactive elements and branding, green (#3fb950) for TOTP codes, muted gray (#8b949e) for secondary text, dark surfaces (#161b22) for inputs
- **Interactions:** Keyboard-first. Every action has a single-key shortcut. Mouse works but isn't required.
### Popup States
The popup is a state machine with four primary states:
**1. Locked (unlock prompt)**
- Single passphrase input field
- ENTER to submit, ESC to close popup
- Spinner during Argon2id derivation
- Error message on bad passphrase (inline, red text)
**2. Entry List**
- Search bar at top (focused by `/`)
- Group filter tabs below search (all, personal, work, etc. — derived from entries)
- Scrollable entry list with keyboard navigation (↑↓)
- Each entry shows: name, username, domain (extracted from URL)
- Active entry highlighted with left blue border
- Footer: keybinding hints
- `+` to add new entry
- ENTER to open selected entry
**3. Entry Detail**
- Back navigation (ESC)
- Entry name as header, group label
- Fields: URL, username (c to copy), password masked (p to copy), TOTP code with countdown bar (t to copy)
- Notes section (if present)
- Actions: f = autofill active tab, e = edit, d = delete (with confirmation)
- TOTP countdown: green progress bar, updates every second, code regenerates at 0
**4. Setup Wizard**
- Three steps with progress bar:
1. Git host config: host type toggle (Gitea/GitHub), host URL, repo path, API token
2. Reference image: file upload (drag-and-drop or file picker), stored to `chrome.storage.local`
3. Test unlock: enter passphrase, verify derivation succeeds against the remote vault
- Back/next navigation, validation on each step
### Additional Views (modal overlays)
- **Add/Edit Entry:** Form with fields for name, URL, username, password (with generate button), TOTP secret, group, notes. Save commits to git.
- **Delete Confirmation:** "Delete {name}? This commits a removal to the vault." Yes/No.
### Keyboard Shortcuts
| Key | Context | Action |
|-----|---------|--------|
| `/` | List | Focus search |
| `↑↓` | List | Navigate entries |
| `Enter` | List | Open selected entry |
| `Esc` | Detail/Edit | Back to list |
| `Esc` | List | Close popup |
| `+` | List | Add new entry |
| `c` | Detail | Copy username |
| `p` | Detail | Copy password |
| `t` | Detail | Copy TOTP code |
| `f` | Detail | Autofill active tab |
| `e` | Detail | Edit entry |
| `d` | Detail | Delete entry (with confirmation) |
### Popup Dimensions
Width: 360px. Height: auto, max 500px with scroll. Standard Chrome extension popup constraints.
## Content Script
Runs on all HTTP/HTTPS pages. Three responsibilities:
### 1. Login Form Detection
Conservative detection — standard selectors only:
```typescript
// Password field detection
const passwordFields = document.querySelectorAll('input[type="password"]');
// Username field detection (adjacent to password field)
// Priority order:
// 1. input[autocomplete="username"]
// 2. input[autocomplete="email"]
// 3. input[type="email"]
// 4. input[name] matching /user|email|login|account/i
// 5. Nearest preceding text/email input in the same form
```
No shadow DOM traversal. No heuristic scoring. No iframe inspection. If the form uses non-standard markup, the user copies from the popup manually.
### 2. Field Icon Injection
When a password field is detected:
- Small idfoto icon (16x16, inline SVG) appears at the right edge of the password field
- Click triggers: send page URL to service worker → get matching entries
- Single match: fill immediately
- Multiple matches: show inline picker (small dropdown below the icon)
- Icon styled to not conflict with existing field content
### 3. Credential Fill
On fill trigger (from popup `f` key or field icon click):
1. Service worker sends `{ username, password }` to content script
2. Content script sets `.value` on detected fields
3. Dispatches `input` and `change` events (required for React/Vue/Angular controlled inputs)
4. Focuses the next logical element (submit button or next field)
The content script never receives the master_key, manifest, or any vault data beyond the specific credentials being filled.
## Extension File Structure
```
extension/
├── manifest.json # MV3 manifest
├── package.json # TypeScript, build tooling
├── tsconfig.json
├── webpack.config.js # or vite.config.ts
├── src/
│ ├── service-worker/
│ │ ├── index.ts # WASM init, message router, state management
│ │ ├── vault.ts # vault CRUD operations
│ │ ├── git-host.ts # GitHost interface definition
│ │ ├── gitea.ts # Gitea API implementation
│ │ ├── github.ts # GitHub API implementation
│ │ ├── totp.ts # TOTP code request handling
│ │ └── autofill.ts # content script coordination
│ ├── popup/
│ │ ├── index.html # popup shell
│ │ ├── popup.ts # state machine: locked → list → detail → edit
│ │ ├── components/
│ │ │ ├── unlock.ts # passphrase prompt
│ │ │ ├── entry-list.ts # search + group filter + entry rows
│ │ │ ├── entry-detail.ts # field display + TOTP countdown
│ │ │ ├── entry-form.ts # add/edit form
│ │ │ └── setup-wizard.ts # three-step setup flow
│ │ └── styles.css # terminal dark theme
│ ├── content/
│ │ ├── detector.ts # login form field detection
│ │ ├── fill.ts # credential injection + event dispatch
│ │ └── icon.ts # field icon injection + inline picker
│ └── shared/
│ ├── messages.ts # typed message definitions
│ └── types.ts # Entry, ManifestEntry, VaultConfig, etc.
├── wasm/ # wasm-pack output (idfoto_wasm.js + .wasm)
├── icons/ # extension icons (16, 48, 128px)
└── dist/ # build output → load unpacked into Chrome
```
No framework. Vanilla TypeScript + DOM manipulation. The popup is small enough that a framework adds overhead without value. Bundle stays tiny.
## Build Pipeline
### WASM build
```bash
wasm-pack build crates/idfoto-wasm --target web --out-dir ../../extension/wasm
```
### Extension build
```bash
cd extension && npm run build # TypeScript → bundled JS via webpack/vite → dist/
```
### Combined
```bash
make extension # or: npm run build:all from extension/
```
Chains wasm-pack then webpack. Dev mode: `npm run dev` watches TypeScript and auto-rebuilds. WASM only needs rebuild when Rust source changes.
### Chrome manifest.json
```json
{
"manifest_version": 3,
"name": "idfoto",
"version": "0.1.0",
"description": "Two-factor encrypted password manager",
"permissions": ["storage", "activeTab", "clipboardWrite"],
"host_permissions": ["<all_urls>"],
"background": {
"service_worker": "service-worker.js",
"type": "module"
},
"action": {
"default_popup": "popup.html",
"default_icon": {
"16": "icons/icon-16.png",
"48": "icons/icon-48.png",
"128": "icons/icon-128.png"
}
},
"content_scripts": [{
"matches": ["<all_urls>"],
"js": ["content.js"],
"run_at": "document_idle"
}],
"content_security_policy": {
"extension_pages": "script-src 'self' 'wasm-unsafe-eval'; object-src 'self'"
}
}
```
Note: `wasm-unsafe-eval` is required in MV3 to instantiate WASM modules. This is the standard approach — Chrome explicitly added this directive for WASM use cases.
`host_permissions: ["<all_urls>"]` is needed for the content script to run on all pages and for the service worker to make API calls to arbitrary git hosts.
## Security Considerations
### What's stored in `chrome.storage.local`
| Data | Sensitivity | Rationale |
|------|-------------|-----------|
| Reference image bytes | Low | Public in threat model (can live on social media). Provides image_secret but useless without passphrase. |
| API token | Medium | Grants repo access. Scoped to repo-only permissions. |
| Host URL, repo path | Low | Not secret. |
### What's never persisted
- Passphrase
- master_key (service worker memory only, cleared on termination)
- image_secret (derived in memory during unlock, not cached)
### Content script isolation
The content script runs in the page's DOM context but never receives vault-level data. It only gets the specific `{ username, password }` pair for a fill operation, delivered on demand by the service worker.
### API token security
The token is stored in `chrome.storage.local`, which is sandboxed per-extension and inaccessible to web pages. A compromised extension could leak it, but that's true of any credential stored by any extension. Mitigation: scope the token to minimum required permissions (repo read/write only).
## Testing Strategy
### WASM crate
- Unit tests: each wrapper function round-trips correctly (`wasm-pack test --node`)
- TOTP: test vectors from RFC 6238 appendix B
- Integration: derive key + encrypt + decrypt cycle matches `idfoto-core` output
### Extension (manual for V1)
- Setup wizard: configure Gitea host, upload reference image, test unlock
- CRUD: add, view, edit, delete entries through popup
- Groups: create entries in different groups, verify filter works
- Autofill: test on standard login forms (GitHub, Google, etc.)
- TOTP: verify generated codes match Google Authenticator for same seed
- Service worker lifecycle: close popup, wait >30s, reopen — verify re-unlock required
- Offline: verify graceful error when git host unreachable
### Future: automated extension testing with Puppeteer/Playwright
Not in V1 scope. The extension is small enough that manual testing covers it.
## Non-Goals
- Firefox/Safari extensions (later plan)
- Offline vault cache (extension always needs git host access)
- Conflict resolution (409 on write = re-sync, no merge)
- Framework (React, Vue, etc.) for popup UI
- Automated E2E testing (manual for V1)
- Multiple vaults per extension (single vault, groups for organization)