More video work, planning, etc. -- Need to mark things EXPERIMENTAL.
This commit is contained in:
294
IdeasScout_PLANS_20260324.md
Normal file
294
IdeasScout_PLANS_20260324.md
Normal file
@@ -0,0 +1,294 @@
|
||||
# Stegasoo Ideas Scout — Implementation Plans (2026-03-24)
|
||||
|
||||
Baseline: v4.3.0, Python >=3.11, FORMAT_VERSION 5, no existing users (no backward compat constraints).
|
||||
|
||||
---
|
||||
|
||||
## Tier 1 — Quick Wins
|
||||
|
||||
### 1. Platform-Calibrated DCT Presets
|
||||
|
||||
**Description**: `--platform telegram|discord|signal|whatsapp` flag for DCT encode. Bakes in each platform's known recompression parameters. Pre-verifies payload survives before outputting.
|
||||
|
||||
**Implementation approach**:
|
||||
- New file `src/stegasoo/platform_presets.py` — `PlatformPreset` dataclass + `PRESETS` dict mapping platform → tuned `quant_step`, `jpeg_quality`, `embed_positions`, `max_dimension`, `recompress_quality`
|
||||
- `dct_steganography.py`: `_embed_scipy_dct_safe()` / `_embed_jpegio()` accept optional preset overrides for `QUANT_STEP`, `DEFAULT_EMBED_POSITIONS`, output quality
|
||||
- New `pre_verify_survival()` function: encode → re-save at platform quality → extract → pass/fail
|
||||
- Thread `platform` param through `encode.py` → `steganography.py` → DCT functions
|
||||
- `cli.py`: add `--platform` as `click.Choice` + `--verify/--no-verify` (pre-verification doubles encode time)
|
||||
- LSB + `--platform` should error early — LSB data is destroyed by any JPEG recompression
|
||||
|
||||
**Known platform params** (from research):
|
||||
| Platform | Quality | Max Dimension | Notes |
|
||||
|----------|---------|---------------|-------|
|
||||
| Telegram | ~82 | 2560×2560 | ~81KB embeddable |
|
||||
| Discord | ~85 | Varies (Nitro) | |
|
||||
| Signal | ~80 | Aggressive | |
|
||||
| WhatsApp | ~70 | 1600×1600 | Most lossy |
|
||||
|
||||
**Go/No-Go metrics**:
|
||||
- >95% payload survival rate per platform at 1KB message size in automated tests
|
||||
- Pre-verification correctly predicts real platform behavior (manual validation per platform at least once)
|
||||
|
||||
**Complexity**: **M** — new file + parameter threading through 4-5 functions
|
||||
|
||||
**Risks**: Platform params change without notice. Add version/date stamps to presets and a `stegasoo tools verify-platform` test command.
|
||||
|
||||
---
|
||||
|
||||
### 2. Steganalysis Self-Check (`stegasoo check`)
|
||||
|
||||
**Description**: New CLI command running chi-square and RS (Regular-Singular) statistical analysis on stego images. Outputs detectability risk level (low/medium/high).
|
||||
|
||||
**Implementation approach**:
|
||||
- New file `src/stegasoo/steganalysis.py`:
|
||||
- `chi_square_analysis(image_data) -> float` — chi-square statistic on LSB distribution per channel
|
||||
- `rs_analysis(image_data) -> float` — Regular-Singular groups analysis (requires numpy)
|
||||
- `assess_risk(chi_p, rs_estimate) -> str` — maps to "low"/"medium"/"high"
|
||||
- `check_image(image_data) -> dict` — orchestrator
|
||||
- `cli.py`: new `@cli.command("check")` with `IMAGE` arg, `--json`, `--mode lsb|dct|auto`
|
||||
- `constants.py`: threshold constants for chi-square p-value and RS boundaries
|
||||
- `__init__.py`: export `check_image` in `__all__`
|
||||
- Start LSB-only; DCT steganalysis (calibration attack) deferred
|
||||
|
||||
**Go/No-Go metrics**:
|
||||
- Clean images → consistently "low risk"
|
||||
- Naive sequential LSB → "high risk"
|
||||
- Stegasoo LSB at <50% capacity → "low" or "medium"
|
||||
|
||||
**Complexity**: **M** — ~150 lines numpy per test, straightforward CLI integration
|
||||
|
||||
---
|
||||
|
||||
### 3. Python 3.13 DCT Cleanup
|
||||
|
||||
**Description**: The `jpegio` → `jpeglib` migration is already done in code. Remaining work: rename stale `jpegio` references and verify on 3.13.
|
||||
|
||||
**Implementation approach**:
|
||||
- `dct_steganography.py`: rename `HAS_JPEGIO` → `HAS_JPEGLIB`, `_jpegio_*` functions → `_jpeglib_*`, update constant names (`JPEGIO_MAGIC` → `JPEGLIB_MAGIC`, etc.)
|
||||
- Verify `jpeglib.to_jpegio()` compatibility shim — if jpeglib plans to deprecate it, migrate to native API
|
||||
- Run full test suite on Python 3.13
|
||||
|
||||
**Go/No-Go metrics**:
|
||||
- All DCT tests pass on Python 3.13
|
||||
- No deprecation warnings from jpeglib
|
||||
|
||||
**Complexity**: **S** — renaming and verification only
|
||||
|
||||
---
|
||||
|
||||
## Tier 2 — Strategic
|
||||
|
||||
### 4. Content-Adaptive Embedding (S-UNIWARD/WOW-inspired)
|
||||
|
||||
**Description**: Replace uniform-random pixel selection with texture-weighted cost functions. Embed preferentially in busy/textured regions where changes are least detectable. 3-5x harder to detect statistically.
|
||||
|
||||
**Implementation approach**:
|
||||
- New file `src/stegasoo/adaptive_cost.py`:
|
||||
- `compute_cost_map(image_data) -> np.ndarray` — per-pixel distortion cost via directional high-pass filters (Daubechets wavelet bank / KB filter)
|
||||
- `select_pixels_by_cost(cost_map, pixel_key, num_needed) -> list[int]` — weighted sampling, still ChaCha20-seeded for determinism
|
||||
- `steganography.py`:
|
||||
- `generate_pixel_indices()`: add `cost_map` param, use weighted sampling when provided
|
||||
- `_embed_lsb()`: compute cost map when adaptive mode enabled
|
||||
- `_extract_lsb()`: must compute identical cost map to find same pixels
|
||||
- `dct_steganography.py`: adapt `DEFAULT_EMBED_POSITIONS` per-block based on block texture energy
|
||||
- Thread `adaptive: bool` through `encode.py`/`decode.py`
|
||||
- `constants.py`: add `EMBED_MODE_ADAPTIVE_LSB`, filter kernels, cost thresholds
|
||||
|
||||
**Go/No-Go metrics**:
|
||||
- Chi-square test (Feature 2) shows measurable improvement vs uniform-random
|
||||
- **Critical**: cost map computation is deterministic across platforms (quantize to fixed-point integers)
|
||||
- Round-trip decode succeeds on Linux x86, Linux ARM, macOS
|
||||
|
||||
**Complexity**: **L** — novel algorithm, cross-platform determinism requirement, touches core embedding
|
||||
|
||||
**Risks**: Floating-point differences in wavelet computation could break extraction. Mitigate with integer quantization. Increases encode/decode time ~2-3x.
|
||||
|
||||
---
|
||||
|
||||
### 5. Per-Message Forward Secrecy via HKDF
|
||||
|
||||
**Description**: Derive ephemeral per-message encryption keys using HKDF expansion from the Argon2id root key + random nonce. Compromising one message doesn't reveal others.
|
||||
|
||||
**Implementation approach**:
|
||||
- `crypto.py`:
|
||||
- Add `from cryptography.hazmat.primitives.kdf.hkdf import HKDFExpand`
|
||||
- `derive_message_key(root_key, nonce) -> bytes` — HKDF-Expand with SHA-256
|
||||
- `encrypt_message()`: generate 16-byte random nonce, derive per-message key, embed nonce in header
|
||||
- `decrypt_message()`: extract nonce, derive same key
|
||||
- Also derive pixel selection key via HKDF with different `info` param
|
||||
- `constants.py`:
|
||||
- Bump `FORMAT_VERSION` to 6
|
||||
- `HKDF_INFO_ENCRYPTION = b"stegasoo-v6-encrypt"`, `HKDF_INFO_PIXEL = b"stegasoo-v6-pixel"`
|
||||
- `MESSAGE_NONCE_SIZE = 16`
|
||||
- Header grows from 66 → 82 bytes: add `message_nonce(16)` field
|
||||
- Update `HEADER_OVERHEAD` / `ENCRYPTION_OVERHEAD` in `steganography.py`
|
||||
|
||||
**Go/No-Go metrics**:
|
||||
- Two messages with identical credentials produce different ciphertexts and different pixel locations
|
||||
- `cryptography` library HKDF works with existing Argon2id output
|
||||
|
||||
**Complexity**: **M** — well-defined crypto change, touches security-critical header format
|
||||
|
||||
---
|
||||
|
||||
### 6. PWA Mobile Interface
|
||||
|
||||
**Description**: Convert Flask Web UI to Progressive Web App. Mobile-optimized, installable, offline-capable static pages.
|
||||
|
||||
**Implementation approach**:
|
||||
- New files in `frontends/web/static/`: `manifest.json`, `sw.js`, icon set (192×192, 512×512)
|
||||
- Base template: add manifest link, theme-color meta, viewport meta, service worker registration
|
||||
- `app.py`: serve manifest with correct MIME, add cache headers for static assets
|
||||
- Responsive CSS for encode/decode accordion forms
|
||||
- Camera capture: `<input type="file" accept="image/*" capture="environment">` for reference photo
|
||||
- Service worker caches static assets only — NOT encode/decode API endpoints
|
||||
|
||||
**Go/No-Go metrics**:
|
||||
- Lighthouse PWA score >= 90
|
||||
- Installable on Android Chrome and iOS Safari
|
||||
- Offline: static pages load, encode/decode shows graceful "offline" message
|
||||
|
||||
**Complexity**: **M** — frontend only, no core library changes
|
||||
|
||||
**Risks**: Camera capture requires HTTPS (already supported via `ssl_utils.py`).
|
||||
|
||||
---
|
||||
|
||||
## Tier 3 — Moonshot
|
||||
|
||||
### 7. Plausible Deniability / Dual-Payload Mode
|
||||
|
||||
**Description**: Two independent encrypted payloads in one carrier, each with different credentials. Reveal decoy under coercion; real payload stays hidden.
|
||||
|
||||
**Implementation approach**:
|
||||
- New file `src/stegasoo/dual_payload.py`:
|
||||
- `encode_dual(message_a, message_b, carrier, creds_a, creds_b)`
|
||||
- Partition available pixels into two disjoint pools using different seeds
|
||||
- **Critical**: ALL images (single or dual) must fill unused pixel pool with random data so single-payload and dual-payload images are indistinguishable
|
||||
- `steganography.py`: `generate_pixel_indices()` gets `exclude_indices` param
|
||||
- `decode.py`: each credential set finds a different valid payload; wrong credentials produce garbage
|
||||
- CLI + Web UI: dual-payload encode workflow
|
||||
|
||||
**Go/No-Go metrics**:
|
||||
- Single-payload and dual-payload images are statistically indistinguishable (chi-square can't differentiate)
|
||||
- Each payload decodes independently
|
||||
- Wrong credentials for one payload don't reveal other payload's existence
|
||||
|
||||
**Complexity**: **XL** — novel design, halves capacity per payload, challenging UX, needs rigorous security analysis
|
||||
|
||||
**Dependencies**: Feature 2 (validation), Feature 4 (detectability reduction)
|
||||
|
||||
---
|
||||
|
||||
## Architectural Improvements
|
||||
|
||||
### 8. EmbeddingBackend Protocol
|
||||
|
||||
**Description**: Typed plugin interface for all embedding algorithms. Replace if/elif dispatch in `steganography.py` with a registry.
|
||||
|
||||
**Implementation approach**:
|
||||
- New package `src/stegasoo/backends/`:
|
||||
- `protocol.py` — `EmbeddingBackend(Protocol)` with `embed()`, `extract()`, `calculate_capacity()`, `is_available()`
|
||||
- `lsb.py`, `dct.py` — wrap existing functions
|
||||
- `registry.py` — `BackendRegistry` mapping mode strings to backends
|
||||
- `steganography.py`: `embed_in_image()` / `extract_from_image()` dispatch via registry
|
||||
- `__init__.py`: export protocol and `register_backend()`
|
||||
|
||||
**Complexity**: **M** — implement before Features 4 and 7 (they become new backends)
|
||||
|
||||
---
|
||||
|
||||
### 9. HKDF Key Separation
|
||||
|
||||
Subsumed by Feature 5. The HKDF expansion provides:
|
||||
- Encryption key: `HKDF-Expand(root_key, info="stegasoo-encrypt", nonce)`
|
||||
- Pixel selection key: `HKDF-Expand(root_key, info="stegasoo-pixel", nonce)`
|
||||
- Future: MAC key, padding key, etc.
|
||||
|
||||
---
|
||||
|
||||
### 10. `[core]` Extra with Minimal Deps
|
||||
|
||||
**Description**: Move Pillow to `[image]` extra, base deps = `cryptography` + `argon2-cffi` + `zstandard` only.
|
||||
|
||||
**Complexity**: **S** — but Pillow is used in `crypto.py` for photo hashing (core to security model). Only worth it with a concrete headless use case. **Low priority.**
|
||||
|
||||
---
|
||||
|
||||
## Ecosystem Features
|
||||
|
||||
### 11. Aletheia Integration
|
||||
|
||||
Optional `--engine aletheia` backend for Feature 2's `stegasoo check`. BSD-licensed, provides SPA/RS/WS attacks + ML classifiers. **Complexity: S** (after Feature 2). **Depends on**: Feature 2.
|
||||
|
||||
### 12. C2PA/AI Provenance Watermarking
|
||||
|
||||
Embed C2PA metadata alongside stego payloads. **Complexity: L** — C2PA is a complex standard. Potentially conflicts with stego goals (adds detectable metadata). Research-heavy.
|
||||
|
||||
### 13. Signal/Matrix Bot
|
||||
|
||||
Bot that decodes stego images in a channel using configured channel key. **Complexity: M** — integration work, uses existing `decode()` API.
|
||||
|
||||
### 14. Homebrew Tap + Nix Flake
|
||||
|
||||
Package distribution for macOS/NixOS. **Complexity: S** — packaging only, no code changes.
|
||||
|
||||
---
|
||||
|
||||
## Summary Table
|
||||
|
||||
| # | Feature | Tier | Size | Dependencies | Primary Files |
|
||||
|---|---------|------|------|-------------|---------------|
|
||||
| 1 | Platform DCT Presets | T1 | M | — | new `platform_presets.py`, `dct_steganography.py`, `encode.py`, `cli.py` |
|
||||
| 2 | Steganalysis Self-Check | T1 | M | — | new `steganalysis.py`, `cli.py`, `constants.py` |
|
||||
| 3 | Python 3.13 DCT Cleanup | T1 | S | — | `dct_steganography.py` |
|
||||
| 4 | Content-Adaptive Embedding | T2 | L | numpy, #2 | new `adaptive_cost.py`, `steganography.py`, `constants.py` |
|
||||
| 5 | HKDF Forward Secrecy | T2 | M | — | `crypto.py`, `constants.py`, `steganography.py` |
|
||||
| 6 | PWA Mobile Interface | T2 | M | — | `frontends/web/` templates + static |
|
||||
| 7 | Dual-Payload Mode | T3 | XL | #2, #4 | new `dual_payload.py`, `steganography.py`, `cli.py` |
|
||||
| 8 | EmbeddingBackend Protocol | Arch | M | — | new `backends/` package, `steganography.py` |
|
||||
| 9 | HKDF Key Separation | Arch | — | Included in #5 | `crypto.py` |
|
||||
| 10 | `[core]` Extra | Arch | S | — | `pyproject.toml` |
|
||||
| 11 | Aletheia Integration | Eco | S | #2 | `steganalysis.py` |
|
||||
| 12 | C2PA Watermarking | Eco | L | — | new module |
|
||||
| 13 | Signal/Matrix Bot | Eco | M | — | new `bots/` package |
|
||||
| 14 | Homebrew + Nix | Eco | S | — | packaging files only |
|
||||
|
||||
---
|
||||
|
||||
## Suggested Roadmap
|
||||
|
||||
### Phase 1 — Foundations (v4.4.0)
|
||||
|
||||
1. **#3** Python 3.13 DCT Cleanup (S) — unblocks CI on 3.13
|
||||
2. **#8** EmbeddingBackend Protocol (M) — architectural cleanup before new embedding work
|
||||
3. **#2** Steganalysis Self-Check (M) — validation tooling for everything that follows
|
||||
|
||||
### Phase 2 — Security & Robustness (v4.5.0)
|
||||
|
||||
4. **#5** HKDF Forward Secrecy (M) — FORMAT_VERSION bump to 6, improved crypto
|
||||
5. **#1** Platform-Calibrated DCT Presets (M) — high user value for social media
|
||||
6. **#14** Homebrew + Nix (S) — distribution expansion
|
||||
|
||||
### Phase 3 — Advanced Steganography (v5.0.0)
|
||||
|
||||
7. **#4** Content-Adaptive Embedding (L) — major security improvement
|
||||
8. **#6** PWA Mobile Interface (M) — parallel frontend work stream
|
||||
|
||||
### Phase 4 — Moonshot (v5.x+)
|
||||
|
||||
9. **#7** Dual-Payload Mode (XL) — after #2 and #4 are solid
|
||||
10. **#12** C2PA Watermarking (L) — research-heavy
|
||||
11. **#13** Signal/Matrix Bot (M) — community-driven
|
||||
|
||||
---
|
||||
|
||||
## Additional Ideas (Backlog)
|
||||
|
||||
- **Animated GIF steganography** — LSB in GIF frames, natural multi-media extension
|
||||
- **PDF steganography** — whitespace/font metric/embedded image payloads
|
||||
- **Batch encode** — `stegasoo batch-encode --dir /photos/` with auto carrier selection (BATCH_* constants suggest this was planned)
|
||||
- **Stego identification** — `stegasoo identify image.png` probes for known stego signatures
|
||||
- **Per-device credential sync via QR** — channel key as stego image of reference photo
|
||||
- **`stegasoo verify`** — decode + confirm message matches expected hash without revealing contents
|
||||
Reference in New Issue
Block a user