# Stegasoo Ideas Scout — Implementation Plans (2026-03-24) Baseline: v4.3.0, Python >=3.11, FORMAT_VERSION 5, no existing users (no backward compat constraints). --- ## Tier 1 — Quick Wins ### 1. Platform-Calibrated DCT Presets **Description**: `--platform telegram|discord|signal|whatsapp` flag for DCT encode. Bakes in each platform's known recompression parameters. Pre-verifies payload survives before outputting. **Implementation approach**: - New file `src/stegasoo/platform_presets.py` — `PlatformPreset` dataclass + `PRESETS` dict mapping platform → tuned `quant_step`, `jpeg_quality`, `embed_positions`, `max_dimension`, `recompress_quality` - `dct_steganography.py`: `_embed_scipy_dct_safe()` / `_embed_jpegio()` accept optional preset overrides for `QUANT_STEP`, `DEFAULT_EMBED_POSITIONS`, output quality - New `pre_verify_survival()` function: encode → re-save at platform quality → extract → pass/fail - Thread `platform` param through `encode.py` → `steganography.py` → DCT functions - `cli.py`: add `--platform` as `click.Choice` + `--verify/--no-verify` (pre-verification doubles encode time) - LSB + `--platform` should error early — LSB data is destroyed by any JPEG recompression **Known platform params** (from research): | Platform | Quality | Max Dimension | Notes | |----------|---------|---------------|-------| | Telegram | ~82 | 2560×2560 | ~81KB embeddable | | Discord | ~85 | Varies (Nitro) | | | Signal | ~80 | Aggressive | | | WhatsApp | ~70 | 1600×1600 | Most lossy | **Go/No-Go metrics**: - >95% payload survival rate per platform at 1KB message size in automated tests - Pre-verification correctly predicts real platform behavior (manual validation per platform at least once) **Complexity**: **M** — new file + parameter threading through 4-5 functions **Risks**: Platform params change without notice. Add version/date stamps to presets and a `stegasoo tools verify-platform` test command. --- ### 2. Steganalysis Self-Check (`stegasoo check`) **Description**: New CLI command running chi-square and RS (Regular-Singular) statistical analysis on stego images. Outputs detectability risk level (low/medium/high). **Implementation approach**: - New file `src/stegasoo/steganalysis.py`: - `chi_square_analysis(image_data) -> float` — chi-square statistic on LSB distribution per channel - `rs_analysis(image_data) -> float` — Regular-Singular groups analysis (requires numpy) - `assess_risk(chi_p, rs_estimate) -> str` — maps to "low"/"medium"/"high" - `check_image(image_data) -> dict` — orchestrator - `cli.py`: new `@cli.command("check")` with `IMAGE` arg, `--json`, `--mode lsb|dct|auto` - `constants.py`: threshold constants for chi-square p-value and RS boundaries - `__init__.py`: export `check_image` in `__all__` - Start LSB-only; DCT steganalysis (calibration attack) deferred **Go/No-Go metrics**: - Clean images → consistently "low risk" - Naive sequential LSB → "high risk" - Stegasoo LSB at <50% capacity → "low" or "medium" **Complexity**: **M** — ~150 lines numpy per test, straightforward CLI integration --- ### 3. Python 3.13 DCT Cleanup **Description**: The `jpegio` → `jpeglib` migration is already done in code. Remaining work: rename stale `jpegio` references and verify on 3.13. **Implementation approach**: - `dct_steganography.py`: rename `HAS_JPEGIO` → `HAS_JPEGLIB`, `_jpegio_*` functions → `_jpeglib_*`, update constant names (`JPEGIO_MAGIC` → `JPEGLIB_MAGIC`, etc.) - Verify `jpeglib.to_jpegio()` compatibility shim — if jpeglib plans to deprecate it, migrate to native API - Run full test suite on Python 3.13 **Go/No-Go metrics**: - All DCT tests pass on Python 3.13 - No deprecation warnings from jpeglib **Complexity**: **S** — renaming and verification only --- ## Tier 2 — Strategic ### 4. Content-Adaptive Embedding (S-UNIWARD/WOW-inspired) **Description**: Replace uniform-random pixel selection with texture-weighted cost functions. Embed preferentially in busy/textured regions where changes are least detectable. 3-5x harder to detect statistically. **Implementation approach**: - New file `src/stegasoo/adaptive_cost.py`: - `compute_cost_map(image_data) -> np.ndarray` — per-pixel distortion cost via directional high-pass filters (Daubechets wavelet bank / KB filter) - `select_pixels_by_cost(cost_map, pixel_key, num_needed) -> list[int]` — weighted sampling, still ChaCha20-seeded for determinism - `steganography.py`: - `generate_pixel_indices()`: add `cost_map` param, use weighted sampling when provided - `_embed_lsb()`: compute cost map when adaptive mode enabled - `_extract_lsb()`: must compute identical cost map to find same pixels - `dct_steganography.py`: adapt `DEFAULT_EMBED_POSITIONS` per-block based on block texture energy - Thread `adaptive: bool` through `encode.py`/`decode.py` - `constants.py`: add `EMBED_MODE_ADAPTIVE_LSB`, filter kernels, cost thresholds **Go/No-Go metrics**: - Chi-square test (Feature 2) shows measurable improvement vs uniform-random - **Critical**: cost map computation is deterministic across platforms (quantize to fixed-point integers) - Round-trip decode succeeds on Linux x86, Linux ARM, macOS **Complexity**: **L** — novel algorithm, cross-platform determinism requirement, touches core embedding **Risks**: Floating-point differences in wavelet computation could break extraction. Mitigate with integer quantization. Increases encode/decode time ~2-3x. --- ### 5. Per-Message Forward Secrecy via HKDF **Description**: Derive ephemeral per-message encryption keys using HKDF expansion from the Argon2id root key + random nonce. Compromising one message doesn't reveal others. **Implementation approach**: - `crypto.py`: - Add `from cryptography.hazmat.primitives.kdf.hkdf import HKDFExpand` - `derive_message_key(root_key, nonce) -> bytes` — HKDF-Expand with SHA-256 - `encrypt_message()`: generate 16-byte random nonce, derive per-message key, embed nonce in header - `decrypt_message()`: extract nonce, derive same key - Also derive pixel selection key via HKDF with different `info` param - `constants.py`: - Bump `FORMAT_VERSION` to 6 - `HKDF_INFO_ENCRYPTION = b"stegasoo-v6-encrypt"`, `HKDF_INFO_PIXEL = b"stegasoo-v6-pixel"` - `MESSAGE_NONCE_SIZE = 16` - Header grows from 66 → 82 bytes: add `message_nonce(16)` field - Update `HEADER_OVERHEAD` / `ENCRYPTION_OVERHEAD` in `steganography.py` **Go/No-Go metrics**: - Two messages with identical credentials produce different ciphertexts and different pixel locations - `cryptography` library HKDF works with existing Argon2id output **Complexity**: **M** — well-defined crypto change, touches security-critical header format --- ### 6. PWA Mobile Interface **Description**: Convert Flask Web UI to Progressive Web App. Mobile-optimized, installable, offline-capable static pages. **Implementation approach**: - New files in `frontends/web/static/`: `manifest.json`, `sw.js`, icon set (192×192, 512×512) - Base template: add manifest link, theme-color meta, viewport meta, service worker registration - `app.py`: serve manifest with correct MIME, add cache headers for static assets - Responsive CSS for encode/decode accordion forms - Camera capture: `` for reference photo - Service worker caches static assets only — NOT encode/decode API endpoints **Go/No-Go metrics**: - Lighthouse PWA score >= 90 - Installable on Android Chrome and iOS Safari - Offline: static pages load, encode/decode shows graceful "offline" message **Complexity**: **M** — frontend only, no core library changes **Risks**: Camera capture requires HTTPS (already supported via `ssl_utils.py`). --- ## Tier 3 — Moonshot ### 7. Plausible Deniability / Dual-Payload Mode **Description**: Two independent encrypted payloads in one carrier, each with different credentials. Reveal decoy under coercion; real payload stays hidden. **Implementation approach**: - New file `src/stegasoo/dual_payload.py`: - `encode_dual(message_a, message_b, carrier, creds_a, creds_b)` - Partition available pixels into two disjoint pools using different seeds - **Critical**: ALL images (single or dual) must fill unused pixel pool with random data so single-payload and dual-payload images are indistinguishable - `steganography.py`: `generate_pixel_indices()` gets `exclude_indices` param - `decode.py`: each credential set finds a different valid payload; wrong credentials produce garbage - CLI + Web UI: dual-payload encode workflow **Go/No-Go metrics**: - Single-payload and dual-payload images are statistically indistinguishable (chi-square can't differentiate) - Each payload decodes independently - Wrong credentials for one payload don't reveal other payload's existence **Complexity**: **XL** — novel design, halves capacity per payload, challenging UX, needs rigorous security analysis **Dependencies**: Feature 2 (validation), Feature 4 (detectability reduction) --- ## Architectural Improvements ### 8. EmbeddingBackend Protocol **Description**: Typed plugin interface for all embedding algorithms. Replace if/elif dispatch in `steganography.py` with a registry. **Implementation approach**: - New package `src/stegasoo/backends/`: - `protocol.py` — `EmbeddingBackend(Protocol)` with `embed()`, `extract()`, `calculate_capacity()`, `is_available()` - `lsb.py`, `dct.py` — wrap existing functions - `registry.py` — `BackendRegistry` mapping mode strings to backends - `steganography.py`: `embed_in_image()` / `extract_from_image()` dispatch via registry - `__init__.py`: export protocol and `register_backend()` **Complexity**: **M** — implement before Features 4 and 7 (they become new backends) --- ### 9. HKDF Key Separation Subsumed by Feature 5. The HKDF expansion provides: - Encryption key: `HKDF-Expand(root_key, info="stegasoo-encrypt", nonce)` - Pixel selection key: `HKDF-Expand(root_key, info="stegasoo-pixel", nonce)` - Future: MAC key, padding key, etc. --- ### 10. `[core]` Extra with Minimal Deps **Description**: Move Pillow to `[image]` extra, base deps = `cryptography` + `argon2-cffi` + `zstandard` only. **Complexity**: **S** — but Pillow is used in `crypto.py` for photo hashing (core to security model). Only worth it with a concrete headless use case. **Low priority.** --- ## Ecosystem Features ### 11. Aletheia Integration Optional `--engine aletheia` backend for Feature 2's `stegasoo check`. BSD-licensed, provides SPA/RS/WS attacks + ML classifiers. **Complexity: S** (after Feature 2). **Depends on**: Feature 2. ### 12. C2PA/AI Provenance Watermarking Embed C2PA metadata alongside stego payloads. **Complexity: L** — C2PA is a complex standard. Potentially conflicts with stego goals (adds detectable metadata). Research-heavy. ### 13. Signal/Matrix Bot Bot that decodes stego images in a channel using configured channel key. **Complexity: M** — integration work, uses existing `decode()` API. ### 14. Homebrew Tap + Nix Flake Package distribution for macOS/NixOS. **Complexity: S** — packaging only, no code changes. --- ## Summary Table | # | Feature | Tier | Size | Dependencies | Primary Files | |---|---------|------|------|-------------|---------------| | 1 | Platform DCT Presets | T1 | M | — | new `platform_presets.py`, `dct_steganography.py`, `encode.py`, `cli.py` | | 2 | Steganalysis Self-Check | T1 | M | — | new `steganalysis.py`, `cli.py`, `constants.py` | | 3 | Python 3.13 DCT Cleanup | T1 | S | — | `dct_steganography.py` | | 4 | Content-Adaptive Embedding | T2 | L | numpy, #2 | new `adaptive_cost.py`, `steganography.py`, `constants.py` | | 5 | HKDF Forward Secrecy | T2 | M | — | `crypto.py`, `constants.py`, `steganography.py` | | 6 | PWA Mobile Interface | T2 | M | — | `frontends/web/` templates + static | | 7 | Dual-Payload Mode | T3 | XL | #2, #4 | new `dual_payload.py`, `steganography.py`, `cli.py` | | 8 | EmbeddingBackend Protocol | Arch | M | — | new `backends/` package, `steganography.py` | | 9 | HKDF Key Separation | Arch | — | Included in #5 | `crypto.py` | | 10 | `[core]` Extra | Arch | S | — | `pyproject.toml` | | 11 | Aletheia Integration | Eco | S | #2 | `steganalysis.py` | | 12 | C2PA Watermarking | Eco | L | — | new module | | 13 | Signal/Matrix Bot | Eco | M | — | new `bots/` package | | 14 | Homebrew + Nix | Eco | S | — | packaging files only | --- ## Suggested Roadmap ### Phase 1 — Foundations (v4.4.0) 1. **#3** Python 3.13 DCT Cleanup (S) — unblocks CI on 3.13 2. **#8** EmbeddingBackend Protocol (M) — architectural cleanup before new embedding work 3. **#2** Steganalysis Self-Check (M) — validation tooling for everything that follows ### Phase 2 — Security & Robustness (v4.5.0) 4. **#5** HKDF Forward Secrecy (M) — FORMAT_VERSION bump to 6, improved crypto 5. **#1** Platform-Calibrated DCT Presets (M) — high user value for social media 6. **#14** Homebrew + Nix (S) — distribution expansion ### Phase 3 — Advanced Steganography (v5.0.0) 7. **#4** Content-Adaptive Embedding (L) — major security improvement 8. **#6** PWA Mobile Interface (M) — parallel frontend work stream ### Phase 4 — Moonshot (v5.x+) 9. **#7** Dual-Payload Mode (XL) — after #2 and #4 are solid 10. **#12** C2PA Watermarking (L) — research-heavy 11. **#13** Signal/Matrix Bot (M) — community-driven --- ## Additional Ideas (Backlog) - **Animated GIF steganography** — LSB in GIF frames, natural multi-media extension - **PDF steganography** — whitespace/font metric/embedded image payloads - **Batch encode** — `stegasoo batch-encode --dir /photos/` with auto carrier selection (BATCH_* constants suggest this was planned) - **Stego identification** — `stegasoo identify image.png` probes for known stego signatures - **Per-device credential sync via QR** — channel key as stego image of reference photo - **`stegasoo verify`** — decode + confirm message matches expected hash without revealing contents