Document CLI and Web UI architecture for future devs

CLI module now explains: - Click command group hierarchy (tree diagram) - JSON output pattern for scriptability - Secure input handling (hide_input, confirmation_prompt) - Dry-run mode pattern - Batch processing with variadic args and progress callbacks Web UI now explains: - Flask architecture overview with ASCII diagram - Subprocess isolation pattern (why we run stegasoo in subprocesses) - Async job management with polling flow diagram - Context processors for template globals - Secret key persistence for session survival - Environment-based configuration (12-factor style) If you're reading this code trying to learn Flask/Click patterns, these comments should actually teach you something useful. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-06 23:58:59 -05:00
parent 93420704e8
commit aa99a258f4
2 changed files with 298 additions and 18 deletions
--- a/frontends/web/app.py
+++ b/frontends/web/app.py
@@ -2,23 +2,76 @@
 """
 Stegasoo Web Frontend (v4.0.0)

-Flask-based web UI for steganography operations.
-Supports both text messages and file embedding.
+A production Flask application demonstrating proper web architecture patterns.
+This isn't just a quick demo - it's built to run on a Raspberry Pi 24/7.
+
+ARCHITECTURE OVERVIEW
+=====================
+
+    ┌─────────────────────────────────────────────────────────────────────┐
+    │                         FLASK APPLICATION                            │
+    ├─────────────────────────────────────────────────────────────────────┤
+    │                                                                      │
+    │   Routes (/encode, /decode, /api/*)                                  │
+    │       │                                                              │
+    │       ├── auth.py           # Session management, user accounts      │
+    │       ├── temp_storage.py   # File-based temp storage with expiry    │
+    │       ├── subprocess_stego.py  # Isolated encode/decode workers      │
+    │       └── ssl_utils.py      # Self-signed cert generation            │
+    │                                                                      │
+    │   Templates (Jinja2)                                                 │
+    │       └── base.html → encode.html, decode.html, etc.                │
+    │                                                                      │
+    │   Static assets (CSS, JS)                                           │
+    │       └── Vanilla JS, no framework (keeps it simple)                │
+    │                                                                      │
+    └─────────────────────────────────────────────────────────────────────┘
+
+KEY PATTERNS
+============
+
+1. SUBPROCESS ISOLATION
+   Stegasoo's DCT mode uses scipy/jpegio which can crash on malformed input.
+   We run encode/decode in subprocesses so crashes don't take down the server:
+
+       subprocess_stego = SubprocessStego(timeout=180)
+       result = subprocess_stego.encode(carrier, ref, message, ...)
+
+   If the subprocess crashes, we catch it and return an error gracefully.
+
+2. ASYNC JOBS WITH PROGRESS
+   Encoding large images can take 30+ seconds. We use ThreadPoolExecutor
+   to run jobs in background threads with progress reporting:
+
+       job_id = generate_job_id()
+       _executor.submit(_run_encode_job, job_id, params)
+       # Client polls /api/encode/progress/<job_id> for updates
+
+3. CONTEXT PROCESSORS
+   @app.context_processor injects variables into ALL templates:
+
+       return {"version": __version__, "has_dct": has_dct_support()}
+
+   Now every template can use {{ version }} without passing it explicitly.
+
+4. BEFORE_REQUEST HOOKS
+   @app.before_request runs before every request. We use it for:
+   - First-run setup redirect (no users → /setup)
+   - Session validation
+   - Cleanup of old temp files
+
+5. SECURE SECRET KEY
+   Flask sessions need a secret key. We persist it to a file so sessions
+   survive server restarts (otherwise everyone gets logged out).

 CHANGES in v4.0.0:
 - Added channel key support for deployment/group isolation
 - New /api/channel/status endpoint
 - Channel key selector on encode/decode pages
- Messages encoded with channel key require same key to decode

 CHANGES in v3.2.0:
 - Removed date dependency from all operations
- Renamed day_phrase → passphrase
- No date selection or tracking needed
 - Simplified user experience for asynchronous communications
-
-NEW in v3.0: LSB and DCT embedding modes with advanced options.
-NEW in v3.0.1: DCT output format selection (PNG or JPEG) and color mode (grayscale or color).
 """

 import io
@@ -157,8 +210,35 @@ except ImportError:
 # ============================================================================
 # SUBPROCESS ISOLATION FOR STEGASOO OPERATIONS
 # ============================================================================
-# Runs encode/decode/compare in subprocesses to prevent jpegio/scipy crashes
-# from taking down the Flask server.
+#
+# This is a critical reliability pattern. Here's the problem:
+#
+# scipy's DCT and jpegio can crash (segfault) on:
+# - Malformed JPEG files
+# - Very large images that exhaust memory
+# - Certain edge cases in coefficient manipulation
+#
+# If these crash in the main Flask process, your whole server dies.
+# Users get a connection reset, and the service goes down.
+#
+# The solution: Run stegasoo operations in separate Python processes.
+#
+#     Main Flask process                 Worker subprocess
+#     ┌─────────────────┐               ┌─────────────────┐
+#     │                 │   spawn       │                 │
+#     │  /api/encode    │──────────────>│  encode()       │
+#     │                 │               │                 │
+#     │  wait for       │<──────────────│  return result  │
+#     │  result         │    or crash   │  (or crash)     │
+#     │                 │               │                 │
+#     │  handle error   │               │  (process dies) │
+#     └─────────────────┘               └─────────────────┘
+#
+# If the subprocess crashes, we catch the error and return a friendly message.
+# The main server keeps running. Users can try again with different input.
+#
+# The subprocess_stego module handles all the pickling/unpickling of data.
+
 from subprocess_stego import (
    SubprocessStego,
    cleanup_progress_file,
@@ -182,38 +262,89 @@ subprocess_stego = SubprocessStego(timeout=180)  # 3 minute timeout for large im
 # ============================================================================
 # FLASK APP CONFIGURATION
 # ============================================================================
+#
+# Flask configuration demonstrates several production patterns:
+#
+# 1. SECRET KEY PERSISTENCE
+#    Flask uses secret_key to sign session cookies. If it changes, all users
+#    get logged out. We save it to a file so it survives restarts.
+#
+# 2. CONTENT LENGTH LIMITS
+#    MAX_CONTENT_LENGTH prevents DoS via huge uploads. Flask will reject
+#    requests that exceed this before loading them into memory.
+#
+# 3. ENVIRONMENT-BASED CONFIG
+#    Settings come from environment variables, allowing:
+#    - Different settings per deployment (dev/staging/prod)
+#    - Docker/systemd to inject config without code changes
+#    - 12-factor app compliance
+#
+# 4. INSTANCE FOLDER
+#    Flask's instance_path is for per-deployment data (databases, keys).
+#    It's .gitignored by default - perfect for secrets.

 app = Flask(__name__)

 # Persist secret key so sessions survive restarts
+# Without this, every restart = everyone gets logged out
 _instance_path = Path(app.instance_path)
 _instance_path.mkdir(parents=True, exist_ok=True)
 _secret_key_file = _instance_path / ".secret_key"
 if _secret_key_file.exists():
    app.secret_key = _secret_key_file.read_text().strip()
 else:
-    app.secret_key = secrets.token_hex(32)
+    # First run: generate a new key and save it
+    app.secret_key = secrets.token_hex(32)  # 256 bits of randomness
    _secret_key_file.write_text(app.secret_key)
-    _secret_key_file.chmod(0o600)
+    _secret_key_file.chmod(0o600)  # Only owner can read

+# Reject uploads larger than this (prevents memory exhaustion)
 app.config["MAX_CONTENT_LENGTH"] = MAX_FILE_SIZE

 # Auth configuration from environment
+# STEGASOO_AUTH_ENABLED=false disables login (for local/dev use)
 app.config["AUTH_ENABLED"] = os.environ.get("STEGASOO_AUTH_ENABLED", "true").lower() == "true"
 app.config["HTTPS_ENABLED"] = os.environ.get("STEGASOO_HTTPS_ENABLED", "false").lower() == "true"

-# Initialize auth module
+# Initialize auth module (sets up session handling, user DB)
 init_auth(app)

 # ============================================================================
 # ASYNC JOB MANAGEMENT (v4.1.2)
 # ============================================================================
-# Encode operations can run in background threads with progress reporting
+#
+# Problem: DCT encoding a large image can take 30-60 seconds.
+# Solution: Run it in a background thread, let the client poll for progress.
+#
+# The flow:
+#
+#     Client                          Server
+#     ──────                          ──────
+#     POST /api/encode/async ──────>  Start background job
+#                            <──────  Return job_id
+#
+#     GET /api/encode/progress/123 ─>  Check job status
+#                            <──────  {"progress": 45, "phase": "embedding"}
+#
+#     GET /api/encode/progress/123 ─>  Check again
+#                            <──────  {"status": "complete", "file_id": "abc"}
+#
+#     GET /api/download/abc ────────>  Download result
+#                            <──────  Encoded image
+#
+# Why ThreadPoolExecutor instead of Celery/Redis?
+# - This runs on a Raspberry Pi with 1GB RAM
+# - We don't need distributed workers
+# - Keep it simple - threads are fine for 2 concurrent jobs
+#
+# The thread pool is limited to 2 workers because:
+# - Each encode loads the full image into memory
+# - Too many concurrent jobs = OOM on the Pi

-# Thread pool for background encode/decode operations
 _executor = ThreadPoolExecutor(max_workers=2)

-# Job storage: job_id -> {status, result, error, file_id, ...}
+# Job storage: job_id -> {status, result, error, file_id, created, ...}
+# We use a dict with a lock because threads access it concurrently
 _jobs = {}
 _jobs_lock = threading.Lock()

@@ -268,6 +399,27 @@ THUMBNAIL_FILES: dict[str, bytes] = {}  # Not used - see temp_storage.py
 # ============================================================================
 # TEMPLATE CONTEXT PROCESSOR
 # ============================================================================
+#
+# Context processors inject variables into EVERY template automatically.
+# Instead of passing the same data to every render_template() call:
+#
+#     # Bad: repetitive and error-prone
+#     return render_template("page.html", version=__version__, has_dct=...)
+#
+# We define it once here and it's available everywhere:
+#
+#     # In any template:
+#     <p>Version: {{ version }}</p>
+#     {% if has_dct %}DCT mode available{% endif %}
+#
+# This is great for:
+# - Version numbers (show in footer)
+# - Feature flags (has_dct, auth_enabled)
+# - User info (username, is_admin)
+# - Global config (max sizes, limits)
+#
+# The function runs on EVERY request, so keep it fast.
+# Don't do expensive database queries here.


@app.context_processor