docs: fix zig-zag position numbering and luminance rationale in imgsecret

Corrected zig-zag scan positions from 4-15 to 6-17 (verified against
standard JPEG zig-zag ordering). Fixed inverted HVS luminance reasoning
to correctly explain that luminance is used because it isn't spatially
subsampled by JPEG, not because of visual sensitivity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
adlee-was-taken
2026-04-12 09:23:16 -04:00
parent 847051216d
commit c7aab28484

View File

@@ -83,7 +83,7 @@ const BLOCKS_PER_COPY: usize = (SECRET_BITS + BITS_PER_BLOCK - 1) / BITS_PER_BLO
/// Mid-frequency DCT coefficient positions for embedding, specified as
/// (row, col) indices into the 8x8 DCT coefficient matrix.
///
/// These correspond to zig-zag scan positions 4 through 15 -- the "sweet spot"
/// These correspond to zig-zag scan positions 6 through 17 -- the "sweet spot"
/// between low-frequency coefficients (which carry visible image structure and
/// are heavily quantized by JPEG) and high-frequency coefficients (which carry
/// noise/detail and are aggressively zeroed by JPEG compression).
@@ -93,23 +93,23 @@ const BLOCKS_PER_COPY: usize = (SECRET_BITS + BITS_PER_BLOCK - 1) / BITS_PER_BLO
///
/// The zig-zag ordering is the standard JPEG scan order:
/// ```text
/// Zig-zag positions 4-7: (0,3) (1,2) (2,1) (3,0)
/// Zig-zag positions 8-11: (0,4) (1,3) (2,2) (3,1)
/// Zig-zag positions 12-15: (4,0) (0,5) (1,4) (2,3)
/// Zig-zag positions 6-9: (0,3) (1,2) (2,1) (3,0)
/// Zig-zag positions 10-13: (4,0) (3,1) (2,2) (1,3)
/// Zig-zag positions 14-17: (0,4) (0,5) (1,4) (2,3)
/// ```
const EMBED_POSITIONS: [(usize, usize); 12] = [
(0, 3),
(1, 2),
(2, 1),
(3, 0), // zig-zag 4-7
(3, 0), // zig-zag 6-9
(0, 4),
(1, 3),
(2, 2),
(3, 1), // zig-zag 8-11
(3, 1), // zig-zag 10-13
(4, 0),
(0, 5),
(1, 4),
(2, 3), // zig-zag 12-15
(2, 3), // zig-zag 14-17
];
// ─── YChannel ────────────────────────────────────────────────────────────────
@@ -117,10 +117,10 @@ const EMBED_POSITIONS: [(usize, usize); 12] = [
/// The luminance (Y) channel of an image, stored as a flat array of f64 values.
///
/// We embed exclusively in the luminance channel because:
/// - Human vision is more sensitive to luminance than chrominance, so the
/// luminance channel has more "room" for watermarking in the frequency domain.
/// - JPEG compresses chrominance channels more aggressively (typically 4:2:0
/// subsampling), which would destroy embedded data.
/// - Luminance is not spatially subsampled by JPEG (unlike chrominance which
/// is typically 4:2:0), so the full DCT block grid is available for embedding.
/// - JPEG's chrominance subsampling would destroy embedded data by halving
/// the spatial resolution before DCT, misaligning our block positions.
/// - Working with a single channel keeps the DCT operations simple and fast.
struct YChannel {
/// Row-major luminance values. `data[y * width + x]` gives the luminance