feat(skills): add product-expert roadmap-audit + spec-review strategist

A standalone, self-triggering skill that acts as Relicario's product
strategist: audits the roadmap and reviews freshly-brainstormed release
specs for product/market fit, emitting PM-ready relay directive blocks.
Advisory only — the user stays the decision-maker.

- Two modes: roadmap audit (default) and spec review (verdict:
  PROCEED / RESCOPE / CUT / PIVOT).
- Four-lens engine run as parallel subagents: ground-truth (verify
  claims vs code/git, distinguishing an in-flight lift from real drift),
  jobs-to-be-done, market/competitive, and strategy synthesis.
- Fast by default; `deep` adds live competitive web research.
- Durable by design: lenses read living docs (README/ROADMAP/STATUS/
  CHANGELOG/specs) at runtime, so new surfaces/segments/features are
  picked up automatically. The one static asset, competitive-landscape.md,
  carries a last-reviewed date + freshness protocol.
- Wires a post-brainstorm product gate into CLAUDE.md's Planning section.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01VQbgrP6KQW5pibjbPEoTSs
This commit is contained in:
adlee-was-taken
2026-06-20 22:30:48 -04:00
parent 59ebc28e7e
commit c3044ed5af
5 changed files with 540 additions and 0 deletions

View File

@@ -0,0 +1,114 @@
# Competitive landscape — password managers
> **last-reviewed: 2026-06-20.** This file is the only static, rot-prone asset in
> the skill (the four lenses otherwise read living docs at runtime). The market
> moves: competitors ship features, get breached, change pricing, appear, and
> die. Treat every claim below as "true as of last-reviewed, verify if it
> matters."
**Freshness protocol:**
- If `last-reviewed` is **more than ~6 months** before today, treat this file as
suspect: prefer running the market lens in **deep** mode (live web research)
over trusting the snapshot, and at the end of the run *offer to refresh this
file* (re-research the competitors, rewrite the entries, bump `last-reviewed`).
- Any time a **deep**-mode run surfaces something this file gets wrong or misses
(a new competitor, a shipped feature, a breach), offer to fold it back in and
bump the date. The cheat-sheet should improve every time it's proven stale.
A grounding cheat-sheet for the market lens in **fast** mode so it reasons from a
real map, not vibes.
The goal isn't to rank these for everyone — it's to locate Relicario's wedge
honestly: where the two-factor / self-host / git-backed / server-sees-ciphertext
thesis genuinely wins for the target user, and where Relicario is simply behind
on table stakes.
---
## The field
### Bitwarden
- Open-source, freemium, cloud-hosted by default; self-host possible (official
server is heavy; **vaultwarden** is the popular lightweight Rust reimpl).
- Single-factor KDF: master password (optionally with 2FA gating *login*, not the
KDF). Server breach entropy rests on the master password alone.
- Strong on: ubiquity, mature mobile + browser autofill, painless import/export,
organizations & sharing, low/zero price.
- The default thing a privacy-conscious technical user reaches for. **This is
Relicario's primary reference competitor** — most "why not just use X" pressure
comes from here (specifically self-hosted vaultwarden).
### vaultwarden
- Community Rust server compatible with Bitwarden clients; trivial to self-host
(single container). Inherits Bitwarden's polished clients for free.
- This is the sharpest comparison for Relicario's self-host story: a user who
wants self-hosted secrets already has a turnkey, full-featured option with
mobile apps and autofill. Relicario must justify what it adds *over* this.
### KeePassXC (+ KeePass ecosystem)
- Local-first, file-based (`.kdbx`), no server at all; sync is BYO (Dropbox,
Syncthing, git, etc.). Open-source, free.
- Single-factor by default but supports key files / hardware keys as a second
factor — conceptually the closest mainstream analog to Relicario's "something
you have" image secret (a key file is the unglamorous version of the stego
image).
- Strong on: zero-trust-server (there is no server), longevity, plugin ecosystem.
- Weak on: clunky cross-device sync, dated UX, mobile is third-party.
- The other user Relicario competes for: the "I don't trust any cloud" crowd.
### 1Password
- Commercial, polished, cloud-only (no self-host). **Two-factor KDF**: master
password + a 128-bit Secret Key — the mainstream product whose security model
is closest in spirit to Relicario's (two factors into the key derivation).
- Strong on: best-in-class UX, mobile, autofill, family/team sharing, support.
- Relevant because it proves the two-factor-KDF idea is marketable — but it does
it with a boring random Secret Key, not steganography, and gives up self-host.
### Proton Pass
- Newer, from Proton (Mail/VPN); privacy-positioned, cloud, freemium, open-source
clients. Single-factor KDF; leans on brand trust and the Proton bundle.
- Relevant as the "privacy brand" competitor — it wins on trust + ecosystem, not
on a novel crypto model.
### LastPass (cautionary tale, not a competitor to chase)
- Repeated breaches (notably 2022) where exfiltrated vaults were only as strong
as users' master passwords — the canonical argument *for* a second KDF factor.
- Useful in positioning: Relicario's README already uses LastPass as the "~4060
bits, single factor" baseline. The market lesson is real and on Relicario's
side, but invoking it is marketing, not differentiation.
---
## Where Relicario can win (the honest version)
- **Server-sees-only-ciphertext + no metadata** against a self-host backend that
still stores structured data. This is a genuine, explainable edge over
vaultwarden for the threat-model-literate user.
- **Two factors into the KDF** (not just 2FA on login) — only 1Password really
matches this, and it isn't self-hostable. That intersection (two-factor KDF +
self-host) is close to empty. That's the wedge.
- **Git as audit log** — "when was this rotated?" answered by `git log` and field
history. Niche, but unique and real for the audit-conscious user.
## Where Relicario is behind (table stakes to be honest about)
- **Mobile.** Bitwarden/1Password/Proton all have first-class mobile apps with
autofill. Relicario is CLI + browser extension; the Rust core compiles to ARM
but there's no shipped mobile client. For most users this alone is
disqualifying — weigh it heavily.
- **Autofill quality & breadth.** Browser-extension autofill maturity is a moat
the incumbents have spent years on.
- **Frictionless import** from the incumbents (Bitwarden, 1Password) — LastPass
CSV exists; the others are on the roadmap. Import friction is a real adoption
tax.
- **Sharing / multi-user polish.** The org-vault track is new; incumbents have
mature org/family sharing.
## The uncomfortable question to keep asking
For a user who wants self-hosted secrets, **vaultwarden already exists and is
turnkey with great clients.** Every Relicario feature should be weighed against:
"does this widen the gap on the thesis (two-factor KDF, no-metadata, git audit),
or is it just trying to catch up to vaultwarden on table stakes I'll never win?"
The strategy lens should treat *catching up to vaultwarden's client polish* and
*deepening the unique thesis* as different bets with very different ROI.

View File

@@ -0,0 +1,155 @@
# The four lenses — dispatch prompts
Spawn these as parallel subagents (the `Agent` tool). Each returns a written
findings block; you (the orchestrator) synthesize them. Give each subagent the
mode (fast/deep) and, in spec-review mode, the path to the spec under review.
Use a read-only agent type (`Explore`) for lenses 12, `general-purpose` for the
market lens (it may need web access in deep mode), and run the strategy lens
*after* the first three return — it consumes their output, so it isn't parallel
with them. Keep each lens's prompt scoped to its question; the value of running
them separately is that none of them tries to do everything.
---
## Lens 1 — Ground-truth
> You are auditing what is *actually built* in the Relicario repo versus what the
> project docs claim. This is a reality check, not a code review — do not hunt
> for bugs.
>
> Do this:
> 1. Read ROADMAP.md, STATUS.md, CHANGELOG.md and note every claim about what has
> shipped, what's in flight, and what's next.
> 2. Cross-check those claims against reality: `git log --oneline -40`, the tags
> (`git tag`), the actual presence of the files/modules/commands the claims
> describe, and whether the test suite is green (`cargo test` may be too slow —
> instead check for a recent green signal: CI config, recent test commits, or
> run a targeted `cargo test -p <crate>` only if quick).
> 3. Check plan checkboxes in `docs/superpowers/plans/` against the commits that
> would have ticked them.
>
> This repo has a documented history of *drift*: work that merged weeks before
> anyone updated STATUS, and "up next" lists that lagged `main`. Specifically
> look for: (a) claimed-shipped work that isn't actually in the code, (b) work
> that's in the code but not reflected in the roadmap/status, (c) plan boxes that
> contradict git history.
>
> CRITICAL distinction — drift vs. in-flight lift. Docs lagging the code is NOT
> automatically "drift to fix." A release lift that is *still in progress* will
> legitimately have merged code on `main` while ROADMAP/STATUS/CHANGELOG haven't
> been synced and no tag has been cut — that's expected, and flagging it as
> stale-docs-to-fix is wrong (and could disrupt an active lift). Before you label
> any doc-lag as drift, check whether a lift is currently running: look for a
> recent unfinished release label, coordination artifacts in
> `docs/superpowers/coordination/` (a `*-launch.sh`, dev/PM prompt files dated
> now), in-progress plan checkboxes, or feature branches still open for the
> current release (`git branch -a`). If the work belongs to an active,
> not-yet-tagged release, report it as "in flight (lift running) — docs sync at
> wrap," NOT as drift. Reserve "drift" for docs that contradict *finished/tagged*
> reality.
>
> Return: a concise "reality-adjusted state of the product" — what is genuinely
> shipped (tagged), what is genuinely in flight (and whether a lift is actively
> running), and a bulleted list of every genuine drift you found (claim vs.
> finished reality, with the commit or file that proves it). Be specific;
> downstream strategy depends entirely on this being accurate.
---
## Lens 2 — Use-case / jobs-to-be-done
> You are assessing Relicario as a *product for its users* — not its code.
>
> Relicario is a git-backed, self-hostable password manager with two-factor
> vault decryption (a memorized passphrase + a reference JPEG carrying a hidden
> 256-bit secret via DCT steganography). The server only ever sees ciphertext.
> Read README.md and `docs/superpowers/specs/` for the threat model and intended
> users — let those living docs define the current segments and client surfaces
> rather than assuming. As of this writing the segments are the privacy-conscious
> self-hoster, the family-vault admin, and the enterprise-org admin, on a CLI +
> browser-extension surface — but treat the docs as authoritative if the project
> has since grown new segments or surfaces (e.g. mobile, Safari).
>
> Answer:
> 1. Who is this really for, and what jobs do they hire it for? Map the major
> features to the jobs they serve.
> 2. Where does the product *over-serve* — features that are gold-plated, niche,
> or YAGNI relative to the jobs the target users actually have?
> 3. Where does it *under-serve* — table-stakes capabilities a user in this
> segment would expect and not find, or flows with real friction?
> 4. CLI/extension parity is an explicit design value in this project. Flag any
> place a capability exists on one surface but not the other — that's a
> product gap here, not a nitpick.
>
> Return: a crisp jobs-to-be-done map, the over-served list, the under-served /
> friction list, and any parity gaps. Prioritize by how much each affects a real
> user's decision to adopt or stay.
---
## Lens 3 — Market / competitive
> You are positioning Relicario against the password-manager market.
>
> Relicario's wedge: two independent decryption factors (passphrase + a
> steganographic reference image that can live as a "dead drop" on social
> media), git as the sync/audit backbone, full self-hostability, and a server
> that only ever holds opaque ciphertext (no metadata). It's GPL and open-source;
> check README.md / ROADMAP.md for the current release stage, client surfaces,
> and which tracks (e.g. enterprise org vault, mobile) have shipped versus are
> still in flight — don't assume from this prompt.
>
> Read `references/competitive-landscape.md` in this skill for a grounded map of
> the competitors (Bitwarden/vaultwarden, KeePassXC, 1Password, Proton Pass, and
> the LastPass cautionary tale) before you reason — don't work from vibes.
>
> {FAST MODE}: reason from that cheat-sheet plus your own knowledge.
> {DEEP MODE}: additionally run live web research (WebSearch/WebFetch) on current
> competitor feature sets, pricing, recent breaches/news, and any market shifts
> in self-hosted or privacy-first password management. Cite what you find.
>
> Answer:
> 1. Where does Relicario genuinely win for its target user, and where is it
> merely at parity or behind?
> 2. Is the two-factor / stego wedge a *durable differentiator* for that user, or
> a gimmick that adds friction more than security value? Argue it honestly.
> 3. What is table stakes in this market that Relicario lacks (e.g. mobile
> clients, autofill quality, painless import, secure sharing)?
> 4. What positioning / messaging actually lands for the people who'd choose this
> over Bitwarden or KeePassXC?
>
> Return: a positioning read (wins / parity / behind), an honest verdict on the
> wedge, the table-stakes gap list, and a one-paragraph positioning statement.
> State whether you ran fast or deep.
---
## Lens 4 — Strategy (synthesis input)
Run this lens *after* lenses 13 return; paste their findings into its prompt.
> You are the strategy synthesis for a Relicario product review. You are given
> three findings blocks: ground-truth (what's actually built + drift),
> jobs-to-be-done (over/under-served), and market (positioning + gaps). Below
> them is the current ROADMAP.md "up next" list.
>
> Produce a prioritized set of roadmap moves. Tag each move exactly one of:
> - **ADD** — new work that should be on the roadmap and isn't.
> - **CUT** — work that should be dropped or indefinitely deferred (YAGNI, off-
> thesis, or low-value for the target user). Cutting is a first-class call.
> - **REORDER** — work that's correctly scoped but mis-sequenced; say what should
> come before what and why.
> - **PIVOT** — a larger directional change (segment, positioning, or thesis).
> Use sparingly and argue it hard.
>
> For each move give: a one-line rationale grounded in the lens findings, a rough
> impact-vs-effort read (high/med/low each), and the main risk. Order the whole
> list by leverage — the single highest-impact move first.
>
> Be opinionated and specific. "Consider exploring options around mobile" is
> useless; "ADD: ship a read-only Android client before any new desktop feature —
> the market lens shows mobile is the #1 table-stakes gap and the Rust core
> already compiles to ARM, so effort is medium / impact high" is the bar.
>
> Return: the tagged, leverage-ordered move list, nothing else.

View File

@@ -0,0 +1,94 @@
# Output templates
Use these verbatim in structure. Fill the brackets. Keep prose tight — the user
reads this to make a decision, not to admire it. Lead with the highest-leverage
point in every section.
---
## Roadmap-audit output
```markdown
# Product Audit — Relicario — [YYYY-MM-DD] · [fast | deep]
## Reality check
[One paragraph: where the product *actually* stands, reconciled against code +
git, not the docs' self-description.]
**Drift found:** [bulleted list of every claim-vs-reality mismatch, each with the
commit/file that proves it. Write "none — docs match reality" if clean.]
## Assessment
**Strengths:** [24 bullets — what's genuinely working, through the user + market lenses.]
**Gaps:** [24 bullets — table-stakes misses, friction, parity gaps.]
**Risks:** [13 bullets — what could undermine the product or the thesis.]
## Recommendations
[Leverage-ordered. Highest-impact first. Each line:]
- **[ADD|CUT|REORDER|PIVOT]** — [the move]. *Why:* [one line]. *Impact/Effort:* [H/M/L · H/M/L]. *Risk:* [one line].
## PM brief
[The paste-ready block — see "PM brief block" below.]
```
---
## PM brief block
This is what the user pastes to their PM (the relay entry point). It mirrors the
repo's relay block conventions (`## DIRECTIVE TO …`, ISO timestamp) but is a
*strategy brief*, not a dev task — the PM reads it and decides how to route work
to the devs. Keep it self-contained: the PM may act on it without the full audit.
```markdown
## PRODUCT DIRECTIVE TO PM
Time: [ISO-8601 local timestamp]
Source: /product-expert roadmap audit ([fast|deep])
Reality note: [one line — any drift the PM must know before acting, e.g. "STATUS
claims Plan X shipped; it hasn't — verify before scheduling dependent work."]
Roadmap changes (in priority order):
1. [ADD|CUT|REORDER|PIVOT] [the move] — [one-line why].
2.
Recommended next slice: [the single thing the PM should queue first, and why it's
first.]
Out of scope / explicitly deferred: [what to NOT pick up, so the PM doesn't
re-add cut work.]
```
The user edits this before relaying. Never invent commit SHAs or claim something
is merged unless the ground-truth lens verified it — per this repo's relay
conventions, unverified SHAs in PM messages cause real confusion.
---
## Spec-review output
```markdown
# Spec Review — [spec filename] — [YYYY-MM-DD] · [fast | deep]
## Verdict: [PROCEED | RESCOPE | CUT | PIVOT]
[One sentence stating the call plainly.]
## Rationale
- **User job:** [does it serve a real job for a real segment? which?]
- **Positioning fit:** [does it strengthen or dilute the wedge?]
- **Opportunity cost:** [what does building this displace on the roadmap? is that
trade worth it?]
- **Scope:** [right-sized, gold-plated, or under-built? cite the YAGNI risks.]
## [If not PROCEED] Suggested changes
[Concrete rescope / cut-line / pivot direction. Be specific enough that the next
step is obvious.]
## Next step
[If PROCEED: "Spec holds up — proceed to writing-plans." Otherwise: "Revise the
spec per above, then re-review or proceed."]
```
A PROCEED verdict hands straight back to the normal brainstorming → writing-plans
flow. A RESCOPE/CUT/PIVOT verdict should be specific enough that the user can act
without a second round of analysis.