Claude Code plugin · open source

Multi-agent SDD framework. No more slop. Specs, then code.
Reviewed by humans, drafted by a team of agents.

Waterfall orchestrates eight specialized agents through a deterministic V-cycle — requirements, specs, design, implementation, review, validation, closure — so Claude Code stops freelancing and starts shipping artefacts you can actually audit.

quick install /plugin install https://github.com/mgallet92i/waterfall
8 specialized agents
10 deterministic phases
0 vibe coding allowed
scroll
01 / Problem

LLMs are confident.
Confidence cascades into slop.

One vague prompt, one missed assumption, one improvised refactor — and the agent ships a tower of code that compiles, looks plausible, and is wrong end-to-end. Reviewing it after the fact is harder than writing it from scratch.

cascade failure

The bad cascade

prompt✓ understood
plan✓ plausible
design~ assumed
impl.✗ drift
tests✗ rewritten to pass
PR✗ merged anyway

Each phase compounds the previous one's drift. By the time a human reviews the PR, the original intent is unrecoverable.

context window

The dumb zone

fresh context compaction
sharp degraded dumb zone

Past a threshold, the agent loses the thread but keeps writing. Waterfall enforces small, scoped phases so every artefact is produced in fresh context.

The fix isn't a smarter model. It's a process that refuses to let one model do everything, and that produces written artefacts a human can sign off on at every gate.

02 / Agents

Eight agents, one human, one ledger.
Each role owns one job and hands off in writing.

Specialization is the point. A single super-prompt is a single point of failure; a team produces traceable artefacts with named owners.

OR

Orchestrator

Drives the state machine. Locks scope, sequences phases, calls the next agent. Never writes product content.

  • in ticket, ledger
  • out phase transitions
PM

Product Manager

Interviews HO, writes PRD.md. Owns the REQUIREMENTS phase end-to-end. Also drives CLOSURE (retro, PR).

  • in HO interview, trigger
  • out PRD.md, retro.md
PO

Product Owner

Reads PRD.md authored by PM. Writes functional specs and acceptance criteria. Starts at FUNCTIONAL_SPECS.

  • in PRD.md
  • out functional specs, AC
TL

Tech Lead

Owns the technical design. Picks the architecture, documents trade-offs, calls hosting and stack.

  • in functional specs
  • out technical design
RV

Reviewer

Independent gate at every phase boundary. Reads, challenges, blocks. Never writes the artefact under review.

  • in any artefact
  • out review verdict
DV

Developer

Writes the code. Only the code. Against frozen specs. Never re-opens a closed phase.

  • in tech design
  • out code, diffs
QA

Quality Assurance

Replays acceptance criteria against the build. Cross-browser, cross-flow. Reports verifiable failures.

  • in AC, build
  • out validation report
DS

Designer

Visual + interaction design when the deliverable has a UI. Pairs with PO on flows, TL on feasibility.

  • in functional specs
  • out mocks, design tokens
HO

Human Operator

You. The only non-agent role. Sets scope, signs every gate, and is accountable for what ships. Waterfall is built around your sign-off — not around removing it.

  • in any artefact, any diff
  • out approval, change request, block
Hand-off lane

Read left to right. Every arrow is a written artefact. HO checkpoints sit below — each one gates the transition above it.

  1. ORticket
  2. PMrequirements
  3. POspecs
  4. TLdesign
  5. DVcode
  6. QAvalidation
  7. ORclosure
  1. HO sign-off after design RV gate · before any code
  2. HO sign-off after code review RV gate · before validation
  3. HO sign-off after validation before closure
03 / Methodology

A V-cycle, end to end.
Every phase has an artefact. Every artefact has a gate.

Verification on the way down (what we're going to do, written). Validation on the way up (what we did, tested against what we wrote). Coding sits at the bottom — last, not first.

Verification — going down Validation — coming back up
  1. 01 · Bootstrap OR
  2. 02 · Requirements PM
  3. 03 · Functional specs PO
  4. 04 · Technical design TL
  5. 05 · Review RV · HO
  6. 06 · Planning TL
  7. 07 · Implementation DV
  8. 08 · Code review RV · HO
  9. 09 · Validation QA
  10. 10 · Closure OR · HO
HO Human owner — signs every gate.
Artefact Every phase emits a written, reviewable file.
Gate No phase advances without explicit approval.
Artefacts — the paper trail

Every acronym you'll see across the cycle. Each one is a file the next agent reads.

Code Artefact What it is Owner
EX Experience expectation A user-facing expectation in specs.md, prioritized MUST / SHOULD / MAY. The unit of intent every downstream artefact traces back to. PO
INV Invariant A non-negotiable property the system must always preserve, in specs.md. Cross-cutting; doesn't belong to any single EX. PO
TF Functional test Testable GIVEN / WHEN / THEN scenario in acceptance.md. The contract QA replays at validation. PO
T Task Sized unit of work in tasks.md, traced to one or more EX. The DV's worklist. TL
B Blocker Review finding in review.md that prevents convergence. Must be resolved before the artefact is signed off. RV
Q Question Review-time clarification in review.md, routed to PM, PO, or TL. Non-blocking but must be answered. RV
ADR Architecture decision An architectural decision recorded in design.md with context, options considered, and rationale. TL
DEC Decision A non-architectural decision logged in tracking.md or retro.md — scope, trade-off, process choice. any
OBS Observation An in-vivo observation about the workflow itself, logged in tracking.md or or.log. Aggregated into retro.md at closure. any
05 / Install

Three commands. One plugin.
Then drive Claude Code with /waterfall.

Waterfall ships as a Claude Code plugin. Marketplace-ready, no build step on your side.

  1. 01

    Install from GitHub

    One slash command — Waterfall installs directly from this repository. Marketplace listing coming soon.

    claude
     /plugin install https://github.com/mgallet92i/waterfall
  2. 02

    Configure HO at the repo root

    Copy .wf-config.example.json to .wf-config.json and tune models per role, review-loop budgets, watchdog interval, and Dark Factory. Defaults apply if the file is absent. Schema lives in .wf-config.example.md.

    shell
    $ cp .wf-config.example.json .wf-config.json
    # edit models, review_loops, dark_factory…
  3. 03

    Open a ticket and let the orchestrator run

    OR walks you through Bootstrap → Requirements. Each gate pings you for approval.

    claude
     /waterfall:new add-google-workspace-sso
    # OR creates ticket WF-014, hands off to PM…
06 / Trade-offs

Waterfall is not free.
Here's exactly what it costs.

We're transparent about fit. Some changes are too small to deserve a V-cycle. Some are too big to ship without one.

  • 3–5×

    process length

    vs. one-shot prompting. You write specs before code; that's the whole pitch.

  • 2–4×

    token consumption

    Multiple agents read each artefact. Context is fresh per phase, not bloated per session.

  • N/A

    tiny edits

    If your change is a typo or a one-line fix, skip Waterfall. Use it where review matters.

  • −15%

    tokens — subagent vs team

    Default agent_mode: subagent (Agent tool, no inter-agent SendMessage) runs slightly cheaper than team mode and is more deterministic — PM stays in charge of every spawn, no idle teammates to repoke. Switch to team only if agents need to chat directly.

07 / Why Waterfall

Anti-vibe-coding,
by construction.

Five reasons teams adopt Waterfall — and why each one is a deliberate stance, not a feature list.

01

Structured methodology

A 10-phase state machine. Phases are named, ordered, and every transition emits a written artefact. No cycle is shorter or longer than the work demands.

02

Multi-agent parallelism

OR sequences. PM, PO, TL, DS work in parallel where they can. RV is independent. DV doesn't start until specs are frozen — so it never has to redo work.

03

EX → specs → code traceability

Every line of code maps back through technical design, functional specs, and an EX story. Reviewing a diff means re-reading three files, not guessing intent.

04

Dark Factory mode

Maximum autonomy. Agents run end-to-end. HO only reviews at named checkpoints — typically Review (after design) and Closure (after validation). For teams that trust the artefacts and want speed.

Bootstrapauto
Requirementsauto
Functional specsauto
Technical designauto
ReviewHO
Implementationauto
Code reviewauto
Validationauto
ClosureHO
05

Human control, by design

HO must review artefacts and code. That sentence is in the plugin's default config, on every gate, and on this homepage on purpose. Waterfall is not built to take humans out of the loop. It's built so the humans in the loop are reading the right thing at the right time.

copied