The bad cascade
Each phase compounds the previous one's drift. By the time a human reviews the PR, the original intent is unrecoverable.
Waterfall orchestrates eight specialized agents through a deterministic V-cycle — requirements, specs, design, implementation, review, validation, closure — so Claude Code stops freelancing and starts shipping artefacts you can actually audit.
› /plugin install https://github.com/mgallet92i/waterfall
One vague prompt, one missed assumption, one improvised refactor — and the agent ships a tower of code that compiles, looks plausible, and is wrong end-to-end. Reviewing it after the fact is harder than writing it from scratch.
Each phase compounds the previous one's drift. By the time a human reviews the PR, the original intent is unrecoverable.
Past a threshold, the agent loses the thread but keeps writing. Waterfall enforces small, scoped phases so every artefact is produced in fresh context.
The fix isn't a smarter model. It's a process that refuses to let one model do everything, and that produces written artefacts a human can sign off on at every gate.
Specialization is the point. A single super-prompt is a single point of failure; a team produces traceable artefacts with named owners.
Drives the state machine. Locks scope, sequences phases, calls the next agent. Never writes product content.
Interviews HO, writes PRD.md. Owns the REQUIREMENTS phase end-to-end. Also drives CLOSURE (retro, PR).
Reads PRD.md authored by PM. Writes functional specs and acceptance criteria. Starts at FUNCTIONAL_SPECS.
Owns the technical design. Picks the architecture, documents trade-offs, calls hosting and stack.
Independent gate at every phase boundary. Reads, challenges, blocks. Never writes the artefact under review.
Writes the code. Only the code. Against frozen specs. Never re-opens a closed phase.
Replays acceptance criteria against the build. Cross-browser, cross-flow. Reports verifiable failures.
Visual + interaction design when the deliverable has a UI. Pairs with PO on flows, TL on feasibility.
You. The only non-agent role. Sets scope, signs every gate, and is accountable for what ships. Waterfall is built around your sign-off — not around removing it.
Read left to right. Every arrow is a written artefact. HO checkpoints sit below — each one gates the transition above it.
Verification on the way down (what we're going to do, written). Validation on the way up (what we did, tested against what we wrote). Coding sits at the bottom — last, not first.
Every acronym you'll see across the cycle. Each one is a file the next agent reads.
| Code | Artefact | What it is | Owner |
|---|---|---|---|
| EX | Experience expectation | A user-facing expectation in specs.md, prioritized MUST / SHOULD / MAY. The unit of intent every downstream artefact traces back to. |
PO |
| INV | Invariant | A non-negotiable property the system must always preserve, in specs.md. Cross-cutting; doesn't belong to any single EX. |
PO |
| TF | Functional test | Testable GIVEN / WHEN / THEN scenario in acceptance.md. The contract QA replays at validation. |
PO |
| T | Task | Sized unit of work in tasks.md, traced to one or more EX. The DV's worklist. |
TL |
| B | Blocker | Review finding in review.md that prevents convergence. Must be resolved before the artefact is signed off. |
RV |
| Q | Question | Review-time clarification in review.md, routed to PM, PO, or TL. Non-blocking but must be answered. |
RV |
| ADR | Architecture decision | An architectural decision recorded in design.md with context, options considered, and rationale. |
TL |
| DEC | Decision | A non-architectural decision logged in tracking.md or retro.md — scope, trade-off, process choice. |
any |
| OBS | Observation | An in-vivo observation about the workflow itself, logged in tracking.md or or.log. Aggregated into retro.md at closure. |
any |
/waterfall.
Waterfall ships as a Claude Code plugin. Marketplace-ready, no build step on your side.
One slash command — Waterfall installs directly from this repository. Marketplace listing coming soon.
› /plugin install https://github.com/mgallet92i/waterfall
Copy .wf-config.example.json to .wf-config.json and tune models per role, review-loop budgets, watchdog interval, and Dark Factory. Defaults apply if the file is absent. Schema lives in .wf-config.example.md.
$ cp .wf-config.example.json .wf-config.json
# edit models, review_loops, dark_factory…
OR walks you through Bootstrap → Requirements. Each gate pings you for approval.
› /waterfall:new add-google-workspace-sso
# OR creates ticket WF-014, hands off to PM…
We're transparent about fit. Some changes are too small to deserve a V-cycle. Some are too big to ship without one.
vs. one-shot prompting. You write specs before code; that's the whole pitch.
Multiple agents read each artefact. Context is fresh per phase, not bloated per session.
If your change is a typo or a one-line fix, skip Waterfall. Use it where review matters.
Default agent_mode: subagent (Agent tool, no inter-agent SendMessage) runs slightly cheaper than team mode and is more deterministic — PM stays in charge of every spawn, no idle teammates to repoke. Switch to team only if agents need to chat directly.
Five reasons teams adopt Waterfall — and why each one is a deliberate stance, not a feature list.
A 10-phase state machine. Phases are named, ordered, and every transition emits a written artefact. No cycle is shorter or longer than the work demands.
OR sequences. PM, PO, TL, DS work in parallel where they can. RV is independent. DV doesn't start until specs are frozen — so it never has to redo work.
Every line of code maps back through technical design, functional specs, and an EX story. Reviewing a diff means re-reading three files, not guessing intent.
Maximum autonomy. Agents run end-to-end. HO only reviews at named checkpoints — typically Review (after design) and Closure (after validation). For teams that trust the artefacts and want speed.
HO must review artefacts and code. That sentence is in the plugin's default config, on every gate, and on this homepage on purpose. Waterfall is not built to take humans out of the loop. It's built so the humans in the loop are reading the right thing at the right time.