Build a 21-Agent Product Pipeline with Claude Skills
TL;DR: Instead of asking Claude to "build the app" in one giant prompt, split the work into 21 specialist agents implemented as sequential Claude skills: discovery, system analysis, product spec, architecture, data, UX flows, UI design, brand, copy, frontend, backend, infrastructure, migrations, integrations, maintainability, performance, accessibility, security, testing, SEO, and launch. Each skill reads the artefacts of earlier stages, writes its own artefact to an
outputs/folder, and ends by writing a pass or fail gate file. The next stage refuses to start until the previous gate passes, which stops half-baked specs from becoming half-baked code. Package the whole pipeline as one plugin, run it stage by stage in Cowork or Claude Code, and rerun any failed stage in isolation. The full folder structure, a stage skill template, and the gate format are below.
A Claude agent pipeline is a set of sequential skills, each acting as one specialist agent, that takes a product from spec to architecture to data model to UX to UI to build to QA to security to SEO to launch, with a quality gate between every stage. You build it as a single Claude plugin containing 21 numbered skills, and this guide shows the exact structure I use to ship real products.
Why split a product build into 21 agents?
Because one prompt cannot hold a whole product, and quality collapses when Claude tries to be analyst, architect, designer, developer, and tester simultaneously. Splitting the build into a Claude agent pipeline forces each stage to produce a reviewable artefact, and the gates between stages catch problems while they are still cheap to fix.
The 21 agents map to how real product teams work:
00 discovery-research 07 brand 14 maintainability-review
01 system-analyst 08 copywriter 15 performance
02 product-spec 09 frontend-build 16 accessibility
03 architecture 10 backend-build 17 security
04 data-architect 11 infrastructure 18 test-architect
05 ux-flow-architect 12 database-migrations 19 seo-aeo
06 ui-designer 13 integrations 20 launch
Sequential matters more than parallel here. The data model depends on the spec, the UI depends on the UX flows, and security review is meaningless before code exists. Gates encode those dependencies explicitly.
How does the pipeline architecture work?
The pipeline is one plugin, 21 skills, one shared artefact folder, and one gate folder. Every stage follows the same contract: read upstream artefacts, do one job, write one artefact, write one gate verdict.
product-pipeline/
.claude-plugin/
plugin.json
skills/
00-discovery-research/SKILL.md
01-system-analyst/SKILL.md
02-product-spec/SKILL.md
...
20-launch/SKILL.md
outputs/ # stage artefacts, e.g. 02_spec.md
gates/ # stage verdicts, e.g. 02_gate.json
pipeline.yaml # stage order, dependencies, gate criteria
The pipeline.yaml file is the contract every skill obeys:
pipeline: product-build
stages:
- id: 02-product-spec
requires: [01-system-analyst]
artefact: outputs/02_spec.md
gate:
- every feature has user stories with acceptance criteria
- out-of-scope section is present and non-empty
- id: 03-architecture
requires: [02-product-spec]
artefact: outputs/03_architecture.md
gate:
- stack chosen with written rationale
- every spec feature mapped to a component
If you have not built a plugin before, start with my pillar guide on how to build a Claude plugin from scratch, because the pipeline is just a large, disciplined plugin.
How do you build the pipeline step by step?
You scaffold the plugin, define the stages, write one skill template, clone it 21 times with stage-specific checklists, then run it with gates enforced. Here is the sequence.
Step 1: Scaffold the plugin
mkdir -p product-pipeline/.claude-plugin
mkdir -p product-pipeline/{outputs,gates}
for i in 00-discovery-research 01-system-analyst 02-product-spec 03-architecture; do
mkdir -p "product-pipeline/skills/$i"
done
Add a minimal plugin.json with a name such as product-pipeline, a version, and a one-line description.
Step 2: Write the stage skill template
Every stage skill shares the same skeleton. Here is stage 17 as a complete example:
---
name: 17-security
description: Security review stage of the product pipeline. Trigger after testing passes. Checks auth flows, secrets handling, input validation, and dependency risk before SEO and launch.
---
# Security agent (stage 17 of 21)
Inputs: outputs/09_frontend.md, outputs/10_backend.md, outputs/18_test_report.md
Output: outputs/17_security.md
Gate file: gates/17_gate.json
Refuse to run if gates/18_gate.json is missing or failed.
Checklist:
1. No secrets in code, config, or logs. Placeholders only (YOUR_API_KEY).
2. Every form input is validated server-side, not just in the browser.
3. Auth flows cover signup, login, reset, and session expiry errors.
4. Dependencies are pinned and scanned for known vulnerabilities.
Write PASS or FAIL with reasons to the gate file.
The two lines that make the pipeline reliable are the Refuse to run line and the gate write at the end. Without them you have 21 documents, not a pipeline.
Step 3: Define the gate format
Keep gates machine-readable so any stage, or you, can check pipeline state at a glance:
{
"stage": "17-security",
"verdict": "PASS",
"checked": 4,
"failures": [],
"timestamp": "2026-06-09T14:30:00Z"
}
Step 4: Write the stage-specific checklists
The template is shared; the checklist is where each agent earns its place. The spec agent checks for acceptance criteria and explicit non-goals. The data architect checks that every entity has an owner, a lifecycle, and an index plan. The QA stages are the deepest: my UI and frontend gates reuse the website checklist skill, and stage 19 reuses the SEO and AI search checklist skill so nothing ships unindexed.
Step 5: Run the pipeline with gates enforced
Package and install the plugin (the packaging commands are in the plugin pillar), then drive it with stage-scoped prompts in Cowork or Claude Code:
Run stage 02 of the product pipeline for the invoicing app.
Read the stage 01 artefact first, obey the gate rules.
Run one stage per session where you can. Fresh context per stage keeps each agent sharp and keeps token use predictable, a discipline that pairs well with a token budget skill. When a gate fails, fix and rerun that stage only, never the whole pipeline.
Step 6: Review at the human checkpoints
Three gates deserve a human eye every time: the spec gate (stage 02), the architecture gate (stage 03), and the security gate (stage 17). Everything else can run on Claude's own verdicts until the final launch review.
What goes wrong without gates?
Without gates, errors compound silently: a vague spec becomes a wrong data model, which becomes rework in every later stage. The gate between stages is the cheapest error-correction point you will ever get.
Three failure patterns show up repeatedly. First, skipping discovery and writing the spec from assumptions. Second, letting the build stages start while the UX artefact still has open questions. Third, treating QA, security, accessibility, and performance as one stage; they are four different mindsets and they catch different bugs, which is why they are four different agents here.
Frequently asked questions
Do the 21 agents run automatically one after another?
You choose. I trigger stages manually because the human checkpoints are valuable, but nothing stops you wrapping the loop in a single orchestrator skill that reads pipeline.yaml and walks the stages, pausing only on gate failures.
Is 21 stages overkill for a small tool?
For a weekend tool, yes: collapse to 6 or 7 (spec, architecture, build, QA, security, SEO, launch) by merging neighbours. Keep the gate contract even when you merge stages; the contract is the point, not the number 21.
Can the agents be subagents instead of skills?
Subagents work well for stages you want isolated, and Claude Code supports defining them in the same plugin under agents/. I default to skills because artefacts and gates live on disk either way, and skills keep the workflow portable between Cowork and Claude Code. The Claude documentation covers both options.
How long does a full pipeline run take?
A focused product takes me two to four working days end to end, with most wall-clock time in the build and QA stages. The pipeline does not make Claude faster; it makes rework rarer, which is where the days are saved.
About the author
Yanni Papoutsis builds AI products, automation pipelines, and technical documentation with Claude, and publishes free tooling and guides at yanni.uk.
Next step: Pair this build pipeline with its marketing twin, the 21-agent GTM pipeline, and explore 1,000+ free AI tools at yanni.uk/category/all/.
Sources
- Claude documentation (skills, subagents, plugins): https://docs.claude.com
- Anthropic (agent design guidance): https://www.anthropic.com
- Model Context Protocol: https://modelcontextprotocol.io