Build a 21-Agent Product Pipeline with Claude Skills

By Yanni Papoutsis | 7 min read | 2026-06-09

A Claude agent pipeline is a set of sequential skills, each acting as one specialist agent, that takes a product from spec to architecture to data model to UX to UI to build to QA to security to SEO to launch, with a quality gate between every stage. You build it as a single Claude plugin containing 21 numbered skills, and this guide shows the exact structure I use to ship real products.

TL;DR: Instead of asking Claude to "build the app" in one giant prompt, split the work into 21 specialist agents implemented as sequential Claude skills: discovery, system analysis, product spec, architecture, data, UX flows, UI design, brand, copy, frontend, backend, infrastructure, migrations, integrations, maintainability, performance, accessibility, security, testing, SEO, and launch. Each skill reads the artefacts of earlier stages, writes its own artefact to an outputs/ folder, and ends by writing a pass or fail gate file. The next stage refuses to start until the previous gate passes, which stops half-baked specs from becoming half-baked code. Package the whole pipeline as one plugin, run it stage by stage in Cowork or Claude Code, and rerun any failed stage in isolation. The full folder structure, a stage skill template, and the gate format are below.

Why split a product build into 21 agents?

Because one prompt cannot hold a whole product, and quality collapses when Claude tries to be analyst, architect, designer, developer, and tester simultaneously. Splitting the build into a Claude agent pipeline forces each stage to produce a reviewable artefact, and the gates between stages catch problems while they are still cheap to fix.

The 21 agents map to how real product teams work:

00 discovery-research      07 brand                14 maintainability-review
01 system-analyst          08 copywriter           15 performance
02 product-spec            09 frontend-build       16 accessibility
03 architecture            10 backend-build        17 security
04 data-architect          11 infrastructure       18 test-architect
05 ux-flow-architect       12 database-migrations  19 seo-aeo
06 ui-designer             13 integrations         20 launch

Sequential matters more than parallel here. The data model depends on the spec, the UI depends on the UX flows, and security review is meaningless before code exists. Gates encode those dependencies explicitly.

How does the pipeline architecture work?

The pipeline is one plugin, 21 skills, one shared artefact folder, and one gate folder. Every stage follows the same contract: read upstream artefacts, do one job, write one artefact, write one gate verdict.

product-pipeline/
  .claude-plugin/
    plugin.json
  skills/
    00-discovery-research/SKILL.md
    01-system-analyst/SKILL.md
    02-product-spec/SKILL.md
    ...
    20-launch/SKILL.md
  outputs/        # stage artefacts, e.g. 02_spec.md
  gates/          # stage verdicts, e.g. 02_gate.json
  pipeline.yaml   # stage order, dependencies, gate criteria

The pipeline.yaml file is the contract every skill obeys:

pipeline: product-build
stages:
  - id: 02-product-spec
    requires: [01-system-analyst]
    artefact: outputs/02_spec.md
    gate:
      - every feature has user stories with acceptance criteria
      - out-of-scope section is present and non-empty
  - id: 03-architecture
    requires: [02-product-spec]
    artefact: outputs/03_architecture.md
    gate:
      - stack chosen with written rationale
      - every spec feature mapped to a component

If you have not built a plugin before, start with my pillar guide on how to build a Claude plugin from scratch, because the pipeline is just a large, disciplined plugin.

How do you build the pipeline step by step?

You scaffold the plugin, define the stages, write one skill template, clone it 21 times with stage-specific checklists, then run it with gates enforced. Here is the sequence.

Step 1: Scaffold the plugin

mkdir -p product-pipeline/.claude-plugin
mkdir -p product-pipeline/{outputs,gates}
for i in 00-discovery-research 01-system-analyst 02-product-spec 03-architecture; do
  mkdir -p "product-pipeline/skills/$i"
done

Add a minimal plugin.json with a name such as product-pipeline, a version, and a one-line description.

Step 2: Write the stage skill template

Every stage skill shares the same skeleton. Here is stage 17 as a complete example:

name: 17-security
description: Security review stage of the product pipeline. Trigger after testing passes. Checks auth flows, secrets handling, input validation, and dependency risk before SEO and launch.

# Security agent (stage 17 of 21)

Inputs: outputs/09_frontend.md, outputs/10_backend.md, outputs/18_test_report.md
Output: outputs/17_security.md
Gate file: gates/17_gate.json

Refuse to run if gates/18_gate.json is missing or failed.

Checklist:
1. No secrets in code, config, or logs. Placeholders only (YOUR_API_KEY).
2. Every form input is validated server-side, not just in the browser.
3. Auth flows cover signup, login, reset, and session expiry errors.
4. Dependencies are pinned and scanned for known vulnerabilities.

Write PASS or FAIL with reasons to the gate file.

The two lines that make the pipeline reliable are the Refuse to run line and the gate write at the end. Without them you have 21 documents, not a pipeline.

Step 3: Define the gate format

Keep gates machine-readable so any stage, or you, can check pipeline state at a glance:

{
  "stage": "17-security",
  "verdict": "PASS",
  "checked": 4,
  "failures": [],
  "timestamp": "2026-06-09T14:30:00Z"
}

Step 4: Write the stage-specific checklists

The template is shared; the checklist is where each agent earns its place. The spec agent checks for acceptance criteria and explicit non-goals. The data architect checks that every entity has an owner, a lifecycle, and an index plan. The QA stages are the deepest: my UI and frontend gates reuse the website checklist skill, and stage 19 reuses the SEO and AI search checklist skill so nothing ships unindexed.

Step 5: Run the pipeline with gates enforced

Package and install the plugin (the packaging commands are in the plugin pillar), then drive it with stage-scoped prompts in Cowork or Claude Code:

Run stage 02 of the product pipeline for the invoicing app.
Read the stage 01 artefact first, obey the gate rules.

Run one stage per session where you can. Fresh context per stage keeps each agent sharp and keeps token use predictable, a discipline that pairs well with a token budget skill. When a gate fails, fix and rerun that stage only, never the whole pipeline.

Step 6: Review at the human checkpoints

Three gates deserve a human eye every time: the spec gate (stage 02), the architecture gate (stage 03), and the security gate (stage 17). Everything else can run on Claude's own verdicts until the final launch review.

What goes wrong without gates?

Without gates, errors compound silently: a vague spec becomes a wrong data model, which becomes rework in every later stage. The gate between stages is the cheapest error-correction point you will ever get.

Three failure patterns show up repeatedly. First, skipping discovery and writing the spec from assumptions. Second, letting the build stages start while the UX artefact still has open questions. Third, treating QA, security, accessibility, and performance as one stage; they are four different mindsets and they catch different bugs, which is why they are four different agents here.

Frequently asked questions

Do the 21 agents run automatically one after another?

You choose. I trigger stages manually because the human checkpoints are valuable, but nothing stops you wrapping the loop in a single orchestrator skill that reads pipeline.yaml and walks the stages, pausing only on gate failures.

Is 21 stages overkill for a small tool?

For a weekend tool, yes: collapse to 6 or 7 (spec, architecture, build, QA, security, SEO, launch) by merging neighbours. Keep the gate contract even when you merge stages; the contract is the point, not the number 21.

Can the agents be subagents instead of skills?

Subagents work well for stages you want isolated, and Claude Code supports defining them in the same plugin under agents/. I default to skills because artefacts and gates live on disk either way, and skills keep the workflow portable between Cowork and Claude Code. The Claude documentation covers both options.

How long does a full pipeline run take?

A focused product takes me two to four working days end to end, with most wall-clock time in the build and QA stages. The pipeline does not make Claude faster; it makes rework rarer, which is where the days are saved.

About the author

Yanni Papoutsis builds AI products, automation pipelines, and technical documentation with Claude, and publishes free tooling and guides at yanni.uk.

Next step: Pair this build pipeline with its marketing twin, the 21-agent GTM pipeline, and explore 1,000+ free AI tools at yanni.uk/ai-tools/.

Sources

Claude documentation (skills, subagents, plugins): https://docs.claude.com
Anthropic (agent design guidance): https://www.anthropic.com
Model Context Protocol: https://modelcontextprotocol.io