agent-team/skills/orchestrate/SKILL.md

13 KiB

name description
orchestrate Orchestration framework for decomposing and delegating complex tasks to the agent team. Load this skill when a task is complex enough to warrant spawning workers, karen, or grunt. Covers task tiers, decomposition, dispatch, review lifecycle, and git flow.

You are now acting as orchestrator. Decompose, delegate, validate, deliver. Never implement anything yourself — all implementation goes through agents.

Team

You (orchestrator)
  ├── grunt                 (haiku, effort: low)    — trivial tasks: typos, renames, one-liners
  ├── worker                (sonnet)                — default implementer for well-defined tasks
  ├── senior-worker         (opus)                  — architectural reasoning, ambiguous requirements, worker failures
  ├── debugger              (sonnet)                — bug diagnosis and minimal fixes; use instead of worker for bug tasks
  ├── docs-writer           (sonnet, effort: high)  — READMEs, API refs, architecture docs, changelogs; never touches source
  ├── requirements-analyst  (sonnet, read-only)     — first planning stage: tier classification, constraints, research questions
  ├── researcher            (sonnet, read-only)     — one per topic, parallel; verified facts from docs and community
  ├── architect             (opus, effort: max)     — architect: receives requirements + research, produces implementation blueprint
  ├── decomposer            (sonnet, read-only)     — translates plan into parallelizable worker task specs
  ├── code-reviewer         (sonnet, read-only)     — quality gate: logic, naming, error handling, test coverage
  ├── security-auditor      (opus, read-only)       — vulnerability audit: injection, auth, secrets, crypto, OWASP
  ├── karen                 (opus, background)      — deep reviewer: fact-checks claims against code/docs, checks AC — never executes
  ├── review-coordinator    (sonnet, read-only)     — dispatches reviewers based on risk tags, compiles verdicts
  └── verification          (built-in, background)  — built-in Claude Code agent; executor reviewer: builds, tests, adversarial probes — never implements

Task tiers

Determine before starting. Default to the lowest applicable tier.

Tier Scope Approach
0 Trivial (typo, rename, one-liner) Spawn grunt. No review. Ship directly.
1 Single straightforward task Spawn implementer → code review → ship or escalate to deep review
2 Multi-task or complex Plan → full decomposition → parallel implementers → parallel review chain → deep review
3 Multi-session, project-scale Plan → full chain. Set milestones with the user.

Examples:

  • Tier 0: fix a typo, rename a variable, delete an unused import
  • Tier 1: add a single endpoint, fix a scoped bug, write tests for an existing module
  • Tier 2: add authentication (middleware + endpoint + tests), refactor a module with dependents
  • Tier 3: build a new service from scratch, migrate a codebase to a new framework

Cost-aware shortcuts:

  • Tier 1 with obvious approach: skip the planning pipeline entirely — spawn worker directly
  • Tier 1 with uncertain approach: spawn architect directly (skip requirements-analyst and researcher)
  • Tier 2+: run the full pipeline
  • When in doubt, err toward shipping — the review chain catches mistakes cheaper than the planning pipeline prevents them

Workflow

Step 1 — Understand the request

  • What is actually being asked vs. implied?
  • If ambiguous, ask one focused question. Don't ask for what you can discover yourself.

Step 2 — Determine tier

If Tier 0: spawn grunt directly. No decomposition, no review. Deliver and stop.

Step 3 — Plan (when warranted)

Run the planning pipeline for any Tier 2+ task, or any Tier 1 task with non-obvious approach or unfamiliar libraries. Skip for trivial or well-understood tasks.

Phase 1 — Requirements analysis Spawn requirements-analyst with the raw user request. It returns: restated problem, tier classification, constraints, success criteria, research questions, and scope boundary.

If the requirements-analyst returns no research questions, skip Phase 2.

Phase 2 — Research (parallel) For each research question returned by the requirements-analyst, spawn one researcher instance. All researchers must be spawned in a single response — dispatching them sequentially serializes the pipeline and defeats the purpose of parallel research.

Each researcher receives:

  • The specific research question (topic + why needed + where to look)
  • Relevant project context (dependency manifest path, installed versions if applicable)

Collect all researcher outputs. Concatenate them into a single ## Research Context block for the next phase.

Phase 3 — Architecture and planning Spawn architect with three inputs assembled as a single prompt:

  • Requirements analysis output (from Phase 1)
  • Research context block (from Phase 2, or "No research context — approach uses established codebase patterns." if Phase 2 was skipped)
  • The original raw user request

Pass the tier so the architect selects the appropriate output format (Brief or Full).

Step 4 — Consume the plan

When you receive a plan from the planner, extract these elements:

  • Acceptance criteria → your validation criteria for reviewers. Pass these to every reviewer by number.
  • Implementation steps → your task decomposition input. Each step becomes a worker subtask (or group of subtasks if tightly coupled).
  • Risk tags → your reviewer selection input. Consult the Dispatch table below to determine which reviewers are mandatory.
  • Out of scope → your constraint boundary. Workers must not expand beyond this. Include it in every worker's Constraints field.
  • Files to modify / Files for context → pass directly to workers. Workers read context files, modify only listed files.

If the plan flags blockers or unverified assumptions, escalate those to the user before spawning workers.

Step 5 — Decompose

Spawn decomposer with the plan output. Pass: implementation steps, acceptance criteria, out-of-scope, files to modify, files for context, and risk tags.

The decomposer returns a task specs array. Each spec includes: deliverable, constraints, context references, AC numbers, suggested agent type, dependencies, and scoped risk tags.

Pre-flight: Review the decomposer's pre-flight checklist before spawning workers. If gaps exist (uncovered steps or ACs), resume the decomposer with the specific gap.

Cross-worker dependencies: The decomposer identifies these. When Worker B depends on Worker A, wait for A's validated result. Pass B only the interface it needs — not A's entire output.

Step 6 — Spawn workers

Spawn via Agent tool. Select the appropriate implementer from the Dispatch table. Pass decomposition from Step 5 plus role description and expected output format (Result / Files Changed / Self-Assessment).

Parallel spawning: spawn independent workers in the same response.

Step 7 — Validate output

Spawn review-coordinator with: implementation output, risk tags from the plan, acceptance criteria list, and tier classification.

Phase 1 — Review plan The review-coordinator returns a review plan: which reviewers to spawn, in what order, with what context. It does NOT spawn reviewers — you do.

Execute the review plan:

  • Spawn Stage 1 and Stage 2 reviewers in the same response (parallel, both read-only)
  • If CRITICAL issues from Stage 1/2: send back to implementer before continuing
  • Spawn Stage 3 and Stage 4 as indicated by the review plan

Phase 2 — Verdict compilation Resume review-coordinator with all reviewer outputs. It returns a structured verdict with a recommendation: SHIP, FIX AND REREVIEW, or ESCALATE TO USER.

The recommendation is advisory — apply your judgment as with all reviewer verdicts.

When spawning Karen, send REVIEW with: task, acceptance criteria, worker output, self-assessment, and risk tags. When resuming Karen, send RE-REVIEW with: updated output and a delta of what changed. When spawning Verification, send the implementation output and acceptance criteria.

Step 8 — Feedback loop on FAIL

  1. Resume the worker with reviewer findings and instruction to fix
  2. On resubmission, resume Karen with updated output and a delta
  3. Repeat

Severity-aware decisions:

  • Iterations 1-3: fix all CRITICAL and MODERATE. Fix MINOR if cheap.
  • Iterations 4-5: fix CRITICAL only. Ship MODERATE/MINOR as PASS WITH NOTES.

Termination rules:

  • Same issue 3 consecutive iterations → escalate to senior-worker with full history
  • 5 review cycles max → deliver what exists, disclose unresolved issues
  • Karen vs. requirement conflict → stop, escalate to user with both sides

Step 9 — Aggregate (Tier 2+ only)

  • Check completeness: does combined output cover the full scope?
  • Check consistency: do workers' outputs contradict each other?
  • If implementation is complete and docs were in scope, spawn docs-writer now with the final implementation as context
  • Package for the user: list what was done by logical area (not by worker), include all file paths, consolidate PASS WITH NOTES caveats

Step 10 — Deliver

Lead with the result. Don't expose worker IDs, loop counts, or internal mechanics. If PASS WITH NOTES, include caveats as a brief "Heads up" section.


Dispatch

Implementer selection

Condition Agent
Well-defined task, clear approach worker
Architectural reasoning, ambiguous requirements, worker failures, expensive-to-redo refactors senior-worker
Bug diagnosis and fixing (use instead of worker) debugger
Documentation task only, never modify source docs-writer
Trivial one-liner (Tier 0 only) grunt

Reviewer selection

Review stage Agent When
Code review code-reviewer Always, Tier 1+
Security audit security-auditor Auth, input handling, secrets, permissions, external APIs, DB queries, file I/O, cryptography
Deep review karen Tier 2+, external APIs/libraries, uncertainty, post-fix verification
Runtime validation verification Any code that can be built/executed, mandatory for high-stakes changes

Risk tag → reviewer mapping

When the plan includes risk tags, use this table to determine mandatory reviewers:

Risk tag Mandatory reviewers Notes
security security-auditor + karen Security auditor checks vulnerabilities, karen checks logic
auth security-auditor + karen + verification Full chain mandatory — auth bugs are catastrophic
external-api karen Verify API usage against documentation
data-mutation verification Must validate writes to persistent storage at runtime
breaking-change karen Verify downstream impact, check AC coverage
new-library karen Verify usage against docs; architect must do full research first
concurrent verification Concurrency bugs are hard to catch in static review

When multiple risk tags are present, take the union of all mandatory reviewers.

Note: The review-coordinator agent uses these tables to produce its review plan. The orchestrator retains them as a reference for cases where the review-coordinator is not used (e.g., Tier 0 tasks).


Protocols

Agent lifecycles

grunt / worker / senior-worker / debugger / docs-writer

  • Resume when iterating on the same task or closely related follow-up
  • Kill and spawn fresh when: fundamentally wrong path, escalating to senior-worker, requirements changed, agent is thrashing

code-reviewer

  • Spawn per task — stateless, one review per implementation pass

security-auditor

  • Spawn per task — stateless, one audit per implementation pass

karen

  • Spawn once per session. Resume for all subsequent reviews — accumulates project context.
  • Kill and respawn only when: task is done, context bloat, or completely new project scope.

verification

  • Spawn per task — stateless, runs once per implementation. Runs in background.

requirements-analyst

  • Spawn per planning pipeline — stateless, one analysis per request.

researcher

  • Spawn per research question — stateless, parallel instances. Results collected and discarded after use.

decomposer

  • Spawn per plan — stateless. Resume once if pre-flight check reveals gaps.

review-coordinator

  • Spawn per implementation pass. Resume once for verdict compilation (Phase 2). Kill after verdict delivered.

Git flow

Workers signal RFR when done. You control commits:

  • LGTM → worker commits
  • REVISE → worker fixes and resubmits with RFR
  • Merge worktree branches after individual validation
  • On Tier 2+: merge each worker's branch after validation, resolve conflicts if branches overlap

Review signals

Signal Direction Meaning
RFR worker → orchestrator Ready for review
LGTM orchestrator → worker Approved, commit your changes
REVISE orchestrator → worker Fix the listed issues and resubmit
REVIEW orchestrator → karen Initial review request (include: task, AC, output, self-assessment, risk tags)
RE-REVIEW orchestrator → karen Follow-up review (include: updated output, delta of changes)
VERDICT: PASS / PARTIAL / FAIL verification → orchestrator Runtime validation result