diff --git a/README.md b/README.md index c3cffac..07ad725 100644 --- a/README.md +++ b/README.md @@ -143,16 +143,15 @@ Shared runtime intent is generated conservatively across tools: The adapters do not expose identical config surfaces. For example, Codex does not support Claude-style per-tool `allow` / `deny` / `ask` patterns directly. The shared protocol keeps the intent portable, then adapters derive the closest target behavior. -`runtime.filesystem`, `runtime.approval`, and `runtime.network_access` are the primary source of truth. `targets.codex.sandbox_mode`, `targets.codex.approval_policy`, and `targets.codex.network_access` are compatibility overrides for exceptional cases only. When set, they override the Codex-derived value. +`runtime.approval` and `runtime.network_access` are the primary source of truth. `targets.codex.approval_policy` and `targets.codex.network_access` are compatibility overrides for exceptional cases only. When set, they override the Codex-derived value. -This repo intentionally sets those Codex overrides to `sandbox_mode: danger-full-access`, `approval_policy: never`, and `network_access: true`. The reason is not that Codex has no approval controls at all, but that it lacks Claude-equivalent pattern-level permission controls for tool/path `allow` / `deny` / `ask`. In this repo, Codex therefore runs with a deliberately more permissive top-level policy than the portable runtime defaults. +This repo intentionally sets those Codex overrides to `approval_policy: never` and `network_access: true`. The reason is not that Codex has no approval controls at all, but that it lacks Claude-equivalent pattern-level permission controls for tool/path `allow` / `deny` / `ask`. In this repo, Codex therefore runs with a deliberately more permissive top-level policy than the portable runtime defaults. Use target-specific fields only when you intentionally need a target-only override: ```yaml targets: codex: - sandbox_mode: danger-full-access approval_policy: untrusted network_access: false claude: diff --git a/SETTINGS.yaml b/SETTINGS.yaml index 52c6f71..0844c7b 100644 --- a/SETTINGS.yaml +++ b/SETTINGS.yaml @@ -49,7 +49,6 @@ targets: codex: # Intentional target override: Codex does not expose Claude-equivalent # per-tool/path allow/deny/ask controls, so this repo runs Codex in - # full-auto with no sandbox and network enabled by default. - sandbox_mode: danger-full-access + # full-auto with network enabled by default. approval_policy: never network_access: true diff --git a/TEAM.yaml b/TEAM.yaml index 3661de1..379150b 100644 --- a/TEAM.yaml +++ b/TEAM.yaml @@ -274,6 +274,7 @@ rules: - 03-git - 04-tools - 05-verification + - 06-nix - 07-research items: 01-session: @@ -306,6 +307,12 @@ rules: applies_to: - claude - codex + 06-nix: + id: 06-nix + source_file: rules/06-nix.md + applies_to: + - claude + - codex 07-research: id: 07-research source_file: rules/07-research.md diff --git a/agents/architect.md b/agents/architect.md index c7212e0..5eb6de5 100644 --- a/agents/architect.md +++ b/agents/architect.md @@ -94,7 +94,7 @@ Triggered when the orchestrator resumes you with a `## Research Context` block ( 1. Surface any unresolved blockers from research before planning — do not plan around unverified assumptions 2. Analyze the codebase: files to change, files for context, existing patterns to follow 3. Design the architecture: define interfaces and contracts upfront so parallel workers don't need to coordinate -4. Decompose into waves: group steps by what runs in parallel vs. what has dependencies +4. Decompose into waves: group steps by what can run in parallel vs. what has dependencies 5. Write the plan file **If the request involves more than 8–10 steps**, decompose into multiple plans, each independently implementable and testable. State: "This is plan 1 of N." @@ -157,7 +157,7 @@ What could go wrong. Edge cases. Breaking changes. ## Implementation Waves ### Wave 1 — [description] -Tasks that run in parallel. No dependencies. +Tasks that can run in parallel. No dependencies. - [ ] **Step 1: [title]** — What/Where/How @@ -194,7 +194,7 @@ Key facts from research, organized by relevance. Include source URLs. Flag anyth Every file that will change, with a brief description and file:line references. ### Files for context (read-only) -Files workers should read when relevant to understand patterns, interfaces, or dependencies. +Files workers should read to understand patterns, interfaces, or dependencies. ### Current patterns Conventions, naming schemes, architectural patterns the implementation must follow. @@ -279,6 +279,6 @@ Format: comma-separated, e.g. `security, external-api`. Add a brief note if the - If documentation is ambiguous or missing, say so explicitly and fall back to codebase evidence - Surface gotchas and known issues prominently -- Use approaches already used elsewhere in the codebase over novel patterns +- Prefer approaches used elsewhere in the codebase over novel patterns - Flag any assumption you couldn't verify - For each non-trivial decision, evaluate at least two approaches and state why you chose one diff --git a/agents/auditor.md b/agents/auditor.md index 163a987..69d1c88 100644 --- a/agents/auditor.md +++ b/agents/auditor.md @@ -80,7 +80,7 @@ typecheck_status: pass | fail | skipped --- ``` -**Hard rule:** `security_findings.critical > 0` or `security_findings.high > 0` or `build_status: fail` or `test_status: fail` requires `signal: fail`. +**Hard rule:** `security_findings.critical > 0` or `build_status: fail` or `test_status: fail` requires `signal: fail`. Then the markdown body: diff --git a/agents/grunt.md b/agents/grunt.md index 6454d44..b243899 100644 --- a/agents/grunt.md +++ b/agents/grunt.md @@ -19,9 +19,9 @@ You are a grunt agent. You implement small, explicit tasks quickly and cheaply. Implement only what was assigned. Do not expand scope on your own judgment. -**Do not make architectural decisions.** If the task depends on an unclear interface, missing contract, or non-trivial judgment call, stop and report that the task must be escalated. +**Do not make architectural decisions.** If the task depends on an unclear interface, missing contract, or non-trivial judgment call, stop and report that the task should be escalated. -If the task grows beyond a small, tightly scoped change, stop and report that it must be reassigned to `worker`. Escalate to the orchestrator instead when the real issue is a missing plan, unclear requirement, or changed scope. +If the task grows beyond a small, tightly scoped change, stop and report that it should be reassigned to `worker`. Escalate to the orchestrator instead when the real issue is a missing plan, unclear requirement, or changed scope. If you are stuck after one focused attempt, stop and report what blocked you. diff --git a/agents/researcher.md b/agents/researcher.md index 871dda7..c8d91e4 100644 --- a/agents/researcher.md +++ b/agents/researcher.md @@ -24,7 +24,7 @@ Shell access is intentionally unavailable in this role to enforce read-only beha ## Verification standards - **Dependency versions** — check the project's dependency manifest first. Research the installed version, not the latest. -- **Official documentation** — fetch the authoritative docs. Use versioned documentation matching the installed version. +- **Official documentation** — fetch the authoritative docs. Prefer versioned documentation matching the installed version. - **Changelogs and migration guides** — fetch these when the question involves upgrades or version-sensitive behavior. - **Community examples** — search for real implementations, known gotchas, and battle-tested patterns. - **If verification fails** — state what you tried and could not verify. Do not fabricate an answer. Flag it as unverified. diff --git a/agents/reviewer.md b/agents/reviewer.md index 508b83f..aaa2d58 100644 --- a/agents/reviewer.md +++ b/agents/reviewer.md @@ -28,7 +28,7 @@ Shell access is intentionally unavailable in this role to enforce read-only beha ## Claim verification -- **Acceptance criteria** — when acceptance criteria are provided, walk each criterion explicitly by number. Clean code that doesn't do what was asked is a FAIL. +- **Acceptance criteria** — walk each criterion explicitly by number. Clean code that doesn't do what was asked is a FAIL. - **API and library usage** — verify against official docs ${WEB_SEARCH} when the implementation uses external APIs, libraries, or non-obvious patterns - **File and path claims** — do they exist? - **Logic correctness** — does the implementation actually solve the problem? @@ -57,8 +57,6 @@ ac_coverage: **Hard rule:** `critical_count > 0` requires `signal: fail`. -Omit `ac_coverage` when no acceptance criteria were provided in the assignment. - Then the markdown body: ### Review: [scope] @@ -66,7 +64,7 @@ Then the markdown body: **CRITICAL** — must fix before shipping - file:line — [what's wrong and why] -**MODERATE** — fix during active review cycles unless explicitly deferred by orchestrator policy +**MODERATE** — should fix - file:line — [what's wrong] **MINOR** — consider fixing @@ -77,8 +75,6 @@ Then the markdown body: - AC2: PASS / FAIL — [one line] - ... -Omit the **AC Coverage** section when no acceptance criteria were provided. - One line summary. --- diff --git a/agents/senior.md b/agents/senior.md index 9a4acef..1d7e4ea 100644 --- a/agents/senior.md +++ b/agents/senior.md @@ -19,7 +19,7 @@ You are a senior agent. You implement difficult or ambiguous tasks with strong t Implement only what was assigned. Do not expand scope unless the orchestrator explicitly revises the task. -You may resolve local implementation ambiguity when necessary, but **do not invent architecture** that must be specified by the plan. If a missing interface or contract changes the design boundary, stop and report the gap. +You may resolve local implementation ambiguity when necessary, but **do not invent architecture** that should have been specified by the plan. If a missing interface or contract changes the design boundary, stop and report the gap. If the plan appears wrong or incomplete, stop and explain the issue clearly rather than forcing a brittle implementation. @@ -27,12 +27,12 @@ If you are stuck after two serious attempts, stop and report what you tried and ## Escalation contract -- Stay local: difficult implementation, careful cross-file reasoning, and bounded ambiguity that is resolvable without changing the plan's design boundary. -- Escalate to the orchestrator: when the remaining work requires decomposition into a team, when coordination is now the main risk, or when the plan needs to be revised before safe implementation can continue. +- Stay local: difficult implementation, careful cross-file reasoning, and bounded ambiguity that can be resolved without changing the plan's design boundary. +- Escalate to the orchestrator: when the remaining work should be decomposed into a team, when coordination is now the main risk, or when the plan needs to be revised before safe implementation can continue. - Do not summon more seniors yourself. Re-decomposition is the orchestrator's responsibility. - If a stronger implementation wave is needed, report that explicitly so the orchestrator can spawn a senior team with clear ownership. When returning a typed envelope: -- Use `signal: blocked` when the orchestrator must re-decompose the work, amend the plan, or split the task into a senior wave. +- Use `signal: blocked` when the orchestrator should re-decompose the work, amend the plan, or split the task into a senior wave. - Use `signal: escalate` only when the issue requires a user decision rather than orchestration. - In the body, state the preferred next route explicitly: `Route: orchestrator (re-decompose)` or `Route: orchestrator (user decision required)`. diff --git a/agents/worker.md b/agents/worker.md index 5318d84..d0e4b93 100644 --- a/agents/worker.md +++ b/agents/worker.md @@ -33,6 +33,6 @@ If this task is more complex than it appeared (more files involved, unclear inte - Do not silently turn a plan gap into a design decision. When returning a typed envelope: -- Use `signal: blocked` when the work must be reassigned to `senior` or when the orchestrator needs to unblock you. +- Use `signal: blocked` when the work should be reassigned to `senior` or when the orchestrator needs to unblock you. - Use `signal: escalate` only when user-level clarification or approval is required. - In the body, state the preferred next route explicitly: `Route: senior` or `Route: orchestrator`. diff --git a/generate.sh b/generate.sh index 5b7c080..5b8002a 100755 --- a/generate.sh +++ b/generate.sh @@ -400,12 +400,6 @@ map_effort() { map_sandbox_mode() { local permission_mode="$1" local tools="$2" - local override="${3:-}" - - if [ -n "$override" ] && [ "$override" != "null" ]; then - echo "$override" - return - fi # plan mode is read-only if [ "$permission_mode" = "plan" ]; then @@ -431,12 +425,6 @@ map_sandbox_mode() { # --------------------------------------------------------------------------- map_default_sandbox_mode() { local default_mode="$1" - local override="${2:-}" - - if [ -n "$override" ] && [ "$override" != "null" ]; then - echo "$override" - return - fi case "$default_mode" in plan) echo "read-only" ;; @@ -575,7 +563,6 @@ generate_codex() { [ -n "$agent_id" ] || continue local name description model effort permission_mode tools disallowed_tools - local codex_sandbox_override local agent_skills local src_file dst_file name="$(yq -r ".agents.items.${agent_id}.name" "$TEAM_YAML")" @@ -585,7 +572,6 @@ generate_codex() { permission_mode="$(yq -r ".agents.items.${agent_id}.permission_mode // \"\"" "$TEAM_YAML")" tools="$(yq -r ".agents.items.${agent_id}.tools[]" "$TEAM_YAML" | csv_from_yaml_array)" disallowed_tools="$(yq -r ".agents.items.${agent_id}.disallowed_tools // [] | .[]" "$TEAM_YAML" | csv_from_yaml_array)" - codex_sandbox_override="$(yq -r '.targets.codex.sandbox_mode // ""' "$SETTINGS_SHARED_YAML")" agent_skills="$(yq -r ".agents.items.${agent_id}.skills[]" "$TEAM_YAML")" src_file="$SCRIPT_DIR/$(yq -r ".agents.items.${agent_id}.instruction_file" "$TEAM_YAML")" dst_file="$CODEX_AGENTS_DIR/${name}.toml" @@ -594,7 +580,7 @@ generate_codex() { local codex_model codex_effort codex_sandbox codex_model="$(map_model "$model")" codex_effort="$(map_effort "${effort:-medium}")" - codex_sandbox="$(map_sandbox_mode "$permission_mode" "$tools" "$codex_sandbox_override")" + codex_sandbox="$(map_sandbox_mode "$permission_mode" "$tools")" # Extract and expand body with Codex variable values local body expanded_body @@ -678,19 +664,17 @@ TOML echo "" echo "Generating codex/config.toml..." - local default_mode runtime_approval codex_approval_override codex_network_access codex_sandbox_override + local default_mode runtime_approval codex_approval_override codex_network_access default_mode="$(map_filesystem_intent_to_claude_mode "$(yq -r '.runtime.filesystem' "$SETTINGS_SHARED_YAML")")" runtime_approval="$(yq -r '.runtime.approval' "$SETTINGS_SHARED_YAML")" - codex_sandbox_override="$(yq -r '.targets.codex.sandbox_mode // ""' "$SETTINGS_SHARED_YAML")" codex_approval_override="$(yq -r '.targets.codex.approval_policy // ""' "$SETTINGS_SHARED_YAML")" codex_network_access="$(yq -r '.targets.codex.network_access // .runtime.network_access // false' "$SETTINGS_SHARED_YAML")" local config_sandbox config_approval - config_sandbox="$(map_default_sandbox_mode "$default_mode" "$codex_sandbox_override")" + config_sandbox="$(map_default_sandbox_mode "$default_mode")" config_approval="$(map_approval_policy "$runtime_approval" "$codex_approval_override")" - if [ "$config_sandbox" = "workspace-write" ]; then - cat > "$CODEX_DIR/config.toml" < "$CODEX_DIR/config.toml" < "$CODEX_DIR/config.toml" < 0` requires `signal: fail`. -Body: Findings by severity (CRITICAL / MODERATE / MINOR), then AC Coverage details when applicable, then one-line summary. +Body: Findings by severity (CRITICAL / MODERATE / MINOR), then AC Coverage details, then one-line summary. ### audit_verdict @@ -132,7 +131,7 @@ typecheck_status: pass | fail | skipped Required: `type`, `signal`, `security_findings`, `build_status`, `test_status` Optional: `typecheck_status` -**Hard rule:** `security_findings.critical > 0` or `security_findings.high > 0` or `build_status: fail` or `test_status: fail` requires `signal: fail`. +**Hard rule:** `security_findings.critical > 0` or `build_status: fail` or `test_status: fail` requires `signal: fail`. High-severity findings (`security_findings.high > 0`) do not require `fail` — use `pass_with_notes`. Body: Security findings by severity (or CLEAN), then Runtime section with tested/passed/failed. @@ -209,7 +208,7 @@ Body: Answer, Verified Facts with sources, Version Constraints, Gotchas, Unverif ### task_assignment -Sent to: grunt, worker, senior, debugger, documenter +Sent to: worker, debugger, documenter ```yaml --- @@ -229,7 +228,7 @@ Body: Task spec, Acceptance Criteria, Context (interface contracts, constraints, ### revision_request -Sent to: grunt, worker, senior, debugger, documenter +Sent to: worker, debugger, documenter ```yaml --- @@ -250,7 +249,7 @@ Body: Issues to fix (from reviewer and/or auditor), grouped by source, with guid ### approval -Sent to: grunt, worker, senior, debugger, documenter +Sent to: worker, debugger, documenter ```yaml --- diff --git a/skills/orchestrate/SKILL.md b/skills/orchestrate/SKILL.md index c743034..3e2b8da 100644 --- a/skills/orchestrate/SKILL.md +++ b/skills/orchestrate/SKILL.md @@ -105,7 +105,7 @@ For each wave in the plan: - `grunt -> worker` when the task is no longer mechanical but still well-defined - `worker -> senior` when the task is implementable but needs stronger judgment or broader reasoning - `grunt` or `worker` -> orchestrator when the real issue is a plan gap, changed scope, or missing requirement -- `senior -> orchestrator` when the work requires re-decomposition into a senior wave/team or when the plan boundary must change +- `senior -> orchestrator` when the work should be re-decomposed into a senior wave/team or the plan boundary must change ### Step 6 — Review @@ -146,7 +146,7 @@ Do not advance until both verdicts are collected. - **Docs:** if documentation was in scope, spawn `documenter` now with final implementation as context - **Package:** list what was done by logical area (not by worker). Include all file paths. Surface PASS WITH NOTES caveats as a brief "Heads up" section. -Lead with the result. Don't expose worker IDs, wave counts, or internal mechanics. When subagent results return to your context, use concise summaries over verbatim output — the full detail is in the code, not the report. +Lead with the result. Don't expose worker IDs, wave counts, or internal mechanics. When subagent results return to your context, prefer concise summaries over verbatim output — the full detail is in the code, not the report. --- @@ -211,7 +211,7 @@ The actual write protection for read-only agents comes from `disallowedTools: Wr **Reviewer and auditor must be spawned in a single response.** **All researchers must be spawned in a single response.** -Spawning agents sequentially when parallel dispatch is possible is a protocol violation, not a style choice. Parallel dispatch reduces wall-clock latency proportionally — N agents in parallel complete in the time of the slowest, not the sum of all. +Spawning agents sequentially when they could run in parallel is a protocol violation, not a style choice. Parallel dispatch reduces wall-clock latency proportionally — N agents in parallel complete in the time of the slowest, not the sum of all. ### Git flow diff --git a/skills/worker-protocol/SKILL.md b/skills/worker-protocol/SKILL.md index 1c616f7..25d8b8d 100644 --- a/skills/worker-protocol/SKILL.md +++ b/skills/worker-protocol/SKILL.md @@ -1,7 +1,7 @@ --- name: worker-protocol description: Standard output format, feedback handling, and operational procedures for all worker agents. -when_to_use: Loaded by grunt, worker, senior, debugger, and documenter agents. Defines the worker_submission envelope format and commit workflow. +when_to_use: Loaded by worker, debugger, and documenter agents. Defines the worker_submission envelope format and commit workflow. --- ## Output format @@ -29,7 +29,7 @@ Then the markdown body: [Your deliverable here] ## Self-Assessment -- Acceptance criteria met: [yes/no per criterion, one line each, or "No acceptance criteria were provided"] +- Acceptance criteria met: [yes/no per criterion, one line each] - Known limitations: [any, or "none"] ``` diff --git a/spec/agent-runtime-v1.md b/spec/agent-runtime-v1.md index 4e6d275..a172ff8 100644 --- a/spec/agent-runtime-v1.md +++ b/spec/agent-runtime-v1.md @@ -55,7 +55,6 @@ Target blocks are escape hatches, not the main schema. Current target-specific fields: - `targets.claude.claude_md_excludes` -- `targets.codex.sandbox_mode` (optional override of derived sandbox mode) - `targets.codex.approval_policy` (optional override of derived approval) - `targets.codex.network_access` (optional override of derived network access) @@ -64,7 +63,7 @@ Authority rules: - `runtime.approval` and `runtime.network_access` are the portable source of truth. - Codex target fields exist for explicit compatibility overrides and should normally be omitted. - When Codex target fields are set, they intentionally override the derived Codex value. -- In this repo, `targets.codex.sandbox_mode`, `targets.codex.approval_policy`, and `targets.codex.network_access` are intentionally set so Codex runs with `sandbox_mode = "danger-full-access"`, `approval_policy = "never"`, and network enabled by default. This is a deliberate target-specific compatibility choice, not an accidental divergence. +- In this repo, `targets.codex.approval_policy` and `targets.codex.network_access` are intentionally set so Codex runs with `approval_policy = "never"` and network enabled by default. This is a deliberate target-specific compatibility choice, not an accidental divergence. ## Adapter rules @@ -89,11 +88,10 @@ Lossiness: - `runtime.filesystem = read-only` -> `sandbox_mode = "read-only"` - `runtime.filesystem = workspace-write` -> `sandbox_mode = "workspace-write"` -- `targets.codex.sandbox_mode` -> overrides the derived `sandbox_mode` - `runtime.approval = manual` -> `approval_policy = "on-request"` (unless overridden) - `runtime.approval = guarded-auto` -> `approval_policy = "untrusted"` (unless overridden) - `runtime.approval = full-auto` -> `approval_policy = "never"` (unless overridden) -- `runtime.network_access` -> `[sandbox_workspace_write].network_access` when `sandbox_mode = "workspace-write"` +- `runtime.network_access` -> `[sandbox_workspace_write].network_access` Lossiness: