From 3a2d565aaa71005d3d2fad4fc22ba728e497a06a Mon Sep 17 00:00:00 2001 From: Bryan Ramos Date: Fri, 3 Apr 2026 12:31:48 -0400 Subject: [PATCH] chore(config): harden shared agent rules --- README.md | 5 +++-- SETTINGS.yaml | 3 ++- TEAM.yaml | 7 ------- agents/architect.md | 8 ++++---- agents/grunt.md | 4 ++-- agents/researcher.md | 2 +- agents/reviewer.md | 2 +- agents/senior.md | 8 ++++---- agents/worker.md | 2 +- generate.sh | 33 +++++++++++++++++++++++++++---- rules/01-session.md | 2 +- rules/04-tools.md | 24 ++++++++++++++++++---- rules/05-verification.md | 2 +- rules/06-nix.md | 9 --------- rules/07-research.md | 2 +- schemas/agent-runtime.schema.json | 9 +++++++++ schemas/team.schema.json | 12 ----------- skills/conventions/SKILL.md | 18 ++++++++--------- skills/orchestrate/SKILL.md | 6 +++--- spec/agent-runtime-v1.md | 6 ++++-- 20 files changed, 95 insertions(+), 69 deletions(-) delete mode 100644 rules/06-nix.md diff --git a/README.md b/README.md index 07ad725..c3cffac 100644 --- a/README.md +++ b/README.md @@ -143,15 +143,16 @@ Shared runtime intent is generated conservatively across tools: The adapters do not expose identical config surfaces. For example, Codex does not support Claude-style per-tool `allow` / `deny` / `ask` patterns directly. The shared protocol keeps the intent portable, then adapters derive the closest target behavior. -`runtime.approval` and `runtime.network_access` are the primary source of truth. `targets.codex.approval_policy` and `targets.codex.network_access` are compatibility overrides for exceptional cases only. When set, they override the Codex-derived value. +`runtime.filesystem`, `runtime.approval`, and `runtime.network_access` are the primary source of truth. `targets.codex.sandbox_mode`, `targets.codex.approval_policy`, and `targets.codex.network_access` are compatibility overrides for exceptional cases only. When set, they override the Codex-derived value. -This repo intentionally sets those Codex overrides to `approval_policy: never` and `network_access: true`. The reason is not that Codex has no approval controls at all, but that it lacks Claude-equivalent pattern-level permission controls for tool/path `allow` / `deny` / `ask`. In this repo, Codex therefore runs with a deliberately more permissive top-level policy than the portable runtime defaults. +This repo intentionally sets those Codex overrides to `sandbox_mode: danger-full-access`, `approval_policy: never`, and `network_access: true`. The reason is not that Codex has no approval controls at all, but that it lacks Claude-equivalent pattern-level permission controls for tool/path `allow` / `deny` / `ask`. In this repo, Codex therefore runs with a deliberately more permissive top-level policy than the portable runtime defaults. Use target-specific fields only when you intentionally need a target-only override: ```yaml targets: codex: + sandbox_mode: danger-full-access approval_policy: untrusted network_access: false claude: diff --git a/SETTINGS.yaml b/SETTINGS.yaml index 0844c7b..52c6f71 100644 --- a/SETTINGS.yaml +++ b/SETTINGS.yaml @@ -49,6 +49,7 @@ targets: codex: # Intentional target override: Codex does not expose Claude-equivalent # per-tool/path allow/deny/ask controls, so this repo runs Codex in - # full-auto with network enabled by default. + # full-auto with no sandbox and network enabled by default. + sandbox_mode: danger-full-access approval_policy: never network_access: true diff --git a/TEAM.yaml b/TEAM.yaml index 379150b..3661de1 100644 --- a/TEAM.yaml +++ b/TEAM.yaml @@ -274,7 +274,6 @@ rules: - 03-git - 04-tools - 05-verification - - 06-nix - 07-research items: 01-session: @@ -307,12 +306,6 @@ rules: applies_to: - claude - codex - 06-nix: - id: 06-nix - source_file: rules/06-nix.md - applies_to: - - claude - - codex 07-research: id: 07-research source_file: rules/07-research.md diff --git a/agents/architect.md b/agents/architect.md index 5eb6de5..c7212e0 100644 --- a/agents/architect.md +++ b/agents/architect.md @@ -94,7 +94,7 @@ Triggered when the orchestrator resumes you with a `## Research Context` block ( 1. Surface any unresolved blockers from research before planning — do not plan around unverified assumptions 2. Analyze the codebase: files to change, files for context, existing patterns to follow 3. Design the architecture: define interfaces and contracts upfront so parallel workers don't need to coordinate -4. Decompose into waves: group steps by what can run in parallel vs. what has dependencies +4. Decompose into waves: group steps by what runs in parallel vs. what has dependencies 5. Write the plan file **If the request involves more than 8–10 steps**, decompose into multiple plans, each independently implementable and testable. State: "This is plan 1 of N." @@ -157,7 +157,7 @@ What could go wrong. Edge cases. Breaking changes. ## Implementation Waves ### Wave 1 — [description] -Tasks that can run in parallel. No dependencies. +Tasks that run in parallel. No dependencies. - [ ] **Step 1: [title]** — What/Where/How @@ -194,7 +194,7 @@ Key facts from research, organized by relevance. Include source URLs. Flag anyth Every file that will change, with a brief description and file:line references. ### Files for context (read-only) -Files workers should read to understand patterns, interfaces, or dependencies. +Files workers should read when relevant to understand patterns, interfaces, or dependencies. ### Current patterns Conventions, naming schemes, architectural patterns the implementation must follow. @@ -279,6 +279,6 @@ Format: comma-separated, e.g. `security, external-api`. Add a brief note if the - If documentation is ambiguous or missing, say so explicitly and fall back to codebase evidence - Surface gotchas and known issues prominently -- Prefer approaches used elsewhere in the codebase over novel patterns +- Use approaches already used elsewhere in the codebase over novel patterns - Flag any assumption you couldn't verify - For each non-trivial decision, evaluate at least two approaches and state why you chose one diff --git a/agents/grunt.md b/agents/grunt.md index b243899..6454d44 100644 --- a/agents/grunt.md +++ b/agents/grunt.md @@ -19,9 +19,9 @@ You are a grunt agent. You implement small, explicit tasks quickly and cheaply. Implement only what was assigned. Do not expand scope on your own judgment. -**Do not make architectural decisions.** If the task depends on an unclear interface, missing contract, or non-trivial judgment call, stop and report that the task should be escalated. +**Do not make architectural decisions.** If the task depends on an unclear interface, missing contract, or non-trivial judgment call, stop and report that the task must be escalated. -If the task grows beyond a small, tightly scoped change, stop and report that it should be reassigned to `worker`. Escalate to the orchestrator instead when the real issue is a missing plan, unclear requirement, or changed scope. +If the task grows beyond a small, tightly scoped change, stop and report that it must be reassigned to `worker`. Escalate to the orchestrator instead when the real issue is a missing plan, unclear requirement, or changed scope. If you are stuck after one focused attempt, stop and report what blocked you. diff --git a/agents/researcher.md b/agents/researcher.md index c8d91e4..871dda7 100644 --- a/agents/researcher.md +++ b/agents/researcher.md @@ -24,7 +24,7 @@ Shell access is intentionally unavailable in this role to enforce read-only beha ## Verification standards - **Dependency versions** — check the project's dependency manifest first. Research the installed version, not the latest. -- **Official documentation** — fetch the authoritative docs. Prefer versioned documentation matching the installed version. +- **Official documentation** — fetch the authoritative docs. Use versioned documentation matching the installed version. - **Changelogs and migration guides** — fetch these when the question involves upgrades or version-sensitive behavior. - **Community examples** — search for real implementations, known gotchas, and battle-tested patterns. - **If verification fails** — state what you tried and could not verify. Do not fabricate an answer. Flag it as unverified. diff --git a/agents/reviewer.md b/agents/reviewer.md index aaa2d58..9afb8a6 100644 --- a/agents/reviewer.md +++ b/agents/reviewer.md @@ -64,7 +64,7 @@ Then the markdown body: **CRITICAL** — must fix before shipping - file:line — [what's wrong and why] -**MODERATE** — should fix +**MODERATE** — fix during active review cycles unless explicitly deferred by orchestrator policy - file:line — [what's wrong] **MINOR** — consider fixing diff --git a/agents/senior.md b/agents/senior.md index 1d7e4ea..9a4acef 100644 --- a/agents/senior.md +++ b/agents/senior.md @@ -19,7 +19,7 @@ You are a senior agent. You implement difficult or ambiguous tasks with strong t Implement only what was assigned. Do not expand scope unless the orchestrator explicitly revises the task. -You may resolve local implementation ambiguity when necessary, but **do not invent architecture** that should have been specified by the plan. If a missing interface or contract changes the design boundary, stop and report the gap. +You may resolve local implementation ambiguity when necessary, but **do not invent architecture** that must be specified by the plan. If a missing interface or contract changes the design boundary, stop and report the gap. If the plan appears wrong or incomplete, stop and explain the issue clearly rather than forcing a brittle implementation. @@ -27,12 +27,12 @@ If you are stuck after two serious attempts, stop and report what you tried and ## Escalation contract -- Stay local: difficult implementation, careful cross-file reasoning, and bounded ambiguity that can be resolved without changing the plan's design boundary. -- Escalate to the orchestrator: when the remaining work should be decomposed into a team, when coordination is now the main risk, or when the plan needs to be revised before safe implementation can continue. +- Stay local: difficult implementation, careful cross-file reasoning, and bounded ambiguity that is resolvable without changing the plan's design boundary. +- Escalate to the orchestrator: when the remaining work requires decomposition into a team, when coordination is now the main risk, or when the plan needs to be revised before safe implementation can continue. - Do not summon more seniors yourself. Re-decomposition is the orchestrator's responsibility. - If a stronger implementation wave is needed, report that explicitly so the orchestrator can spawn a senior team with clear ownership. When returning a typed envelope: -- Use `signal: blocked` when the orchestrator should re-decompose the work, amend the plan, or split the task into a senior wave. +- Use `signal: blocked` when the orchestrator must re-decompose the work, amend the plan, or split the task into a senior wave. - Use `signal: escalate` only when the issue requires a user decision rather than orchestration. - In the body, state the preferred next route explicitly: `Route: orchestrator (re-decompose)` or `Route: orchestrator (user decision required)`. diff --git a/agents/worker.md b/agents/worker.md index d0e4b93..5318d84 100644 --- a/agents/worker.md +++ b/agents/worker.md @@ -33,6 +33,6 @@ If this task is more complex than it appeared (more files involved, unclear inte - Do not silently turn a plan gap into a design decision. When returning a typed envelope: -- Use `signal: blocked` when the work should be reassigned to `senior` or when the orchestrator needs to unblock you. +- Use `signal: blocked` when the work must be reassigned to `senior` or when the orchestrator needs to unblock you. - Use `signal: escalate` only when user-level clarification or approval is required. - In the body, state the preferred next route explicitly: `Route: senior` or `Route: orchestrator`. diff --git a/generate.sh b/generate.sh index 5b8002a..5b7c080 100755 --- a/generate.sh +++ b/generate.sh @@ -400,6 +400,12 @@ map_effort() { map_sandbox_mode() { local permission_mode="$1" local tools="$2" + local override="${3:-}" + + if [ -n "$override" ] && [ "$override" != "null" ]; then + echo "$override" + return + fi # plan mode is read-only if [ "$permission_mode" = "plan" ]; then @@ -425,6 +431,12 @@ map_sandbox_mode() { # --------------------------------------------------------------------------- map_default_sandbox_mode() { local default_mode="$1" + local override="${2:-}" + + if [ -n "$override" ] && [ "$override" != "null" ]; then + echo "$override" + return + fi case "$default_mode" in plan) echo "read-only" ;; @@ -563,6 +575,7 @@ generate_codex() { [ -n "$agent_id" ] || continue local name description model effort permission_mode tools disallowed_tools + local codex_sandbox_override local agent_skills local src_file dst_file name="$(yq -r ".agents.items.${agent_id}.name" "$TEAM_YAML")" @@ -572,6 +585,7 @@ generate_codex() { permission_mode="$(yq -r ".agents.items.${agent_id}.permission_mode // \"\"" "$TEAM_YAML")" tools="$(yq -r ".agents.items.${agent_id}.tools[]" "$TEAM_YAML" | csv_from_yaml_array)" disallowed_tools="$(yq -r ".agents.items.${agent_id}.disallowed_tools // [] | .[]" "$TEAM_YAML" | csv_from_yaml_array)" + codex_sandbox_override="$(yq -r '.targets.codex.sandbox_mode // ""' "$SETTINGS_SHARED_YAML")" agent_skills="$(yq -r ".agents.items.${agent_id}.skills[]" "$TEAM_YAML")" src_file="$SCRIPT_DIR/$(yq -r ".agents.items.${agent_id}.instruction_file" "$TEAM_YAML")" dst_file="$CODEX_AGENTS_DIR/${name}.toml" @@ -580,7 +594,7 @@ generate_codex() { local codex_model codex_effort codex_sandbox codex_model="$(map_model "$model")" codex_effort="$(map_effort "${effort:-medium}")" - codex_sandbox="$(map_sandbox_mode "$permission_mode" "$tools")" + codex_sandbox="$(map_sandbox_mode "$permission_mode" "$tools" "$codex_sandbox_override")" # Extract and expand body with Codex variable values local body expanded_body @@ -664,17 +678,19 @@ TOML echo "" echo "Generating codex/config.toml..." - local default_mode runtime_approval codex_approval_override codex_network_access + local default_mode runtime_approval codex_approval_override codex_network_access codex_sandbox_override default_mode="$(map_filesystem_intent_to_claude_mode "$(yq -r '.runtime.filesystem' "$SETTINGS_SHARED_YAML")")" runtime_approval="$(yq -r '.runtime.approval' "$SETTINGS_SHARED_YAML")" + codex_sandbox_override="$(yq -r '.targets.codex.sandbox_mode // ""' "$SETTINGS_SHARED_YAML")" codex_approval_override="$(yq -r '.targets.codex.approval_policy // ""' "$SETTINGS_SHARED_YAML")" codex_network_access="$(yq -r '.targets.codex.network_access // .runtime.network_access // false' "$SETTINGS_SHARED_YAML")" local config_sandbox config_approval - config_sandbox="$(map_default_sandbox_mode "$default_mode")" + config_sandbox="$(map_default_sandbox_mode "$default_mode" "$codex_sandbox_override")" config_approval="$(map_approval_policy "$runtime_approval" "$codex_approval_override")" - cat > "$CODEX_DIR/config.toml" < "$CODEX_DIR/config.toml" < "$CODEX_DIR/config.toml" < worker` when the task is no longer mechanical but still well-defined - `worker -> senior` when the task is implementable but needs stronger judgment or broader reasoning - `grunt` or `worker` -> orchestrator when the real issue is a plan gap, changed scope, or missing requirement -- `senior -> orchestrator` when the work should be re-decomposed into a senior wave/team or the plan boundary must change +- `senior -> orchestrator` when the work requires re-decomposition into a senior wave/team or when the plan boundary must change ### Step 6 — Review @@ -146,7 +146,7 @@ Do not advance until both verdicts are collected. - **Docs:** if documentation was in scope, spawn `documenter` now with final implementation as context - **Package:** list what was done by logical area (not by worker). Include all file paths. Surface PASS WITH NOTES caveats as a brief "Heads up" section. -Lead with the result. Don't expose worker IDs, wave counts, or internal mechanics. When subagent results return to your context, prefer concise summaries over verbatim output — the full detail is in the code, not the report. +Lead with the result. Don't expose worker IDs, wave counts, or internal mechanics. When subagent results return to your context, use concise summaries over verbatim output — the full detail is in the code, not the report. --- @@ -211,7 +211,7 @@ The actual write protection for read-only agents comes from `disallowedTools: Wr **Reviewer and auditor must be spawned in a single response.** **All researchers must be spawned in a single response.** -Spawning agents sequentially when they could run in parallel is a protocol violation, not a style choice. Parallel dispatch reduces wall-clock latency proportionally — N agents in parallel complete in the time of the slowest, not the sum of all. +Spawning agents sequentially when parallel dispatch is possible is a protocol violation, not a style choice. Parallel dispatch reduces wall-clock latency proportionally — N agents in parallel complete in the time of the slowest, not the sum of all. ### Git flow diff --git a/spec/agent-runtime-v1.md b/spec/agent-runtime-v1.md index a172ff8..4e6d275 100644 --- a/spec/agent-runtime-v1.md +++ b/spec/agent-runtime-v1.md @@ -55,6 +55,7 @@ Target blocks are escape hatches, not the main schema. Current target-specific fields: - `targets.claude.claude_md_excludes` +- `targets.codex.sandbox_mode` (optional override of derived sandbox mode) - `targets.codex.approval_policy` (optional override of derived approval) - `targets.codex.network_access` (optional override of derived network access) @@ -63,7 +64,7 @@ Authority rules: - `runtime.approval` and `runtime.network_access` are the portable source of truth. - Codex target fields exist for explicit compatibility overrides and should normally be omitted. - When Codex target fields are set, they intentionally override the derived Codex value. -- In this repo, `targets.codex.approval_policy` and `targets.codex.network_access` are intentionally set so Codex runs with `approval_policy = "never"` and network enabled by default. This is a deliberate target-specific compatibility choice, not an accidental divergence. +- In this repo, `targets.codex.sandbox_mode`, `targets.codex.approval_policy`, and `targets.codex.network_access` are intentionally set so Codex runs with `sandbox_mode = "danger-full-access"`, `approval_policy = "never"`, and network enabled by default. This is a deliberate target-specific compatibility choice, not an accidental divergence. ## Adapter rules @@ -88,10 +89,11 @@ Lossiness: - `runtime.filesystem = read-only` -> `sandbox_mode = "read-only"` - `runtime.filesystem = workspace-write` -> `sandbox_mode = "workspace-write"` +- `targets.codex.sandbox_mode` -> overrides the derived `sandbox_mode` - `runtime.approval = manual` -> `approval_policy = "on-request"` (unless overridden) - `runtime.approval = guarded-auto` -> `approval_policy = "untrusted"` (unless overridden) - `runtime.approval = full-auto` -> `approval_policy = "never"` (unless overridden) -- `runtime.network_access` -> `[sandbox_workspace_write].network_access` +- `runtime.network_access` -> `[sandbox_workspace_write].network_access` when `sandbox_mode = "workspace-write"` Lossiness: