Agent Orchestration Policy
Contents
- Core Principle
- Universal Task Pipeline
- Confidence Policy
- Low Confidence Escalation
- Debate Policy
- Parallel Execution Policy
- Anti-Hallucination Policy
- Verification Policy
- Per-Role Agent Files
This document defines orchestration policy. The file-backed mechanism that makes the policy observable and auditable — scoped work units, quality gates, evidence ledger, and resource tracking — is specified in
WORK_UNIT_ORCHESTRATION.md.
Core principle
The main AI agent is always the orchestrator, not just an implementer.
Main Agent = Orchestrator / Planner / Judge
Sub Agents = Investigator / Implementer / Reviewer / Verifier
Strongest Model = Final reasoning authority for high-risk decisions
Universal task pipeline
Receive prompt
↓
Clarify intent internally
↓
Identify assumptions and risks
↓
Estimate confidence
↓
If confidence < threshold → bounded investigation
↓
If still low → recommend next best action with evidence
↓
Plan
↓
Split tasks
↓
Run non-overlapping work in parallel
↓
Execute
↓
Verify
↓
Report result, evidence, uncertainty
Confidence policy
Do not use confidence < 1 as an infinite loop trigger. Perfect certainty is rare.
Use threshold by risk level:
Formatting / documentation: 0.70
Simple code change: 0.80
Feature implementation: 0.85
Architecture decision: 0.90
Security / auth / payment: 0.95
Production deployment: 0.95+
If confidence is below threshold, the orchestrator must investigate within limits.
Recommended limits:
Max investigation rounds: 3
Max debate rounds: 2
Max retry per failed command: 2
Default max parallel agents: 3
Low confidence escalation
The orchestrator must not ask the user an open-ended question such as:
What should I do next?
Instead, it must recommend the next best action.
Required format:
Current confidence:
Evidence found:
Evidence missing:
Why confidence is low:
Recommended next action:
Reasoning:
Risk of proceeding:
Risk of not proceeding:
Verification plan:
Ask for approval only when the next action has side effects or elevated risk.
Approval is required before:
- installing dependencies
- running unknown scripts
- modifying CI/CD
- changing authentication or authorization
- changing payment, billing, or security logic
- deleting files
- pushing commits
- opening pull requests
- deploying
- enabling a new external skill
- granting network, filesystem, or credential access
Debate policy
For complex or high-risk tasks, run a debate before execution.
Minimum roles:
Planner Agent
Domain Specialist Agent
Skeptic / Risk Reviewer Agent
Verifier Agent
Debate questions:
- What are we trying to achieve?
- What evidence do we have?
- What assumptions exist?
- What can go wrong?
- What alternatives exist?
- Which approach is safest and most maintainable?
- How will the result be verified?
Parallel execution policy
Parallel work is allowed only when scopes do not overlap.
Safe examples:
- Backend API analysis
- Frontend UI analysis
- Test coverage review
- Documentation review
Unsafe examples:
- Two agents editing the same service
- One agent refactoring while another adds features in the same files
- CI/CD changes without coordination
Anti-hallucination policy
Agents must not invent:
- APIs
- library behavior
- file contents
- business requirements
- user intentions
- test results
- performance results
- security guarantees
Every factual claim about the repository must be backed by:
- file path
- code reference
- command output
- test result
- documentation source
Verification policy
Before marking a task complete, verify with appropriate checks:
- read diff
- run tests
- run lint
- run type check
- run build
- inspect generated files
- check acceptance criteria
- ask reviewer agent to inspect
Final report must include:
- what changed
- why it changed
- how it was verified
- what remains uncertain
- recommended next action
Per-role agent files
Each engine reads per-role agent files in a different shape, so the orchestrator
renders all three from the same canonical role definition
(src/agents/role.ts → src/agents/render.ts → src/agents/role-templates.ts).
vf init --agents writes them all; vf init --engine <e> writes only the one
the engine reads.
.claude/agents/<role>.md # Claude Code (Markdown body + YAML frontmatter)
.codex/agents/<role>.toml # Codex CLI (TOML: name, model, prompt, tools)
.github/agents/<role>.md # Copilot CLI (Markdown: frontmatter + body)
render.ts is the single source of truth for the three formats; it enforces
the role taxonomy (project-fit roles vs tool/tweak roles, see
src/skills/SKILL_TAXONOMY.md) and the cross-platform rule (path comparisons
with path.sep — see HOOKS_AND_GUARDRAILS.md). A role without a matching
render target is reported at init time rather than silently dropped. When
vf init runs and a per-role renderer is present, the per-role files are
written alongside the engine-level files; vf init --engine <e> writes only
the engine’s matching format.
Related: Work-Unit Orchestration · Security Model Edit this page on GitHub