Advisor
The advisor is an optional per-harness hook that lets a stronger model answer focused strategic questions during a trial.
How it works
Inside a harness run, an adapter can issue an AdvisorRequest whenever it hits a decision point:
@dataclass(frozen=True)
class AdvisorRequest:
goal: str # what the agent is trying to do overall
problem: str # the specific thing it is stuck on now
attempt: str | None = NoneThe advisor returns a structured response:
@dataclass(frozen=True)
class AdvisorResponse:
advice: str # strategic guidance
suggested_action: str # one concrete next step
confidence: float # 0.0 to 1.0
reasoning: str # why this suggestionThe advisor returns guidance only. It does not write output files or invoke tools. The agent decides what to do with the response.
Configuration
Advisor settings live in the harness config, not the task:
[advisor]
model = "au.anthropic.claude-sonnet-4-6" # the advising model
max_uses = 5 # calls allowed per trial
max_response_tokens = 500 # keep advice concise
context_window = 10 # recent turns to include
enabled = truemax_uses is a hard cap per trial.
Per-adapter context
Different adapters give the advisor different context. A tool-loop adapter might show recent tool calls and their results. An RLM adapter might show recent REPL commands and scratchpad notes. The AdvisorContextStrategy protocol lets each adapter curate exactly what the advisor sees:
class AdvisorContextStrategy(Protocol):
def build_advisor_context(
self,
request: AdvisorRequest,
conversation_state: Any,
) -> list[dict[str, str]]: ...This is why the same advisor model can sensibly coach both a REPL agent and a tool-loop agent: the adapter shapes the context to match its own execution model.
Usage tracking
Advisor calls are tracked separately from the main agent's token budget:
class AdvisorUsageStats(StrictModel):
calls_made: int
calls_remaining: int
advisor_input_tokens: int
advisor_output_tokens: int
advisor_cost_usd: floatReports can surface these alongside the base cost so readers can see the split: "this run cost USD 0.42 to execute, plus USD 0.18 in advisor calls".
Failure handling
If the advisor call fails (network error, provider timeout, parsing failure), the agent receives a safe fallback response:
AdvisorResponse(
advice="Advisor unavailable - proceed on your own judgement",
suggested_action="continue",
confidence=0.0,
reasoning="advisor call failed",
)A fallback response means the trial continues without useful guidance; the zero confidence indicates the advice should be ignored.
When to use it
The advisor pays off when the base model is fast and cheap but stumbles on strategic choices, such as a Haiku executor with Sonnet as the coach. Two signals that it is worth wiring up:
- Your agent consistently makes early wrong-direction choices that ripple into the rest of the trial
- The cheap model is competent at execution but chooses poorly; raising its ceiling matters more than raising its throughput
For a strong base model, the advisor is usually overkill. Spend the tokens on longer turn budgets. For a weak base model, the advisor can lift a task's success rate at a fraction of the cost of upgrading the executor.