Agent Loop
title: Agent Loop
Section titled “title: Agent Loop”Agent Loop
Section titled “Agent Loop”Layer 3. Multi-step orchestration. Source: src/agent/.
Purpose and responsibilities
Section titled “Purpose and responsibilities”- Drive a conversation through successive
LLMClient.complete()/LLMClient.stream()calls until the model returns no tool calls (orstop()is called). - Maintain
ConversationHistory(messages +ContextRegistry+ token estimates). - Dispatch tool calls and feed results back as tool-result messages.
- Enforce
maxStepscap, guardrail tripwires, permission policy, and human-in-the-loop approval gates. - Emit structured run/step/tool hooks for observability and control.
- Provide
dump()/AgentLoop.restore()for checkpoint/resume across process restarts.
Does NOT: own a network queue, call fetch directly, or hold LLM credentials.
All HTTP still flows through LLMClient’s EngineFetch.
Key types
Section titled “Key types”src/agent/types.ts
Section titled “src/agent/types.ts”interface AgentTool { definition: Tool; // schema sent to the LLM execute: (args, ctx: ToolExecutionContext) => Promise<string | ContentPart[]>;}
interface ToolExecutionContext { step: number; callId: string; signal: AbortSignal; // pre-wired to toolTimeout AbortController metrics: Map<string, { value: number | string | boolean; type: string }>; trace?: TraceContext;}
interface AgentRunReport { id: string; // runId (UUID) model: string; startedAt: number; completedAt: number; totalMs: number; reason: 'done' | 'stopped' | 'error' | 'guardrail' | 'max_steps'; userMessage: string | ContentPart[] | Message[]; finalText: string; error?: string; steps: StepReport[]; stepCount: number; toolCallCount: number; totalUsage: Usage; totalLlmTimeMs: number; totalToolTimeMs: number;}
interface StepReport { index: number; type: 'initial' | 'tool_followup'; llmLatencyMs: number; usage: Usage; finishReason: string; toolCalls: ToolCallReport[]; toolTotalMs: number;}
interface ToolCallReport { callId: string; toolName: string; arguments: Record<string, unknown>; resultSizeBytes: number; latencyMs: number; skipped: boolean; error: string | null; metrics: Record<string, { value: number | string | boolean; type: string }>;}
type AgentStreamEvent = | { type: 'step_start'; step: number } | { type: 'text'; text: string } | { type: 'thinking'; text: string } | { type: 'tool_call_start'; step: number; callId: string; toolName: string; arguments: ... } | { type: 'tool_call_end'; step: number; callId: string; latencyMs: number } | { type: 'step_end'; step: number; usage: Usage; latencyMs: number } | { type: 'done'; response: CompletionResponse };src/agent/loop-step-state.ts
Section titled “src/agent/loop-step-state.ts”interface StepState { stepText: string; stepThinking: string; stepToolCalls: ToolCallPart[]; toolCallAccum: Map<string, ToolCallAccumEntry>; // id → { id, name, args, _meta? } stepUsage: Usage; stepFinishReason: string;}StepState is a fresh mutable accumulator per streaming step, passed through
pure helpers in src/agent/loop-internals.ts. The main loop body does not
hold closure state across streaming events.
src/agent/loop-config.ts
Section titled “src/agent/loop-config.ts”AgentLoopConfig accepts: client, system (string or thunk),
context, tools, history, hooks, maxTokens, temperature, thinking,
cache, parallelToolCalls, toolTimeout, maxSteps, guardrails,
policy, approve, checkpoint.
ConversationHistory (src/agent/history.ts)
Section titled “ConversationHistory (src/agent/history.ts)”class ConversationHistory { readonly registry: ContextRegistry; // layered system-prompt composition id: string; length: number; system: string | undefined; // composed read / legacy-layer write
append(message: Message, meta?: { model?, usage?, latencyMs? }): HistoryEntry messages(): Message[] estimatedTokens(): number recordActualUsage(inputTokens: number): void spliceRange(from: number, to: number, replacement: Message): HistoryEntry[] fork(newId?: string): ConversationHistory export(): HistorySnapshot static import(snapshot: HistorySnapshot): ConversationHistory}Token estimation hybrid (estimatedTokens()):
- Keeps
_lastActualTotaland_lastActualEntryIndexanchored to the last provider-reportedinputTokens. - For entries at indices
> _lastActualEntryIndex: estimates each message viaestimateTokens(content)(4 chars per token, with 250-token penalty for images/audio/video). - If no anchor yet: estimates everything from scratch including
systemtext. - Anchor update: when
append()receives an assistant message withmeta.usage.inputTokens > 0, it calls_lastActualTotal = inputTokensand_lastActualEntryIndex = entries.length - 1(the pre-append index). spliceRangeresets the anchor (_lastActualTotal = 0,_lastActualEntryIndex = -1) when the replaced range includes or precedes the anchor index — ContextGuard compaction invalidates the prior measurement.
spliceRange(from, to, replacement): ContextGuard’s entry point for
compacting history. Replaces entries [from, to) with a single synthetic
message; re-indexes all entries; resets the token anchor.
fork(): deep-clones entries and registry layers without copying event
subscribers. Used by delegate.ts to hand an independent history copy to a
sub-agent.
export() / static import(): round-trip via HistorySnapshot (includes
entries, registry snapshot, metadata). import handles legacy snapshots that
have a flat system field instead of a registry.
ContextRegistry (src/agent/context-registry/registry.ts)
Section titled “ContextRegistry (src/agent/context-registry/registry.ts)”A layered key-value store for ContextLayer objects:
interface ContextLayer { name: string; content: string | ContentPart[]; priority: number; // lower = earlier in composed output tags: string[]; owner: string; mergeParent?: boolean; // additive vs. override when parent has same name metadata?: Record<string, unknown>;}flat({ tag?, includeParent? }) sorts layers by priority and concatenates
their text content with separator (default '\n\n'). The result is the
composed system prompt string.
Parent chain: setParent(parent) wires event bubbling — changes in the parent
are reflected in child renders. Cycle detection throws.
Named layers written by the SDK (src/agent/context-registry/layers.ts):
agentloop.system— priority 10 (stable agent role, prompt-cache-friendly prefix)_legacy_system— priority 50 (backward-compathistory.systemsetter)agentloop.context— priority 100 (dynamic per-task context)
The priority ordering ensures stable content precedes dynamic content, which is critical for Anthropic prompt caching (the breakpoint is placed after the stable prefix).
AgentLoop class (src/agent/loop.ts)
Section titled “AgentLoop class (src/agent/loop.ts)”Construction
Section titled “Construction”- Validate
client, resolve_system(string) or_systemThunk(function). - Initialize
_tools: Map<string, AgentTool>keyed bytoolKey(t)— the tool definition’snamefor function tools (MCP tools use"namespace__name"), or the builtintypestring for builtin tools. - Create or rehydrate
_history: ConversationHistory. writeAgentLoopSystem(registry, system, 'agent-loop')andwriteAgentLoopContext(registry, context, 'agent-loop')to publish the initial layers.this.id = this._history.id— agent identity is the conversation ID.- Emit
onAgentCreate.
complete(input, options?) flow
Section titled “complete(input, options?) flow”beginRun(input) guard _running flag; set _stopRequested=false; new AbortController if _systemThunk: await to get current system, update registry if changed mint runId + timestamps emit onRunStart append user input to _history
while (stepCount < _maxSteps): if _stopRequested: reason='stopped'; break
emit onStepStart
composedSystem = history.registry.flat({ tag: 'system' }) run input guardrails (all kind='input', in order)
lastResponse = await client.complete(history.messages(), { system: composedSystem, tools: toolDefinitions(options), // own tools merged with options.tools ctx: { ...options.ctx, conversationId: history.id }, signal: options.signal ?? _abortController.signal, ...maxTokens, temperature, thinking, cache })
history.append({ role: 'assistant', content: lastResponse.content }, { model, usage, latencyMs })
emit onStepComplete run output guardrails (all kind='output', in order)
if hasToolCalls: executeToolCalls(runId, stepCount, toolCalls, reports, trace) → returns ContentPart[] (tool results) history.append({ role: 'tool', content: toolResults }) stepCount++ continue else: break
finalizeRun(...) push AgentRunReport to _reports emit onRunCompletereturn CompletionResponse (last step's content + totalUsage)stepCount starts at 0. maxSteps cap check happens AFTER executing tools, so
the check is: if (stepCount >= _maxSteps) { reason='max_steps'; break }.
Constant: DEFAULT_MAX_STEPS = 16 (src/agent/loop.ts:1272).
stream(input, options?) flow
Section titled “stream(input, options?) flow”Identical structure to complete(), except:
yield { type: 'step_start', step }at each step start.client.stream(...)is iterated; eachStreamEventis passed toaccumulateStreamEvent(event, state)(loop-internals.ts) which updatesStepStateand returns anAgentStreamEvent | nullto yield upstream.- After the stream loop:
finalizeUnendedToolCalls(state)andbuildStepResponse(state, model, stepStart). - Tool call events are yielded as
tool_call_startandtool_call_end. yield { type: 'step_end', step, usage, latencyMs }.yield { type: 'done', response: finalResponse }at the end.
Stream event accumulation (src/agent/loop-internals.ts)
Section titled “Stream event accumulation (src/agent/loop-internals.ts)”accumulateStreamEvent(event, state):
'text'→state.stepText += text; return{ type: 'text', text }.'thinking'→state.stepThinking += text.'tool_call_start'→ push{ id, name, args: '' }totoolCallAccum.'tool_call_delta'→ look up accumulator byevent.id;acc.args += arguments.'tool_call_end'→ find accumulator byevent.id(fallback: first unfinished); parseacc.argsas JSON (silent empty-object on parse failure); push tostate.stepToolCalls.'usage'→state.stepUsage = event.usage.'done'→state.stepFinishReason = event.finishReason.
finalizeUnendedToolCalls(state): after the stream loop, iterates toolCallAccum
and pushes any accumulator entries not already in stepToolCalls. This handles
Anthropic/OpenAI streaming where tool_call_end may not always fire.
buildStepResponse(state, model, stepStart): builds a CompletionResponse
from accumulated stepText, stepThinking, stepToolCalls, stepUsage.
finishReason is 'tool_use' if there are any tool calls, else state.stepFinishReason.
Tool execution (src/agent/loop.ts:695)
Section titled “Tool execution (src/agent/loop.ts:695)”executeToolCalls(runId, step, toolCalls, reports, trace): if parallelToolCalls && toolCalls.length > 1: return Promise.all(toolCalls.map(executeSingleTool)) else: sequential for-loopexecuteSingleTool(tc, ...):
- Emit
onToolCallStart(async). Hook can setskip=trueoroverrideResult. - If
skip: returnbuildSkippedResult(tc, overrideResult, reports). - If
overrideResultset: returnbuildOverriddenResult(tc, result, reports). lookupToolOrError(tc.name, _tools, ...):- Found: return
{ found: true, tool }. - Not found: emit
onToolCallError+onWarning, push error report, return{ found: false, errorResult }(error result fed back to model, not a throw).
- Found: return
- Permission policy check (
_policy.check('agent', { kind: 'tool', toolName }, 'execute')):decision.allow=true→ proceed.decision.ask=true→runApprovalGate(...).decision.allow=false→buildDeniedResult(tc, reason, reports).
executeWithTimeout(tool, tc, baseCtx, _toolTimeout):- Creates
AbortController; setssetTimeout(abort, timeoutMs). - Calls
tool.execute(tc.arguments, ctx). clearTimeoutinfinally.
- Creates
- On success:
buildSuccessResult(...)→ emitonToolCallComplete, push report. - On error:
handleToolError(...):- Emit
onToolCallError. Hook may setcontinueOnError=false(throws) orfallbackResult. - Return
{ type: 'tool_result', id, content: errMsg, isError: true }.
- Emit
Tool results are always fed back to the model as ContentPart items under
{ role: 'tool' } — even errors (the model can recover). Thrown errors from
continueOnError=false propagate out of the step loop and set reason='error'.
Guardrails (src/agent/guardrail-types.ts)
Section titled “Guardrails (src/agent/guardrail-types.ts)”interface Guardrail { name: string; kind: 'input' | 'output'; check(ctx: GuardrailCheckContext): Promise<GuardrailDecision>;}
interface GuardrailDecision { pass: boolean; tripwire?: boolean; // if true AND !pass: halt the run reason?: string; severity?: 'low' | 'medium' | 'high' | 'critical';}Input guardrails fire before each LLM call (runInputGuardrails); output
guardrails fire after each step response (runOutputGuardrails). First tripwire
stops the loop with reason='guardrail'; the trip reason becomes finalText.
onGuardrailTriggered is emitted for monitoring.
Human-in-the-loop approval (src/agent/loop.ts:843)
Section titled “Human-in-the-loop approval (src/agent/loop.ts:843)”runApprovalGate(tc, ...):
- Build
ApprovalRequestandPendingToolCall; push to_pendingToolCalls. - Emit
onApprovalRequested. - If
_checkpoint != null:_checkpoint.set('agent-loop:' + id, dump())— persists state at the suspension point. - Await
resolveApproval(req):- Check
_prefedApprovals.get(req.callId)(consumed once — resume path). - Fall back to
_approve(req). - If no approver configured: default to
'deny'.
- Check
- Remove from
_pendingToolCalls. EmitonApprovalResolved. decision.decision:'approve'→ execute the tool normally (or useoverrideResult).'skip'→buildSkippedResult.'deny'→buildDeniedResult.
Resume after restart:
AgentLoop.restore(snapshot, config)— rehydrates fromAgentLoopSnapshot.agent.resumeWithApproval(callId, decision)— removes from_pendingToolCalls, stores in_prefedApprovals.- Re-run
agent.complete(...)with an approver that returns the pre-fed decision.
Dump / restore (src/agent/loop.ts:1145)
Section titled “Dump / restore (src/agent/loop.ts:1145)”dump() produces AgentLoopSnapshot (version 1):
interface AgentLoopSnapshot { version: 1; system: string; context: string; history: HistorySnapshot; toolNames: string[]; reports: AgentRunReport[]; metadata: Record<string, unknown>; createdAt: number; savedAt: number; pendingToolCalls?: PendingToolCall[];}AgentLoop.restore(snapshot, config):
- Constructs a new
AgentLoopfrom snapshot’ssystem,context,history. - Restores
_reports,_metadata,_pendingToolCalls. - Emits
onWarningfor tools present in the snapshot but absent fromconfig.tools(code: 'tool_removed') and vice versa (code: 'tool_added').
Delegate, chain, consolidate helpers
Section titled “Delegate, chain, consolidate helpers”src/helpers/delegate.ts, chain.ts, consolidate.ts are orchestration
helpers for multi-agent patterns. They are NOT part of AgentLoop itself.
delegate— creates a sub-agent, hands it a forked history, returns its result.chain— sequences agents; each receives the prior’s output in context.consolidate— aggregates results from parallel runs into one summary.
Extension points
Section titled “Extension points”- Adding tools:
agent.addTool(tool)/AgentLoopConfig.tools. Tool is keyed bytoolKey(t)(fromsrc/agent/tool-key.ts). - Guardrails: implement
Guardrailinterface; pass inconfig.guardrails. - Permission policy: implement
PermissionPolicyfromsrc/plugins/permissions/policy.ts; pass inconfig.policy. - System prompt layers: write named layers to
agent.history.registryfrom any plugin that has access to it. - Hooks: subscribe on
agent.hooks(HookBus) — agent events, tool events, and all network/LLM events from the shared bus.
Key invariants
Section titled “Key invariants”- The loop appends messages in strict alternating order:
user → assistant → [tool] → assistant → [tool] → ...Out-of-order messages would confuse providers. stepCountstarts at 0 and only increments AFTER tool execution. The checkif (stepCount >= _maxSteps)is evaluated at the BOTTOM of the loop (after tool dispatch), not at the top — so a run always executes at least one LLM step even withmaxSteps = 1.stop()sets_stopRequested = trueand calls_abortController.abort(). The loop exits after the CURRENT step completes (not mid-stream).StepStateis created fresh per step (makeStepState()); no closure state leaks between steps in the streaming path.- Tool-call IDs flow from
ToolCallPart.id(assistant message) toToolResultPart.id(tool message). Providers match them by ID. - The loop never calls the network directly; it calls
client.complete()orclient.stream(), which route throughEngineFetch. beginRunthrows'AgentLoop is already running'if_runningis true, preventing concurrent runs on the same instance.