Internal Tools
title: Internal Tools
Section titled “title: Internal Tools”Internal Tools
Section titled “Internal Tools”Source: src/plugins/internal-tools/.
Purpose and responsibilities
Section titled “Purpose and responsibilities”A self-contained tool catalog and execution runtime, separate from the AgentTool type
used by AgentLoop. Internal tools are first-class, versioned, namespaced callable units
stored in pluggable backends.
Responsibilities:
- Discover and search tools across multiple backends (local in-memory registry, remote catalog, etc.).
- Select the best model for a tool call based on benchmark-derived compatibility data
(
CompatFile) with fallback to tool-declaredmodelPreference. - Execute tools through
InternalToolRunner, which poolsLLMClientinstances per model, injects them into the tool’s execution context, validates input/output schemas, and emits observability hooks. - Provide
defineLLMToolanddefineTemplateToolfactory functions for building LLM-backed tools declaratively without writingexecutelogic manually.
Does NOT replace AgentTool. Internal tools are for SDK-internal or application-defined
operations that need their own LLM model selection and cost tracking, not for exposing
external tool schemas to the model during a chat loop.
Key types (src/plugins/internal-tools/types.ts)
Section titled “Key types (src/plugins/internal-tools/types.ts)”interface InternalTool { id: string; // "namespace:name@semver" format (see id.ts) namespace: string; name: string; version: string; description: string; inputSchema: JsonSchema; outputSchema?: JsonSchema; execute: (input: unknown, ctx: InternalToolContext) => Promise<unknown>; modelPreference?: ModelPreference; recommendedThreshold?: number; // min avg benchmark score for compat; default 100 signature?: string; signedBy?: string; tags?: string[];}
interface ModelPreference { preferredModel?: string; // "provider/model" string fallbackModels?: string[]; maxTokens?: number; temperature?: number;}
interface InternalToolContext { hooks?: HookBus; client?: unknown; // LLMClient pinned to chosen model (runner provides) modelId?: string; // "provider/model" (runner provides) toolId?: string; counter?: TokenCounter; // HybridTokenCounter (runner provides) recordLLMResponse?: (response: CompletionResponse) => void; [key: string]: unknown; // open bag for tool-specific extras}
interface ToolBackend { readonly name: string; list(): Promise<InternalTool[]>; get(id: string): Promise<InternalTool | null>;}
type CompatFile = Record<string, { recommended: string[] }>;InternalToolContext.recordLLMResponse is called by tool implementations (e.g.
defineLLMTool) after each internal LLM call. The runner captures response.usage so
cost tracking and benchmark metrics cover the tool’s own LLM spend, not just the outer
agent call.
Tool ID format (src/plugins/internal-tools/id.ts)
Section titled “Tool ID format (src/plugins/internal-tools/id.ts)”Format: "namespace:name@semver" — e.g. "orxa:summarize@1.0.0".
Regex: /^([a-z0-9_-]+):([a-z0-9_-]+)@([0-9]+\.[0-9]+\.[0-9]+)$/
Functions:
parseToolId(id)→ParsedToolId— throws on invalid format.tryParseToolId(id)→ParsedToolId | null— swallows errors.formatToolId(namespace, name, version)→ validates each component separately before concatenating.matchesVersion(requested, actual)— exact equality only (no semver range resolution).idWithoutVersion(id)→"namespace:name"— strips the@versionsuffix.
ToolRegistry (src/plugins/internal-tools/registry.ts)
Section titled “ToolRegistry (src/plugins/internal-tools/registry.ts)”Manages a list of ToolBackend instances with an invalidatable flat cache.
class ToolRegistry { addBackend(backend: ToolBackend): this // throws on duplicate name; invalidates cache removeBackend(name: string): boolean // invalidates cache invalidate(): void async get(id: string): Promise<InternalTool | null> async list(): Promise<InternalTool[]> async find(filter: ToolFilter, catalog?: ModelCatalog): Promise<InternalTool[]> async search(query: string, opts?: SearchOptions): Promise<InternalTool[]> modelsFor(toolId: string, opts?: { minScore?: number; catalog?: ModelCatalog }): string[]}Cache: private cache: Map<string, InternalTool> | null. ensureCache() builds the
flat map by iterating backends in registration order. First-added backend wins on id
conflicts — if two backends export a tool with the same id, only the first is visible.
Cache is atomically replaced after rebuild (not updated in place).
Search scoring (scoreMatch, private method): exact name = 100, name prefix = 80,
name substring = 60, exact tag = 50, tag substring = 40, description substring = 20.
No embedding-based semantic search. Results are sorted by score descending and capped by
opts.limit (default 20).
find filter fields (ToolFilter):
namespace: exact match ontool.namespace.prefix:tool.id.startsWith(prefix).tag:tool.tags?.includes(tag).model: checkscatalog.get(provider, model)?.toolCompat?.[tool.id].score >= minScore(default minScore 0.8).
modelsFor(toolId, opts): iterates catalog.list(), reads
(info as { toolCompat? }).toolCompat?.[toolId].score, returns "${provider}/${model}"
strings scoring at or above minScore.
LocalToolBackend (src/plugins/internal-tools/backends/local.ts)
Section titled “LocalToolBackend (src/plugins/internal-tools/backends/local.ts)”In-memory Map<string, InternalTool>. The primary backend for statically registered
tools. Built-in tools are registered via LocalToolBackend in
src/plugins/internal-tools/builtin/builtin.ts.
InternalToolRunner (src/plugins/internal-tools/runner/runner.ts)
Section titled “InternalToolRunner (src/plugins/internal-tools/runner/runner.ts)”interface InternalToolRunnerConfig { hooks: HookBus; registry: ToolRegistry; catalog?: ModelCatalog; engine?: EngineHandle; // required for LLM-backed tools apiKeys: Partial<Record<ProviderName, string>>; defaultModel?: string; compat?: CompatFile; // benchmark-derived recommended chains clientOptions?: Partial<LLMClientConfig>; counter?: TokenCounter; // auto-created from catalog when absent}Model resolution (resolveModels, private)
Section titled “Model resolution (resolveModels, private)”Precedence order, deduped with Set:
compat?.[tool.id]?.recommended— benchmark-derived, cheapest-first ordering.tool.modelPreference?.preferredModel.tool.modelPreference?.fallbackModels.config.defaultModel— only when all other sources are empty.
modelId strings must be "provider/model" format. parseModelId splits on the first
/ and throws when not found or at position 0.
Key availability check (assertKeyAvailability, private)
Section titled “Key availability check (assertKeyAvailability, private)”Runs BEFORE execution (after tool lookup). Collects the provider set from all model ids in
the resolved list, intersects with config.apiKeys, and throws when no usable provider
exists. This prevents silently trying every model only to fail on the last one.
Client pooling (getClient, private)
Section titled “Client pooling (getClient, private)”Pool key is provider by default. When catalog.get(provider, model)?.requiresDedicatedClient
is true, the pool key is "provider/model" to prevent sharing clients across models that
need isolated configuration. Clients are built via createLLM({ engine, provider, model, apiKey, hooks, ...clientOptions }). The engine dependency means every pooled client’s
HTTP flows through the NetworkEngine queue.
destroy() calls client.destroy() on all pooled clients and clears the map.
Execution lifecycle
Section titled “Execution lifecycle”Non-LLM tools (when resolveModels returns empty): executeNonLLM. Single attempt,
no model resolution, ctx.client is undefined. Emits onInternalToolCallStart (with
chosenModel: '') then onInternalToolCallComplete or onInternalToolCallError.
LLM-backed tools: executeLLM. Iterates the model list:
- Skip models whose provider has no API key.
- Get or create the pooled client.
- Emit
onInternalToolCallStart { toolId, input, chosenModel, attempt }. - Build
InternalToolContextwithclient,modelId,counter,recordLLMResponsecallback. - Call
tool.execute(input, ctx). - Validate output against
tool.outputSchema(validateOutput, private) — emitsonWarningwith code'output_schema_mismatch'on type mismatch (does not throw). - Emit
onInternalToolCallComplete { toolId, input, output, chosenModel, latencyMs, attempts, usage? }. - On failure: collect the error, emit
onInternalToolCallError, emitonWarningwith code'internal_tool_fallback'if more models remain, and continue to the next model.
If all models fail, throws "Tool ${toolId} failed on all N model(s): ${summary}" where
summary lists model: error pairs.
Input validation (validateInput, private) checks tool.inputSchema.type === 'object',
verifies the input is an object (not array or primitive), and asserts all required fields
are present. Throws before execute is called.
defineLLMTool (src/plugins/internal-tools/runner/define.ts)
Section titled “defineLLMTool (src/plugins/internal-tools/runner/define.ts)”Converts a declarative LLMToolDefinition into an InternalTool with a generated
execute function.
interface LLMToolDefinition { id: string; namespace: string; name: string; version: string; description: string; inputSchema: JsonSchema; outputSchema?: JsonSchema; systemPrompt?: string; userTemplate?: string; outputFormat?: 'text' | 'json'; prepareInput?: (input: Record<string, unknown>) => Record<string, unknown>; resolveMaxTokens?: (input, ctx: ResolveMaxTokensContext) => number; outputExample?: unknown; variants?: PromptVariant[]; modelPreference: ModelPreference; recommendedThreshold?: number; tags?: string[];}execute logic:
- Extract
clientandmodelIdfromctx— throws when absent (runner must provide). - Split
modelIdintoproviderandmodelNameon the first/. - Apply schema defaults (
applySchemaDefaults, private function) — fills indefaultvalues frominputSchema.propertiesfor missing fields. - Call
def.prepareInput?.(vars)to transform the input. - Select variant via
selectVariant(variants, { provider, model, mode }). - Render
systemPromptanduserTemplateviarenderTemplatefromsrc/plugins/internal-tools/runner/template.ts. - If
outputFormat === 'json': callcomposeJsonSystemPrompt(withStructure)to prependJSON_API_SYSTEM_PROMPT(src/plugins/internal-tools/runner/json-enforcement.ts). - If
resolveMaxTokensis set: call it with(vars, { provider, model, counter }). - Call
client.complete([{ role: 'user', content: user }], { system, maxTokens, temperature }). - Call
ctx.recordLLMResponse?.(response). - If
outputFormat === 'json': parse viaparseJsonWithFences(src/plugins/internal-tools/runner/template.ts) — strips markdown code fences beforeJSON.parse. Throws with the raw output on parse failure. - Return
response.textforoutputFormat === 'text'.
The tool has Symbol.for('orxa:llm_tool_def') set on it (constant LLM_DEF_KEY). Use
getLLMToolDefinition(tool) to retrieve the original LLMToolDefinition from a tool
instance.
JSON enforcement (src/plugins/internal-tools/runner/json-enforcement.ts)
Section titled “JSON enforcement (src/plugins/internal-tools/runner/json-enforcement.ts)”JSON_API_SYSTEM_PROMPT is a hardcoded multi-line instruction string that mandates raw
JSON output with no markdown fences, no prose, and no comments. It is prepended to the
tool’s own system prompt via composeJsonSystemPrompt(toolSystemPrompt) using a
'---' separator.
attachStructureGuidance (private to define.ts) appends the outputSchema and
outputExample as fenced JSON blocks inside the tool’s system prompt before the JSON API
instruction is prepended.
Variant selection (src/plugins/internal-tools/runner/variants.ts)
Section titled “Variant selection (src/plugins/internal-tools/runner/variants.ts)”PromptVariant extends LLMToolDefinition fields with an id and an optional
providerMatch?: string | string[] pattern. selectVariant(variants, { provider, model, mode }) picks the first variant matching providerMatch (substring check on
"provider/model"), falling back to the variant with isDefault: true, then the first
variant. Variants allow provider-specific prompt tuning (e.g. different token limits or
JSON instructions for different model families).
Built-in tools (src/plugins/internal-tools/builtin/)
Section titled “Built-in tools (src/plugins/internal-tools/builtin/)”All registered via LocalToolBackend in builtin.ts. All use defineLLMTool:
| Tool id | File | Output format |
|---|---|---|
orxa:summarize@1.0.0 | builtin/summarize.ts | text |
orxa:classify@1.0.0 | builtin/classify.ts | json |
orxa:score@1.0.0 | builtin/score.ts | json |
orxa:structure@1.0.0 | builtin/structure.ts | json |
orxa:clarify@1.0.0 | builtin/clarify.ts | json |
All five call the context client internally. RunnerContextTools in
src/plugins/context-guard/types.ts wires orxa:summarize@1.0.0 and
orxa:fact-extract@1.0.0 (not a builtin — ships in extensions) to ContextGuard
strategy compaction.
Extension points
Section titled “Extension points”Custom backend: implement ToolBackend (readonly name, list(), get(id)) and call
registry.addBackend(backend). The backend is queried lazily and the cache is rebuilt.
Custom tool: create an InternalTool directly with a hand-written execute function,
or use defineLLMTool for LLM-backed tools. Register via LocalToolBackend or a custom
backend.
Custom model selection: implement CompatFile and pass as config.compat to
InternalToolRunner. The recommended array for each tool id overrides modelPreference
and should be ordered cheapest-first.
Gotchas and edge cases
Section titled “Gotchas and edge cases”ToolRegistrycache is invalidated on everyaddBackend/removeBackend. During cache rebuild, concurrentget/listcalls all await the sameensureCache()invocation (the cache is set atomically at the end). There is no explicit lock — multiple concurrent callers trigger a single rebuild in practice due to theif (this.cache) returnguard.- Duplicate backend names throw immediately in
addBackend. Duplicate tool ids across backends are silently resolved by first-registered wins. InternalToolRunner.run(toolId, input)looks up the tool fromregistry.get(toolId).registry.getis exact-match by id (including version). There is no semver range matching —registry.get('orxa:summarize')returns null; the full versioned id is required.assertKeyAvailabilityruns before execution but after model resolution. IfcompatormodelPreferencelists only providers with no API keys, the runner throws before callingexecute, even for tools that do not actually make LLM calls. For non-LLM tools, pass an empty model list inmodelPreferenceor rely onresolveModelsreturning[].executeNonLLMemitsonInternalToolCallStartwithchosenModel: ''. Tools that introspectctx.modelIdwill seeundefinedin this path.defineLLMToolwrapsclient.completewith a single user message. The system prompt and the user template are rendered separately. The context window is not managed — very long inputs can overflow;resolveMaxTokensgives tools a hook to computemaxTokensdynamically based on input length.parseJsonWithFencesintemplate.tsstrips leading/trailing```json(or```) blocks before callingJSON.parse. If the model produces multiple fenced blocks, only the first is extracted. Output that is valid JSON without fences is parsed directly.outputSchemavalidation invalidateOutputonly checks the top-level type (object, array, string, number, boolean). It emits a warning and does NOT throw — invalid outputs are returned to the caller. Use the warning to detect tool regressions.