List models

What you will achieve

Call listModels() to read the curated catalog (sync, no network), or listModelsLive() to fetch the live model list from a provider’s API. Each ModelInfo entry carries pricing per million tokens, context window size, capability flags, and the preferred API type. Results from listModelsLive() are cached in memory for 24 hours and deduped against the curated catalog.

When and why

Use listModels() when you need to:

Pick the right model programmatically — filter by capability (vision, tool use, structured output) or price before making a complete() call.
Pre-flight cost estimates — the catalog is what estimate() reads for pricing data.
Build a model selector UI — list models with human-readable names, context windows, and prices.

Use listModelsLive() when you need to:

Check what your account can actually call — the live list reflects your API key’s enabled models.
Discover new models — provider catalogs update faster than this SDK; live fetch surfaces newly launched models immediately.
OpenRouter — the curated catalog does not include OpenRouter models; live fetch builds ModelInfo objects directly from the OpenRouter API (includes pricing).

Step by step

Step 1 — Query the curated catalog (no network)

import { listModels } from '@combycode/llm-sdk';

// All providers
const all = listModels();
console.log(all.length); // hundreds of models across openai, anthropic, google, xai

// Filter to one provider
const anthropic = listModels({ provider: 'anthropic' });
console.log(anthropic.map(m => m.model));
// ['claude-opus-4.8', 'claude-sonnet-4-5', 'claude-haiku-4-5', ...]

listModels() reads a bundled catalog JSON — no API call, no async, no key required. The cost engine uses the same catalog.

Step 2 — Inspect a model entry

import { listModels } from '@combycode/llm-sdk';

const [model] = listModels({ provider: 'openai' })
  .filter(m => m.model === 'gpt-4o');

console.log(model.pricing.inputPerMTok);   // USD per 1M input tokens, e.g. 2.5
console.log(model.pricing.outputPerMTok);  // USD per 1M output tokens, e.g. 10
console.log(model.contextWindow);          // e.g. 128000
console.log(model.capabilities.vision);    // true
console.log(model.capabilities.toolUse);   // true
console.log(model.reasoning.supported);    // false for gpt-4o, true for o1/o3
console.log(model.preferredApi);           // 'responses' for OpenAI models

Step 3 — Fetch live model ids

import { listModelsLive } from '@combycode/llm-sdk';

// Raw mode: just id strings
const ids = await listModelsLive({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY,
  raw: true,
});
console.log(ids); // ['gpt-4o', 'gpt-4o-mini', 'o1', ...]

raw: true returns string[] instead of ModelInfo[]. Use it for quick existence checks or dropdowns.

Step 4 — Fetch live models with catalog enrichment

import { listModelsLive } from '@combycode/llm-sdk';

// Default (raw: false): ModelInfo[] merged with curated catalog
const models = await listModelsLive({
  provider: 'anthropic',
  apiKey: process.env.ANTHROPIC_API_KEY,
});

// Models in the catalog get full pricing + capability data.
// Models the provider lists but the catalog doesn't know (new releases) get
// a minimal ModelInfo with toolUse: true, streaming: true, and no pricing.
const known = models.filter(m => m.pricing.inputPerMTok !== undefined);
console.log(known.length, 'models with pricing data');

Step 5 — OpenRouter: build live from API (no bundled catalog)

OpenRouter is not in the bundled catalog. listModelsLive({ provider: 'openrouter' }) builds ModelInfo directly from the OpenRouter /api/v1/models response, including pricing and capability flags:

import { listModelsLive } from '@combycode/llm-sdk';

const orModels = await listModelsLive({
  provider: 'openrouter',
  apiKey: process.env.OPENROUTER_API_KEY,
});

// Find cheap models with vision
const cheap = orModels
  .filter(m => m.capabilities.vision && (m.pricing.inputPerMTok ?? Infinity) < 1)
  .sort((a, b) => (a.pricing.inputPerMTok ?? 0) - (b.pricing.inputPerMTok ?? 0));

console.log(cheap[0].model, cheap[0].pricing.inputPerMTok);

Step 6 — Force refresh the 24h cache

const fresh = await listModelsLive({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY,
  refresh: true, // bypass cache, always re-fetch
});

Without refresh: true, concurrent callers share a single in-flight request via a Map<string, Promise> — React StrictMode double-fires are safe.

Your options

listModels(opts?) — catalog (sync)

Option	Type	Notes
`provider`	`ProviderName`	Filter to one provider. Omit for all.
`engine`	`EngineHandle`	Override the global engine (rarely needed).

Returns ModelInfo[] synchronously from the bundled JSON.

listModelsLive(opts) — live API (async)

Option	Type	Required	Notes
`provider`	`ProviderName`	Yes	Which provider’s `/models` endpoint to call.
`apiKey`	`string`	No	Falls back to `engine.apiKeys[provider]`.
`raw`	`boolean`	No	`true` returns `string[]` (bare ids). Default `false` returns `ModelInfo[]`.
`refresh`	`boolean`	No	Force re-fetch even if cached. Default `false` (24h TTL).
`engine`	`EngineHandle`	No	Override engine; call flows through `engine.fetch` (queue, hooks, retry).

ModelInfo key fields:

Field	Type	Notes
`model`	`string`	Canonical slug (e.g. `claude-opus-4.8`). Send this to `complete()`.
`providerModelName`	`string \| undefined`	Exact wire name sent to the provider API when it differs from the slug.
`pricing.inputPerMTok`	`number \| undefined`	USD per 1M input tokens.
`pricing.outputPerMTok`	`number \| undefined`	USD per 1M output tokens.
`pricing.cacheReadPerMTok`	`number \| undefined`	Prompt cache read rate (Anthropic, OpenAI).
`contextWindow`	`number \| undefined`	Maximum total tokens (input + output).
`capabilities.toolUse`	`boolean`	Supports function/built-in tools.
`capabilities.vision`	`boolean`	Accepts image input.
`capabilities.structuredOutput`	`boolean`	Supports JSON schema output.
`capabilities.audio`	`boolean`	Accepts audio input.
`reasoning.supported`	`boolean`	Has extended reasoning / chain-of-thought.
`reasoning.effortControl`	`boolean`	Supports `effort` level parameter.
`preferredApi`	`ApiType`	The API this SDK uses by default for this model.
`active`	`boolean`	`false` for deprecated/retired models.

Live endpoint per provider:

Provider	URL
`openai`	`GET https://api.openai.com/v1/models`
`anthropic`	`GET https://api.anthropic.com/v1/models`
`google`	`GET https://generativelanguage.googleapis.com/v1beta/models`
`xai`	`GET https://api.x.ai/v1/models`
`openrouter`	`GET https://openrouter.ai/api/v1/models`

All calls flow through engine.fetch (rate-limit queue, retry, hooks). The queue name is {provider}/models to avoid sharing capacity with completion calls.

Compare the SDKs

import { listModelsLive } from '@combycode/llm-sdk';

// `listModels()` returns our curated catalog (pricing + capabilities) — the
// richer default. `listModelsLive()` is the thin live-discovery fetch that
// mirrors the official "list models" scenario.
const provider = (process.env.LLM_MODEL ?? '').split('/')[0];
const t0 = performance.now();
const ids = await listModelsLive({ provider: provider as never, apiKey: process.env.LLM_API_KEY });

console.log(JSON.stringify({ result: ids.length > 0 ? `found:${ids.length}` : 'none', ms: Math.round(performance.now() - t0) }));

Official SDKs return provider-specific objects: OpenAI’s model.list() returns Model objects; Anthropic’s returns ModelInfo; Google’s returns Model (different schema). None cross-covers providers, and none carry pricing in the live response (you must consult pricing pages separately). ORXA’s listModels() returns a single ModelInfo[] type across all providers with pricing and capability data baked in. listModelsLive() merges live availability with the catalog, filling gaps for new models with minimal entries.

Gotchas and next steps

The catalog is a snapshot. Provider catalogs update faster than this SDK. New models appear in listModelsLive() immediately but will show undefined pricing until the SDK catalog is updated. For production cost estimates, use listModelsLive() and check pricing.inputPerMTok !== undefined.

raw: true returns bare ids from the provider. Google returns models/gemini-2.0-flash; the adapter strips the models/ prefix. OpenAI and Anthropic return the canonical id directly. OpenRouter returns namespaced ids like google/gemini-2.0-flash.

Anthropic requires browser opt-in. listModelsLive({ provider: 'anthropic' }) from a browser context automatically adds the anthropic-dangerous-direct-browser-access: true header — the same header the chat adapter adds. No extra configuration needed.

24h cache is per-provider, in-memory. Restarting the process clears it. Call clearLiveModelsCache() (internal utility) in tests to reset state between runs.

Next steps:

Cost tracking guide — estimate() reads the catalog for pre-flight cost checks
Provider routing — use the model list to build dynamic fallback chains
Models page — browse the full curated catalog in the browser