Compare commits

...

6 Commits

Author SHA1 Message Date
Jiayuan Zhang
aaa5529f61 fix(agents): honour default flag wire-side and stop masking model errors
Addresses three issues from the latest PR #1399 review.

1. Wire the `default` flag end-to-end. The Model struct tagged one
   entry per provider as Default=true so the UI can badge it, but
   the daemon's heartbeat-report writer serialised models as
   `map[string]string`, silently dropping the bool; handler's
   `ModelEntry` also lacked the field. Result: the dropdown always
   showed "Default (provider)" and never "Default — <name>".
   Replace the map with a typed wire struct on the daemon side and
   add `Default bool` to `handler.ModelEntry`, plus a regression
   test asserting the flag round-trips through the report body
   and the store.

2. Stop imposing a static Multica-side default on task execution.
   `runTask` resolved the model via a three-tier chain that ended
   in `agent.DefaultModel(provider)`, forcing `claude-sonnet-4-6`
   / `gpt-5.4` / etc. onto every task whose agent hadn't set a
   model. That's the same shape of bug that bit us on cursor: any
   Go-side guess drifts from what the user's account actually has
   access to. Drop the third tier — when both `agent.model` and
   `MULTICA_<PROVIDER>_MODEL` are empty we now pass `""` through,
   so each backend omits `--model` from the CLI and the provider
   resolves its own default. `DefaultModel` / `defaultStaticModelsFor`
   become dead and are removed along with their test; the per-entry
   `Default: true` markers remain as a *display* hint on
   discovery responses (and are now surfaced via Fix #1).

3. Hermes: fail the task when the user's chosen model can't be
   applied. `session/set_model` errors used to log a warn and
   continue on hermes' own default, so a successful task would
   mislead the user into thinking their explicit pick ran. Now,
   when `opts.Model != ""` and the RPC fails, we send a `failed`
   Result with the hermes error in `result.Error` and skip the
   prompt entirely.

Plus gofmt clean-up on `hermes.go` (the new sniffer block picked
up tab/space noise on the prior commit).
2026-04-21 00:00:34 +08:00
Jiayuan Zhang
a8e2ef4c4d fix(agent/hermes): surface provider errors instead of reporting empty output
When the model the user picked isn't usable by the configured
provider — e.g. ChatGPT-account auth rejecting `gpt-5.1-codex-mini`
as "not supported when using Codex with a ChatGPT account" — hermes
still acknowledges `session/prompt` with `stopReason=end_turn` and
an empty text payload. The daemon's task machinery then reports a
useless "hermes returned empty output" and the real HTTP 400 detail
(the one that actually tells the user what to change) stays buried
in the daemon's stderr log.

Tee stderr through a bounded provider-error sniffer that scans for
`⚠️` / `` / `[ERROR]` headers plus the `Error:` / `detail:` tails
hermes emits for API failures. When the task finishes `completed`
with empty output, promote it to `failed` and use the sniffed
message as the task error. Nothing changes for healthy runs — the
sniffer is an additional writer behind an `io.MultiWriter`, it
doesn't filter or replace the normal stderr log forwarder.

Reproduced against a local hermes configured for
`provider: openai-codex` + `chatgpt.com/backend-api/codex`: before
this change the task result was `{status: completed, comment:
"hermes returned empty output"}`; after, it's
`{status: failed, error: "hermes provider error: HTTP 400: ... The
'gpt-5.1-codex-mini' model is not supported when using Codex with
a ChatGPT account."}`.

Tests cover the real fixture shape, partial-line buffering across
Writer calls, log-line filtering (info-level lines never surface
as errors), and the bounded line buffer.
2026-04-20 23:45:49 +08:00
Jiayuan Zhang
9acaf248dc feat(agent/hermes): ACP-driven model discovery and per-task selection
Hermes' ACP server already exposes everything we need to treat its
model catalog like any other provider's:

* `acp_adapter/server.py::_build_model_state` attaches a full
  `SessionModelState` (availableModels + currentModelId) to the
  response of `session/new` / `session/load` / `session/resume`.
* `acp_adapter/server.py::set_session_model` is an ACP-standard RPC
  that switches the active session's provider + model.

Use both from Multica's daemon instead of treating hermes as a
"dropdown-disabled" special case.

Discovery — `discoverHermesModels` spawns a throwaway `hermes acp`
process, drives `initialize` + `session/new(cwd=<tmpdir>)`, reads the
model catalog straight out of the response, then kills the child.
Failures (hermes missing, bad credentials, config resolution error)
degrade to an empty list so the creatable dropdown still works. The
existing 60s TTL cache amortises the ~few-second process spin-up.

Per-task selection — `hermesBackend.Execute` now issues a
`session/set_model` RPC with the UI-chosen `opts.Model` before
`session/prompt`. Failure is logged but not fatal; the session
falls back to hermes' configured default rather than aborting the
task.

`ModelSelectionSupported` no longer special-cases hermes — every
provider in the registry now honours `agent.model` end-to-end. The
UI "disabled" branch of ModelDropdown is retained (dead for today's
catalog) so a future provider can opt out without reintroducing this
plumbing.

Tests cover the JSON parser for the real `SessionModelState` shape
(including `currentModelId` mapping to Default=true, dedup of
duplicate `modelId` entries, and graceful handling of missing /
malformed payloads), plus a regression guard on the support flag.
2026-04-20 23:34:17 +08:00
Jiayuan Zhang
ad5a2abaa2 fix(agent/cursor): discover models via cursor-agent --list-models
The previous cursor catalog hardcoded model IDs like `claude-sonnet-4-6`,
`gpt-5.4`, `gemini-2.5-pro` and `composer-1.5`. None of those are valid
cursor-agent model names today — the real IDs have shape
`composer-2-fast`, `claude-4.6-sonnet-medium`, `claude-opus-4-7-high`,
`gemini-3.1-pro`, etc. Selecting one of the stale IDs from the UI made
cursor-agent exit 1 during task execution with no actionable error,
visible in practice as "cursor-agent exited with error: exit status 1"
on a 2-second-long failed run.

Cursor's catalog is too volatile to ship statically (there are ~60+
variants covering the fast/medium/high/xhigh/max × thinking matrix).
Switch to dynamic discovery via `cursor-agent --list-models`, with a
single-entry `auto` static fallback when the binary is missing.

- `discoverCursorModels` shells out and parses the `<id> - <label>`
  rows; the entry tagged `(default)` surfaces as Default=true so the
  dropdown badge matches cursor's own pick rather than ours.
- Strip the parenthetical suffix from the display label so
  "Composer 2 Fast (current, default)" renders as "Composer 2 Fast".
- Reuses the same identifier guard as the openclaw text parser to
  skip the "Available models" header / blank lines.
- Tests parse the real fixture from the live CLI.

DefaultModel("cursor") now returns "auto" (the static-fallback
default) instead of a specific ID, since we intentionally don't ship
an opinionated cursor model — cursor itself surfaces the real default
through the Default flag on whichever entry `--list-models` tags.
2026-04-20 23:09:56 +08:00
Jiayuan Zhang
6094a149ac fix(agent/openclaw): reject TUI decoration when enumerating agents
`openclaw agents list` prints a decorated banner with box-drawing
characters and section headers ("Agents:", "Identity:", "Workspace:")
surrounding the actual agent rows. The previous text parser picked
up the first whitespace-separated token on every non-empty,
non-dash-prefixed line, which surfaced the section headers and
single-character icons ("│", "◇") as selectable models in the
dropdown.

- Try structured JSON output first (`agents list --json` / `--output
  json` / `-o json`); on any of those the UI gets names straight from
  the CLI's source of truth, with the model shown as a label suffix.
- Fall back to a conservative text parser that only accepts rows with
  at least two whitespace-separated tokens where both look like
  safe identifiers (letters / digits / `-_./`, no trailing colon).
  Anything decorative or header-like is discarded.
- On ambiguity we return an empty list — a silently-wrong enumeration
  would mislead users more than the creatable dropdown's manual entry.

Adds a regression test using the exact banner shape seen in the field
plus two tests for the JSON paths (array and `{agents: [...]}` wrapped).
2026-04-20 22:51:52 +08:00
Jiayuan Zhang
a67e533742 feat(agents): add per-agent model field with provider-aware dropdown
Adds a first-class `model` field to agents so users can pick the LLM model
from the create / settings UI instead of editing custom_env / custom_args.
The previous "set MULTICA_<PROVIDER>_MODEL env var on the daemon" approach
forced one model per provider per machine and was easy to misconfigure
(e.g. -m as a custom_arg breaks codex app-server initialization).

Backend (server/pkg/agent):
- New `agent.ListModels(provider, path)` returns the models supported by a
  provider. Static catalogs for claude, codex, gemini, cursor, copilot;
  dynamic discovery for opencode (`opencode models`), pi (`pi --list-models`),
  openclaw (`openclaw agents list`); 60s TTL cache + empty-list fallback on
  failure. Hermes returns an empty list and `ModelSelectionSupported=false`
  because its model is configured out-of-band.
- `agent.DefaultModel(provider)` returns the recommended default per
  provider (Sonnet 4.6 for claude, GPT-5.4 for codex, Gemini 2.5 Pro for
  gemini, composer-1.5 for cursor); copilot/openclaw/hermes deliberately
  have no default. The static catalog tags one entry per provider with
  `Default: true` so the UI can render a badge.
- For openclaw, opts.Model is mapped to `--agent <name>` since the CLI
  rejects `--model` outright; custom_args `--agent` still wins for
  back-compat.

Daemon protocol (server/internal/daemon):
- Heartbeat response carries an optional `pending_model_list` request
  (same pattern as PingStore / UpdateStore). The daemon resolves models
  via `agent.ListModels`, including the `supported` flag, and reports
  back via /api/daemon/runtimes/{id}/models/{requestId}/result.
- Task dispatch uses a three-tier fallback for the runtime model:
  agent.model → MULTICA_<PROVIDER>_MODEL env → agent.DefaultModel(provider).

Server API (server/internal/handler):
- `agent.model` is a new column (migration 050) and surfaces in
  Agent / CreateAgent / UpdateAgent payloads.
- New endpoints under /api/runtimes/{id}/models: POST to initiate
  discovery, GET to poll the request, plus the daemon-side report
  endpoint above.

CLI (server/cmd/multica):
- `multica agent create / update --model <id>`. Help copy steers users
  away from passing --model via --custom-args, which fails on codex
  (app-server mode) and openclaw.

Frontend (packages/core, packages/views):
- `Agent.model`, `RuntimeModel`, `RuntimeModelListRequest`,
  `RuntimeModelsResult` types.
- `runtimes/models.ts` exports `runtimeModelsOptions(runtimeId)` which
  initiates discovery and polls the request to completion (500ms
  cadence, 30s ceiling).
- New `ModelDropdown` (packages/views/agents) — searchable popover,
  provider grouping, creatable manual entry, "default" badge on the
  shipped recommendation, disabled state when the provider reports
  `supported=false` (Hermes), and clears any stale model value in that
  case to avoid persisting a ghost configuration.
- Wired into create-agent-dialog and the agent settings tab.

Verification:
- gofmt clean on touched files
- `go build ./... && go test ./...` (server) green; new openclaw and
  models_test cases included
- `pnpm typecheck` green across all 6 packages

Closes the immediate UX gap behind MUL-1151. DeepSeek V4 (or any new
model) becomes a zero-code addition: add it to the relevant static
catalog, or rely on the creatable input for one-off use.
2026-04-20 22:45:56 +08:00
31 changed files with 2229 additions and 53 deletions

View File

@@ -59,6 +59,7 @@ export const mockAgents: Agent[] = [
custom_env_redacted: false,
visibility: "workspace",
max_concurrent_tasks: 3,
model: "",
owner_id: null,
skills: [],
created_at: "2026-01-01T00:00:00Z",

View File

@@ -35,6 +35,7 @@ import type {
RuntimeHourlyActivity,
RuntimePing,
RuntimeUpdate,
RuntimeModelListRequest,
TimelineEntry,
AssigneeFrequencyEntry,
TaskMessagePayload,
@@ -470,6 +471,17 @@ export class ApiClient {
return this.fetch(`/api/runtimes/${runtimeId}/update/${updateId}`);
}
async initiateListModels(runtimeId: string): Promise<RuntimeModelListRequest> {
return this.fetch(`/api/runtimes/${runtimeId}/models`, { method: "POST" });
}
async getListModelsResult(
runtimeId: string,
requestId: string,
): Promise<RuntimeModelListRequest> {
return this.fetch(`/api/runtimes/${runtimeId}/models/${requestId}`);
}
async listAgentTasks(agentId: string): Promise<AgentTask[]> {
return this.fetch(`/api/agents/${agentId}/tasks`);
}

View File

@@ -1,3 +1,4 @@
export * from "./queries";
export * from "./mutations";
export * from "./hooks";
export * from "./models";

View File

@@ -0,0 +1,52 @@
import { queryOptions } from "@tanstack/react-query";
import { api } from "../api";
import type { RuntimeModelsResult } from "../types/agent";
export const runtimeModelsKeys = {
all: () => ["runtimes", "models"] as const,
forRuntime: (runtimeId: string) =>
[...runtimeModelsKeys.all(), runtimeId] as const,
};
const POLL_INTERVAL_MS = 500;
const POLL_TIMEOUT_MS = 30_000;
// resolveRuntimeModels initiates a list-models request against the daemon
// (via heartbeat piggyback) and polls until the daemon reports back or
// the request times out. Returns both the models list and a
// `supported` flag: `supported=false` means the provider ignores
// per-agent model selection entirely (hermes today) — the UI uses
// this to disable its dropdown instead of accepting a value that
// wouldn't be honoured at runtime.
export async function resolveRuntimeModels(
runtimeId: string,
): Promise<RuntimeModelsResult> {
const initial = await api.initiateListModels(runtimeId);
const start = Date.now();
let current = initial;
while (current.status === "pending" || current.status === "running") {
if (Date.now() - start > POLL_TIMEOUT_MS) {
throw new Error("model discovery timed out");
}
await new Promise((resolve) => setTimeout(resolve, POLL_INTERVAL_MS));
current = await api.getListModelsResult(runtimeId, initial.id);
}
if (current.status === "failed" || current.status === "timeout") {
throw new Error(current.error || "model discovery failed");
}
return { models: current.models ?? [], supported: current.supported };
}
export function runtimeModelsOptions(runtimeId: string | null | undefined) {
return queryOptions({
queryKey: runtimeId
? runtimeModelsKeys.forRuntime(runtimeId)
: runtimeModelsKeys.all(),
queryFn: () => resolveRuntimeModels(runtimeId as string),
enabled: Boolean(runtimeId),
// Models rarely change; cache for 60s to match the server-side
// cache in agent.ListModels.
staleTime: 60_000,
retry: false,
});
}

View File

@@ -54,6 +54,7 @@ export interface Agent {
visibility: AgentVisibility;
status: AgentStatus;
max_concurrent_tasks: number;
model: string;
owner_id: string | null;
skills: Skill[];
created_at: string;
@@ -73,6 +74,7 @@ export interface CreateAgentRequest {
custom_args?: string[];
visibility?: AgentVisibility;
max_concurrent_tasks?: number;
model?: string;
}
export interface UpdateAgentRequest {
@@ -87,6 +89,7 @@ export interface UpdateAgentRequest {
visibility?: AgentVisibility;
status?: AgentStatus;
max_concurrent_tasks?: number;
model?: string;
}
// Skills
@@ -187,3 +190,36 @@ export interface RuntimeUpdate {
created_at: string;
updated_at: string;
}
export interface RuntimeModel {
id: string;
label: string;
provider?: string;
default?: boolean;
}
export type RuntimeModelListStatus =
| "pending"
| "running"
| "completed"
| "failed"
| "timeout";
export interface RuntimeModelListRequest {
id: string;
runtime_id: string;
status: RuntimeModelListStatus;
models?: RuntimeModel[];
supported: boolean;
error?: string;
created_at: string;
updated_at: string;
}
// Result shape returned by resolveRuntimeModels — includes the
// "supported" bit so the UI can distinguish "no models discovered"
// from "provider does not honour per-agent model selection".
export interface RuntimeModelsResult {
models: RuntimeModel[];
supported: boolean;
}

View File

@@ -20,6 +20,10 @@ export type {
RuntimePingStatus,
RuntimeUpdate,
RuntimeUpdateStatus,
RuntimeModel,
RuntimeModelListRequest,
RuntimeModelListStatus,
RuntimeModelsResult,
IssueUsageSummary,
} from "./agent";
export type { Workspace, WorkspaceRepo, Member, MemberRole, User, MemberWithUser, Invitation } from "./workspace";

View File

@@ -4,6 +4,7 @@ import { useState, useEffect, useMemo } from "react";
import { Cloud, ChevronDown, Globe, Lock, Loader2 } from "lucide-react";
import { ProviderLogo } from "../../runtimes/components/provider-logo";
import { ActorAvatar } from "../../common/actor-avatar";
import { ModelDropdown } from "./model-dropdown";
import type {
AgentVisibility,
RuntimeDevice,
@@ -48,6 +49,7 @@ export function CreateAgentDialog({
const [name, setName] = useState("");
const [description, setDescription] = useState("");
const [visibility, setVisibility] = useState<AgentVisibility>("private");
const [model, setModel] = useState("");
const [creating, setCreating] = useState(false);
const [runtimeOpen, setRuntimeOpen] = useState(false);
const [runtimeFilter, setRuntimeFilter] = useState<RuntimeFilter>("mine");
@@ -89,6 +91,7 @@ export function CreateAgentDialog({
description: description.trim(),
runtime_id: selectedRuntime.id,
visibility,
model: model.trim() || undefined,
});
onClose();
} catch (err) {
@@ -275,6 +278,14 @@ export function CreateAgentDialog({
</PopoverContent>
</Popover>
</div>
<ModelDropdown
runtimeId={selectedRuntime?.id ?? null}
runtimeOnline={selectedRuntime?.status === "online"}
value={model}
onChange={setModel}
disabled={!selectedRuntime}
/>
</div>
<DialogFooter>

View File

@@ -0,0 +1,252 @@
"use client";
import { useEffect, useMemo, useState } from "react";
import { useQuery } from "@tanstack/react-query";
import { ChevronDown, Cpu, Loader2, Plus, Check, Info } from "lucide-react";
import { runtimeModelsOptions } from "@multica/core/runtimes";
import type { RuntimeModel } from "@multica/core/types";
import {
Popover,
PopoverTrigger,
PopoverContent,
} from "@multica/ui/components/ui/popover";
import { Input } from "@multica/ui/components/ui/input";
import { Label } from "@multica/ui/components/ui/label";
// ModelDropdown renders a searchable, creatable model picker for an agent.
// It fetches the supported-model catalog from the selected runtime — the
// daemon enumerates models on demand via heartbeat piggyback. Providers
// that don't honour per-agent model selection at runtime (currently
// hermes) return supported=false, and the dropdown renders disabled
// with an explanation instead of silently accepting a value the
// backend would ignore.
export function ModelDropdown({
runtimeId,
runtimeOnline,
value,
onChange,
disabled,
}: {
runtimeId: string | null;
runtimeOnline: boolean;
value: string;
onChange: (value: string) => void;
disabled?: boolean;
}) {
const [open, setOpen] = useState(false);
const [search, setSearch] = useState("");
const modelsQuery = useQuery(
runtimeModelsOptions(runtimeOnline ? runtimeId : null),
);
const supported = modelsQuery.data?.supported ?? true;
const models = modelsQuery.data?.models ?? [];
const defaultModel = useMemo(() => models.find((m) => m.default), [models]);
const grouped = useMemo(() => groupByProvider(models), [models]);
// When the selected runtime reports it doesn't support per-agent
// model selection, clear any previously-saved value so we don't
// persist a ghost configuration that never takes effect.
useEffect(() => {
if (!supported && value !== "") {
onChange("");
}
}, [supported, value, onChange]);
const filtered = useMemo(() => {
if (!search.trim()) return grouped;
const needle = search.toLowerCase();
const out: Record<string, RuntimeModel[]> = {};
for (const [provider, list] of Object.entries(grouped)) {
const matches = list.filter(
(m) =>
m.id.toLowerCase().includes(needle) ||
m.label.toLowerCase().includes(needle),
);
if (matches.length > 0) out[provider] = matches;
}
return out;
}, [grouped, search]);
const trimmedSearch = search.trim();
const exactMatch = models.some(
(m) => m.id === trimmedSearch || m.label === trimmedSearch,
);
const canCreate = trimmedSearch.length > 0 && !exactMatch;
const select = (id: string) => {
onChange(id);
setOpen(false);
setSearch("");
};
const triggerLabel =
value ||
(disabled
? "Select a runtime first"
: runtimeOnline
? defaultModel
? `Default — ${defaultModel.label}`
: "Default (provider)"
: "Runtime offline — enter manually");
if (!supported && !modelsQuery.isLoading) {
// Provider doesn't honour per-agent model selection — show a
// clearly-disabled state so the user knows why the control is
// inert. (Hermes reads its model from ~/.hermes/.env.)
return (
<div className="min-w-0">
<Label className="text-xs text-muted-foreground">Model</Label>
<div className="mt-1.5 flex items-start gap-2 rounded-lg border border-dashed border-border bg-muted/30 px-3 py-2.5 text-sm text-muted-foreground">
<Info className="mt-0.5 h-4 w-4 shrink-0" />
<div className="min-w-0">
<div>Model selection is managed by this runtime.</div>
<div className="mt-0.5 text-xs">
Configure the model on the runtime host (e.g. Hermes reads it
from its own config file).
</div>
</div>
</div>
</div>
);
}
return (
<div className="min-w-0">
<div className="flex items-center justify-between">
<Label className="text-xs text-muted-foreground">Model</Label>
{modelsQuery.isError && (
<span className="text-xs text-muted-foreground">discovery failed</span>
)}
</div>
<Popover open={open} onOpenChange={setOpen}>
<PopoverTrigger
disabled={disabled}
className="flex w-full min-w-0 items-center gap-3 rounded-lg border border-border bg-background px-3 py-2.5 mt-1.5 text-left text-sm transition-colors hover:bg-muted disabled:pointer-events-none disabled:opacity-50"
>
<Cpu className="h-4 w-4 shrink-0 text-muted-foreground" />
<div className="min-w-0 flex-1">
<div className="truncate font-medium">
{triggerLabel}
</div>
{value && (
<div className="truncate text-xs text-muted-foreground">
{modelLabel(models, value)}
</div>
)}
</div>
<ChevronDown
className={`h-4 w-4 shrink-0 text-muted-foreground transition-transform ${open ? "rotate-180" : ""}`}
/>
</PopoverTrigger>
<PopoverContent
align="start"
className="w-[var(--anchor-width)] p-0 overflow-hidden"
>
<div className="border-b border-border p-2">
<Input
autoFocus
placeholder="Search or type a model ID"
value={search}
onChange={(e) => setSearch(e.target.value)}
className="h-8"
/>
</div>
<div className="max-h-72 overflow-y-auto p-1">
{modelsQuery.isLoading && (
<div className="flex items-center gap-2 px-3 py-6 text-sm text-muted-foreground">
<Loader2 className="h-4 w-4 animate-spin" />
Discovering models
</div>
)}
{!modelsQuery.isLoading &&
Object.entries(filtered).map(([provider, list]) => (
<div key={provider} className="mb-1">
{provider && (
<div className="px-2 pt-1.5 pb-0.5 text-xs font-medium uppercase tracking-wide text-muted-foreground">
{provider}
</div>
)}
{list.map((m) => (
<button
key={m.id}
onClick={() => select(m.id)}
className={`flex w-full items-center gap-2 rounded-md px-3 py-2 text-left text-sm transition-colors ${
m.id === value ? "bg-accent" : "hover:bg-accent/50"
}`}
>
<div className="min-w-0 flex-1">
<div className="flex items-center gap-1.5">
<span className="truncate font-medium">{m.label}</span>
{m.default && (
<span className="shrink-0 rounded bg-primary/10 px-1.5 py-0.5 text-xs font-medium text-primary">
default
</span>
)}
</div>
{m.label !== m.id && (
<div className="truncate text-xs text-muted-foreground">
{m.id}
</div>
)}
</div>
{m.id === value && (
<Check className="h-4 w-4 shrink-0 text-primary" />
)}
</button>
))}
</div>
))}
{!modelsQuery.isLoading &&
Object.keys(filtered).length === 0 &&
!canCreate && (
<div className="px-3 py-6 text-center text-sm text-muted-foreground">
No models available.
</div>
)}
{canCreate && (
<button
onClick={() => select(trimmedSearch)}
className="flex w-full items-center gap-2 rounded-md px-3 py-2 text-left text-sm text-primary transition-colors hover:bg-accent/50"
>
<Plus className="h-4 w-4 shrink-0" />
<span className="truncate">
Use {trimmedSearch}
</span>
</button>
)}
{value && (
<button
onClick={() => select("")}
className="mt-1 flex w-full items-center gap-2 border-t border-border px-3 py-2 text-left text-xs text-muted-foreground transition-colors hover:bg-accent/50"
>
Clear selection (use provider default)
</button>
)}
</div>
</PopoverContent>
</Popover>
</div>
);
}
function groupByProvider(models: RuntimeModel[]): Record<string, RuntimeModel[]> {
const out: Record<string, RuntimeModel[]> = {};
for (const m of models) {
const key = m.provider ?? "";
if (!out[key]) out[key] = [];
out[key].push(m);
}
return out;
}
function modelLabel(models: RuntimeModel[], id: string): string {
const found = models.find((m) => m.id === id);
if (!found) return "custom";
return found.provider ? found.provider : "model";
}

View File

@@ -23,6 +23,7 @@ import { api } from "@multica/core/api";
import { useFileUpload } from "@multica/core/hooks/use-file-upload";
import { ActorAvatar } from "../../../common/actor-avatar";
import { ProviderLogo } from "../../../runtimes/components/provider-logo";
import { ModelDropdown } from "../model-dropdown";
type RuntimeFilter = "mine" | "all";
@@ -44,6 +45,7 @@ export function SettingsTab({
const [visibility, setVisibility] = useState<AgentVisibility>(agent.visibility);
const [maxTasks, setMaxTasks] = useState(agent.max_concurrent_tasks);
const [selectedRuntimeId, setSelectedRuntimeId] = useState(agent.runtime_id);
const [model, setModel] = useState(agent.model ?? "");
const [runtimeOpen, setRuntimeOpen] = useState(false);
const [runtimeFilter, setRuntimeFilter] = useState<RuntimeFilter>("mine");
const [saving, setSaving] = useState(false);
@@ -90,7 +92,8 @@ export function SettingsTab({
description !== (agent.description ?? "") ||
visibility !== agent.visibility ||
maxTasks !== agent.max_concurrent_tasks ||
selectedRuntimeId !== agent.runtime_id;
selectedRuntimeId !== agent.runtime_id ||
model !== (agent.model ?? "");
const handleSave = async () => {
if (!name.trim()) {
@@ -106,6 +109,7 @@ export function SettingsTab({
visibility,
max_concurrent_tasks: maxTasks,
runtime_id: selectedRuntimeId,
model,
});
toast.success("Settings saved");
} catch {
@@ -321,6 +325,14 @@ export function SettingsTab({
</Popover>
</div>
<ModelDropdown
runtimeId={selectedRuntime?.id ?? null}
runtimeOnline={selectedRuntime?.status === "online"}
value={model}
onChange={setModel}
disabled={!selectedRuntime}
/>
<Button onClick={handleSave} disabled={!dirty || saving} size="sm">
{saving ? <Loader2 className="h-3.5 w-3.5 mr-1.5 animate-spin" /> : <Save className="h-3.5 w-3.5 mr-1.5" />}
Save Changes

View File

@@ -55,6 +55,7 @@ const agent: Agent = {
visibility: "workspace",
status: "idle",
max_concurrent_tasks: 1,
model: "",
owner_id: null,
skills: [],
created_at: "2026-04-16T00:00:00Z",

View File

@@ -114,7 +114,8 @@ func init() {
agentCreateCmd.Flags().String("instructions", "", "Agent instructions")
agentCreateCmd.Flags().String("runtime-id", "", "Runtime ID (required)")
agentCreateCmd.Flags().String("runtime-config", "", "Runtime config as JSON string")
agentCreateCmd.Flags().String("custom-args", "", "Custom CLI arguments as JSON array (e.g. '[\"--model\", \"o3\"]')")
agentCreateCmd.Flags().String("model", "", "Model identifier (e.g. claude-sonnet-4-6, openai/gpt-4o). Prefer this over passing --model in --custom-args.")
agentCreateCmd.Flags().String("custom-args", "", "Custom CLI arguments as JSON array. For model selection prefer --model; some providers (codex app-server, openclaw) reject --model in custom_args.")
agentCreateCmd.Flags().String("visibility", "private", "Visibility: private or workspace")
agentCreateCmd.Flags().Int32("max-concurrent-tasks", 6, "Maximum concurrent tasks")
agentCreateCmd.Flags().String("output", "json", "Output format: table or json")
@@ -125,7 +126,8 @@ func init() {
agentUpdateCmd.Flags().String("instructions", "", "New instructions")
agentUpdateCmd.Flags().String("runtime-id", "", "New runtime ID")
agentUpdateCmd.Flags().String("runtime-config", "", "New runtime config as JSON string")
agentUpdateCmd.Flags().String("custom-args", "", "New custom CLI arguments as JSON array (e.g. '[\"--model\", \"o3\"]')")
agentUpdateCmd.Flags().String("model", "", "New model identifier. Pass an empty string to clear and fall back to the runtime default.")
agentUpdateCmd.Flags().String("custom-args", "", "New custom CLI arguments as JSON array. For model selection prefer --model; some providers (codex app-server, openclaw) reject --model in custom_args.")
agentUpdateCmd.Flags().String("visibility", "", "New visibility: private or workspace")
agentUpdateCmd.Flags().String("status", "", "New status")
agentUpdateCmd.Flags().Int32("max-concurrent-tasks", 0, "New max concurrent tasks")
@@ -347,6 +349,10 @@ func runAgentCreate(cmd *cobra.Command, _ []string) error {
}
body["custom_args"] = ca
}
if cmd.Flags().Changed("model") {
v, _ := cmd.Flags().GetString("model")
body["model"] = v
}
if cmd.Flags().Changed("visibility") {
v, _ := cmd.Flags().GetString("visibility")
body["visibility"] = v
@@ -412,6 +418,10 @@ func runAgentUpdate(cmd *cobra.Command, args []string) error {
}
body["custom_args"] = ca
}
if cmd.Flags().Changed("model") {
v, _ := cmd.Flags().GetString("model")
body["model"] = v
}
if cmd.Flags().Changed("visibility") {
v, _ := cmd.Flags().GetString("visibility")
body["visibility"] = v
@@ -426,7 +436,7 @@ func runAgentUpdate(cmd *cobra.Command, args []string) error {
}
if len(body) == 0 {
return fmt.Errorf("no fields to update; use --name, --description, --instructions, --runtime-id, --runtime-config, --custom-args, --visibility, --status, or --max-concurrent-tasks")
return fmt.Errorf("no fields to update; use --name, --description, --instructions, --runtime-id, --runtime-config, --model, --custom-args, --visibility, --status, or --max-concurrent-tasks")
}
ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second)

View File

@@ -145,6 +145,7 @@ func NewRouter(pool *pgxpool.Pool, hub *realtime.Hub, bus *events.Bus) chi.Route
r.Get("/runtimes/{runtimeId}/tasks/pending", h.ListPendingTasksByRuntime)
r.Post("/runtimes/{runtimeId}/ping/{pingId}/result", h.ReportPingResult)
r.Post("/runtimes/{runtimeId}/update/{updateId}/result", h.ReportUpdateResult)
r.Post("/runtimes/{runtimeId}/models/{requestId}/result", h.ReportModelListResult)
r.Get("/tasks/{taskId}/status", h.GetTaskStatus)
r.Post("/tasks/{taskId}/start", h.StartTask)
@@ -346,6 +347,8 @@ func NewRouter(pool *pgxpool.Pool, hub *realtime.Hub, bus *events.Bus) chi.Route
r.Get("/ping/{pingId}", h.GetPing)
r.Post("/update", h.InitiateUpdate)
r.Get("/update/{updateId}", h.GetUpdate)
r.Post("/models", h.InitiateListModels)
r.Get("/models/{requestId}", h.GetModelListRequest)
r.Delete("/", h.DeleteAgentRuntime)
})
})

View File

@@ -147,9 +147,10 @@ func (c *Client) GetTaskStatus(ctx context.Context, taskID string) (string, erro
// HeartbeatResponse contains the server's response to a heartbeat, including any pending actions.
type HeartbeatResponse struct {
Status string `json:"status"`
PendingPing *PendingPing `json:"pending_ping,omitempty"`
PendingUpdate *PendingUpdate `json:"pending_update,omitempty"`
Status string `json:"status"`
PendingPing *PendingPing `json:"pending_ping,omitempty"`
PendingUpdate *PendingUpdate `json:"pending_update,omitempty"`
PendingModelList *PendingModelList `json:"pending_model_list,omitempty"`
}
// PendingPing represents a ping test request from the server.
@@ -163,6 +164,11 @@ type PendingUpdate struct {
TargetVersion string `json:"target_version"`
}
// PendingModelList represents a request to enumerate supported models.
type PendingModelList struct {
ID string `json:"id"`
}
func (c *Client) SendHeartbeat(ctx context.Context, runtimeID string) (*HeartbeatResponse, error) {
var resp HeartbeatResponse
if err := c.postJSON(ctx, "/api/daemon/heartbeat", map[string]string{
@@ -182,6 +188,11 @@ func (c *Client) ReportUpdateResult(ctx context.Context, runtimeID, updateID str
return c.postJSON(ctx, fmt.Sprintf("/api/daemon/runtimes/%s/update/%s/result", runtimeID, updateID), result, nil)
}
// ReportModelListResult sends the model-discovery result back to the server.
func (c *Client) ReportModelListResult(ctx context.Context, runtimeID, requestID string, result map[string]any) error {
return c.postJSON(ctx, fmt.Sprintf("/api/daemon/runtimes/%s/models/%s/result", runtimeID, requestID), result, nil)
}
// WorkspaceInfo holds minimal workspace metadata returned by the API.
type WorkspaceInfo struct {
ID string `json:"id"`

View File

@@ -496,11 +496,70 @@ func (d *Daemon) heartbeatLoop(ctx context.Context) {
if resp.PendingUpdate != nil {
go d.handleUpdate(ctx, rid, resp.PendingUpdate)
}
// Handle pending model-list requests.
if resp.PendingModelList != nil {
rt := d.findRuntime(rid)
if rt != nil {
go d.handleModelList(ctx, *rt, resp.PendingModelList.ID)
}
}
}
}
}
}
// handleModelList resolves the provider's supported models (via static
// catalog or by shelling out to the agent CLI) and reports the result
// back to the server. Model discovery failures are reported as empty
// lists rather than errors so the UI can still render a creatable
// dropdown.
func (d *Daemon) handleModelList(ctx context.Context, rt Runtime, requestID string) {
d.logger.Info("model list requested", "runtime_id", rt.ID, "request_id", requestID, "provider", rt.Provider)
entry, ok := d.cfg.Agents[rt.Provider]
if !ok {
d.client.ReportModelListResult(ctx, rt.ID, requestID, map[string]any{
"status": "failed",
"error": fmt.Sprintf("no agent configured for provider %q", rt.Provider),
})
return
}
models, err := agent.ListModels(ctx, rt.Provider, entry.Path)
if err != nil {
d.client.ReportModelListResult(ctx, rt.ID, requestID, map[string]any{
"status": "failed",
"error": err.Error(),
})
return
}
// Wire format matches handler.ModelEntry. Use a struct (not
// map[string]string) so the Default bool round-trips — without
// it the UI loses its "default" badge on the advertised pick.
type modelWire struct {
ID string `json:"id"`
Label string `json:"label"`
Provider string `json:"provider,omitempty"`
Default bool `json:"default,omitempty"`
}
wire := make([]modelWire, 0, len(models))
for _, m := range models {
wire = append(wire, modelWire{
ID: m.ID,
Label: m.Label,
Provider: m.Provider,
Default: m.Default,
})
}
d.client.ReportModelListResult(ctx, rt.ID, requestID, map[string]any{
"status": "completed",
"models": wire,
"supported": agent.ModelSelectionSupported(rt.Provider),
})
}
func (d *Daemon) handlePing(ctx context.Context, rt Runtime, pingID string) {
d.logger.Info("ping requested", "runtime_id", rt.ID, "ping_id", pingID, "provider", rt.Provider)
@@ -1018,9 +1077,25 @@ func (d *Daemon) runTask(ctx context.Context, task Task, provider string, taskLo
customArgs = task.Agent.CustomArgs
mcpConfig = task.Agent.McpConfig
}
// Two-tier model resolution: an explicit agent.model wins,
// then the daemon-wide MULTICA_<PROVIDER>_MODEL env var. If
// both are empty we deliberately pass "" through — each
// backend omits `--model` from the CLI invocation, so the
// provider picks its own default (Claude Code's shipped
// default, codex app-server's account-scoped default, etc.).
// Baking a Go-side "recommended default" here is how the
// cursor regression happened — static guesses drift from
// whatever the upstream CLI actually accepts.
model := ""
if task.Agent != nil && task.Agent.Model != "" {
model = task.Agent.Model
}
if model == "" {
model = entry.Model
}
execOpts := agent.ExecOptions{
Cwd: env.WorkDir,
Model: entry.Model,
Model: model,
Timeout: d.cfg.AgentTimeout,
ResumeSessionID: task.PriorSessionID,
CustomArgs: customArgs,

View File

@@ -49,6 +49,7 @@ type AgentData struct {
CustomEnv map[string]string `json:"custom_env,omitempty"`
CustomArgs []string `json:"custom_args,omitempty"`
McpConfig json.RawMessage `json:"mcp_config,omitempty"`
Model string `json:"model,omitempty"`
}
// SkillData represents a structured skill for task execution.

View File

@@ -36,6 +36,7 @@ type AgentResponse struct {
Visibility string `json:"visibility"`
Status string `json:"status"`
MaxConcurrentTasks int32 `json:"max_concurrent_tasks"`
Model string `json:"model"`
OwnerID *string `json:"owner_id"`
Skills []SkillResponse `json:"skills"`
CreatedAt string `json:"created_at"`
@@ -94,6 +95,7 @@ func agentToResponse(a db.Agent) AgentResponse {
Visibility: a.Visibility,
Status: a.Status,
MaxConcurrentTasks: a.MaxConcurrentTasks,
Model: a.Model.String,
OwnerID: uuidToPtr(a.OwnerID),
Skills: []SkillResponse{},
CreatedAt: timestampToString(a.CreatedAt),
@@ -144,6 +146,7 @@ type TaskAgentData struct {
CustomEnv map[string]string `json:"custom_env,omitempty"`
CustomArgs []string `json:"custom_args,omitempty"`
McpConfig json.RawMessage `json:"mcp_config,omitempty"`
Model string `json:"model,omitempty"`
}
func taskToResponse(t db.AgentTaskQueue) AgentTaskResponse {
@@ -265,6 +268,7 @@ type CreateAgentRequest struct {
McpConfig json.RawMessage `json:"mcp_config"`
Visibility string `json:"visibility"`
MaxConcurrentTasks int32 `json:"max_concurrent_tasks"`
Model string `json:"model"`
}
func decodeJSONBodyWithRawFields(body io.Reader, dst any) (map[string]json.RawMessage, error) {
@@ -362,6 +366,7 @@ func (h *Handler) CreateAgent(w http.ResponseWriter, r *http.Request) {
CustomEnv: ce,
CustomArgs: ca,
McpConfig: mc,
Model: pgtype.Text{String: req.Model, Valid: req.Model != ""},
})
if err != nil {
// Unique constraint on (workspace_id, name) — return a clear conflict error
@@ -401,6 +406,7 @@ type UpdateAgentRequest struct {
Visibility *string `json:"visibility"`
Status *string `json:"status"`
MaxConcurrentTasks *int32 `json:"max_concurrent_tasks"`
Model *string `json:"model"`
}
// canViewAgentEnv checks whether the requesting user is allowed to see the
@@ -523,6 +529,9 @@ func (h *Handler) UpdateAgent(w http.ResponseWriter, r *http.Request) {
if req.MaxConcurrentTasks != nil {
params.MaxConcurrentTasks = pgtype.Int4{Int32: *req.MaxConcurrentTasks, Valid: true}
}
if req.Model != nil {
params.Model = pgtype.Text{String: *req.Model, Valid: true}
}
agent, err = h.Queries.UpdateAgent(r.Context(), params)
if err != nil {

View File

@@ -494,6 +494,11 @@ func (h *Handler) DaemonHeartbeat(w http.ResponseWriter, r *http.Request) {
}
}
// Check for pending model-list requests for this runtime.
if pending := h.ModelListStore.PopPending(req.RuntimeID); pending != nil {
resp["pending_model_list"] = map[string]string{"id": pending.ID}
}
writeJSON(w, http.StatusOK, resp)
}
@@ -589,6 +594,7 @@ func (h *Handler) ClaimTaskByRuntime(w http.ResponseWriter, r *http.Request) {
CustomEnv: customEnv,
CustomArgs: customArgs,
McpConfig: mcpConfig,
Model: agent.Model.String,
}
}

View File

@@ -48,6 +48,7 @@ type Handler struct {
EmailService *service.EmailService
PingStore *PingStore
UpdateStore *UpdateStore
ModelListStore *ModelListStore
Storage storage.Storage
CFSigner *auth.CloudFrontSigner
cfg Config
@@ -71,6 +72,7 @@ func New(queries *db.Queries, txStarter txStarter, hub *realtime.Hub, bus *event
EmailService: emailService,
PingStore: NewPingStore(),
UpdateStore: NewUpdateStore(),
ModelListStore: NewModelListStore(),
Storage: store,
CFSigner: cfSigner,
cfg: cfg,

View File

@@ -0,0 +1,228 @@
package handler
import (
"encoding/json"
"log/slog"
"net/http"
"sync"
"time"
"github.com/go-chi/chi/v5"
)
// ---------------------------------------------------------------------------
// In-memory model-list request store
// ---------------------------------------------------------------------------
//
// The server cannot call the daemon directly (the daemon is behind the user's
// NAT and only polls the server). So "list models for this runtime" uses the
// same pattern as PingStore: server creates a pending request, daemon pops it
// on the next heartbeat, executes locally, and reports the result back.
// ModelListStatus represents the lifecycle of a model list request.
type ModelListStatus string
const (
ModelListPending ModelListStatus = "pending"
ModelListRunning ModelListStatus = "running"
ModelListCompleted ModelListStatus = "completed"
ModelListFailed ModelListStatus = "failed"
ModelListTimeout ModelListStatus = "timeout"
)
// ModelListRequest represents a pending or completed model list request.
// Supported is false when the provider ignores per-agent model
// selection entirely (currently: hermes). The UI uses this to
// disable its dropdown rather than silently accepting a value the
// backend will drop.
type ModelListRequest struct {
ID string `json:"id"`
RuntimeID string `json:"runtime_id"`
Status ModelListStatus `json:"status"`
Models []ModelEntry `json:"models,omitempty"`
Supported bool `json:"supported"`
Error string `json:"error,omitempty"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
}
// ModelEntry mirrors agent.Model for the wire. `Default` tags the
// model the runtime advertises as its preferred pick (e.g. Claude
// Code's shipped default, or hermes' currentModelId) so the UI can
// badge it — don't drop it when marshalling.
type ModelEntry struct {
ID string `json:"id"`
Label string `json:"label"`
Provider string `json:"provider,omitempty"`
Default bool `json:"default,omitempty"`
}
// ModelListStore is a thread-safe in-memory store. Entries expire after 2 min
// to bound memory use; the UI polls /requests/:id until status is terminal.
type ModelListStore struct {
mu sync.Mutex
requests map[string]*ModelListRequest
}
func NewModelListStore() *ModelListStore {
return &ModelListStore{requests: make(map[string]*ModelListRequest)}
}
func (s *ModelListStore) Create(runtimeID string) *ModelListRequest {
s.mu.Lock()
defer s.mu.Unlock()
// Garbage-collect stale entries so the map can't grow unbounded.
for id, req := range s.requests {
if time.Since(req.CreatedAt) > 2*time.Minute {
delete(s.requests, id)
}
}
req := &ModelListRequest{
ID: randomID(),
RuntimeID: runtimeID,
Status: ModelListPending,
// Default to true; the daemon overrides this in the report
// for providers that don't support per-agent model selection.
Supported: true,
CreatedAt: time.Now(),
UpdatedAt: time.Now(),
}
s.requests[req.ID] = req
return req
}
func (s *ModelListStore) Get(id string) *ModelListRequest {
s.mu.Lock()
defer s.mu.Unlock()
req, ok := s.requests[id]
if !ok {
return nil
}
if req.Status == ModelListPending && time.Since(req.CreatedAt) > 30*time.Second {
req.Status = ModelListTimeout
req.Error = "daemon did not respond within 30 seconds"
req.UpdatedAt = time.Now()
}
return req
}
// PopPending returns and marks-running the oldest pending request for a runtime.
func (s *ModelListStore) PopPending(runtimeID string) *ModelListRequest {
s.mu.Lock()
defer s.mu.Unlock()
var oldest *ModelListRequest
for _, req := range s.requests {
if req.RuntimeID == runtimeID && req.Status == ModelListPending {
if oldest == nil || req.CreatedAt.Before(oldest.CreatedAt) {
oldest = req
}
}
}
if oldest != nil {
oldest.Status = ModelListRunning
oldest.UpdatedAt = time.Now()
}
return oldest
}
func (s *ModelListStore) Complete(id string, models []ModelEntry, supported bool) {
s.mu.Lock()
defer s.mu.Unlock()
if req, ok := s.requests[id]; ok {
req.Status = ModelListCompleted
req.Models = models
req.Supported = supported
req.UpdatedAt = time.Now()
}
}
func (s *ModelListStore) Fail(id string, errMsg string) {
s.mu.Lock()
defer s.mu.Unlock()
if req, ok := s.requests[id]; ok {
req.Status = ModelListFailed
req.Error = errMsg
req.UpdatedAt = time.Now()
}
}
// ---------------------------------------------------------------------------
// Handlers
// ---------------------------------------------------------------------------
// InitiateListModels creates a pending model list request for a runtime.
// Called by the frontend; the daemon picks it up on its next heartbeat.
func (h *Handler) InitiateListModels(w http.ResponseWriter, r *http.Request) {
runtimeID := chi.URLParam(r, "runtimeId")
rt, err := h.Queries.GetAgentRuntime(r.Context(), parseUUID(runtimeID))
if err != nil {
writeError(w, http.StatusNotFound, "runtime not found")
return
}
if _, ok := h.requireWorkspaceMember(w, r, uuidToString(rt.WorkspaceID), "runtime not found"); !ok {
return
}
if rt.Status != "online" {
writeError(w, http.StatusServiceUnavailable, "runtime is offline")
return
}
req := h.ModelListStore.Create(runtimeID)
writeJSON(w, http.StatusOK, req)
}
// GetModelListRequest returns the status of a model list request.
func (h *Handler) GetModelListRequest(w http.ResponseWriter, r *http.Request) {
requestID := chi.URLParam(r, "requestId")
req := h.ModelListStore.Get(requestID)
if req == nil {
writeError(w, http.StatusNotFound, "request not found")
return
}
writeJSON(w, http.StatusOK, req)
}
// ReportModelListResult receives the list result from the daemon.
func (h *Handler) ReportModelListResult(w http.ResponseWriter, r *http.Request) {
runtimeID := chi.URLParam(r, "runtimeId")
if _, ok := h.requireDaemonRuntimeAccess(w, r, runtimeID); !ok {
return
}
requestID := chi.URLParam(r, "requestId")
var body struct {
Status string `json:"status"` // "completed" or "failed"
Models []ModelEntry `json:"models"`
Supported *bool `json:"supported"`
Error string `json:"error"`
}
if err := json.NewDecoder(r.Body).Decode(&body); err != nil {
writeError(w, http.StatusBadRequest, "invalid request body")
return
}
if body.Status == "completed" {
// Older daemons may omit `supported`; default to true to keep
// the UI usable while they haven't been redeployed yet.
supported := true
if body.Supported != nil {
supported = *body.Supported
}
h.ModelListStore.Complete(requestID, body.Models, supported)
} else {
h.ModelListStore.Fail(requestID, body.Error)
}
slog.Debug("model list report", "runtime_id", runtimeID, "request_id", requestID, "status", body.Status, "count", len(body.Models))
writeJSON(w, http.StatusOK, map[string]string{"status": "ok"})
}

View File

@@ -0,0 +1,88 @@
package handler
import (
"bytes"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
)
// TestReportModelListResult_PreservesDefault guards the daemon → server
// → UI wire format for the model-discovery result. The `default` bool
// on each ModelEntry lights up the UI's "default" badge; if it gets
// dropped here (e.g. by going through a map[string]string), the badge
// silently disappears.
func TestReportModelListResult_PreservesDefault(t *testing.T) {
store := NewModelListStore()
req := store.Create("runtime-xyz")
// Report a completed result with one default entry and one not.
body := map[string]any{
"status": "completed",
"supported": true,
"models": []map[string]any{
{"id": "foo-default", "label": "Foo", "provider": "p", "default": true},
{"id": "bar", "label": "Bar", "provider": "p"},
},
}
raw, _ := json.Marshal(body)
// Use the store's Complete directly — we're verifying the wire
// shape, not HTTP auth. The handler itself unmarshals into
// []ModelEntry and forwards verbatim, which is the path we care
// about here.
var parsed struct {
Models []ModelEntry `json:"models"`
}
if err := json.Unmarshal(raw, &parsed); err != nil {
t.Fatalf("unmarshal report body: %v", err)
}
store.Complete(req.ID, parsed.Models, true)
got := store.Get(req.ID)
if got == nil {
t.Fatal("expected stored result")
}
if len(got.Models) != 2 {
t.Fatalf("expected 2 models, got %d: %+v", len(got.Models), got.Models)
}
if !got.Models[0].Default {
t.Errorf("first model should carry Default=true, got %+v", got.Models[0])
}
if got.Models[1].Default {
t.Errorf("second model should carry Default=false, got %+v", got.Models[1])
}
// Serialise the stored request back out (what UI actually sees)
// and confirm `default: true` survives.
out, _ := json.Marshal(got)
if !bytes.Contains(out, []byte(`"default":true`)) {
t.Errorf(`expected "default":true in JSON response, got: %s`, out)
}
}
// TestReportModelListResult_DecodesJSONBodyDefault verifies the
// handler's request-body parsing accepts the `default` bool from
// the daemon POST — not just through the store API.
func TestReportModelListResult_DecodesJSONBodyDefault(t *testing.T) {
// Simulate the shape the daemon POSTs: status + models + supported
// with `default` on one entry.
payload := `{"status":"completed","supported":true,"models":[{"id":"a","label":"A","default":true},{"id":"b","label":"B"}]}`
r := httptest.NewRequest(http.MethodPost, "/api/daemon/runtimes/rt/models/req/result", bytes.NewBufferString(payload))
var body struct {
Status string `json:"status"`
Models []ModelEntry `json:"models"`
Supported *bool `json:"supported"`
}
if err := json.NewDecoder(r.Body).Decode(&body); err != nil {
t.Fatalf("decode: %v", err)
}
if len(body.Models) != 2 {
t.Fatalf("want 2 models, got %d", len(body.Models))
}
if !body.Models[0].Default {
t.Errorf("default flag lost on model[0]: %+v", body.Models[0])
}
}

View File

@@ -0,0 +1 @@
ALTER TABLE agent DROP COLUMN IF EXISTS model;

View File

@@ -0,0 +1,5 @@
-- Adds an explicit per-agent model field. Previously the only way to
-- pick a model per agent was via custom_env / custom_args; a first-class
-- column lets the UI render a dropdown and keeps Codex-style app-server
-- providers (which reject -m in custom_args) working without CLI flags.
ALTER TABLE agent ADD COLUMN model TEXT;

View File

@@ -5,7 +5,9 @@ import (
"context"
"encoding/json"
"fmt"
"io"
"os/exec"
"regexp"
"strings"
"sync"
"time"
@@ -64,7 +66,15 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
cancel()
return nil, fmt.Errorf("hermes stdin pipe: %w", err)
}
cmd.Stderr = newLogWriter(b.cfg.Logger, "[hermes:stderr] ")
// Forward stderr to the daemon log *and* sniff provider-level
// errors out of it so we can surface them in the task result.
// Hermes' session/prompt still reports stopReason=end_turn when
// the underlying HTTP call to the LLM returns 4xx/5xx, so
// without this we'd report a misleading "empty output" and hide
// the real cause (wrong model for the current provider, bad
// credentials, rate limit, …) in the daemon log.
providerErr := newHermesProviderErrorSniffer()
cmd.Stderr = io.MultiWriter(newLogWriter(b.cfg.Logger, "[hermes:stderr] "), providerErr)
if err := cmd.Start(); err != nil {
cancel()
@@ -82,8 +92,8 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
promptDone := make(chan hermesPromptResult, 1)
c := &hermesClient{
cfg: b.cfg,
stdin: stdin,
cfg: b.cfg,
stdin: stdin,
pending: make(map[int]*pendingRPC),
onMessage: func(msg Message) {
if msg.Type == MessageText {
@@ -190,13 +200,40 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
c.sessionID = sessionID
b.cfg.Logger.Info("hermes session created", "session_id", sessionID)
// 3. Build the prompt content. If we have a system prompt, prepend it.
// 3. If the caller picked a model (via agent.model from the
// UI dropdown), ask hermes to switch the session to it
// before we send any prompt. Hermes' _build_model_state
// exposes modelId as `provider:model` — we pass that
// through verbatim. This MUST fail the task on error:
// if we silently fell back to hermes' default model the
// user would think their pick was honoured while the
// task actually ran on something else.
if opts.Model != "" {
if _, err := c.request(runCtx, "session/set_model", map[string]any{
"sessionId": sessionID,
"modelId": opts.Model,
}); err != nil {
b.cfg.Logger.Warn("hermes set_session_model failed", "error", err, "requested_model", opts.Model)
finalStatus = "failed"
finalError = fmt.Sprintf("hermes could not switch to model %q: %v", opts.Model, err)
resCh <- Result{
Status: finalStatus,
Error: finalError,
DurationMs: time.Since(startTime).Milliseconds(),
SessionID: sessionID,
}
return
}
b.cfg.Logger.Info("hermes session model set", "model", opts.Model)
}
// 4. Build the prompt content. If we have a system prompt, prepend it.
userText := prompt
if opts.SystemPrompt != "" {
userText = opts.SystemPrompt + "\n\n---\n\n" + prompt
}
// 4. Send the prompt and wait for PromptResponse.
// 5. Send the prompt and wait for PromptResponse.
_, err = c.request(runCtx, "session/prompt", map[string]any{
"sessionId": sessionID,
"prompt": []map[string]any{
@@ -248,6 +285,20 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
finalOutput := output.String()
outputMu.Unlock()
// If hermes produced no visible output but we sniffed a
// provider-level error on stderr (typically HTTP 4xx from
// the configured LLM endpoint), promote the status to
// failed and surface the real reason. Without this the
// daemon reports a cryptic "hermes returned empty output"
// and the actionable error (e.g. "model X not supported
// with your ChatGPT account") stays buried in daemon logs.
if finalStatus == "completed" && finalOutput == "" {
if msg := providerErr.message(); msg != "" {
finalStatus = "failed"
finalError = msg
}
}
// Build usage map.
c.usageMu.Lock()
u := c.usage
@@ -283,13 +334,13 @@ type hermesPromptResult struct {
}
type hermesClient struct {
cfg Config
stdin interface{ Write([]byte) (int, error) }
mu sync.Mutex
nextID int
pending map[int]*pendingRPC
sessionID string
onMessage func(Message)
cfg Config
stdin interface{ Write([]byte) (int, error) }
mu sync.Mutex
nextID int
pending map[int]*pendingRPC
sessionID string
onMessage func(Message)
onPromptDone func(hermesPromptResult)
usageMu sync.Mutex
@@ -427,8 +478,8 @@ func (c *hermesClient) extractPromptResult(data json.RawMessage) {
}
if resp.Usage != nil {
pr.usage = TokenUsage{
InputTokens: resp.Usage.InputTokens,
OutputTokens: resp.Usage.OutputTokens,
InputTokens: resp.Usage.InputTokens,
OutputTokens: resp.Usage.OutputTokens,
CacheReadTokens: resp.Usage.CachedReadTokens,
}
}
@@ -509,9 +560,9 @@ func (c *hermesClient) handleAgentThought(data json.RawMessage) {
func (c *hermesClient) handleToolCallStart(data json.RawMessage) {
var msg struct {
ToolCallID string `json:"toolCallId"`
Title string `json:"title"`
Kind string `json:"kind"`
ToolCallID string `json:"toolCallId"`
Title string `json:"title"`
Kind string `json:"kind"`
RawInput map[string]any `json:"rawInput"`
}
if err := json.Unmarshal(data, &msg); err != nil {
@@ -649,3 +700,98 @@ func hermesToolNameFromTitle(title string, kind string) string {
return kind
}
}
// ── Provider-error sniffing ──
//
// hermes' session/prompt RPC reports stopReason=end_turn even when
// the underlying HTTP call to the configured LLM endpoint returned
// an error — the actionable detail only appears on stderr (e.g.
// `⚠️ API call failed (attempt 1/3): BadRequestError [HTTP 400]` and
// `Error: HTTP 400: Error code: 400 - {'detail': "The '...' model
// is not supported when using Codex with a ChatGPT account."}`).
// We scan for those patterns so the daemon can surface a real
// failure instead of a generic "empty output".
type hermesProviderErrorSniffer struct {
mu sync.Mutex
remains []byte // buffer for a partial trailing line across writes
lines []string // captured error lines, bounded
seen map[string]bool
}
// hermesErrorHeaderRe matches the first line of an API-error block.
// Hermes prefixes these with ⚠️ / ❌ and includes an HTTP status
// code or a non-retryable-error tag.
var hermesErrorHeaderRe = regexp.MustCompile(`(?:⚠️|❌|\[ERROR\]).*(?:BadRequestError|AuthenticationError|RateLimitError|HTTP [0-9]{3}|Non-retryable|API call failed)`)
// hermesErrorDetailRe pulls the most useful single-line messages
// out of the subsequent lines of the error block (the one whose
// "Error:" or "Details:" tag actually spells out what happened).
var hermesErrorDetailRe = regexp.MustCompile(`(?:Error:|detail:|Details:)\s*(.+)`)
const hermesMaxErrorLines = 8
func newHermesProviderErrorSniffer() *hermesProviderErrorSniffer {
return &hermesProviderErrorSniffer{seen: map[string]bool{}}
}
// Write implements io.Writer so the sniffer can sit behind an
// io.MultiWriter next to the normal stderr log forwarder.
func (s *hermesProviderErrorSniffer) Write(p []byte) (int, error) {
s.mu.Lock()
defer s.mu.Unlock()
data := append(s.remains, p...)
// Keep the final partial line (no trailing newline) for the
// next write so multi-line error blocks aren't split.
nl := strings.LastIndexByte(string(data), '\n')
var complete string
if nl < 0 {
s.remains = append(s.remains[:0], data...)
return len(p), nil
}
complete = string(data[:nl])
s.remains = append(s.remains[:0], data[nl+1:]...)
for _, line := range strings.Split(complete, "\n") {
line = strings.TrimSpace(line)
if line == "" {
continue
}
if !(hermesErrorHeaderRe.MatchString(line) || hermesErrorDetailRe.MatchString(line)) {
continue
}
if s.seen[line] {
continue
}
s.seen[line] = true
s.lines = append(s.lines, line)
if len(s.lines) > hermesMaxErrorLines {
s.lines = s.lines[len(s.lines)-hermesMaxErrorLines:]
}
}
return len(p), nil
}
// message returns a single-line summary suitable for the task
// error field. Prefers the most specific "Error:" / "detail:"
// fragment; falls back to the first captured header line; empty
// when nothing useful was seen.
func (s *hermesProviderErrorSniffer) message() string {
s.mu.Lock()
defer s.mu.Unlock()
for _, line := range s.lines {
if m := hermesErrorDetailRe.FindStringSubmatch(line); m != nil {
detail := strings.TrimSpace(m[1])
if detail != "" {
return "hermes provider error: " + detail
}
}
}
for _, line := range s.lines {
if hermesErrorHeaderRe.MatchString(line) {
return "hermes provider error: " + line
}
}
return ""
}

View File

@@ -2,6 +2,7 @@ package agent
import (
"encoding/json"
"strings"
"testing"
)
@@ -375,3 +376,71 @@ func TestHermesClientIgnoresInvalidJSON(t *testing.T) {
c.handleLine("")
c.handleLine("{}")
}
func TestHermesProviderErrorSniffer(t *testing.T) {
t.Parallel()
// Real sample of the stderr hermes emits when the configured
// LLM endpoint rejects the requested model. We verify the
// sniffer extracts the `Error: ...` line so the task error
// tells the user *why* it failed.
s := newHermesProviderErrorSniffer()
lines := []string{
"2026-04-20 23:41:47 [INFO] acp_adapter.server: Prompt on session abc",
`⚠️ API call failed (attempt 1/3): BadRequestError [HTTP 400]`,
` 🔌 Provider: openai-codex Model: gpt-5.1-codex-mini`,
` 📝 Error: HTTP 400: Error code: 400 - {'detail': "The 'gpt-5.1-codex-mini' model is not supported when using Codex with a ChatGPT account."}`,
`⏱️ Elapsed: 1.17s`,
}
for _, line := range lines {
if _, err := s.Write([]byte(line + "\n")); err != nil {
t.Fatalf("Write: %v", err)
}
}
msg := s.message()
if msg == "" {
t.Fatal("expected a non-empty error message")
}
if !strings.Contains(msg, "model is not supported") {
t.Errorf("expected detail about model support, got %q", msg)
}
}
func TestHermesProviderErrorSnifferIgnoresInfoLines(t *testing.T) {
t.Parallel()
s := newHermesProviderErrorSniffer()
s.Write([]byte("2026-04-20 23:41:45 [INFO] acp_adapter.entry: Loaded env\n"))
s.Write([]byte("2026-04-20 23:41:47 [INFO] agent.auxiliary_client: Vision auto-detect...\n"))
if msg := s.message(); msg != "" {
t.Errorf("info lines should produce no error, got %q", msg)
}
}
func TestHermesProviderErrorSnifferHandlesPartialLines(t *testing.T) {
t.Parallel()
// Writer may be called mid-line; the sniffer must buffer until
// it sees a newline so the regex doesn't miss the header.
s := newHermesProviderErrorSniffer()
s.Write([]byte(`⚠️ API call failed (attempt 1/3):`))
s.Write([]byte(` BadRequestError [HTTP 400]` + "\n"))
s.Write([]byte(` 📝 Error: something went wrong` + "\n"))
msg := s.message()
if !strings.Contains(msg, "something went wrong") {
t.Errorf("expected buffered line to be captured, got %q", msg)
}
}
func TestHermesProviderErrorSnifferBoundedBuffer(t *testing.T) {
t.Parallel()
s := newHermesProviderErrorSniffer()
for i := 0; i < 20; i++ {
// Each line differs so dedup doesn't merge them.
s.Write([]byte(`⚠️ API call failed (HTTP 400) attempt ` + string(rune('a'+i%26)) + `: Non-retryable error` + "\n"))
}
if len(s.lines) > hermesMaxErrorLines {
t.Errorf("sniffer kept %d lines, limit is %d", len(s.lines), hermesMaxErrorLines)
}
}

741
server/pkg/agent/models.go Normal file
View File

@@ -0,0 +1,741 @@
package agent
import (
"bufio"
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"os"
"os/exec"
"strings"
"sync"
"time"
)
// Model describes a single LLM model exposed by an agent provider.
// The dropdown groups by Provider when the ID uses the
// `provider/model` form (e.g. "openai/gpt-4o" from opencode).
// Default is a *display* hint: the UI badges the entry the
// runtime advertises as its preferred pick (e.g. Claude Code's
// shipped default, or hermes' currentModelId). It has no effect
// at execution time — when agent.model is empty the daemon passes
// "" to the backend so each provider's own CLI resolves its own
// default, which is always closer to what the user's account /
// environment actually supports than a static guess here.
type Model struct {
ID string `json:"id"`
Label string `json:"label"`
Provider string `json:"provider,omitempty"`
Default bool `json:"default,omitempty"`
}
// modelCache memoizes dynamic discovery calls so repeated UI loads
// don't re-shell the agent CLI. Entries expire after cacheTTL.
type modelCacheEntry struct {
models []Model
expiresAt time.Time
}
var (
modelCacheMu sync.Mutex
modelCache = map[string]modelCacheEntry{}
)
const modelCacheTTL = 60 * time.Second
// ListModels returns the models supported by the given agent provider.
// For providers with a known static catalog it returns the baked-in
// list; for providers with a CLI discovery mechanism (opencode, pi,
// openclaw) it shells out with caching and falls back to the static
// list on failure.
//
// executablePath lets the caller point at a non-default binary; pass
// "" to use the provider's default name on PATH.
func ListModels(ctx context.Context, providerType, executablePath string) ([]Model, error) {
switch providerType {
case "claude":
return claudeStaticModels(), nil
case "codex":
return codexStaticModels(), nil
case "gemini":
return geminiStaticModels(), nil
case "cursor":
return cachedDiscovery(providerType, func() ([]Model, error) {
return discoverCursorModels(ctx, executablePath)
})
case "copilot":
return copilotStaticModels(), nil
case "hermes":
return cachedDiscovery(providerType, func() ([]Model, error) {
return discoverHermesModels(ctx, executablePath)
})
case "opencode":
return cachedDiscovery(providerType, func() ([]Model, error) {
return discoverOpenCodeModels(ctx, executablePath)
})
case "pi":
return cachedDiscovery(providerType, func() ([]Model, error) {
return discoverPiModels(ctx, executablePath)
})
case "openclaw":
return cachedDiscovery(providerType, func() ([]Model, error) {
return discoverOpenclawAgents(ctx, executablePath)
})
default:
return nil, fmt.Errorf("unknown agent type: %q", providerType)
}
}
// ModelSelectionSupported reports whether setting `agent.model` has
// any effect for the given provider. Today every provider in the
// registry honours `opts.Model` end-to-end: Hermes routes it through
// the ACP `session/set_model` RPC before each prompt, which means
// the UI's dropdown choice is carried all the way down to the LLM
// call. The helper is retained so we can add a `return false` branch
// the next time a provider legitimately ignores model selection.
func ModelSelectionSupported(providerType string) bool {
_ = providerType
return true
}
// cachedDiscovery invokes fn and caches the result for modelCacheTTL.
// The cache is keyed on providerType only; callers that need to
// distinguish discovery by host/user should include that in the key
// if we ever introduce such a mode.
func cachedDiscovery(key string, fn func() ([]Model, error)) ([]Model, error) {
modelCacheMu.Lock()
if entry, ok := modelCache[key]; ok && time.Now().Before(entry.expiresAt) {
out := entry.models
modelCacheMu.Unlock()
return out, nil
}
modelCacheMu.Unlock()
models, err := fn()
if err != nil {
return nil, err
}
modelCacheMu.Lock()
modelCache[key] = modelCacheEntry{models: models, expiresAt: time.Now().Add(modelCacheTTL)}
modelCacheMu.Unlock()
return models, nil
}
// ── Static catalogs ──
// claudeStaticModels reflects the Claude Code CLI's accepted --model
// values. Keep this list short and current; stale entries here
// mislead users more than they help. Default = Sonnet because it's
// the everyday workhorse (Opus is reserved for advisor-style flows).
func claudeStaticModels() []Model {
return []Model{
{ID: "claude-sonnet-4-6", Label: "Claude Sonnet 4.6", Provider: "anthropic", Default: true},
{ID: "claude-opus-4-7", Label: "Claude Opus 4.7", Provider: "anthropic"},
{ID: "claude-haiku-4-5-20251001", Label: "Claude Haiku 4.5", Provider: "anthropic"},
{ID: "claude-opus-4-6", Label: "Claude Opus 4.6", Provider: "anthropic"},
{ID: "claude-sonnet-4-5", Label: "Claude Sonnet 4.5", Provider: "anthropic"},
}
}
func codexStaticModels() []Model {
return []Model{
{ID: "gpt-5.4", Label: "GPT-5.4", Provider: "openai", Default: true},
{ID: "gpt-5.4-mini", Label: "GPT-5.4 mini", Provider: "openai"},
{ID: "gpt-5.3-codex", Label: "GPT-5.3 Codex", Provider: "openai"},
{ID: "gpt-5", Label: "GPT-5", Provider: "openai"},
{ID: "o3", Label: "o3", Provider: "openai"},
{ID: "o3-mini", Label: "o3-mini", Provider: "openai"},
}
}
func geminiStaticModels() []Model {
return []Model{
{ID: "gemini-2.5-pro", Label: "Gemini 2.5 Pro", Provider: "google", Default: true},
{ID: "gemini-2.5-flash", Label: "Gemini 2.5 Flash", Provider: "google"},
{ID: "gemini-2.0-flash", Label: "Gemini 2.0 Flash", Provider: "google"},
}
}
// cursorStaticModels is a minimal fallback used when
// `cursor-agent --list-models` isn't available (binary missing,
// offline, etc). The real catalog is fetched dynamically because
// Cursor's model IDs shift (e.g. `composer-2-fast`,
// `claude-4.6-sonnet-medium`, `gemini-3.1-pro`) and any static
// list we ship goes stale fast.
func cursorStaticModels() []Model {
return []Model{
{ID: "auto", Label: "Auto", Provider: "cursor", Default: true},
}
}
// copilotStaticModels — GitHub Copilot CLI resolves models via the
// user's GitHub account, not via CLI args. We deliberately mark no
// Default: the right model is whatever GitHub routes the request
// to, and forcing one here would override that.
func copilotStaticModels() []Model {
return []Model{
{ID: "gpt-5.4", Label: "GPT-5.4", Provider: "openai"},
{ID: "claude-sonnet-4-6", Label: "Claude Sonnet 4.6", Provider: "anthropic"},
}
}
// ── Dynamic discovery ──
// discoverOpenCodeModels runs `opencode models` and parses its tabular
// output. The CLI prints `provider/model` rows; we emit them verbatim
// as IDs so what the user sees matches what `--model` accepts.
// On any failure (CLI missing, parse error, timeout) we fall back to
// an empty list so the creatable UI still works.
func discoverOpenCodeModels(ctx context.Context, executablePath string) ([]Model, error) {
if executablePath == "" {
executablePath = "opencode"
}
if _, err := exec.LookPath(executablePath); err != nil {
return []Model{}, nil
}
runCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
defer cancel()
cmd := exec.CommandContext(runCtx, executablePath, "models")
out, err := cmd.Output()
if err != nil {
return []Model{}, nil
}
return parseOpenCodeModels(string(out)), nil
}
// parseOpenCodeModels accepts the `opencode models` text output and
// extracts IDs. Output format (v0.x): a header row followed by rows
// whose first whitespace-delimited field is `provider/model`.
func parseOpenCodeModels(output string) []Model {
scanner := bufio.NewScanner(strings.NewReader(output))
scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)
var models []Model
seen := map[string]bool{}
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if line == "" {
continue
}
first := strings.Fields(line)
if len(first) == 0 {
continue
}
id := first[0]
if !strings.Contains(id, "/") {
continue
}
// Skip the header row (opencode prints e.g. PROVIDER/MODEL in caps).
if id == strings.ToUpper(id) {
continue
}
if seen[id] {
continue
}
seen[id] = true
provider := ""
if i := strings.Index(id, "/"); i > 0 {
provider = id[:i]
}
models = append(models, Model{ID: id, Label: id, Provider: provider})
}
return models
}
// discoverPiModels runs `pi --list-models` and parses its output.
// Older pi versions print the list to stderr; newer versions use
// stdout. We capture both and parse whichever is non-empty.
func discoverPiModels(ctx context.Context, executablePath string) ([]Model, error) {
if executablePath == "" {
executablePath = "pi"
}
if _, err := exec.LookPath(executablePath); err != nil {
return []Model{}, nil
}
runCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
defer cancel()
cmd := exec.CommandContext(runCtx, executablePath, "--list-models")
var stderr strings.Builder
cmd.Stderr = &stderr
stdout, err := cmd.Output()
if err != nil {
return []Model{}, nil
}
text := string(stdout)
if strings.TrimSpace(text) == "" {
text = stderr.String()
}
return parsePiModels(text), nil
}
// parsePiModels accepts the `pi --list-models` output and extracts
// model IDs. Pi's format uses `provider:model` rows; we normalize to
// the same `provider/model` form as opencode for UI consistency.
func parsePiModels(output string) []Model {
scanner := bufio.NewScanner(strings.NewReader(output))
scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)
var models []Model
seen := map[string]bool{}
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if line == "" {
continue
}
first := strings.Fields(line)
if len(first) == 0 {
continue
}
id := first[0]
if !strings.ContainsAny(id, ":/") {
continue
}
// Normalize ":" to "/" since pi uses colon but opencode/UI uses slash.
id = strings.Replace(id, ":", "/", 1)
if seen[id] {
continue
}
seen[id] = true
provider := ""
if i := strings.Index(id, "/"); i > 0 {
provider = id[:i]
}
models = append(models, Model{ID: id, Label: id, Provider: provider})
}
return models
}
// discoverHermesModels spins up a throwaway `hermes acp` process,
// drives just enough of the protocol to receive the model list
// advertised in the `session/new` response, and shuts it down. The
// list and the `current` flag both come from hermes' own
// `_build_model_state` so whatever ~/.hermes/config.yaml resolves
// to at runtime is exactly what the UI shows.
//
// Failure modes (hermes missing, no credentials, config resolution
// error) all return an empty list so the UI falls back to the
// creatable manual-entry input instead of blocking the form.
func discoverHermesModels(ctx context.Context, executablePath string) ([]Model, error) {
if executablePath == "" {
executablePath = "hermes"
}
if _, err := exec.LookPath(executablePath); err != nil {
return []Model{}, nil
}
runCtx, cancel := context.WithTimeout(ctx, 15*time.Second)
defer cancel()
cmd := exec.CommandContext(runCtx, executablePath, "acp")
// Mirror the real backend's auto-approve so init doesn't prompt.
cmd.Env = append(os.Environ(), "HERMES_YOLO_MODE=1")
stdin, err := cmd.StdinPipe()
if err != nil {
return []Model{}, nil
}
stdout, err := cmd.StdoutPipe()
if err != nil {
stdin.Close()
return []Model{}, nil
}
// Discard stderr; noisy logs here don't help us and we don't
// want them bleeding into the daemon log every 60s.
cmd.Stderr = io.Discard
if err := cmd.Start(); err != nil {
return []Model{}, nil
}
// Ensure the child process is always reaped.
defer func() {
_ = stdin.Close()
_ = cmd.Process.Kill()
_, _ = cmd.Process.Wait()
}()
writeACP := func(id int, method string, params map[string]any) error {
msg := map[string]any{
"jsonrpc": "2.0",
"id": id,
"method": method,
"params": params,
}
data, err := json.Marshal(msg)
if err != nil {
return err
}
data = append(data, '\n')
_, err = stdin.Write(data)
return err
}
// Send initialize + session/new.
if err := writeACP(1, "initialize", map[string]any{
"protocolVersion": 1,
"clientInfo": map[string]any{"name": "multica-model-discovery", "version": "0.1.0"},
"clientCapabilities": map[string]any{},
}); err != nil {
return []Model{}, nil
}
// Hermes requires a valid cwd for session/new — use a temp
// directory we clean up afterwards, not the daemon's workdir
// (which might be in the middle of another task's worktree).
tmp, err := os.MkdirTemp("", "multica-hermes-discovery-")
if err != nil {
return []Model{}, nil
}
defer os.RemoveAll(tmp)
if err := writeACP(2, "session/new", map[string]any{
"cwd": tmp,
"mcpServers": []any{},
}); err != nil {
return []Model{}, nil
}
// Read responses until we see the one for id=2 (session/new).
scanner := bufio.NewScanner(stdout)
scanner.Buffer(make([]byte, 0, 1024*1024), 4*1024*1024)
deadline := time.After(12 * time.Second)
done := make(chan []Model, 1)
go func() {
defer close(done)
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if line == "" {
continue
}
var env struct {
ID json.Number `json:"id"`
Result json.RawMessage `json:"result"`
}
if err := json.Unmarshal([]byte(line), &env); err != nil {
continue
}
if env.ID.String() != "2" || len(env.Result) == 0 {
continue
}
done <- parseHermesSessionNewModels(env.Result)
return
}
}()
select {
case models := <-done:
if models == nil {
return []Model{}, nil
}
return models, nil
case <-deadline:
return []Model{}, nil
case <-runCtx.Done():
return []Model{}, nil
}
}
// parseHermesSessionNewModels extracts the model catalog from a
// hermes `session/new` response. Hermes' ACP schema emits:
//
// {
// "sessionId": "...",
// "models": {
// "availableModels": [
// {"modelId": "...", "name": "...", "description": "... current"}
// ],
// "currentModelId": "..."
// }
// }
//
// Returns nil (not an empty slice) when the payload is missing so
// the caller can distinguish "parsed with no models" (valid but
// empty catalog) from "couldn't find the structure at all".
func parseHermesSessionNewModels(raw json.RawMessage) []Model {
var resp struct {
Models struct {
AvailableModels []struct {
ModelID string `json:"modelId"`
Name string `json:"name"`
Description string `json:"description"`
} `json:"availableModels"`
CurrentModelID string `json:"currentModelId"`
} `json:"models"`
}
if err := json.Unmarshal(raw, &resp); err != nil {
return nil
}
models := make([]Model, 0, len(resp.Models.AvailableModels))
seen := map[string]bool{}
for _, m := range resp.Models.AvailableModels {
if m.ModelID == "" || seen[m.ModelID] {
continue
}
seen[m.ModelID] = true
label := m.Name
if label == "" {
label = m.ModelID
}
provider := ""
if idx := strings.Index(m.ModelID, ":"); idx > 0 {
provider = m.ModelID[:idx]
}
models = append(models, Model{
ID: m.ModelID,
Label: label,
Provider: provider,
Default: m.ModelID == resp.Models.CurrentModelID,
})
}
return models
}
// discoverCursorModels runs `cursor-agent --list-models` and parses
// the `id - Label` rows. Cursor's catalog changes often and ships
// many variants of the same base model (thinking / fast / max
// suffixes) — static baking would be obsolete within weeks. On any
// failure we fall back to the minimal static catalog so the UI
// stays usable when cursor-agent isn't installed on the daemon host.
func discoverCursorModels(ctx context.Context, executablePath string) ([]Model, error) {
if executablePath == "" {
executablePath = "cursor-agent"
}
if _, err := exec.LookPath(executablePath); err != nil {
return cursorStaticModels(), nil
}
runCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
defer cancel()
cmd := exec.CommandContext(runCtx, executablePath, "--list-models")
out, err := cmd.Output()
if err != nil {
return cursorStaticModels(), nil
}
models := parseCursorModels(string(out))
if len(models) == 0 {
return cursorStaticModels(), nil
}
return models, nil
}
// parseCursorModels extracts model IDs from `cursor-agent --list-models`.
// Output format (as of cursor-agent 2026.04):
//
// Available models
// <blank>
// auto - Auto
// composer-2-fast - Composer 2 Fast (current, default)
// composer-2 - Composer 2
// …
//
// The model tagged `(default)` is surfaced as Default=true so the
// UI badge points at cursor's own recommendation rather than a
// hard-coded guess from our catalog.
func parseCursorModels(output string) []Model {
scanner := bufio.NewScanner(strings.NewReader(output))
scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)
var models []Model
seen := map[string]bool{}
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if line == "" {
continue
}
// Row format: "<id> - <label>". Skip the "Available models" header.
idx := strings.Index(line, " - ")
if idx <= 0 {
continue
}
id := strings.TrimSpace(line[:idx])
label := strings.TrimSpace(line[idx+3:])
if !isOpenclawIdentifier(id) {
// Reuse the identifier guard — cursor IDs are in the
// same character set (alnum + `-./_`), so anything
// that fails it is either malformed or a header line.
continue
}
if seen[id] {
continue
}
seen[id] = true
isDefault := strings.Contains(label, "default")
// Strip the "(current, default)" suffix from the display
// label since we surface that through the Default flag.
if paren := strings.Index(label, "("); paren > 0 {
label = strings.TrimSpace(label[:paren])
}
if label == "" {
label = id
}
models = append(models, Model{
ID: id,
Label: label,
Provider: "cursor",
Default: isDefault,
})
}
return models
}
// discoverOpenclawAgents enumerates the pre-registered OpenClaw
// agents (which is where model selection actually lives in the
// OpenClaw world — each agent is bound to a model at `agents add`
// time). It tries structured JSON output first, falling back to a
// conservative text parser that rejects TUI decoration and section
// headers. On any ambiguity we return an empty list and let the
// creatable dropdown handle manual entry — a silently-wrong
// enumeration would be worse than none.
func discoverOpenclawAgents(ctx context.Context, executablePath string) ([]Model, error) {
if executablePath == "" {
executablePath = "openclaw"
}
if _, err := exec.LookPath(executablePath); err != nil {
return []Model{}, nil
}
runCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
defer cancel()
// Try JSON modes first. Different openclaw builds expose the
// flag under different names; trying a couple is cheap.
for _, jsonArgs := range [][]string{
{"agents", "list", "--json"},
{"agents", "list", "--output", "json"},
{"agents", "list", "-o", "json"},
} {
cmd := exec.CommandContext(runCtx, executablePath, jsonArgs...)
out, err := cmd.Output()
if err != nil {
continue
}
if models, ok := parseOpenclawAgentsJSON(out); ok {
return models, nil
}
}
// Text fallback. Be strict — the default output is a decorated
// banner with box-drawing and section headers, and picking up
// the wrong tokens produces nonsense entries like "Identity:".
cmd := exec.CommandContext(runCtx, executablePath, "agents", "list")
out, err := cmd.Output()
if err != nil {
return []Model{}, nil
}
return parseOpenclawAgents(string(out)), nil
}
// openclawAgentEntry is the shape parseOpenclawAgentsJSON expects
// from `openclaw agents list --json`. Both `name` and `id` are
// accepted as the identifier (different openclaw versions ship
// different field names); `model` is optional and only used to
// enrich the dropdown label.
type openclawAgentEntry struct {
Name string `json:"name"`
ID string `json:"id"`
Model string `json:"model"`
}
// parseOpenclawAgentsJSON accepts `openclaw agents list --json`-style
// output. It handles two common shapes: a top-level array, or an
// object with an `agents` key whose value is an array. Returns
// ok=false if the input isn't valid JSON in either shape.
func parseOpenclawAgentsJSON(raw []byte) ([]Model, bool) {
raw = bytes.TrimSpace(raw)
if len(raw) == 0 {
return nil, false
}
var flat []openclawAgentEntry
if err := json.Unmarshal(raw, &flat); err == nil {
return openclawEntriesToModels(flat), true
}
var wrapped struct {
Agents []openclawAgentEntry `json:"agents"`
}
if err := json.Unmarshal(raw, &wrapped); err == nil && wrapped.Agents != nil {
return openclawEntriesToModels(wrapped.Agents), true
}
return nil, false
}
func openclawEntriesToModels(entries []openclawAgentEntry) []Model {
models := make([]Model, 0, len(entries))
seen := map[string]bool{}
for _, e := range entries {
name := e.Name
if name == "" {
name = e.ID
}
if name == "" || seen[name] {
continue
}
seen[name] = true
label := name
if e.Model != "" {
label = name + " (" + e.Model + ")"
}
models = append(models, Model{ID: name, Label: label, Provider: "openclaw"})
}
return models
}
// parseOpenclawAgents extracts agent names from the text output of
// `openclaw agents list`. The default CLI output is a decorated
// banner — section headers ending in `:`, box-drawing characters,
// and single-character icons — so we only accept lines that look
// like a proper `<name> <model>` row: at least two whitespace-
// separated tokens, both made of safe identifier characters, and
// neither ending in `:`. Anything else is discarded to avoid
// surfacing "Identity:" or `◇` as selectable models.
func parseOpenclawAgents(output string) []Model {
scanner := bufio.NewScanner(strings.NewReader(output))
scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)
var models []Model
seen := map[string]bool{}
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if line == "" {
continue
}
fields := strings.Fields(line)
if len(fields) < 2 {
continue
}
name, model := fields[0], fields[1]
if !isOpenclawIdentifier(name) || !isOpenclawIdentifier(model) {
continue
}
if seen[name] {
continue
}
seen[name] = true
models = append(models, Model{
ID: name,
Label: name + " (" + model + ")",
Provider: "openclaw",
})
}
return models
}
// isOpenclawIdentifier reports whether s looks like a valid
// agent-name or model-id token: starts with a letter, contains only
// identifier-safe characters, and isn't a section header
// (trailing colon). Rejects TUI decoration like `│`, `╭`, `◇`, `|`.
func isOpenclawIdentifier(s string) bool {
if s == "" || strings.HasSuffix(s, ":") {
return false
}
first := s[0]
if !((first >= 'a' && first <= 'z') || (first >= 'A' && first <= 'Z')) {
return false
}
for _, r := range s {
switch {
case r >= 'a' && r <= 'z':
case r >= 'A' && r <= 'Z':
case r >= '0' && r <= '9':
case r == '-' || r == '_' || r == '.' || r == '/':
default:
return false
}
}
return true
}

View File

@@ -0,0 +1,324 @@
package agent
import (
"context"
"strings"
"testing"
)
func TestListModelsStaticProviders(t *testing.T) {
ctx := context.Background()
for _, provider := range []string{"claude", "codex", "gemini", "cursor", "copilot"} {
got, err := ListModels(ctx, provider, "")
if err != nil {
t.Fatalf("ListModels(%q) error: %v", provider, err)
}
if len(got) == 0 {
t.Errorf("ListModels(%q) returned no models", provider)
}
for i, m := range got {
if m.ID == "" {
t.Errorf("ListModels(%q)[%d] has empty ID", provider, i)
}
if m.Label == "" {
t.Errorf("ListModels(%q)[%d] has empty Label", provider, i)
}
}
}
}
func TestListModelsHermesWithoutBinary(t *testing.T) {
// With no `hermes` binary on PATH the discovery fast-paths to
// an empty list (the UI then falls back to creatable manual
// entry). This test only verifies the fast-path; an actual
// ACP session is exercised in integration.
ctx := context.Background()
// Prime the cache miss so we hit the live discovery function.
modelCacheMu.Lock()
delete(modelCache, "hermes")
modelCacheMu.Unlock()
got, err := ListModels(ctx, "hermes", "/nonexistent/hermes")
if err != nil {
t.Fatalf("ListModels(hermes) error: %v", err)
}
if got == nil {
t.Error("expected non-nil slice even when binary is missing")
}
}
func TestListModelsUnknownProvider(t *testing.T) {
ctx := context.Background()
_, err := ListModels(ctx, "nonexistent", "")
if err == nil {
t.Fatal("ListModels(unknown) expected error")
}
}
func TestStaticCatalogsHaveAtMostOneDefault(t *testing.T) {
// Each catalog should tag at most one entry as the display
// default so the UI badge is unambiguous. More than one
// usually means a copy/paste slip when adding new models.
catalogs := map[string][]Model{
"claude": claudeStaticModels(),
"codex": codexStaticModels(),
"gemini": geminiStaticModels(),
"cursor": cursorStaticModels(),
"copilot": copilotStaticModels(),
}
for provider, models := range catalogs {
count := 0
for _, m := range models {
if m.Default {
count++
}
}
if count > 1 {
t.Errorf("%s: %d models marked Default, want 0 or 1", provider, count)
}
}
}
func TestParseOpenCodeModels(t *testing.T) {
input := `PROVIDER/MODEL CONTEXT MAX_OUT
openai/gpt-4o 128000 16384
anthropic/claude-sonnet-4-6 200000 8192
openai/gpt-4o 128000 16384
nonprefixed-line
`
models := parseOpenCodeModels(input)
if len(models) != 2 {
t.Fatalf("expected 2 models (header skipped, duplicate deduped, non-slash skipped), got %d: %+v", len(models), models)
}
if models[0].ID != "openai/gpt-4o" || models[0].Provider != "openai" {
t.Errorf("unexpected first model: %+v", models[0])
}
if models[1].ID != "anthropic/claude-sonnet-4-6" || models[1].Provider != "anthropic" {
t.Errorf("unexpected second model: %+v", models[1])
}
}
func TestParsePiModels(t *testing.T) {
input := `openai:gpt-4o
anthropic:claude-opus-4-7
openai:gpt-4o
bareword
`
models := parsePiModels(input)
if len(models) != 2 {
t.Fatalf("expected 2 models, got %d: %+v", len(models), models)
}
if models[0].ID != "openai/gpt-4o" {
t.Errorf("expected colon normalized to slash: %+v", models[0])
}
}
func TestParseOpenclawAgents(t *testing.T) {
input := `deepseek-v4 deepseek-v4
claude-sonnet claude-sonnet-4-6
deepseek-v4 deepseek-v4
`
models := parseOpenclawAgents(input)
// duplicate deduped; label includes model name.
if len(models) != 2 {
t.Fatalf("expected 2 agents, got %d: %+v", len(models), models)
}
if models[0].ID != "deepseek-v4" {
t.Errorf("unexpected first agent: %+v", models[0])
}
if models[0].Label != "deepseek-v4 (deepseek-v4)" {
t.Errorf("unexpected label: %+v", models[0])
}
if models[0].Provider != "openclaw" {
t.Errorf("expected provider openclaw, got %q", models[0].Provider)
}
}
func TestParseOpenclawAgentsRejectsDecoratedTUI(t *testing.T) {
// Reproduces the shape of real `openclaw agents list` output
// that leaked header tokens like "Identity:" / "Workspace:"
// and single-character box-drawing icons into the dropdown.
input := `╭───────────────────────────────╮
│ │
│ ◇ Agents: │
│ │ │
│ │ Identity: │
│ │ Workspace: │
│ │ Agent │
│ │ │
╰───────────────────────────────╯
deepseek-v4 deepseek-v4
claude-sonnet claude-sonnet-4-6
`
models := parseOpenclawAgents(input)
if len(models) != 2 {
t.Fatalf("expected 2 agents (decoration skipped), got %d: %+v", len(models), models)
}
for _, m := range models {
if strings.HasSuffix(m.ID, ":") {
t.Errorf("section header leaked into result: %+v", m)
}
}
if models[0].ID != "deepseek-v4" || models[1].ID != "claude-sonnet" {
t.Errorf("unexpected agents: %+v", models)
}
}
func TestParseOpenclawAgentsJSONArray(t *testing.T) {
input := []byte(`[
{"name": "deepseek-v4", "model": "deepseek-v4"},
{"name": "claude-sonnet", "model": "claude-sonnet-4-6"}
]`)
models, ok := parseOpenclawAgentsJSON(input)
if !ok {
t.Fatal("expected parseOpenclawAgentsJSON to accept an array")
}
if len(models) != 2 {
t.Fatalf("got %d, want 2: %+v", len(models), models)
}
if models[0].ID != "deepseek-v4" || models[0].Label != "deepseek-v4 (deepseek-v4)" {
t.Errorf("unexpected first entry: %+v", models[0])
}
}
func TestParseOpenclawAgentsJSONWrapped(t *testing.T) {
input := []byte(`{"agents": [{"name": "foo", "model": "bar"}]}`)
models, ok := parseOpenclawAgentsJSON(input)
if !ok {
t.Fatal("expected parseOpenclawAgentsJSON to accept wrapped object")
}
if len(models) != 1 || models[0].ID != "foo" {
t.Errorf("unexpected: %+v", models)
}
}
func TestParseOpenclawAgentsJSONRejectsGarbage(t *testing.T) {
if _, ok := parseOpenclawAgentsJSON([]byte("not json")); ok {
t.Error("expected ok=false for non-JSON")
}
}
func TestParseCursorModels(t *testing.T) {
input := `Available models
auto - Auto
composer-2-fast - Composer 2 Fast (current, default)
composer-2 - Composer 2
claude-4.6-sonnet-medium - Sonnet 4.6 1M
claude-opus-4-7-high - Opus 4.7 1M
gemini-3.1-pro - Gemini 3.1 Pro
`
models := parseCursorModels(input)
if len(models) != 6 {
t.Fatalf("expected 6 models, got %d: %+v", len(models), models)
}
ids := map[string]Model{}
for _, m := range models {
ids[m.ID] = m
}
for _, want := range []string{"auto", "composer-2-fast", "composer-2", "claude-4.6-sonnet-medium", "claude-opus-4-7-high", "gemini-3.1-pro"} {
if _, ok := ids[want]; !ok {
t.Errorf("missing expected model %q in: %+v", want, models)
}
}
if def := ids["composer-2-fast"]; !def.Default {
t.Errorf("composer-2-fast should be marked default, got %+v", def)
}
if def := ids["composer-2-fast"]; def.Label != "Composer 2 Fast" {
t.Errorf("default label should be stripped of parenthetical, got %q", def.Label)
}
// Non-default entry should not carry Default=true.
if auto := ids["auto"]; auto.Default {
t.Errorf("non-default entry should not be flagged default: %+v", auto)
}
}
func TestParseCursorModelsSkipsHeaderAndBlankLines(t *testing.T) {
input := `Available models
composer-2 - Composer 2
`
models := parseCursorModels(input)
if len(models) != 1 || models[0].ID != "composer-2" {
t.Fatalf("unexpected: %+v", models)
}
}
func TestParseHermesSessionNewModels(t *testing.T) {
// Mirrors the real shape emitted by hermes'
// acp_adapter/server.py _build_model_state -> SessionModelState.
raw := []byte(`{
"sessionId": "ses_123",
"models": {
"availableModels": [
{"modelId": "nous:moonshotai/kimi-k2.5", "name": "moonshotai/kimi-k2.5", "description": "Provider: Nous"},
{"modelId": "nous:anthropic/claude-opus-4.7", "name": "anthropic/claude-opus-4.7", "description": "Provider: Nous • current"},
{"modelId": "nous:moonshotai/kimi-k2.5", "name": "duplicate", "description": "dup"}
],
"currentModelId": "nous:anthropic/claude-opus-4.7"
}
}`)
models := parseHermesSessionNewModels(raw)
if len(models) != 2 {
t.Fatalf("expected 2 models (duplicate deduped), got %d: %+v", len(models), models)
}
if models[0].ID != "nous:moonshotai/kimi-k2.5" || models[0].Provider != "nous" {
t.Errorf("unexpected first model: %+v", models[0])
}
if models[0].Default {
t.Errorf("non-current entry must not be marked default: %+v", models[0])
}
if !models[1].Default {
t.Errorf("current entry must be marked default: %+v", models[1])
}
if models[1].ID != "nous:anthropic/claude-opus-4.7" {
t.Errorf("expected current model second: %+v", models[1])
}
}
func TestParseHermesSessionNewModelsMissingField(t *testing.T) {
// session/new without the models field — older hermes or
// failed _build_model_state — should yield nil so the caller
// can distinguish "no catalog" from "empty catalog".
raw := []byte(`{"sessionId": "ses_123"}`)
if got := parseHermesSessionNewModels(raw); got != nil && len(got) != 0 {
t.Errorf("expected nil/empty, got %+v", got)
}
}
func TestParseHermesSessionNewModelsGarbage(t *testing.T) {
if got := parseHermesSessionNewModels([]byte("not json")); got != nil {
t.Errorf("expected nil for non-JSON, got %+v", got)
}
}
func TestHermesModelSelectionSupported(t *testing.T) {
// Regression guard: hermes now supports model selection via
// the ACP session/set_model RPC, so the UI dropdown should
// not be disabled for it.
if !ModelSelectionSupported("hermes") {
t.Error("hermes should be model-selection-supported now that set_session_model is wired")
}
}
func TestCachedDiscovery(t *testing.T) {
calls := 0
fn := func() ([]Model, error) {
calls++
return []Model{{ID: "x", Label: "x"}}, nil
}
// First call populates the cache; reset for isolation.
modelCacheMu.Lock()
delete(modelCache, "testkey")
modelCacheMu.Unlock()
if _, err := cachedDiscovery("testkey", fn); err != nil {
t.Fatal(err)
}
if _, err := cachedDiscovery("testkey", fn); err != nil {
t.Fatal(err)
}
if calls != 1 {
t.Errorf("expected 1 underlying call due to cache, got %d", calls)
}
}

View File

@@ -146,7 +146,17 @@ func buildOpenclawArgs(prompt, sessionID string, opts ExecOptions, logger *slog.
if opts.Timeout > 0 {
args = append(args, "--timeout", fmt.Sprintf("%d", int(opts.Timeout.Seconds())))
}
args = append(args, filterCustomArgs(opts.CustomArgs, openclawBlockedArgs, logger)...)
// OpenClaw binds models to pre-registered agents at `openclaw agents
// add/update --model` time; the daemon selects one at runtime by
// passing --agent <name>. The model dropdown populates its list from
// `openclaw agents list`, so opts.Model here is an agent name. Only
// inject when the user hasn't already set --agent via custom_args —
// custom_args wins for backward compatibility with existing configs.
customArgs := filterCustomArgs(opts.CustomArgs, openclawBlockedArgs, logger)
if opts.Model != "" && !customArgsContains(customArgs, "--agent") {
args = append(args, "--agent", opts.Model)
}
args = append(args, customArgs...)
if opts.SystemPrompt != "" {
prompt = opts.SystemPrompt + "\n\n" + prompt
@@ -155,6 +165,18 @@ func buildOpenclawArgs(prompt, sessionID string, opts ExecOptions, logger *slog.
return args
}
// customArgsContains reports whether args contains the given flag
// (either as a standalone token "--flag" or in "--flag=value" form).
func customArgsContains(args []string, flag string) bool {
prefix := flag + "="
for _, a := range args {
if a == flag || strings.HasPrefix(a, prefix) {
return true
}
}
return false
}
// ── Event handlers ──
// openclawEventResult holds accumulated state from processing the event stream.
@@ -439,9 +461,9 @@ type openclawEvent struct {
CallID string `json:"callId,omitempty"`
Input json.RawMessage `json:"input,omitempty"`
Usage map[string]any `json:"usage,omitempty"`
Phase string `json:"phase,omitempty"` // lifecycle event phase
Error *openclawError `json:"error,omitempty"` // structured error object
Message string `json:"message,omitempty"` // alternative error message field
Phase string `json:"phase,omitempty"` // lifecycle event phase
Error *openclawError `json:"error,omitempty"` // structured error object
Message string `json:"message,omitempty"` // alternative error message field
}
// errorMessage extracts a human-readable error message from the event,

View File

@@ -688,8 +688,8 @@ func TestOpenclawUsageAlternativeFieldNames(t *testing.T) {
// Test PaperClip-style field names (inputTokens, outputTokens, etc.)
data := map[string]any{
"inputTokens": float64(500),
"outputTokens": float64(200),
"inputTokens": float64(500),
"outputTokens": float64(200),
"cachedInputTokens": float64(100),
}
usage := parseOpenclawUsage(data)
@@ -711,8 +711,8 @@ func TestOpenclawUsageSnakeCaseFieldNames(t *testing.T) {
// Test snake_case field names (Anthropic API style)
data := map[string]any{
"input_tokens": float64(300),
"output_tokens": float64(150),
"cache_read_input_tokens": float64(80),
"output_tokens": float64(150),
"cache_read_input_tokens": float64(80),
"cache_creation_input_tokens": float64(40),
}
usage := parseOpenclawUsage(data)
@@ -796,8 +796,8 @@ func TestOpenclawUsageFinalResultAlternativeFields(t *testing.T) {
DurationMs: 1000,
AgentMeta: map[string]any{
"usage": map[string]any{
"inputTokens": float64(400),
"outputTokens": float64(180),
"inputTokens": float64(400),
"outputTokens": float64(180),
"cachedInputTokens": float64(90),
},
},
@@ -943,13 +943,15 @@ func TestBuildOpenclawArgsMinimal(t *testing.T) {
}
}
func TestBuildOpenclawArgsDoesNotForwardModelOrSystemPrompt(t *testing.T) {
func TestBuildOpenclawArgsMapsModelToAgent(t *testing.T) {
t.Parallel()
// openclaw agent rejects --model and --system-prompt; verify they are
// never emitted as flags even when Model and SystemPrompt are set.
// For openclaw, agent.model stores the pre-registered agent name;
// the daemon must translate that to `--agent <name>` because the
// CLI rejects `--model` entirely. `--system-prompt` is also
// rejected and must not be emitted as a flag.
args := buildOpenclawArgs("task", "ses-2", ExecOptions{
Model: "gpt-4o",
Model: "deepseek-v4-agent",
SystemPrompt: "You are a helpful agent.",
}, slog.Default())
@@ -959,6 +961,40 @@ func TestBuildOpenclawArgsDoesNotForwardModelOrSystemPrompt(t *testing.T) {
if idx := indexOf(args, "--system-prompt"); idx != -1 {
t.Fatalf("unexpected --system-prompt flag at %d: %v", idx, args)
}
agentIdx := indexOf(args, "--agent")
if agentIdx == -1 || agentIdx+1 >= len(args) {
t.Fatalf("expected --agent <value> in args: %v", args)
}
if got := args[agentIdx+1]; got != "deepseek-v4-agent" {
t.Errorf("--agent value = %q, want %q", got, "deepseek-v4-agent")
}
}
func TestBuildOpenclawArgsCustomAgentWinsOverModel(t *testing.T) {
t.Parallel()
// If the user already configured --agent via custom_args, their
// value wins — we don't double-inject. This keeps existing configs
// working when they later set agent.model.
args := buildOpenclawArgs("task", "ses-2b", ExecOptions{
Model: "from-dropdown",
CustomArgs: []string{"--agent", "from-custom-args"},
}, slog.Default())
count := 0
for _, a := range args {
if a == "--agent" {
count++
}
}
if count != 1 {
t.Fatalf("expected exactly one --agent flag, got %d: %v", count, args)
}
agentIdx := indexOf(args, "--agent")
if args[agentIdx+1] != "from-custom-args" {
t.Errorf("custom --agent should win, got %q", args[agentIdx+1])
}
}
func TestBuildOpenclawArgsPrependsSystemPromptToMessage(t *testing.T) {

View File

@@ -14,7 +14,7 @@ import (
const archiveAgent = `-- name: ArchiveAgent :one
UPDATE agent SET archived_at = now(), archived_by = $2, updated_at = now()
WHERE id = $1
RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config
RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model
`
type ArchiveAgentParams struct {
@@ -46,6 +46,7 @@ func (q *Queries) ArchiveAgent(ctx context.Context, arg ArchiveAgentParams) (Age
&i.CustomEnv,
&i.CustomArgs,
&i.McpConfig,
&i.Model,
)
return i, err
}
@@ -161,7 +162,7 @@ func (q *Queries) ClaimAgentTask(ctx context.Context, agentID pgtype.UUID) (Agen
const clearAgentMcpConfig = `-- name: ClearAgentMcpConfig :one
UPDATE agent SET mcp_config = NULL, updated_at = now()
WHERE id = $1
RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config
RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model
`
func (q *Queries) ClearAgentMcpConfig(ctx context.Context, id pgtype.UUID) (Agent, error) {
@@ -188,6 +189,7 @@ func (q *Queries) ClearAgentMcpConfig(ctx context.Context, id pgtype.UUID) (Agen
&i.CustomEnv,
&i.CustomArgs,
&i.McpConfig,
&i.Model,
)
return i, err
}
@@ -253,9 +255,9 @@ const createAgent = `-- name: CreateAgent :one
INSERT INTO agent (
workspace_id, name, description, avatar_url, runtime_mode,
runtime_config, runtime_id, visibility, max_concurrent_tasks, owner_id,
instructions, custom_env, custom_args, mcp_config
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14)
RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config
instructions, custom_env, custom_args, mcp_config, model
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15)
RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model
`
type CreateAgentParams struct {
@@ -273,6 +275,7 @@ type CreateAgentParams struct {
CustomEnv []byte `json:"custom_env"`
CustomArgs []byte `json:"custom_args"`
McpConfig []byte `json:"mcp_config"`
Model pgtype.Text `json:"model"`
}
func (q *Queries) CreateAgent(ctx context.Context, arg CreateAgentParams) (Agent, error) {
@@ -291,6 +294,7 @@ func (q *Queries) CreateAgent(ctx context.Context, arg CreateAgentParams) (Agent
arg.CustomEnv,
arg.CustomArgs,
arg.McpConfig,
arg.Model,
)
var i Agent
err := row.Scan(
@@ -314,6 +318,7 @@ func (q *Queries) CreateAgent(ctx context.Context, arg CreateAgentParams) (Agent
&i.CustomEnv,
&i.CustomArgs,
&i.McpConfig,
&i.Model,
)
return i, err
}
@@ -462,7 +467,7 @@ func (q *Queries) FailStaleTasks(ctx context.Context, arg FailStaleTasksParams)
}
const getAgent = `-- name: GetAgent :one
SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config FROM agent
SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model FROM agent
WHERE id = $1
`
@@ -490,12 +495,13 @@ func (q *Queries) GetAgent(ctx context.Context, id pgtype.UUID) (Agent, error) {
&i.CustomEnv,
&i.CustomArgs,
&i.McpConfig,
&i.Model,
)
return i, err
}
const getAgentInWorkspace = `-- name: GetAgentInWorkspace :one
SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config FROM agent
SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model FROM agent
WHERE id = $1 AND workspace_id = $2
`
@@ -528,6 +534,7 @@ func (q *Queries) GetAgentInWorkspace(ctx context.Context, arg GetAgentInWorkspa
&i.CustomEnv,
&i.CustomArgs,
&i.McpConfig,
&i.Model,
)
return i, err
}
@@ -728,7 +735,7 @@ func (q *Queries) ListAgentTasks(ctx context.Context, agentID pgtype.UUID) ([]Ag
}
const listAgents = `-- name: ListAgents :many
SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config FROM agent
SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model FROM agent
WHERE workspace_id = $1 AND archived_at IS NULL
ORDER BY created_at ASC
`
@@ -763,6 +770,7 @@ func (q *Queries) ListAgents(ctx context.Context, workspaceID pgtype.UUID) ([]Ag
&i.CustomEnv,
&i.CustomArgs,
&i.McpConfig,
&i.Model,
); err != nil {
return nil, err
}
@@ -775,7 +783,7 @@ func (q *Queries) ListAgents(ctx context.Context, workspaceID pgtype.UUID) ([]Ag
}
const listAllAgents = `-- name: ListAllAgents :many
SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config FROM agent
SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model FROM agent
WHERE workspace_id = $1
ORDER BY created_at ASC
`
@@ -810,6 +818,7 @@ func (q *Queries) ListAllAgents(ctx context.Context, workspaceID pgtype.UUID) ([
&i.CustomEnv,
&i.CustomArgs,
&i.McpConfig,
&i.Model,
); err != nil {
return nil, err
}
@@ -914,7 +923,7 @@ func (q *Queries) ListTasksByIssue(ctx context.Context, issueID pgtype.UUID) ([]
const restoreAgent = `-- name: RestoreAgent :one
UPDATE agent SET archived_at = NULL, archived_by = NULL, updated_at = now()
WHERE id = $1
RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config
RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model
`
func (q *Queries) RestoreAgent(ctx context.Context, id pgtype.UUID) (Agent, error) {
@@ -941,6 +950,7 @@ func (q *Queries) RestoreAgent(ctx context.Context, id pgtype.UUID) (Agent, erro
&i.CustomEnv,
&i.CustomArgs,
&i.McpConfig,
&i.Model,
)
return i, err
}
@@ -993,9 +1003,10 @@ UPDATE agent SET
custom_env = COALESCE($12, custom_env),
custom_args = COALESCE($13, custom_args),
mcp_config = COALESCE($14, mcp_config),
model = COALESCE($15, model),
updated_at = now()
WHERE id = $1
RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config
RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model
`
type UpdateAgentParams struct {
@@ -1013,6 +1024,7 @@ type UpdateAgentParams struct {
CustomEnv []byte `json:"custom_env"`
CustomArgs []byte `json:"custom_args"`
McpConfig []byte `json:"mcp_config"`
Model pgtype.Text `json:"model"`
}
func (q *Queries) UpdateAgent(ctx context.Context, arg UpdateAgentParams) (Agent, error) {
@@ -1031,6 +1043,7 @@ func (q *Queries) UpdateAgent(ctx context.Context, arg UpdateAgentParams) (Agent
arg.CustomEnv,
arg.CustomArgs,
arg.McpConfig,
arg.Model,
)
var i Agent
err := row.Scan(
@@ -1054,6 +1067,7 @@ func (q *Queries) UpdateAgent(ctx context.Context, arg UpdateAgentParams) (Agent
&i.CustomEnv,
&i.CustomArgs,
&i.McpConfig,
&i.Model,
)
return i, err
}
@@ -1061,7 +1075,7 @@ func (q *Queries) UpdateAgent(ctx context.Context, arg UpdateAgentParams) (Agent
const updateAgentStatus = `-- name: UpdateAgentStatus :one
UPDATE agent SET status = $2, updated_at = now()
WHERE id = $1
RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config
RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model
`
type UpdateAgentStatusParams struct {
@@ -1093,6 +1107,7 @@ func (q *Queries) UpdateAgentStatus(ctx context.Context, arg UpdateAgentStatusPa
&i.CustomEnv,
&i.CustomArgs,
&i.McpConfig,
&i.Model,
)
return i, err
}

View File

@@ -40,6 +40,7 @@ type Agent struct {
CustomEnv []byte `json:"custom_env"`
CustomArgs []byte `json:"custom_args"`
McpConfig []byte `json:"mcp_config"`
Model pgtype.Text `json:"model"`
}
type AgentRuntime struct {

View File

@@ -20,8 +20,8 @@ WHERE id = $1 AND workspace_id = $2;
INSERT INTO agent (
workspace_id, name, description, avatar_url, runtime_mode,
runtime_config, runtime_id, visibility, max_concurrent_tasks, owner_id,
instructions, custom_env, custom_args, mcp_config
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14)
instructions, custom_env, custom_args, mcp_config, model
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15)
RETURNING *;
-- name: UpdateAgent :one
@@ -39,6 +39,7 @@ UPDATE agent SET
custom_env = COALESCE(sqlc.narg('custom_env'), custom_env),
custom_args = COALESCE(sqlc.narg('custom_args'), custom_args),
mcp_config = COALESCE(sqlc.narg('mcp_config'), mcp_config),
model = COALESCE(sqlc.narg('model'), model),
updated_at = now()
WHERE id = $1
RETURNING *;