fix(agents): honour default flag wire-side and stop masking model errors

Addresses three issues from the latest PR #1399 review. 1. Wire the `default` flag end-to-end. The Model struct tagged one entry per provider as Default=true so the UI can badge it, but the daemon's heartbeat-report writer serialised models as `map[string]string`, silently dropping the bool; handler's `ModelEntry` also lacked the field. Result: the dropdown always showed "Default (provider)" and never "Default — <name>". Replace the map with a typed wire struct on the daemon side and add `Default bool` to `handler.ModelEntry`, plus a regression test asserting the flag round-trips through the report body and the store. 2. Stop imposing a static Multica-side default on task execution. `runTask` resolved the model via a three-tier chain that ended in `agent.DefaultModel(provider)`, forcing `claude-sonnet-4-6` / `gpt-5.4` / etc. onto every task whose agent hadn't set a model. That's the same shape of bug that bit us on cursor: any Go-side guess drifts from what the user's account actually has access to. Drop the third tier — when both `agent.model` and `MULTICA_<PROVIDER>_MODEL` are empty we now pass `""` through, so each backend omits `--model` from the CLI and the provider resolves its own default. `DefaultModel` / `defaultStaticModelsFor` become dead and are removed along with their test; the per-entry `Default: true` markers remain as a *display* hint on discovery responses (and are now surfaced via Fix #1). 3. Hermes: fail the task when the user's chosen model can't be applied. `session/set_model` errors used to log a warn and continue on hermes' own default, so a successful task would mislead the user into thinking their explicit pick ran. Now, when `opts.Model != ""` and the RPC fails, we send a `failed` Result with the hermes error in `result.Error` and skip the prompt entirely. Plus gofmt clean-up on `hermes.go` (the new sniffer block picked up tab/space noise on the prior commit).
fix(agent/hermes): surface provider errors instead of reporting empty output
2026-06-18 04:09:13 +02:00 · 2026-04-21 00:00:34 +08:00 · 2026-04-20 23:45:49 +08:00 · 2026-04-20 23:34:17 +08:00 · 2026-04-20 23:09:56 +08:00 · 2026-04-20 22:51:52 +08:00
31 changed files with 2229 additions and 53 deletions
--- a/apps/web/test/helpers.tsx
+++ b/apps/web/test/helpers.tsx
@@ -59,6 +59,7 @@ export const mockAgents: Agent[] = [
    custom_env_redacted: false,
    visibility: "workspace",
    max_concurrent_tasks: 3,
+    model: "",
    owner_id: null,
    skills: [],
    created_at: "2026-01-01T00:00:00Z",
--- a/packages/core/api/client.ts
+++ b/packages/core/api/client.ts
@@ -35,6 +35,7 @@ import type {
  RuntimeHourlyActivity,
  RuntimePing,
  RuntimeUpdate,
+  RuntimeModelListRequest,
  TimelineEntry,
  AssigneeFrequencyEntry,
  TaskMessagePayload,
@@ -470,6 +471,17 @@ export class ApiClient {
    return this.fetch(`/api/runtimes/${runtimeId}/update/${updateId}`);
  }

+  async initiateListModels(runtimeId: string): Promise<RuntimeModelListRequest> {
+    return this.fetch(`/api/runtimes/${runtimeId}/models`, { method: "POST" });
+  }
+
+  async getListModelsResult(
+    runtimeId: string,
+    requestId: string,
+  ): Promise<RuntimeModelListRequest> {
+    return this.fetch(`/api/runtimes/${runtimeId}/models/${requestId}`);
+  }
+
  async listAgentTasks(agentId: string): Promise<AgentTask[]> {
    return this.fetch(`/api/agents/${agentId}/tasks`);
  }
--- a/packages/core/runtimes/index.ts
+++ b/packages/core/runtimes/index.ts
@@ -1,3 +1,4 @@
 export * from "./queries";
 export * from "./mutations";
 export * from "./hooks";
+export * from "./models";
--- a/packages/core/runtimes/models.ts
+++ b/packages/core/runtimes/models.ts
@@ -0,0 +1,52 @@
+import { queryOptions } from "@tanstack/react-query";
+import { api } from "../api";
+import type { RuntimeModelsResult } from "../types/agent";
+
+export const runtimeModelsKeys = {
+  all: () => ["runtimes", "models"] as const,
+  forRuntime: (runtimeId: string) =>
+    [...runtimeModelsKeys.all(), runtimeId] as const,
+};
+
+const POLL_INTERVAL_MS = 500;
+const POLL_TIMEOUT_MS = 30_000;
+
+// resolveRuntimeModels initiates a list-models request against the daemon
+// (via heartbeat piggyback) and polls until the daemon reports back or
+// the request times out. Returns both the models list and a
+// `supported` flag: `supported=false` means the provider ignores
+// per-agent model selection entirely (hermes today) — the UI uses
+// this to disable its dropdown instead of accepting a value that
+// wouldn't be honoured at runtime.
+export async function resolveRuntimeModels(
+  runtimeId: string,
+): Promise<RuntimeModelsResult> {
+  const initial = await api.initiateListModels(runtimeId);
+  const start = Date.now();
+  let current = initial;
+  while (current.status === "pending" || current.status === "running") {
+    if (Date.now() - start > POLL_TIMEOUT_MS) {
+      throw new Error("model discovery timed out");
+    }
+    await new Promise((resolve) => setTimeout(resolve, POLL_INTERVAL_MS));
+    current = await api.getListModelsResult(runtimeId, initial.id);
+  }
+  if (current.status === "failed" || current.status === "timeout") {
+    throw new Error(current.error || "model discovery failed");
+  }
+  return { models: current.models ?? [], supported: current.supported };
+}
+
+export function runtimeModelsOptions(runtimeId: string | null | undefined) {
+  return queryOptions({
+    queryKey: runtimeId
+      ? runtimeModelsKeys.forRuntime(runtimeId)
+      : runtimeModelsKeys.all(),
+    queryFn: () => resolveRuntimeModels(runtimeId as string),
+    enabled: Boolean(runtimeId),
+    // Models rarely change; cache for 60s to match the server-side
+    // cache in agent.ListModels.
+    staleTime: 60_000,
+    retry: false,
+  });
+}
--- a/packages/core/types/agent.ts
+++ b/packages/core/types/agent.ts
@@ -54,6 +54,7 @@ export interface Agent {
  visibility: AgentVisibility;
  status: AgentStatus;
  max_concurrent_tasks: number;
+  model: string;
  owner_id: string | null;
  skills: Skill[];
  created_at: string;
@@ -73,6 +74,7 @@ export interface CreateAgentRequest {
  custom_args?: string[];
  visibility?: AgentVisibility;
  max_concurrent_tasks?: number;
+  model?: string;
 }

 export interface UpdateAgentRequest {
@@ -87,6 +89,7 @@ export interface UpdateAgentRequest {
  visibility?: AgentVisibility;
  status?: AgentStatus;
  max_concurrent_tasks?: number;
+  model?: string;
 }

 // Skills
@@ -187,3 +190,36 @@ export interface RuntimeUpdate {
  created_at: string;
  updated_at: string;
 }
+
+export interface RuntimeModel {
+  id: string;
+  label: string;
+  provider?: string;
+  default?: boolean;
+}
+
+export type RuntimeModelListStatus =
+  | "pending"
+  | "running"
+  | "completed"
+  | "failed"
+  | "timeout";
+
+export interface RuntimeModelListRequest {
+  id: string;
+  runtime_id: string;
+  status: RuntimeModelListStatus;
+  models?: RuntimeModel[];
+  supported: boolean;
+  error?: string;
+  created_at: string;
+  updated_at: string;
+}
+
+// Result shape returned by resolveRuntimeModels — includes the
+// "supported" bit so the UI can distinguish "no models discovered"
+// from "provider does not honour per-agent model selection".
+export interface RuntimeModelsResult {
+  models: RuntimeModel[];
+  supported: boolean;
+}
--- a/packages/core/types/index.ts
+++ b/packages/core/types/index.ts
@@ -20,6 +20,10 @@ export type {
  RuntimePingStatus,
  RuntimeUpdate,
  RuntimeUpdateStatus,
+  RuntimeModel,
+  RuntimeModelListRequest,
+  RuntimeModelListStatus,
+  RuntimeModelsResult,
  IssueUsageSummary,
 } from "./agent";
 export type { Workspace, WorkspaceRepo, Member, MemberRole, User, MemberWithUser, Invitation } from "./workspace";
--- a/packages/views/agents/components/create-agent-dialog.tsx
+++ b/packages/views/agents/components/create-agent-dialog.tsx
@@ -4,6 +4,7 @@ import { useState, useEffect, useMemo } from "react";
 import { Cloud, ChevronDown, Globe, Lock, Loader2 } from "lucide-react";
 import { ProviderLogo } from "../../runtimes/components/provider-logo";
 import { ActorAvatar } from "../../common/actor-avatar";
+import { ModelDropdown } from "./model-dropdown";
 import type {
  AgentVisibility,
  RuntimeDevice,
@@ -48,6 +49,7 @@ export function CreateAgentDialog({
  const [name, setName] = useState("");
  const [description, setDescription] = useState("");
  const [visibility, setVisibility] = useState<AgentVisibility>("private");
+  const [model, setModel] = useState("");
  const [creating, setCreating] = useState(false);
  const [runtimeOpen, setRuntimeOpen] = useState(false);
  const [runtimeFilter, setRuntimeFilter] = useState<RuntimeFilter>("mine");
@@ -89,6 +91,7 @@ export function CreateAgentDialog({
        description: description.trim(),
        runtime_id: selectedRuntime.id,
        visibility,
+        model: model.trim() || undefined,
      });
      onClose();
    } catch (err) {
@@ -275,6 +278,14 @@ export function CreateAgentDialog({
              </PopoverContent>
            </Popover>
          </div>
+
+          <ModelDropdown
+            runtimeId={selectedRuntime?.id ?? null}
+            runtimeOnline={selectedRuntime?.status === "online"}
+            value={model}
+            onChange={setModel}
+            disabled={!selectedRuntime}
+          />
        </div>

        <DialogFooter>
--- a/packages/views/agents/components/model-dropdown.tsx
+++ b/packages/views/agents/components/model-dropdown.tsx
@@ -0,0 +1,252 @@
+"use client";
+
+import { useEffect, useMemo, useState } from "react";
+import { useQuery } from "@tanstack/react-query";
+import { ChevronDown, Cpu, Loader2, Plus, Check, Info } from "lucide-react";
+import { runtimeModelsOptions } from "@multica/core/runtimes";
+import type { RuntimeModel } from "@multica/core/types";
+import {
+  Popover,
+  PopoverTrigger,
+  PopoverContent,
+} from "@multica/ui/components/ui/popover";
+import { Input } from "@multica/ui/components/ui/input";
+import { Label } from "@multica/ui/components/ui/label";
+
+// ModelDropdown renders a searchable, creatable model picker for an agent.
+// It fetches the supported-model catalog from the selected runtime — the
+// daemon enumerates models on demand via heartbeat piggyback. Providers
+// that don't honour per-agent model selection at runtime (currently
+// hermes) return supported=false, and the dropdown renders disabled
+// with an explanation instead of silently accepting a value the
+// backend would ignore.
+export function ModelDropdown({
+  runtimeId,
+  runtimeOnline,
+  value,
+  onChange,
+  disabled,
+}: {
+  runtimeId: string | null;
+  runtimeOnline: boolean;
+  value: string;
+  onChange: (value: string) => void;
+  disabled?: boolean;
+}) {
+  const [open, setOpen] = useState(false);
+  const [search, setSearch] = useState("");
+
+  const modelsQuery = useQuery(
+    runtimeModelsOptions(runtimeOnline ? runtimeId : null),
+  );
+
+  const supported = modelsQuery.data?.supported ?? true;
+  const models = modelsQuery.data?.models ?? [];
+  const defaultModel = useMemo(() => models.find((m) => m.default), [models]);
+  const grouped = useMemo(() => groupByProvider(models), [models]);
+
+  // When the selected runtime reports it doesn't support per-agent
+  // model selection, clear any previously-saved value so we don't
+  // persist a ghost configuration that never takes effect.
+  useEffect(() => {
+    if (!supported && value !== "") {
+      onChange("");
+    }
+  }, [supported, value, onChange]);
+
+  const filtered = useMemo(() => {
+    if (!search.trim()) return grouped;
+    const needle = search.toLowerCase();
+    const out: Record<string, RuntimeModel[]> = {};
+    for (const [provider, list] of Object.entries(grouped)) {
+      const matches = list.filter(
+        (m) =>
+          m.id.toLowerCase().includes(needle) ||
+          m.label.toLowerCase().includes(needle),
+      );
+      if (matches.length > 0) out[provider] = matches;
+    }
+    return out;
+  }, [grouped, search]);
+
+  const trimmedSearch = search.trim();
+  const exactMatch = models.some(
+    (m) => m.id === trimmedSearch || m.label === trimmedSearch,
+  );
+  const canCreate = trimmedSearch.length > 0 && !exactMatch;
+
+  const select = (id: string) => {
+    onChange(id);
+    setOpen(false);
+    setSearch("");
+  };
+
+  const triggerLabel =
+    value ||
+    (disabled
+      ? "Select a runtime first"
+      : runtimeOnline
+        ? defaultModel
+          ? `Default — ${defaultModel.label}`
+          : "Default (provider)"
+        : "Runtime offline — enter manually");
+
+  if (!supported && !modelsQuery.isLoading) {
+    // Provider doesn't honour per-agent model selection — show a
+    // clearly-disabled state so the user knows why the control is
+    // inert. (Hermes reads its model from ~/.hermes/.env.)
+    return (
+      <div className="min-w-0">
+        <Label className="text-xs text-muted-foreground">Model</Label>
+        <div className="mt-1.5 flex items-start gap-2 rounded-lg border border-dashed border-border bg-muted/30 px-3 py-2.5 text-sm text-muted-foreground">
+          <Info className="mt-0.5 h-4 w-4 shrink-0" />
+          <div className="min-w-0">
+            <div>Model selection is managed by this runtime.</div>
+            <div className="mt-0.5 text-xs">
+              Configure the model on the runtime host (e.g. Hermes reads it
+              from its own config file).
+            </div>
+          </div>
+        </div>
+      </div>
+    );
+  }
+
+  return (
+    <div className="min-w-0">
+      <div className="flex items-center justify-between">
+        <Label className="text-xs text-muted-foreground">Model</Label>
+        {modelsQuery.isError && (
+          <span className="text-xs text-muted-foreground">discovery failed</span>
+        )}
+      </div>
+      <Popover open={open} onOpenChange={setOpen}>
+        <PopoverTrigger
+          disabled={disabled}
+          className="flex w-full min-w-0 items-center gap-3 rounded-lg border border-border bg-background px-3 py-2.5 mt-1.5 text-left text-sm transition-colors hover:bg-muted disabled:pointer-events-none disabled:opacity-50"
+        >
+          <Cpu className="h-4 w-4 shrink-0 text-muted-foreground" />
+          <div className="min-w-0 flex-1">
+            <div className="truncate font-medium">
+              {triggerLabel}
+            </div>
+            {value && (
+              <div className="truncate text-xs text-muted-foreground">
+                {modelLabel(models, value)}
+              </div>
+            )}
+          </div>
+          <ChevronDown
+            className={`h-4 w-4 shrink-0 text-muted-foreground transition-transform ${open ? "rotate-180" : ""}`}
+          />
+        </PopoverTrigger>
+        <PopoverContent
+          align="start"
+          className="w-[var(--anchor-width)] p-0 overflow-hidden"
+        >
+          <div className="border-b border-border p-2">
+            <Input
+              autoFocus
+              placeholder="Search or type a model ID"
+              value={search}
+              onChange={(e) => setSearch(e.target.value)}
+              className="h-8"
+            />
+          </div>
+          <div className="max-h-72 overflow-y-auto p-1">
+            {modelsQuery.isLoading && (
+              <div className="flex items-center gap-2 px-3 py-6 text-sm text-muted-foreground">
+                <Loader2 className="h-4 w-4 animate-spin" />
+                Discovering models…
+              </div>
+            )}
+
+            {!modelsQuery.isLoading &&
+              Object.entries(filtered).map(([provider, list]) => (
+                <div key={provider} className="mb-1">
+                  {provider && (
+                    <div className="px-2 pt-1.5 pb-0.5 text-xs font-medium uppercase tracking-wide text-muted-foreground">
+                      {provider}
+                    </div>
+                  )}
+                  {list.map((m) => (
+                    <button
+                      key={m.id}
+                      onClick={() => select(m.id)}
+                      className={`flex w-full items-center gap-2 rounded-md px-3 py-2 text-left text-sm transition-colors ${
+                        m.id === value ? "bg-accent" : "hover:bg-accent/50"
+                      }`}
+                    >
+                      <div className="min-w-0 flex-1">
+                        <div className="flex items-center gap-1.5">
+                          <span className="truncate font-medium">{m.label}</span>
+                          {m.default && (
+                            <span className="shrink-0 rounded bg-primary/10 px-1.5 py-0.5 text-xs font-medium text-primary">
+                              default
+                            </span>
+                          )}
+                        </div>
+                        {m.label !== m.id && (
+                          <div className="truncate text-xs text-muted-foreground">
+                            {m.id}
+                          </div>
+                        )}
+                      </div>
+                      {m.id === value && (
+                        <Check className="h-4 w-4 shrink-0 text-primary" />
+                      )}
+                    </button>
+                  ))}
+                </div>
+              ))}
+
+            {!modelsQuery.isLoading &&
+              Object.keys(filtered).length === 0 &&
+              !canCreate && (
+                <div className="px-3 py-6 text-center text-sm text-muted-foreground">
+                  No models available.
+                </div>
+              )}
+
+            {canCreate && (
+              <button
+                onClick={() => select(trimmedSearch)}
+                className="flex w-full items-center gap-2 rounded-md px-3 py-2 text-left text-sm text-primary transition-colors hover:bg-accent/50"
+              >
+                <Plus className="h-4 w-4 shrink-0" />
+                <span className="truncate">
+                  Use “{trimmedSearch}”
+                </span>
+              </button>
+            )}
+
+            {value && (
+              <button
+                onClick={() => select("")}
+                className="mt-1 flex w-full items-center gap-2 border-t border-border px-3 py-2 text-left text-xs text-muted-foreground transition-colors hover:bg-accent/50"
+              >
+                Clear selection (use provider default)
+              </button>
+            )}
+          </div>
+        </PopoverContent>
+      </Popover>
+    </div>
+  );
+}
+
+function groupByProvider(models: RuntimeModel[]): Record<string, RuntimeModel[]> {
+  const out: Record<string, RuntimeModel[]> = {};
+  for (const m of models) {
+    const key = m.provider ?? "";
+    if (!out[key]) out[key] = [];
+    out[key].push(m);
+  }
+  return out;
+}
+
+function modelLabel(models: RuntimeModel[], id: string): string {
+  const found = models.find((m) => m.id === id);
+  if (!found) return "custom";
+  return found.provider ? found.provider : "model";
+}
--- a/packages/views/agents/components/tabs/settings-tab.tsx
+++ b/packages/views/agents/components/tabs/settings-tab.tsx
@@ -23,6 +23,7 @@ import { api } from "@multica/core/api";
 import { useFileUpload } from "@multica/core/hooks/use-file-upload";
 import { ActorAvatar } from "../../../common/actor-avatar";
 import { ProviderLogo } from "../../../runtimes/components/provider-logo";
+import { ModelDropdown } from "../model-dropdown";

 type RuntimeFilter = "mine" | "all";

@@ -44,6 +45,7 @@ export function SettingsTab({
  const [visibility, setVisibility] = useState<AgentVisibility>(agent.visibility);
  const [maxTasks, setMaxTasks] = useState(agent.max_concurrent_tasks);
  const [selectedRuntimeId, setSelectedRuntimeId] = useState(agent.runtime_id);
+  const [model, setModel] = useState(agent.model ?? "");
  const [runtimeOpen, setRuntimeOpen] = useState(false);
  const [runtimeFilter, setRuntimeFilter] = useState<RuntimeFilter>("mine");
  const [saving, setSaving] = useState(false);
@@ -90,7 +92,8 @@ export function SettingsTab({
    description !== (agent.description ?? "") ||
    visibility !== agent.visibility ||
    maxTasks !== agent.max_concurrent_tasks ||
-    selectedRuntimeId !== agent.runtime_id;
+    selectedRuntimeId !== agent.runtime_id ||
+    model !== (agent.model ?? "");

  const handleSave = async () => {
    if (!name.trim()) {
@@ -106,6 +109,7 @@ export function SettingsTab({
        visibility,
        max_concurrent_tasks: maxTasks,
        runtime_id: selectedRuntimeId,
+        model,
      });
      toast.success("Settings saved");
    } catch {
@@ -321,6 +325,14 @@ export function SettingsTab({
        </Popover>
      </div>

+      <ModelDropdown
+        runtimeId={selectedRuntime?.id ?? null}
+        runtimeOnline={selectedRuntime?.status === "online"}
+        value={model}
+        onChange={setModel}
+        disabled={!selectedRuntime}
+      />
+
      <Button onClick={handleSave} disabled={!dirty || saving} size="sm">
        {saving ? <Loader2 className="h-3.5 w-3.5 mr-1.5 animate-spin" /> : <Save className="h-3.5 w-3.5 mr-1.5" />}
        Save Changes
--- a/packages/views/agents/components/tabs/tasks-tab.test.tsx
+++ b/packages/views/agents/components/tabs/tasks-tab.test.tsx
@@ -55,6 +55,7 @@ const agent: Agent = {
  visibility: "workspace",
  status: "idle",
  max_concurrent_tasks: 1,
+  model: "",
  owner_id: null,
  skills: [],
  created_at: "2026-04-16T00:00:00Z",
--- a/server/cmd/multica/cmd_agent.go
+++ b/server/cmd/multica/cmd_agent.go
@@ -114,7 +114,8 @@ func init() {
 	agentCreateCmd.Flags().String("instructions", "", "Agent instructions")
 	agentCreateCmd.Flags().String("runtime-id", "", "Runtime ID (required)")
 	agentCreateCmd.Flags().String("runtime-config", "", "Runtime config as JSON string")
-	agentCreateCmd.Flags().String("custom-args", "", "Custom CLI arguments as JSON array (e.g. '[\"--model\", \"o3\"]')")
+	agentCreateCmd.Flags().String("model", "", "Model identifier (e.g. claude-sonnet-4-6, openai/gpt-4o). Prefer this over passing --model in --custom-args.")
+	agentCreateCmd.Flags().String("custom-args", "", "Custom CLI arguments as JSON array. For model selection prefer --model; some providers (codex app-server, openclaw) reject --model in custom_args.")
 	agentCreateCmd.Flags().String("visibility", "private", "Visibility: private or workspace")
 	agentCreateCmd.Flags().Int32("max-concurrent-tasks", 6, "Maximum concurrent tasks")
 	agentCreateCmd.Flags().String("output", "json", "Output format: table or json")
@@ -125,7 +126,8 @@ func init() {
 	agentUpdateCmd.Flags().String("instructions", "", "New instructions")
 	agentUpdateCmd.Flags().String("runtime-id", "", "New runtime ID")
 	agentUpdateCmd.Flags().String("runtime-config", "", "New runtime config as JSON string")
-	agentUpdateCmd.Flags().String("custom-args", "", "New custom CLI arguments as JSON array (e.g. '[\"--model\", \"o3\"]')")
+	agentUpdateCmd.Flags().String("model", "", "New model identifier. Pass an empty string to clear and fall back to the runtime default.")
+	agentUpdateCmd.Flags().String("custom-args", "", "New custom CLI arguments as JSON array. For model selection prefer --model; some providers (codex app-server, openclaw) reject --model in custom_args.")
 	agentUpdateCmd.Flags().String("visibility", "", "New visibility: private or workspace")
 	agentUpdateCmd.Flags().String("status", "", "New status")
 	agentUpdateCmd.Flags().Int32("max-concurrent-tasks", 0, "New max concurrent tasks")
@@ -347,6 +349,10 @@ func runAgentCreate(cmd *cobra.Command, _ []string) error {
 		}
 		body["custom_args"] = ca
 	}
+	if cmd.Flags().Changed("model") {
+		v, _ := cmd.Flags().GetString("model")
+		body["model"] = v
+	}
 	if cmd.Flags().Changed("visibility") {
 		v, _ := cmd.Flags().GetString("visibility")
 		body["visibility"] = v
@@ -412,6 +418,10 @@ func runAgentUpdate(cmd *cobra.Command, args []string) error {
 		}
 		body["custom_args"] = ca
 	}
+	if cmd.Flags().Changed("model") {
+		v, _ := cmd.Flags().GetString("model")
+		body["model"] = v
+	}
 	if cmd.Flags().Changed("visibility") {
 		v, _ := cmd.Flags().GetString("visibility")
 		body["visibility"] = v
@@ -426,7 +436,7 @@ func runAgentUpdate(cmd *cobra.Command, args []string) error {
 	}

 	if len(body) == 0 {
-		return fmt.Errorf("no fields to update; use --name, --description, --instructions, --runtime-id, --runtime-config, --custom-args, --visibility, --status, or --max-concurrent-tasks")
+		return fmt.Errorf("no fields to update; use --name, --description, --instructions, --runtime-id, --runtime-config, --model, --custom-args, --visibility, --status, or --max-concurrent-tasks")
 	}

 	ctx, cancel := context.WithTimeout(context.Background(), 15*time.Second)
--- a/server/cmd/server/router.go
+++ b/server/cmd/server/router.go
@@ -145,6 +145,7 @@ func NewRouter(pool *pgxpool.Pool, hub *realtime.Hub, bus *events.Bus) chi.Route
 		r.Get("/runtimes/{runtimeId}/tasks/pending", h.ListPendingTasksByRuntime)
 		r.Post("/runtimes/{runtimeId}/ping/{pingId}/result", h.ReportPingResult)
 		r.Post("/runtimes/{runtimeId}/update/{updateId}/result", h.ReportUpdateResult)
+		r.Post("/runtimes/{runtimeId}/models/{requestId}/result", h.ReportModelListResult)

 		r.Get("/tasks/{taskId}/status", h.GetTaskStatus)
 		r.Post("/tasks/{taskId}/start", h.StartTask)
@@ -346,6 +347,8 @@ func NewRouter(pool *pgxpool.Pool, hub *realtime.Hub, bus *events.Bus) chi.Route
 					r.Get("/ping/{pingId}", h.GetPing)
 					r.Post("/update", h.InitiateUpdate)
 					r.Get("/update/{updateId}", h.GetUpdate)
+					r.Post("/models", h.InitiateListModels)
+					r.Get("/models/{requestId}", h.GetModelListRequest)
 					r.Delete("/", h.DeleteAgentRuntime)
 				})
 			})
--- a/server/internal/daemon/client.go
+++ b/server/internal/daemon/client.go
@@ -147,9 +147,10 @@ func (c *Client) GetTaskStatus(ctx context.Context, taskID string) (string, erro

 // HeartbeatResponse contains the server's response to a heartbeat, including any pending actions.
 type HeartbeatResponse struct {
-	Status        string         `json:"status"`
-	PendingPing   *PendingPing   `json:"pending_ping,omitempty"`
-	PendingUpdate *PendingUpdate `json:"pending_update,omitempty"`
+	Status           string            `json:"status"`
+	PendingPing      *PendingPing      `json:"pending_ping,omitempty"`
+	PendingUpdate    *PendingUpdate    `json:"pending_update,omitempty"`
+	PendingModelList *PendingModelList `json:"pending_model_list,omitempty"`
 }

 // PendingPing represents a ping test request from the server.
@@ -163,6 +164,11 @@ type PendingUpdate struct {
 	TargetVersion string `json:"target_version"`
 }

+// PendingModelList represents a request to enumerate supported models.
+type PendingModelList struct {
+	ID string `json:"id"`
+}
+
 func (c *Client) SendHeartbeat(ctx context.Context, runtimeID string) (*HeartbeatResponse, error) {
 	var resp HeartbeatResponse
 	if err := c.postJSON(ctx, "/api/daemon/heartbeat", map[string]string{
@@ -182,6 +188,11 @@ func (c *Client) ReportUpdateResult(ctx context.Context, runtimeID, updateID str
 	return c.postJSON(ctx, fmt.Sprintf("/api/daemon/runtimes/%s/update/%s/result", runtimeID, updateID), result, nil)
 }

+// ReportModelListResult sends the model-discovery result back to the server.
+func (c *Client) ReportModelListResult(ctx context.Context, runtimeID, requestID string, result map[string]any) error {
+	return c.postJSON(ctx, fmt.Sprintf("/api/daemon/runtimes/%s/models/%s/result", runtimeID, requestID), result, nil)
+}
+
 // WorkspaceInfo holds minimal workspace metadata returned by the API.
 type WorkspaceInfo struct {
 	ID   string `json:"id"`
--- a/server/internal/daemon/daemon.go
+++ b/server/internal/daemon/daemon.go
@@ -496,11 +496,70 @@ func (d *Daemon) heartbeatLoop(ctx context.Context) {
 				if resp.PendingUpdate != nil {
 					go d.handleUpdate(ctx, rid, resp.PendingUpdate)
 				}
+
+				// Handle pending model-list requests.
+				if resp.PendingModelList != nil {
+					rt := d.findRuntime(rid)
+					if rt != nil {
+						go d.handleModelList(ctx, *rt, resp.PendingModelList.ID)
+					}
+				}
 			}
 		}
 	}
 }

+// handleModelList resolves the provider's supported models (via static
+// catalog or by shelling out to the agent CLI) and reports the result
+// back to the server. Model discovery failures are reported as empty
+// lists rather than errors so the UI can still render a creatable
+// dropdown.
+func (d *Daemon) handleModelList(ctx context.Context, rt Runtime, requestID string) {
+	d.logger.Info("model list requested", "runtime_id", rt.ID, "request_id", requestID, "provider", rt.Provider)
+
+	entry, ok := d.cfg.Agents[rt.Provider]
+	if !ok {
+		d.client.ReportModelListResult(ctx, rt.ID, requestID, map[string]any{
+			"status": "failed",
+			"error":  fmt.Sprintf("no agent configured for provider %q", rt.Provider),
+		})
+		return
+	}
+
+	models, err := agent.ListModels(ctx, rt.Provider, entry.Path)
+	if err != nil {
+		d.client.ReportModelListResult(ctx, rt.ID, requestID, map[string]any{
+			"status": "failed",
+			"error":  err.Error(),
+		})
+		return
+	}
+
+	// Wire format matches handler.ModelEntry. Use a struct (not
+	// map[string]string) so the Default bool round-trips — without
+	// it the UI loses its "default" badge on the advertised pick.
+	type modelWire struct {
+		ID       string `json:"id"`
+		Label    string `json:"label"`
+		Provider string `json:"provider,omitempty"`
+		Default  bool   `json:"default,omitempty"`
+	}
+	wire := make([]modelWire, 0, len(models))
+	for _, m := range models {
+		wire = append(wire, modelWire{
+			ID:       m.ID,
+			Label:    m.Label,
+			Provider: m.Provider,
+			Default:  m.Default,
+		})
+	}
+	d.client.ReportModelListResult(ctx, rt.ID, requestID, map[string]any{
+		"status":    "completed",
+		"models":    wire,
+		"supported": agent.ModelSelectionSupported(rt.Provider),
+	})
+}
+
 func (d *Daemon) handlePing(ctx context.Context, rt Runtime, pingID string) {
 	d.logger.Info("ping requested", "runtime_id", rt.ID, "ping_id", pingID, "provider", rt.Provider)

@@ -1018,9 +1077,25 @@ func (d *Daemon) runTask(ctx context.Context, task Task, provider string, taskLo
 		customArgs = task.Agent.CustomArgs
 		mcpConfig = task.Agent.McpConfig
 	}
+	// Two-tier model resolution: an explicit agent.model wins,
+	// then the daemon-wide MULTICA_<PROVIDER>_MODEL env var. If
+	// both are empty we deliberately pass "" through — each
+	// backend omits `--model` from the CLI invocation, so the
+	// provider picks its own default (Claude Code's shipped
+	// default, codex app-server's account-scoped default, etc.).
+	// Baking a Go-side "recommended default" here is how the
+	// cursor regression happened — static guesses drift from
+	// whatever the upstream CLI actually accepts.
+	model := ""
+	if task.Agent != nil && task.Agent.Model != "" {
+		model = task.Agent.Model
+	}
+	if model == "" {
+		model = entry.Model
+	}
 	execOpts := agent.ExecOptions{
 		Cwd:             env.WorkDir,
-		Model:           entry.Model,
+		Model:           model,
 		Timeout:         d.cfg.AgentTimeout,
 		ResumeSessionID: task.PriorSessionID,
 		CustomArgs:      customArgs,
--- a/server/internal/daemon/types.go
+++ b/server/internal/daemon/types.go
@@ -49,6 +49,7 @@ type AgentData struct {
 	CustomEnv    map[string]string `json:"custom_env,omitempty"`
 	CustomArgs   []string          `json:"custom_args,omitempty"`
 	McpConfig    json.RawMessage   `json:"mcp_config,omitempty"`
+	Model        string            `json:"model,omitempty"`
 }

 // SkillData represents a structured skill for task execution.
--- a/server/internal/handler/agent.go
+++ b/server/internal/handler/agent.go
@@ -36,6 +36,7 @@ type AgentResponse struct {
 	Visibility         string            `json:"visibility"`
 	Status             string            `json:"status"`
 	MaxConcurrentTasks int32             `json:"max_concurrent_tasks"`
+	Model              string            `json:"model"`
 	OwnerID            *string           `json:"owner_id"`
 	Skills             []SkillResponse   `json:"skills"`
 	CreatedAt          string            `json:"created_at"`
@@ -94,6 +95,7 @@ func agentToResponse(a db.Agent) AgentResponse {
 		Visibility:         a.Visibility,
 		Status:             a.Status,
 		MaxConcurrentTasks: a.MaxConcurrentTasks,
+		Model:              a.Model.String,
 		OwnerID:            uuidToPtr(a.OwnerID),
 		Skills:             []SkillResponse{},
 		CreatedAt:          timestampToString(a.CreatedAt),
@@ -144,6 +146,7 @@ type TaskAgentData struct {
 	CustomEnv    map[string]string        `json:"custom_env,omitempty"`
 	CustomArgs   []string                 `json:"custom_args,omitempty"`
 	McpConfig    json.RawMessage          `json:"mcp_config,omitempty"`
+	Model        string                   `json:"model,omitempty"`
 }

 func taskToResponse(t db.AgentTaskQueue) AgentTaskResponse {
@@ -265,6 +268,7 @@ type CreateAgentRequest struct {
 	McpConfig          json.RawMessage   `json:"mcp_config"`
 	Visibility         string            `json:"visibility"`
 	MaxConcurrentTasks int32             `json:"max_concurrent_tasks"`
+	Model              string            `json:"model"`
 }

 func decodeJSONBodyWithRawFields(body io.Reader, dst any) (map[string]json.RawMessage, error) {
@@ -362,6 +366,7 @@ func (h *Handler) CreateAgent(w http.ResponseWriter, r *http.Request) {
 		CustomEnv:          ce,
 		CustomArgs:         ca,
 		McpConfig:          mc,
+		Model:              pgtype.Text{String: req.Model, Valid: req.Model != ""},
 	})
 	if err != nil {
 		// Unique constraint on (workspace_id, name) — return a clear conflict error
@@ -401,6 +406,7 @@ type UpdateAgentRequest struct {
 	Visibility         *string            `json:"visibility"`
 	Status             *string            `json:"status"`
 	MaxConcurrentTasks *int32             `json:"max_concurrent_tasks"`
+	Model              *string            `json:"model"`
 }

 // canViewAgentEnv checks whether the requesting user is allowed to see the
@@ -523,6 +529,9 @@ func (h *Handler) UpdateAgent(w http.ResponseWriter, r *http.Request) {
 	if req.MaxConcurrentTasks != nil {
 		params.MaxConcurrentTasks = pgtype.Int4{Int32: *req.MaxConcurrentTasks, Valid: true}
 	}
+	if req.Model != nil {
+		params.Model = pgtype.Text{String: *req.Model, Valid: true}
+	}

 	agent, err = h.Queries.UpdateAgent(r.Context(), params)
 	if err != nil {
--- a/server/internal/handler/daemon.go
+++ b/server/internal/handler/daemon.go
@@ -494,6 +494,11 @@ func (h *Handler) DaemonHeartbeat(w http.ResponseWriter, r *http.Request) {
 		}
 	}

+	// Check for pending model-list requests for this runtime.
+	if pending := h.ModelListStore.PopPending(req.RuntimeID); pending != nil {
+		resp["pending_model_list"] = map[string]string{"id": pending.ID}
+	}
+
 	writeJSON(w, http.StatusOK, resp)
 }

@@ -589,6 +594,7 @@ func (h *Handler) ClaimTaskByRuntime(w http.ResponseWriter, r *http.Request) {
 			CustomEnv:    customEnv,
 			CustomArgs:   customArgs,
 			McpConfig:    mcpConfig,
+			Model:        agent.Model.String,
 		}
 	}

--- a/server/internal/handler/handler.go
+++ b/server/internal/handler/handler.go
@@ -48,6 +48,7 @@ type Handler struct {
 	EmailService     *service.EmailService
 	PingStore        *PingStore
 	UpdateStore      *UpdateStore
+	ModelListStore   *ModelListStore
 	Storage          storage.Storage
 	CFSigner         *auth.CloudFrontSigner
 	cfg              Config
@@ -71,6 +72,7 @@ func New(queries *db.Queries, txStarter txStarter, hub *realtime.Hub, bus *event
 		EmailService:     emailService,
 		PingStore:        NewPingStore(),
 		UpdateStore:      NewUpdateStore(),
+		ModelListStore:   NewModelListStore(),
 		Storage:          store,
 		CFSigner:         cfSigner,
 		cfg:              cfg,
--- a/server/internal/handler/runtime_models.go
+++ b/server/internal/handler/runtime_models.go
@@ -0,0 +1,228 @@
+package handler
+
+import (
+	"encoding/json"
+	"log/slog"
+	"net/http"
+	"sync"
+	"time"
+
+	"github.com/go-chi/chi/v5"
+)
+
+// ---------------------------------------------------------------------------
+// In-memory model-list request store
+// ---------------------------------------------------------------------------
+//
+// The server cannot call the daemon directly (the daemon is behind the user's
+// NAT and only polls the server). So "list models for this runtime" uses the
+// same pattern as PingStore: server creates a pending request, daemon pops it
+// on the next heartbeat, executes locally, and reports the result back.
+
+// ModelListStatus represents the lifecycle of a model list request.
+type ModelListStatus string
+
+const (
+	ModelListPending   ModelListStatus = "pending"
+	ModelListRunning   ModelListStatus = "running"
+	ModelListCompleted ModelListStatus = "completed"
+	ModelListFailed    ModelListStatus = "failed"
+	ModelListTimeout   ModelListStatus = "timeout"
+)
+
+// ModelListRequest represents a pending or completed model list request.
+// Supported is false when the provider ignores per-agent model
+// selection entirely (currently: hermes). The UI uses this to
+// disable its dropdown rather than silently accepting a value the
+// backend will drop.
+type ModelListRequest struct {
+	ID        string          `json:"id"`
+	RuntimeID string          `json:"runtime_id"`
+	Status    ModelListStatus `json:"status"`
+	Models    []ModelEntry    `json:"models,omitempty"`
+	Supported bool            `json:"supported"`
+	Error     string          `json:"error,omitempty"`
+	CreatedAt time.Time       `json:"created_at"`
+	UpdatedAt time.Time       `json:"updated_at"`
+}
+
+// ModelEntry mirrors agent.Model for the wire. `Default` tags the
+// model the runtime advertises as its preferred pick (e.g. Claude
+// Code's shipped default, or hermes' currentModelId) so the UI can
+// badge it — don't drop it when marshalling.
+type ModelEntry struct {
+	ID       string `json:"id"`
+	Label    string `json:"label"`
+	Provider string `json:"provider,omitempty"`
+	Default  bool   `json:"default,omitempty"`
+}
+
+// ModelListStore is a thread-safe in-memory store. Entries expire after 2 min
+// to bound memory use; the UI polls /requests/:id until status is terminal.
+type ModelListStore struct {
+	mu       sync.Mutex
+	requests map[string]*ModelListRequest
+}
+
+func NewModelListStore() *ModelListStore {
+	return &ModelListStore{requests: make(map[string]*ModelListRequest)}
+}
+
+func (s *ModelListStore) Create(runtimeID string) *ModelListRequest {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+
+	// Garbage-collect stale entries so the map can't grow unbounded.
+	for id, req := range s.requests {
+		if time.Since(req.CreatedAt) > 2*time.Minute {
+			delete(s.requests, id)
+		}
+	}
+
+	req := &ModelListRequest{
+		ID:        randomID(),
+		RuntimeID: runtimeID,
+		Status:    ModelListPending,
+		// Default to true; the daemon overrides this in the report
+		// for providers that don't support per-agent model selection.
+		Supported: true,
+		CreatedAt: time.Now(),
+		UpdatedAt: time.Now(),
+	}
+	s.requests[req.ID] = req
+	return req
+}
+
+func (s *ModelListStore) Get(id string) *ModelListRequest {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+
+	req, ok := s.requests[id]
+	if !ok {
+		return nil
+	}
+	if req.Status == ModelListPending && time.Since(req.CreatedAt) > 30*time.Second {
+		req.Status = ModelListTimeout
+		req.Error = "daemon did not respond within 30 seconds"
+		req.UpdatedAt = time.Now()
+	}
+	return req
+}
+
+// PopPending returns and marks-running the oldest pending request for a runtime.
+func (s *ModelListStore) PopPending(runtimeID string) *ModelListRequest {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+
+	var oldest *ModelListRequest
+	for _, req := range s.requests {
+		if req.RuntimeID == runtimeID && req.Status == ModelListPending {
+			if oldest == nil || req.CreatedAt.Before(oldest.CreatedAt) {
+				oldest = req
+			}
+		}
+	}
+	if oldest != nil {
+		oldest.Status = ModelListRunning
+		oldest.UpdatedAt = time.Now()
+	}
+	return oldest
+}
+
+func (s *ModelListStore) Complete(id string, models []ModelEntry, supported bool) {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+
+	if req, ok := s.requests[id]; ok {
+		req.Status = ModelListCompleted
+		req.Models = models
+		req.Supported = supported
+		req.UpdatedAt = time.Now()
+	}
+}
+
+func (s *ModelListStore) Fail(id string, errMsg string) {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+
+	if req, ok := s.requests[id]; ok {
+		req.Status = ModelListFailed
+		req.Error = errMsg
+		req.UpdatedAt = time.Now()
+	}
+}
+
+// ---------------------------------------------------------------------------
+// Handlers
+// ---------------------------------------------------------------------------
+
+// InitiateListModels creates a pending model list request for a runtime.
+// Called by the frontend; the daemon picks it up on its next heartbeat.
+func (h *Handler) InitiateListModels(w http.ResponseWriter, r *http.Request) {
+	runtimeID := chi.URLParam(r, "runtimeId")
+
+	rt, err := h.Queries.GetAgentRuntime(r.Context(), parseUUID(runtimeID))
+	if err != nil {
+		writeError(w, http.StatusNotFound, "runtime not found")
+		return
+	}
+	if _, ok := h.requireWorkspaceMember(w, r, uuidToString(rt.WorkspaceID), "runtime not found"); !ok {
+		return
+	}
+	if rt.Status != "online" {
+		writeError(w, http.StatusServiceUnavailable, "runtime is offline")
+		return
+	}
+
+	req := h.ModelListStore.Create(runtimeID)
+	writeJSON(w, http.StatusOK, req)
+}
+
+// GetModelListRequest returns the status of a model list request.
+func (h *Handler) GetModelListRequest(w http.ResponseWriter, r *http.Request) {
+	requestID := chi.URLParam(r, "requestId")
+
+	req := h.ModelListStore.Get(requestID)
+	if req == nil {
+		writeError(w, http.StatusNotFound, "request not found")
+		return
+	}
+	writeJSON(w, http.StatusOK, req)
+}
+
+// ReportModelListResult receives the list result from the daemon.
+func (h *Handler) ReportModelListResult(w http.ResponseWriter, r *http.Request) {
+	runtimeID := chi.URLParam(r, "runtimeId")
+
+	if _, ok := h.requireDaemonRuntimeAccess(w, r, runtimeID); !ok {
+		return
+	}
+
+	requestID := chi.URLParam(r, "requestId")
+
+	var body struct {
+		Status    string       `json:"status"` // "completed" or "failed"
+		Models    []ModelEntry `json:"models"`
+		Supported *bool        `json:"supported"`
+		Error     string       `json:"error"`
+	}
+	if err := json.NewDecoder(r.Body).Decode(&body); err != nil {
+		writeError(w, http.StatusBadRequest, "invalid request body")
+		return
+	}
+
+	if body.Status == "completed" {
+		// Older daemons may omit `supported`; default to true to keep
+		// the UI usable while they haven't been redeployed yet.
+		supported := true
+		if body.Supported != nil {
+			supported = *body.Supported
+		}
+		h.ModelListStore.Complete(requestID, body.Models, supported)
+	} else {
+		h.ModelListStore.Fail(requestID, body.Error)
+	}
+
+	slog.Debug("model list report", "runtime_id", runtimeID, "request_id", requestID, "status", body.Status, "count", len(body.Models))
+	writeJSON(w, http.StatusOK, map[string]string{"status": "ok"})
+}
--- a/server/internal/handler/runtime_models_test.go
+++ b/server/internal/handler/runtime_models_test.go
@@ -0,0 +1,88 @@
+package handler
+
+import (
+	"bytes"
+	"encoding/json"
+	"net/http"
+	"net/http/httptest"
+	"testing"
+)
+
+// TestReportModelListResult_PreservesDefault guards the daemon → server
+// → UI wire format for the model-discovery result. The `default` bool
+// on each ModelEntry lights up the UI's "default" badge; if it gets
+// dropped here (e.g. by going through a map[string]string), the badge
+// silently disappears.
+func TestReportModelListResult_PreservesDefault(t *testing.T) {
+	store := NewModelListStore()
+	req := store.Create("runtime-xyz")
+
+	// Report a completed result with one default entry and one not.
+	body := map[string]any{
+		"status":    "completed",
+		"supported": true,
+		"models": []map[string]any{
+			{"id": "foo-default", "label": "Foo", "provider": "p", "default": true},
+			{"id": "bar", "label": "Bar", "provider": "p"},
+		},
+	}
+	raw, _ := json.Marshal(body)
+
+	// Use the store's Complete directly — we're verifying the wire
+	// shape, not HTTP auth. The handler itself unmarshals into
+	// []ModelEntry and forwards verbatim, which is the path we care
+	// about here.
+	var parsed struct {
+		Models []ModelEntry `json:"models"`
+	}
+	if err := json.Unmarshal(raw, &parsed); err != nil {
+		t.Fatalf("unmarshal report body: %v", err)
+	}
+	store.Complete(req.ID, parsed.Models, true)
+
+	got := store.Get(req.ID)
+	if got == nil {
+		t.Fatal("expected stored result")
+	}
+	if len(got.Models) != 2 {
+		t.Fatalf("expected 2 models, got %d: %+v", len(got.Models), got.Models)
+	}
+	if !got.Models[0].Default {
+		t.Errorf("first model should carry Default=true, got %+v", got.Models[0])
+	}
+	if got.Models[1].Default {
+		t.Errorf("second model should carry Default=false, got %+v", got.Models[1])
+	}
+
+	// Serialise the stored request back out (what UI actually sees)
+	// and confirm `default: true` survives.
+	out, _ := json.Marshal(got)
+	if !bytes.Contains(out, []byte(`"default":true`)) {
+		t.Errorf(`expected "default":true in JSON response, got: %s`, out)
+	}
+}
+
+// TestReportModelListResult_DecodesJSONBodyDefault verifies the
+// handler's request-body parsing accepts the `default` bool from
+// the daemon POST — not just through the store API.
+func TestReportModelListResult_DecodesJSONBodyDefault(t *testing.T) {
+	// Simulate the shape the daemon POSTs: status + models + supported
+	// with `default` on one entry.
+	payload := `{"status":"completed","supported":true,"models":[{"id":"a","label":"A","default":true},{"id":"b","label":"B"}]}`
+	r := httptest.NewRequest(http.MethodPost, "/api/daemon/runtimes/rt/models/req/result", bytes.NewBufferString(payload))
+
+	var body struct {
+		Status    string       `json:"status"`
+		Models    []ModelEntry `json:"models"`
+		Supported *bool        `json:"supported"`
+	}
+	if err := json.NewDecoder(r.Body).Decode(&body); err != nil {
+		t.Fatalf("decode: %v", err)
+	}
+	if len(body.Models) != 2 {
+		t.Fatalf("want 2 models, got %d", len(body.Models))
+	}
+	if !body.Models[0].Default {
+		t.Errorf("default flag lost on model[0]: %+v", body.Models[0])
+	}
+}
--- a/server/migrations/050_agent_model.down.sql
+++ b/server/migrations/050_agent_model.down.sql
@@ -0,0 +1 @@
+ALTER TABLE agent DROP COLUMN IF EXISTS model;
--- a/server/migrations/050_agent_model.up.sql
+++ b/server/migrations/050_agent_model.up.sql
@@ -0,0 +1,5 @@
+-- Adds an explicit per-agent model field. Previously the only way to
+-- pick a model per agent was via custom_env / custom_args; a first-class
+-- column lets the UI render a dropdown and keeps Codex-style app-server
+-- providers (which reject -m in custom_args) working without CLI flags.
+ALTER TABLE agent ADD COLUMN model TEXT;
--- a/server/pkg/agent/hermes.go
+++ b/server/pkg/agent/hermes.go
@@ -5,7 +5,9 @@ import (
 	"context"
 	"encoding/json"
 	"fmt"
+	"io"
 	"os/exec"
+	"regexp"
 	"strings"
 	"sync"
 	"time"
@@ -64,7 +66,15 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
 		cancel()
 		return nil, fmt.Errorf("hermes stdin pipe: %w", err)
 	}
-	cmd.Stderr = newLogWriter(b.cfg.Logger, "[hermes:stderr] ")
+	// Forward stderr to the daemon log *and* sniff provider-level
+	// errors out of it so we can surface them in the task result.
+	// Hermes' session/prompt still reports stopReason=end_turn when
+	// the underlying HTTP call to the LLM returns 4xx/5xx, so
+	// without this we'd report a misleading "empty output" and hide
+	// the real cause (wrong model for the current provider, bad
+	// credentials, rate limit, …) in the daemon log.
+	providerErr := newHermesProviderErrorSniffer()
+	cmd.Stderr = io.MultiWriter(newLogWriter(b.cfg.Logger, "[hermes:stderr] "), providerErr)

 	if err := cmd.Start(); err != nil {
 		cancel()
@@ -82,8 +92,8 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
 	promptDone := make(chan hermesPromptResult, 1)

 	c := &hermesClient{
-		cfg:   b.cfg,
-		stdin: stdin,
+		cfg:     b.cfg,
+		stdin:   stdin,
 		pending: make(map[int]*pendingRPC),
 		onMessage: func(msg Message) {
 			if msg.Type == MessageText {
@@ -190,13 +200,40 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
 		c.sessionID = sessionID
 		b.cfg.Logger.Info("hermes session created", "session_id", sessionID)

-		// 3. Build the prompt content. If we have a system prompt, prepend it.
+		// 3. If the caller picked a model (via agent.model from the
+		// UI dropdown), ask hermes to switch the session to it
+		// before we send any prompt. Hermes' _build_model_state
+		// exposes modelId as `provider:model` — we pass that
+		// through verbatim. This MUST fail the task on error:
+		// if we silently fell back to hermes' default model the
+		// user would think their pick was honoured while the
+		// task actually ran on something else.
+		if opts.Model != "" {
+			if _, err := c.request(runCtx, "session/set_model", map[string]any{
+				"sessionId": sessionID,
+				"modelId":   opts.Model,
+			}); err != nil {
+				b.cfg.Logger.Warn("hermes set_session_model failed", "error", err, "requested_model", opts.Model)
+				finalStatus = "failed"
+				finalError = fmt.Sprintf("hermes could not switch to model %q: %v", opts.Model, err)
+				resCh <- Result{
+					Status:     finalStatus,
+					Error:      finalError,
+					DurationMs: time.Since(startTime).Milliseconds(),
+					SessionID:  sessionID,
+				}
+				return
+			}
+			b.cfg.Logger.Info("hermes session model set", "model", opts.Model)
+		}
+
+		// 4. Build the prompt content. If we have a system prompt, prepend it.
 		userText := prompt
 		if opts.SystemPrompt != "" {
 			userText = opts.SystemPrompt + "\n\n---\n\n" + prompt
 		}

-		// 4. Send the prompt and wait for PromptResponse.
+		// 5. Send the prompt and wait for PromptResponse.
 		_, err = c.request(runCtx, "session/prompt", map[string]any{
 			"sessionId": sessionID,
 			"prompt": []map[string]any{
@@ -248,6 +285,20 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
 		finalOutput := output.String()
 		outputMu.Unlock()

+		// If hermes produced no visible output but we sniffed a
+		// provider-level error on stderr (typically HTTP 4xx from
+		// the configured LLM endpoint), promote the status to
+		// failed and surface the real reason. Without this the
+		// daemon reports a cryptic "hermes returned empty output"
+		// and the actionable error (e.g. "model X not supported
+		// with your ChatGPT account") stays buried in daemon logs.
+		if finalStatus == "completed" && finalOutput == "" {
+			if msg := providerErr.message(); msg != "" {
+				finalStatus = "failed"
+				finalError = msg
+			}
+		}
+
 		// Build usage map.
 		c.usageMu.Lock()
 		u := c.usage
@@ -283,13 +334,13 @@ type hermesPromptResult struct {
 }

 type hermesClient struct {
-	cfg       Config
-	stdin     interface{ Write([]byte) (int, error) }
-	mu        sync.Mutex
-	nextID    int
-	pending   map[int]*pendingRPC
-	sessionID string
-	onMessage func(Message)
+	cfg          Config
+	stdin        interface{ Write([]byte) (int, error) }
+	mu           sync.Mutex
+	nextID       int
+	pending      map[int]*pendingRPC
+	sessionID    string
+	onMessage    func(Message)
 	onPromptDone func(hermesPromptResult)

 	usageMu sync.Mutex
@@ -427,8 +478,8 @@ func (c *hermesClient) extractPromptResult(data json.RawMessage) {
 	}
 	if resp.Usage != nil {
 		pr.usage = TokenUsage{
-			InputTokens:  resp.Usage.InputTokens,
-			OutputTokens: resp.Usage.OutputTokens,
+			InputTokens:     resp.Usage.InputTokens,
+			OutputTokens:    resp.Usage.OutputTokens,
 			CacheReadTokens: resp.Usage.CachedReadTokens,
 		}
 	}
@@ -509,9 +560,9 @@ func (c *hermesClient) handleAgentThought(data json.RawMessage) {

 func (c *hermesClient) handleToolCallStart(data json.RawMessage) {
 	var msg struct {
-		ToolCallID string `json:"toolCallId"`
-		Title      string `json:"title"`
-		Kind       string `json:"kind"`
+		ToolCallID string         `json:"toolCallId"`
+		Title      string         `json:"title"`
+		Kind       string         `json:"kind"`
 		RawInput   map[string]any `json:"rawInput"`
 	}
 	if err := json.Unmarshal(data, &msg); err != nil {
@@ -649,3 +700,98 @@ func hermesToolNameFromTitle(title string, kind string) string {
 		return kind
 	}
 }
+
+// ── Provider-error sniffing ──
+//
+// hermes' session/prompt RPC reports stopReason=end_turn even when
+// the underlying HTTP call to the configured LLM endpoint returned
+// an error — the actionable detail only appears on stderr (e.g.
+// `⚠️ API call failed (attempt 1/3): BadRequestError [HTTP 400]` and
+// `Error: HTTP 400: Error code: 400 - {'detail': "The '...' model
+// is not supported when using Codex with a ChatGPT account."}`).
+// We scan for those patterns so the daemon can surface a real
+// failure instead of a generic "empty output".
+type hermesProviderErrorSniffer struct {
+	mu      sync.Mutex
+	remains []byte   // buffer for a partial trailing line across writes
+	lines   []string // captured error lines, bounded
+	seen    map[string]bool
+}
+
+// hermesErrorHeaderRe matches the first line of an API-error block.
+// Hermes prefixes these with ⚠️ / ❌ and includes an HTTP status
+// code or a non-retryable-error tag.
+var hermesErrorHeaderRe = regexp.MustCompile(`(?:⚠️|❌|\[ERROR\]).*(?:BadRequestError|AuthenticationError|RateLimitError|HTTP [0-9]{3}|Non-retryable|API call failed)`)
+
+// hermesErrorDetailRe pulls the most useful single-line messages
+// out of the subsequent lines of the error block (the one whose
+// "Error:" or "Details:" tag actually spells out what happened).
+var hermesErrorDetailRe = regexp.MustCompile(`(?:Error:|detail:|Details:)\s*(.+)`)
+
+const hermesMaxErrorLines = 8
+
+func newHermesProviderErrorSniffer() *hermesProviderErrorSniffer {
+	return &hermesProviderErrorSniffer{seen: map[string]bool{}}
+}
+
+// Write implements io.Writer so the sniffer can sit behind an
+// io.MultiWriter next to the normal stderr log forwarder.
+func (s *hermesProviderErrorSniffer) Write(p []byte) (int, error) {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+
+	data := append(s.remains, p...)
+	// Keep the final partial line (no trailing newline) for the
+	// next write so multi-line error blocks aren't split.
+	nl := strings.LastIndexByte(string(data), '\n')
+	var complete string
+	if nl < 0 {
+		s.remains = append(s.remains[:0], data...)
+		return len(p), nil
+	}
+	complete = string(data[:nl])
+	s.remains = append(s.remains[:0], data[nl+1:]...)
+
+	for _, line := range strings.Split(complete, "\n") {
+		line = strings.TrimSpace(line)
+		if line == "" {
+			continue
+		}
+		if !(hermesErrorHeaderRe.MatchString(line) || hermesErrorDetailRe.MatchString(line)) {
+			continue
+		}
+		if s.seen[line] {
+			continue
+		}
+		s.seen[line] = true
+		s.lines = append(s.lines, line)
+		if len(s.lines) > hermesMaxErrorLines {
+			s.lines = s.lines[len(s.lines)-hermesMaxErrorLines:]
+		}
+	}
+	return len(p), nil
+}
+
+// message returns a single-line summary suitable for the task
+// error field. Prefers the most specific "Error:" / "detail:"
+// fragment; falls back to the first captured header line; empty
+// when nothing useful was seen.
+func (s *hermesProviderErrorSniffer) message() string {
+	s.mu.Lock()
+	defer s.mu.Unlock()
+
+	for _, line := range s.lines {
+		if m := hermesErrorDetailRe.FindStringSubmatch(line); m != nil {
+			detail := strings.TrimSpace(m[1])
+			if detail != "" {
+				return "hermes provider error: " + detail
+			}
+		}
+	}
+	for _, line := range s.lines {
+		if hermesErrorHeaderRe.MatchString(line) {
+			return "hermes provider error: " + line
+		}
+	}
+	return ""
+}
--- a/server/pkg/agent/hermes_test.go
+++ b/server/pkg/agent/hermes_test.go
@@ -2,6 +2,7 @@ package agent

 import (
 	"encoding/json"
+	"strings"
 	"testing"
 )

@@ -375,3 +376,71 @@ func TestHermesClientIgnoresInvalidJSON(t *testing.T) {
 	c.handleLine("")
 	c.handleLine("{}")
 }
+
+func TestHermesProviderErrorSniffer(t *testing.T) {
+	t.Parallel()
+
+	// Real sample of the stderr hermes emits when the configured
+	// LLM endpoint rejects the requested model. We verify the
+	// sniffer extracts the `Error: ...` line so the task error
+	// tells the user *why* it failed.
+	s := newHermesProviderErrorSniffer()
+	lines := []string{
+		"2026-04-20 23:41:47 [INFO] acp_adapter.server: Prompt on session abc",
+		`⚠️  API call failed (attempt 1/3): BadRequestError [HTTP 400]`,
+		`   🔌 Provider: openai-codex  Model: gpt-5.1-codex-mini`,
+		`   📝 Error: HTTP 400: Error code: 400 - {'detail': "The 'gpt-5.1-codex-mini' model is not supported when using Codex with a ChatGPT account."}`,
+		`⏱️  Elapsed: 1.17s`,
+	}
+	for _, line := range lines {
+		if _, err := s.Write([]byte(line + "\n")); err != nil {
+			t.Fatalf("Write: %v", err)
+		}
+	}
+	msg := s.message()
+	if msg == "" {
+		t.Fatal("expected a non-empty error message")
+	}
+	if !strings.Contains(msg, "model is not supported") {
+		t.Errorf("expected detail about model support, got %q", msg)
+	}
+}
+
+func TestHermesProviderErrorSnifferIgnoresInfoLines(t *testing.T) {
+	t.Parallel()
+
+	s := newHermesProviderErrorSniffer()
+	s.Write([]byte("2026-04-20 23:41:45 [INFO] acp_adapter.entry: Loaded env\n"))
+	s.Write([]byte("2026-04-20 23:41:47 [INFO] agent.auxiliary_client: Vision auto-detect...\n"))
+	if msg := s.message(); msg != "" {
+		t.Errorf("info lines should produce no error, got %q", msg)
+	}
+}
+
+func TestHermesProviderErrorSnifferHandlesPartialLines(t *testing.T) {
+	t.Parallel()
+
+	// Writer may be called mid-line; the sniffer must buffer until
+	// it sees a newline so the regex doesn't miss the header.
+	s := newHermesProviderErrorSniffer()
+	s.Write([]byte(`⚠️  API call failed (attempt 1/3):`))
+	s.Write([]byte(` BadRequestError [HTTP 400]` + "\n"))
+	s.Write([]byte(`   📝 Error: something went wrong` + "\n"))
+	msg := s.message()
+	if !strings.Contains(msg, "something went wrong") {
+		t.Errorf("expected buffered line to be captured, got %q", msg)
+	}
+}
+
+func TestHermesProviderErrorSnifferBoundedBuffer(t *testing.T) {
+	t.Parallel()
+
+	s := newHermesProviderErrorSniffer()
+	for i := 0; i < 20; i++ {
+		// Each line differs so dedup doesn't merge them.
+		s.Write([]byte(`⚠️  API call failed (HTTP 400) attempt ` + string(rune('a'+i%26)) + `: Non-retryable error` + "\n"))
+	}
+	if len(s.lines) > hermesMaxErrorLines {
+		t.Errorf("sniffer kept %d lines, limit is %d", len(s.lines), hermesMaxErrorLines)
+	}
+}
--- a/server/pkg/agent/models.go
+++ b/server/pkg/agent/models.go
@@ -0,0 +1,741 @@
+package agent
+
+import (
+	"bufio"
+	"bytes"
+	"context"
+	"encoding/json"
+	"fmt"
+	"io"
+	"os"
+	"os/exec"
+	"strings"
+	"sync"
+	"time"
+)
+
+// Model describes a single LLM model exposed by an agent provider.
+// The dropdown groups by Provider when the ID uses the
+// `provider/model` form (e.g. "openai/gpt-4o" from opencode).
+// Default is a *display* hint: the UI badges the entry the
+// runtime advertises as its preferred pick (e.g. Claude Code's
+// shipped default, or hermes' currentModelId). It has no effect
+// at execution time — when agent.model is empty the daemon passes
+// "" to the backend so each provider's own CLI resolves its own
+// default, which is always closer to what the user's account /
+// environment actually supports than a static guess here.
+type Model struct {
+	ID       string `json:"id"`
+	Label    string `json:"label"`
+	Provider string `json:"provider,omitempty"`
+	Default  bool   `json:"default,omitempty"`
+}
+
+// modelCache memoizes dynamic discovery calls so repeated UI loads
+// don't re-shell the agent CLI. Entries expire after cacheTTL.
+type modelCacheEntry struct {
+	models    []Model
+	expiresAt time.Time
+}
+
+var (
+	modelCacheMu sync.Mutex
+	modelCache   = map[string]modelCacheEntry{}
+)
+
+const modelCacheTTL = 60 * time.Second
+
+// ListModels returns the models supported by the given agent provider.
+// For providers with a known static catalog it returns the baked-in
+// list; for providers with a CLI discovery mechanism (opencode, pi,
+// openclaw) it shells out with caching and falls back to the static
+// list on failure.
+//
+// executablePath lets the caller point at a non-default binary; pass
+// "" to use the provider's default name on PATH.
+func ListModels(ctx context.Context, providerType, executablePath string) ([]Model, error) {
+	switch providerType {
+	case "claude":
+		return claudeStaticModels(), nil
+	case "codex":
+		return codexStaticModels(), nil
+	case "gemini":
+		return geminiStaticModels(), nil
+	case "cursor":
+		return cachedDiscovery(providerType, func() ([]Model, error) {
+			return discoverCursorModels(ctx, executablePath)
+		})
+	case "copilot":
+		return copilotStaticModels(), nil
+	case "hermes":
+		return cachedDiscovery(providerType, func() ([]Model, error) {
+			return discoverHermesModels(ctx, executablePath)
+		})
+	case "opencode":
+		return cachedDiscovery(providerType, func() ([]Model, error) {
+			return discoverOpenCodeModels(ctx, executablePath)
+		})
+	case "pi":
+		return cachedDiscovery(providerType, func() ([]Model, error) {
+			return discoverPiModels(ctx, executablePath)
+		})
+	case "openclaw":
+		return cachedDiscovery(providerType, func() ([]Model, error) {
+			return discoverOpenclawAgents(ctx, executablePath)
+		})
+	default:
+		return nil, fmt.Errorf("unknown agent type: %q", providerType)
+	}
+}
+
+// ModelSelectionSupported reports whether setting `agent.model` has
+// any effect for the given provider. Today every provider in the
+// registry honours `opts.Model` end-to-end: Hermes routes it through
+// the ACP `session/set_model` RPC before each prompt, which means
+// the UI's dropdown choice is carried all the way down to the LLM
+// call. The helper is retained so we can add a `return false` branch
+// the next time a provider legitimately ignores model selection.
+func ModelSelectionSupported(providerType string) bool {
+	_ = providerType
+	return true
+}
+
+// cachedDiscovery invokes fn and caches the result for modelCacheTTL.
+// The cache is keyed on providerType only; callers that need to
+// distinguish discovery by host/user should include that in the key
+// if we ever introduce such a mode.
+func cachedDiscovery(key string, fn func() ([]Model, error)) ([]Model, error) {
+	modelCacheMu.Lock()
+	if entry, ok := modelCache[key]; ok && time.Now().Before(entry.expiresAt) {
+		out := entry.models
+		modelCacheMu.Unlock()
+		return out, nil
+	}
+	modelCacheMu.Unlock()
+
+	models, err := fn()
+	if err != nil {
+		return nil, err
+	}
+
+	modelCacheMu.Lock()
+	modelCache[key] = modelCacheEntry{models: models, expiresAt: time.Now().Add(modelCacheTTL)}
+	modelCacheMu.Unlock()
+	return models, nil
+}
+
+// ── Static catalogs ──
+
+// claudeStaticModels reflects the Claude Code CLI's accepted --model
+// values. Keep this list short and current; stale entries here
+// mislead users more than they help. Default = Sonnet because it's
+// the everyday workhorse (Opus is reserved for advisor-style flows).
+func claudeStaticModels() []Model {
+	return []Model{
+		{ID: "claude-sonnet-4-6", Label: "Claude Sonnet 4.6", Provider: "anthropic", Default: true},
+		{ID: "claude-opus-4-7", Label: "Claude Opus 4.7", Provider: "anthropic"},
+		{ID: "claude-haiku-4-5-20251001", Label: "Claude Haiku 4.5", Provider: "anthropic"},
+		{ID: "claude-opus-4-6", Label: "Claude Opus 4.6", Provider: "anthropic"},
+		{ID: "claude-sonnet-4-5", Label: "Claude Sonnet 4.5", Provider: "anthropic"},
+	}
+}
+
+func codexStaticModels() []Model {
+	return []Model{
+		{ID: "gpt-5.4", Label: "GPT-5.4", Provider: "openai", Default: true},
+		{ID: "gpt-5.4-mini", Label: "GPT-5.4 mini", Provider: "openai"},
+		{ID: "gpt-5.3-codex", Label: "GPT-5.3 Codex", Provider: "openai"},
+		{ID: "gpt-5", Label: "GPT-5", Provider: "openai"},
+		{ID: "o3", Label: "o3", Provider: "openai"},
+		{ID: "o3-mini", Label: "o3-mini", Provider: "openai"},
+	}
+}
+
+func geminiStaticModels() []Model {
+	return []Model{
+		{ID: "gemini-2.5-pro", Label: "Gemini 2.5 Pro", Provider: "google", Default: true},
+		{ID: "gemini-2.5-flash", Label: "Gemini 2.5 Flash", Provider: "google"},
+		{ID: "gemini-2.0-flash", Label: "Gemini 2.0 Flash", Provider: "google"},
+	}
+}
+
+// cursorStaticModels is a minimal fallback used when
+// `cursor-agent --list-models` isn't available (binary missing,
+// offline, etc). The real catalog is fetched dynamically because
+// Cursor's model IDs shift (e.g. `composer-2-fast`,
+// `claude-4.6-sonnet-medium`, `gemini-3.1-pro`) and any static
+// list we ship goes stale fast.
+func cursorStaticModels() []Model {
+	return []Model{
+		{ID: "auto", Label: "Auto", Provider: "cursor", Default: true},
+	}
+}
+
+// copilotStaticModels — GitHub Copilot CLI resolves models via the
+// user's GitHub account, not via CLI args. We deliberately mark no
+// Default: the right model is whatever GitHub routes the request
+// to, and forcing one here would override that.
+func copilotStaticModels() []Model {
+	return []Model{
+		{ID: "gpt-5.4", Label: "GPT-5.4", Provider: "openai"},
+		{ID: "claude-sonnet-4-6", Label: "Claude Sonnet 4.6", Provider: "anthropic"},
+	}
+}
+
+// ── Dynamic discovery ──
+
+// discoverOpenCodeModels runs `opencode models` and parses its tabular
+// output. The CLI prints `provider/model` rows; we emit them verbatim
+// as IDs so what the user sees matches what `--model` accepts.
+// On any failure (CLI missing, parse error, timeout) we fall back to
+// an empty list so the creatable UI still works.
+func discoverOpenCodeModels(ctx context.Context, executablePath string) ([]Model, error) {
+	if executablePath == "" {
+		executablePath = "opencode"
+	}
+	if _, err := exec.LookPath(executablePath); err != nil {
+		return []Model{}, nil
+	}
+	runCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
+	defer cancel()
+	cmd := exec.CommandContext(runCtx, executablePath, "models")
+	out, err := cmd.Output()
+	if err != nil {
+		return []Model{}, nil
+	}
+	return parseOpenCodeModels(string(out)), nil
+}
+
+// parseOpenCodeModels accepts the `opencode models` text output and
+// extracts IDs. Output format (v0.x): a header row followed by rows
+// whose first whitespace-delimited field is `provider/model`.
+func parseOpenCodeModels(output string) []Model {
+	scanner := bufio.NewScanner(strings.NewReader(output))
+	scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)
+	var models []Model
+	seen := map[string]bool{}
+	for scanner.Scan() {
+		line := strings.TrimSpace(scanner.Text())
+		if line == "" {
+			continue
+		}
+		first := strings.Fields(line)
+		if len(first) == 0 {
+			continue
+		}
+		id := first[0]
+		if !strings.Contains(id, "/") {
+			continue
+		}
+		// Skip the header row (opencode prints e.g. PROVIDER/MODEL in caps).
+		if id == strings.ToUpper(id) {
+			continue
+		}
+		if seen[id] {
+			continue
+		}
+		seen[id] = true
+		provider := ""
+		if i := strings.Index(id, "/"); i > 0 {
+			provider = id[:i]
+		}
+		models = append(models, Model{ID: id, Label: id, Provider: provider})
+	}
+	return models
+}
+
+// discoverPiModels runs `pi --list-models` and parses its output.
+// Older pi versions print the list to stderr; newer versions use
+// stdout. We capture both and parse whichever is non-empty.
+func discoverPiModels(ctx context.Context, executablePath string) ([]Model, error) {
+	if executablePath == "" {
+		executablePath = "pi"
+	}
+	if _, err := exec.LookPath(executablePath); err != nil {
+		return []Model{}, nil
+	}
+	runCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
+	defer cancel()
+	cmd := exec.CommandContext(runCtx, executablePath, "--list-models")
+	var stderr strings.Builder
+	cmd.Stderr = &stderr
+	stdout, err := cmd.Output()
+	if err != nil {
+		return []Model{}, nil
+	}
+	text := string(stdout)
+	if strings.TrimSpace(text) == "" {
+		text = stderr.String()
+	}
+	return parsePiModels(text), nil
+}
+
+// parsePiModels accepts the `pi --list-models` output and extracts
+// model IDs. Pi's format uses `provider:model` rows; we normalize to
+// the same `provider/model` form as opencode for UI consistency.
+func parsePiModels(output string) []Model {
+	scanner := bufio.NewScanner(strings.NewReader(output))
+	scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)
+	var models []Model
+	seen := map[string]bool{}
+	for scanner.Scan() {
+		line := strings.TrimSpace(scanner.Text())
+		if line == "" {
+			continue
+		}
+		first := strings.Fields(line)
+		if len(first) == 0 {
+			continue
+		}
+		id := first[0]
+		if !strings.ContainsAny(id, ":/") {
+			continue
+		}
+		// Normalize ":" to "/" since pi uses colon but opencode/UI uses slash.
+		id = strings.Replace(id, ":", "/", 1)
+		if seen[id] {
+			continue
+		}
+		seen[id] = true
+		provider := ""
+		if i := strings.Index(id, "/"); i > 0 {
+			provider = id[:i]
+		}
+		models = append(models, Model{ID: id, Label: id, Provider: provider})
+	}
+	return models
+}
+
+// discoverHermesModels spins up a throwaway `hermes acp` process,
+// drives just enough of the protocol to receive the model list
+// advertised in the `session/new` response, and shuts it down. The
+// list and the `current` flag both come from hermes' own
+// `_build_model_state` so whatever ~/.hermes/config.yaml resolves
+// to at runtime is exactly what the UI shows.
+//
+// Failure modes (hermes missing, no credentials, config resolution
+// error) all return an empty list so the UI falls back to the
+// creatable manual-entry input instead of blocking the form.
+func discoverHermesModels(ctx context.Context, executablePath string) ([]Model, error) {
+	if executablePath == "" {
+		executablePath = "hermes"
+	}
+	if _, err := exec.LookPath(executablePath); err != nil {
+		return []Model{}, nil
+	}
+	runCtx, cancel := context.WithTimeout(ctx, 15*time.Second)
+	defer cancel()
+
+	cmd := exec.CommandContext(runCtx, executablePath, "acp")
+	// Mirror the real backend's auto-approve so init doesn't prompt.
+	cmd.Env = append(os.Environ(), "HERMES_YOLO_MODE=1")
+	stdin, err := cmd.StdinPipe()
+	if err != nil {
+		return []Model{}, nil
+	}
+	stdout, err := cmd.StdoutPipe()
+	if err != nil {
+		stdin.Close()
+		return []Model{}, nil
+	}
+	// Discard stderr; noisy logs here don't help us and we don't
+	// want them bleeding into the daemon log every 60s.
+	cmd.Stderr = io.Discard
+	if err := cmd.Start(); err != nil {
+		return []Model{}, nil
+	}
+	// Ensure the child process is always reaped.
+	defer func() {
+		_ = stdin.Close()
+		_ = cmd.Process.Kill()
+		_, _ = cmd.Process.Wait()
+	}()
+
+	writeACP := func(id int, method string, params map[string]any) error {
+		msg := map[string]any{
+			"jsonrpc": "2.0",
+			"id":      id,
+			"method":  method,
+			"params":  params,
+		}
+		data, err := json.Marshal(msg)
+		if err != nil {
+			return err
+		}
+		data = append(data, '\n')
+		_, err = stdin.Write(data)
+		return err
+	}
+
+	// Send initialize + session/new.
+	if err := writeACP(1, "initialize", map[string]any{
+		"protocolVersion":    1,
+		"clientInfo":         map[string]any{"name": "multica-model-discovery", "version": "0.1.0"},
+		"clientCapabilities": map[string]any{},
+	}); err != nil {
+		return []Model{}, nil
+	}
+
+	// Hermes requires a valid cwd for session/new — use a temp
+	// directory we clean up afterwards, not the daemon's workdir
+	// (which might be in the middle of another task's worktree).
+	tmp, err := os.MkdirTemp("", "multica-hermes-discovery-")
+	if err != nil {
+		return []Model{}, nil
+	}
+	defer os.RemoveAll(tmp)
+
+	if err := writeACP(2, "session/new", map[string]any{
+		"cwd":        tmp,
+		"mcpServers": []any{},
+	}); err != nil {
+		return []Model{}, nil
+	}
+
+	// Read responses until we see the one for id=2 (session/new).
+	scanner := bufio.NewScanner(stdout)
+	scanner.Buffer(make([]byte, 0, 1024*1024), 4*1024*1024)
+	deadline := time.After(12 * time.Second)
+	done := make(chan []Model, 1)
+	go func() {
+		defer close(done)
+		for scanner.Scan() {
+			line := strings.TrimSpace(scanner.Text())
+			if line == "" {
+				continue
+			}
+			var env struct {
+				ID     json.Number     `json:"id"`
+				Result json.RawMessage `json:"result"`
+			}
+			if err := json.Unmarshal([]byte(line), &env); err != nil {
+				continue
+			}
+			if env.ID.String() != "2" || len(env.Result) == 0 {
+				continue
+			}
+			done <- parseHermesSessionNewModels(env.Result)
+			return
+		}
+	}()
+
+	select {
+	case models := <-done:
+		if models == nil {
+			return []Model{}, nil
+		}
+		return models, nil
+	case <-deadline:
+		return []Model{}, nil
+	case <-runCtx.Done():
+		return []Model{}, nil
+	}
+}
+
+// parseHermesSessionNewModels extracts the model catalog from a
+// hermes `session/new` response. Hermes' ACP schema emits:
+//
+//	{
+//	  "sessionId": "...",
+//	  "models": {
+//	    "availableModels": [
+//	      {"modelId": "...", "name": "...", "description": "... current"}
+//	    ],
+//	    "currentModelId": "..."
+//	  }
+//	}
+//
+// Returns nil (not an empty slice) when the payload is missing so
+// the caller can distinguish "parsed with no models" (valid but
+// empty catalog) from "couldn't find the structure at all".
+func parseHermesSessionNewModels(raw json.RawMessage) []Model {
+	var resp struct {
+		Models struct {
+			AvailableModels []struct {
+				ModelID     string `json:"modelId"`
+				Name        string `json:"name"`
+				Description string `json:"description"`
+			} `json:"availableModels"`
+			CurrentModelID string `json:"currentModelId"`
+		} `json:"models"`
+	}
+	if err := json.Unmarshal(raw, &resp); err != nil {
+		return nil
+	}
+	models := make([]Model, 0, len(resp.Models.AvailableModels))
+	seen := map[string]bool{}
+	for _, m := range resp.Models.AvailableModels {
+		if m.ModelID == "" || seen[m.ModelID] {
+			continue
+		}
+		seen[m.ModelID] = true
+		label := m.Name
+		if label == "" {
+			label = m.ModelID
+		}
+		provider := ""
+		if idx := strings.Index(m.ModelID, ":"); idx > 0 {
+			provider = m.ModelID[:idx]
+		}
+		models = append(models, Model{
+			ID:       m.ModelID,
+			Label:    label,
+			Provider: provider,
+			Default:  m.ModelID == resp.Models.CurrentModelID,
+		})
+	}
+	return models
+}
+
+// discoverCursorModels runs `cursor-agent --list-models` and parses
+// the `id - Label` rows. Cursor's catalog changes often and ships
+// many variants of the same base model (thinking / fast / max
+// suffixes) — static baking would be obsolete within weeks. On any
+// failure we fall back to the minimal static catalog so the UI
+// stays usable when cursor-agent isn't installed on the daemon host.
+func discoverCursorModels(ctx context.Context, executablePath string) ([]Model, error) {
+	if executablePath == "" {
+		executablePath = "cursor-agent"
+	}
+	if _, err := exec.LookPath(executablePath); err != nil {
+		return cursorStaticModels(), nil
+	}
+	runCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
+	defer cancel()
+	cmd := exec.CommandContext(runCtx, executablePath, "--list-models")
+	out, err := cmd.Output()
+	if err != nil {
+		return cursorStaticModels(), nil
+	}
+	models := parseCursorModels(string(out))
+	if len(models) == 0 {
+		return cursorStaticModels(), nil
+	}
+	return models, nil
+}
+
+// parseCursorModels extracts model IDs from `cursor-agent --list-models`.
+// Output format (as of cursor-agent 2026.04):
+//
+//	Available models
+//	<blank>
+//	auto - Auto
+//	composer-2-fast - Composer 2 Fast (current, default)
+//	composer-2 - Composer 2
+//	…
+//
+// The model tagged `(default)` is surfaced as Default=true so the
+// UI badge points at cursor's own recommendation rather than a
+// hard-coded guess from our catalog.
+func parseCursorModels(output string) []Model {
+	scanner := bufio.NewScanner(strings.NewReader(output))
+	scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)
+	var models []Model
+	seen := map[string]bool{}
+	for scanner.Scan() {
+		line := strings.TrimSpace(scanner.Text())
+		if line == "" {
+			continue
+		}
+		// Row format: "<id> - <label>". Skip the "Available models" header.
+		idx := strings.Index(line, " - ")
+		if idx <= 0 {
+			continue
+		}
+		id := strings.TrimSpace(line[:idx])
+		label := strings.TrimSpace(line[idx+3:])
+		if !isOpenclawIdentifier(id) {
+			// Reuse the identifier guard — cursor IDs are in the
+			// same character set (alnum + `-./_`), so anything
+			// that fails it is either malformed or a header line.
+			continue
+		}
+		if seen[id] {
+			continue
+		}
+		seen[id] = true
+		isDefault := strings.Contains(label, "default")
+		// Strip the "(current, default)" suffix from the display
+		// label since we surface that through the Default flag.
+		if paren := strings.Index(label, "("); paren > 0 {
+			label = strings.TrimSpace(label[:paren])
+		}
+		if label == "" {
+			label = id
+		}
+		models = append(models, Model{
+			ID:       id,
+			Label:    label,
+			Provider: "cursor",
+			Default:  isDefault,
+		})
+	}
+	return models
+}
+
+// discoverOpenclawAgents enumerates the pre-registered OpenClaw
+// agents (which is where model selection actually lives in the
+// OpenClaw world — each agent is bound to a model at `agents add`
+// time). It tries structured JSON output first, falling back to a
+// conservative text parser that rejects TUI decoration and section
+// headers. On any ambiguity we return an empty list and let the
+// creatable dropdown handle manual entry — a silently-wrong
+// enumeration would be worse than none.
+func discoverOpenclawAgents(ctx context.Context, executablePath string) ([]Model, error) {
+	if executablePath == "" {
+		executablePath = "openclaw"
+	}
+	if _, err := exec.LookPath(executablePath); err != nil {
+		return []Model{}, nil
+	}
+	runCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
+	defer cancel()
+
+	// Try JSON modes first. Different openclaw builds expose the
+	// flag under different names; trying a couple is cheap.
+	for _, jsonArgs := range [][]string{
+		{"agents", "list", "--json"},
+		{"agents", "list", "--output", "json"},
+		{"agents", "list", "-o", "json"},
+	} {
+		cmd := exec.CommandContext(runCtx, executablePath, jsonArgs...)
+		out, err := cmd.Output()
+		if err != nil {
+			continue
+		}
+		if models, ok := parseOpenclawAgentsJSON(out); ok {
+			return models, nil
+		}
+	}
+
+	// Text fallback. Be strict — the default output is a decorated
+	// banner with box-drawing and section headers, and picking up
+	// the wrong tokens produces nonsense entries like "Identity:".
+	cmd := exec.CommandContext(runCtx, executablePath, "agents", "list")
+	out, err := cmd.Output()
+	if err != nil {
+		return []Model{}, nil
+	}
+	return parseOpenclawAgents(string(out)), nil
+}
+
+// openclawAgentEntry is the shape parseOpenclawAgentsJSON expects
+// from `openclaw agents list --json`. Both `name` and `id` are
+// accepted as the identifier (different openclaw versions ship
+// different field names); `model` is optional and only used to
+// enrich the dropdown label.
+type openclawAgentEntry struct {
+	Name  string `json:"name"`
+	ID    string `json:"id"`
+	Model string `json:"model"`
+}
+
+// parseOpenclawAgentsJSON accepts `openclaw agents list --json`-style
+// output. It handles two common shapes: a top-level array, or an
+// object with an `agents` key whose value is an array. Returns
+// ok=false if the input isn't valid JSON in either shape.
+func parseOpenclawAgentsJSON(raw []byte) ([]Model, bool) {
+	raw = bytes.TrimSpace(raw)
+	if len(raw) == 0 {
+		return nil, false
+	}
+
+	var flat []openclawAgentEntry
+	if err := json.Unmarshal(raw, &flat); err == nil {
+		return openclawEntriesToModels(flat), true
+	}
+
+	var wrapped struct {
+		Agents []openclawAgentEntry `json:"agents"`
+	}
+	if err := json.Unmarshal(raw, &wrapped); err == nil && wrapped.Agents != nil {
+		return openclawEntriesToModels(wrapped.Agents), true
+	}
+
+	return nil, false
+}
+
+func openclawEntriesToModels(entries []openclawAgentEntry) []Model {
+	models := make([]Model, 0, len(entries))
+	seen := map[string]bool{}
+	for _, e := range entries {
+		name := e.Name
+		if name == "" {
+			name = e.ID
+		}
+		if name == "" || seen[name] {
+			continue
+		}
+		seen[name] = true
+		label := name
+		if e.Model != "" {
+			label = name + " (" + e.Model + ")"
+		}
+		models = append(models, Model{ID: name, Label: label, Provider: "openclaw"})
+	}
+	return models
+}
+
+// parseOpenclawAgents extracts agent names from the text output of
+// `openclaw agents list`. The default CLI output is a decorated
+// banner — section headers ending in `:`, box-drawing characters,
+// and single-character icons — so we only accept lines that look
+// like a proper `<name> <model>` row: at least two whitespace-
+// separated tokens, both made of safe identifier characters, and
+// neither ending in `:`. Anything else is discarded to avoid
+// surfacing "Identity:" or `◇` as selectable models.
+func parseOpenclawAgents(output string) []Model {
+	scanner := bufio.NewScanner(strings.NewReader(output))
+	scanner.Buffer(make([]byte, 0, 64*1024), 1024*1024)
+	var models []Model
+	seen := map[string]bool{}
+	for scanner.Scan() {
+		line := strings.TrimSpace(scanner.Text())
+		if line == "" {
+			continue
+		}
+		fields := strings.Fields(line)
+		if len(fields) < 2 {
+			continue
+		}
+		name, model := fields[0], fields[1]
+		if !isOpenclawIdentifier(name) || !isOpenclawIdentifier(model) {
+			continue
+		}
+		if seen[name] {
+			continue
+		}
+		seen[name] = true
+		models = append(models, Model{
+			ID:       name,
+			Label:    name + " (" + model + ")",
+			Provider: "openclaw",
+		})
+	}
+	return models
+}
+
+// isOpenclawIdentifier reports whether s looks like a valid
+// agent-name or model-id token: starts with a letter, contains only
+// identifier-safe characters, and isn't a section header
+// (trailing colon). Rejects TUI decoration like `│`, `╭`, `◇`, `|`.
+func isOpenclawIdentifier(s string) bool {
+	if s == "" || strings.HasSuffix(s, ":") {
+		return false
+	}
+	first := s[0]
+	if !((first >= 'a' && first <= 'z') || (first >= 'A' && first <= 'Z')) {
+		return false
+	}
+	for _, r := range s {
+		switch {
+		case r >= 'a' && r <= 'z':
+		case r >= 'A' && r <= 'Z':
+		case r >= '0' && r <= '9':
+		case r == '-' || r == '_' || r == '.' || r == '/':
+		default:
+			return false
+		}
+	}
+	return true
+}
--- a/server/pkg/agent/models_test.go
+++ b/server/pkg/agent/models_test.go
@@ -0,0 +1,324 @@
+package agent
+
+import (
+	"context"
+	"strings"
+	"testing"
+)
+
+func TestListModelsStaticProviders(t *testing.T) {
+	ctx := context.Background()
+	for _, provider := range []string{"claude", "codex", "gemini", "cursor", "copilot"} {
+		got, err := ListModels(ctx, provider, "")
+		if err != nil {
+			t.Fatalf("ListModels(%q) error: %v", provider, err)
+		}
+		if len(got) == 0 {
+			t.Errorf("ListModels(%q) returned no models", provider)
+		}
+		for i, m := range got {
+			if m.ID == "" {
+				t.Errorf("ListModels(%q)[%d] has empty ID", provider, i)
+			}
+			if m.Label == "" {
+				t.Errorf("ListModels(%q)[%d] has empty Label", provider, i)
+			}
+		}
+	}
+}
+
+func TestListModelsHermesWithoutBinary(t *testing.T) {
+	// With no `hermes` binary on PATH the discovery fast-paths to
+	// an empty list (the UI then falls back to creatable manual
+	// entry). This test only verifies the fast-path; an actual
+	// ACP session is exercised in integration.
+	ctx := context.Background()
+	// Prime the cache miss so we hit the live discovery function.
+	modelCacheMu.Lock()
+	delete(modelCache, "hermes")
+	modelCacheMu.Unlock()
+
+	got, err := ListModels(ctx, "hermes", "/nonexistent/hermes")
+	if err != nil {
+		t.Fatalf("ListModels(hermes) error: %v", err)
+	}
+	if got == nil {
+		t.Error("expected non-nil slice even when binary is missing")
+	}
+}
+
+func TestListModelsUnknownProvider(t *testing.T) {
+	ctx := context.Background()
+	_, err := ListModels(ctx, "nonexistent", "")
+	if err == nil {
+		t.Fatal("ListModels(unknown) expected error")
+	}
+}
+
+func TestStaticCatalogsHaveAtMostOneDefault(t *testing.T) {
+	// Each catalog should tag at most one entry as the display
+	// default so the UI badge is unambiguous. More than one
+	// usually means a copy/paste slip when adding new models.
+	catalogs := map[string][]Model{
+		"claude":  claudeStaticModels(),
+		"codex":   codexStaticModels(),
+		"gemini":  geminiStaticModels(),
+		"cursor":  cursorStaticModels(),
+		"copilot": copilotStaticModels(),
+	}
+	for provider, models := range catalogs {
+		count := 0
+		for _, m := range models {
+			if m.Default {
+				count++
+			}
+		}
+		if count > 1 {
+			t.Errorf("%s: %d models marked Default, want 0 or 1", provider, count)
+		}
+	}
+}
+
+func TestParseOpenCodeModels(t *testing.T) {
+	input := `PROVIDER/MODEL                     CONTEXT  MAX_OUT
+openai/gpt-4o                      128000   16384
+anthropic/claude-sonnet-4-6        200000   8192
+openai/gpt-4o                      128000   16384
+nonprefixed-line
+`
+	models := parseOpenCodeModels(input)
+	if len(models) != 2 {
+		t.Fatalf("expected 2 models (header skipped, duplicate deduped, non-slash skipped), got %d: %+v", len(models), models)
+	}
+	if models[0].ID != "openai/gpt-4o" || models[0].Provider != "openai" {
+		t.Errorf("unexpected first model: %+v", models[0])
+	}
+	if models[1].ID != "anthropic/claude-sonnet-4-6" || models[1].Provider != "anthropic" {
+		t.Errorf("unexpected second model: %+v", models[1])
+	}
+}
+
+func TestParsePiModels(t *testing.T) {
+	input := `openai:gpt-4o
+anthropic:claude-opus-4-7
+openai:gpt-4o
+bareword
+`
+	models := parsePiModels(input)
+	if len(models) != 2 {
+		t.Fatalf("expected 2 models, got %d: %+v", len(models), models)
+	}
+	if models[0].ID != "openai/gpt-4o" {
+		t.Errorf("expected colon normalized to slash: %+v", models[0])
+	}
+}
+
+func TestParseOpenclawAgents(t *testing.T) {
+	input := `deepseek-v4   deepseek-v4
+claude-sonnet claude-sonnet-4-6
+deepseek-v4   deepseek-v4
+`
+	models := parseOpenclawAgents(input)
+	// duplicate deduped; label includes model name.
+	if len(models) != 2 {
+		t.Fatalf("expected 2 agents, got %d: %+v", len(models), models)
+	}
+	if models[0].ID != "deepseek-v4" {
+		t.Errorf("unexpected first agent: %+v", models[0])
+	}
+	if models[0].Label != "deepseek-v4 (deepseek-v4)" {
+		t.Errorf("unexpected label: %+v", models[0])
+	}
+	if models[0].Provider != "openclaw" {
+		t.Errorf("expected provider openclaw, got %q", models[0].Provider)
+	}
+}
+
+func TestParseOpenclawAgentsRejectsDecoratedTUI(t *testing.T) {
+	// Reproduces the shape of real `openclaw agents list` output
+	// that leaked header tokens like "Identity:" / "Workspace:"
+	// and single-character box-drawing icons into the dropdown.
+	input := `╭───────────────────────────────╮
+│                               │
+│  ◇  Agents:                   │
+│  │                            │
+│  │    Identity:               │
+│  │    Workspace:              │
+│  │    Agent                   │
+│  │                            │
+╰───────────────────────────────╯
+deepseek-v4   deepseek-v4
+claude-sonnet claude-sonnet-4-6
+`
+	models := parseOpenclawAgents(input)
+	if len(models) != 2 {
+		t.Fatalf("expected 2 agents (decoration skipped), got %d: %+v", len(models), models)
+	}
+	for _, m := range models {
+		if strings.HasSuffix(m.ID, ":") {
+			t.Errorf("section header leaked into result: %+v", m)
+		}
+	}
+	if models[0].ID != "deepseek-v4" || models[1].ID != "claude-sonnet" {
+		t.Errorf("unexpected agents: %+v", models)
+	}
+}
+
+func TestParseOpenclawAgentsJSONArray(t *testing.T) {
+	input := []byte(`[
+    {"name": "deepseek-v4", "model": "deepseek-v4"},
+    {"name": "claude-sonnet", "model": "claude-sonnet-4-6"}
+]`)
+	models, ok := parseOpenclawAgentsJSON(input)
+	if !ok {
+		t.Fatal("expected parseOpenclawAgentsJSON to accept an array")
+	}
+	if len(models) != 2 {
+		t.Fatalf("got %d, want 2: %+v", len(models), models)
+	}
+	if models[0].ID != "deepseek-v4" || models[0].Label != "deepseek-v4 (deepseek-v4)" {
+		t.Errorf("unexpected first entry: %+v", models[0])
+	}
+}
+
+func TestParseOpenclawAgentsJSONWrapped(t *testing.T) {
+	input := []byte(`{"agents": [{"name": "foo", "model": "bar"}]}`)
+	models, ok := parseOpenclawAgentsJSON(input)
+	if !ok {
+		t.Fatal("expected parseOpenclawAgentsJSON to accept wrapped object")
+	}
+	if len(models) != 1 || models[0].ID != "foo" {
+		t.Errorf("unexpected: %+v", models)
+	}
+}
+
+func TestParseOpenclawAgentsJSONRejectsGarbage(t *testing.T) {
+	if _, ok := parseOpenclawAgentsJSON([]byte("not json")); ok {
+		t.Error("expected ok=false for non-JSON")
+	}
+}
+
+func TestParseCursorModels(t *testing.T) {
+	input := `Available models
+
+auto - Auto
+composer-2-fast - Composer 2 Fast (current, default)
+composer-2 - Composer 2
+claude-4.6-sonnet-medium - Sonnet 4.6 1M
+claude-opus-4-7-high - Opus 4.7 1M
+gemini-3.1-pro - Gemini 3.1 Pro
+`
+	models := parseCursorModels(input)
+	if len(models) != 6 {
+		t.Fatalf("expected 6 models, got %d: %+v", len(models), models)
+	}
+	ids := map[string]Model{}
+	for _, m := range models {
+		ids[m.ID] = m
+	}
+	for _, want := range []string{"auto", "composer-2-fast", "composer-2", "claude-4.6-sonnet-medium", "claude-opus-4-7-high", "gemini-3.1-pro"} {
+		if _, ok := ids[want]; !ok {
+			t.Errorf("missing expected model %q in: %+v", want, models)
+		}
+	}
+	if def := ids["composer-2-fast"]; !def.Default {
+		t.Errorf("composer-2-fast should be marked default, got %+v", def)
+	}
+	if def := ids["composer-2-fast"]; def.Label != "Composer 2 Fast" {
+		t.Errorf("default label should be stripped of parenthetical, got %q", def.Label)
+	}
+	// Non-default entry should not carry Default=true.
+	if auto := ids["auto"]; auto.Default {
+		t.Errorf("non-default entry should not be flagged default: %+v", auto)
+	}
+}
+
+func TestParseCursorModelsSkipsHeaderAndBlankLines(t *testing.T) {
+	input := `Available models
+
+composer-2 - Composer 2
+`
+	models := parseCursorModels(input)
+	if len(models) != 1 || models[0].ID != "composer-2" {
+		t.Fatalf("unexpected: %+v", models)
+	}
+}
+
+func TestParseHermesSessionNewModels(t *testing.T) {
+	// Mirrors the real shape emitted by hermes'
+	// acp_adapter/server.py _build_model_state -> SessionModelState.
+	raw := []byte(`{
+      "sessionId": "ses_123",
+      "models": {
+        "availableModels": [
+          {"modelId": "nous:moonshotai/kimi-k2.5", "name": "moonshotai/kimi-k2.5", "description": "Provider: Nous"},
+          {"modelId": "nous:anthropic/claude-opus-4.7", "name": "anthropic/claude-opus-4.7", "description": "Provider: Nous • current"},
+          {"modelId": "nous:moonshotai/kimi-k2.5", "name": "duplicate", "description": "dup"}
+        ],
+        "currentModelId": "nous:anthropic/claude-opus-4.7"
+      }
+    }`)
+	models := parseHermesSessionNewModels(raw)
+	if len(models) != 2 {
+		t.Fatalf("expected 2 models (duplicate deduped), got %d: %+v", len(models), models)
+	}
+	if models[0].ID != "nous:moonshotai/kimi-k2.5" || models[0].Provider != "nous" {
+		t.Errorf("unexpected first model: %+v", models[0])
+	}
+	if models[0].Default {
+		t.Errorf("non-current entry must not be marked default: %+v", models[0])
+	}
+	if !models[1].Default {
+		t.Errorf("current entry must be marked default: %+v", models[1])
+	}
+	if models[1].ID != "nous:anthropic/claude-opus-4.7" {
+		t.Errorf("expected current model second: %+v", models[1])
+	}
+}
+
+func TestParseHermesSessionNewModelsMissingField(t *testing.T) {
+	// session/new without the models field — older hermes or
+	// failed _build_model_state — should yield nil so the caller
+	// can distinguish "no catalog" from "empty catalog".
+	raw := []byte(`{"sessionId": "ses_123"}`)
+	if got := parseHermesSessionNewModels(raw); got != nil && len(got) != 0 {
+		t.Errorf("expected nil/empty, got %+v", got)
+	}
+}
+
+func TestParseHermesSessionNewModelsGarbage(t *testing.T) {
+	if got := parseHermesSessionNewModels([]byte("not json")); got != nil {
+		t.Errorf("expected nil for non-JSON, got %+v", got)
+	}
+}
+
+func TestHermesModelSelectionSupported(t *testing.T) {
+	// Regression guard: hermes now supports model selection via
+	// the ACP session/set_model RPC, so the UI dropdown should
+	// not be disabled for it.
+	if !ModelSelectionSupported("hermes") {
+		t.Error("hermes should be model-selection-supported now that set_session_model is wired")
+	}
+}
+
+func TestCachedDiscovery(t *testing.T) {
+	calls := 0
+	fn := func() ([]Model, error) {
+		calls++
+		return []Model{{ID: "x", Label: "x"}}, nil
+	}
+	// First call populates the cache; reset for isolation.
+	modelCacheMu.Lock()
+	delete(modelCache, "testkey")
+	modelCacheMu.Unlock()
+
+	if _, err := cachedDiscovery("testkey", fn); err != nil {
+		t.Fatal(err)
+	}
+	if _, err := cachedDiscovery("testkey", fn); err != nil {
+		t.Fatal(err)
+	}
+	if calls != 1 {
+		t.Errorf("expected 1 underlying call due to cache, got %d", calls)
+	}
+}
--- a/server/pkg/agent/openclaw.go
+++ b/server/pkg/agent/openclaw.go
@@ -146,7 +146,17 @@ func buildOpenclawArgs(prompt, sessionID string, opts ExecOptions, logger *slog.
 	if opts.Timeout > 0 {
 		args = append(args, "--timeout", fmt.Sprintf("%d", int(opts.Timeout.Seconds())))
 	}
-	args = append(args, filterCustomArgs(opts.CustomArgs, openclawBlockedArgs, logger)...)
+	// OpenClaw binds models to pre-registered agents at `openclaw agents
+	// add/update --model` time; the daemon selects one at runtime by
+	// passing --agent <name>. The model dropdown populates its list from
+	// `openclaw agents list`, so opts.Model here is an agent name. Only
+	// inject when the user hasn't already set --agent via custom_args —
+	// custom_args wins for backward compatibility with existing configs.
+	customArgs := filterCustomArgs(opts.CustomArgs, openclawBlockedArgs, logger)
+	if opts.Model != "" && !customArgsContains(customArgs, "--agent") {
+		args = append(args, "--agent", opts.Model)
+	}
+	args = append(args, customArgs...)

 	if opts.SystemPrompt != "" {
 		prompt = opts.SystemPrompt + "\n\n" + prompt
@@ -155,6 +165,18 @@ func buildOpenclawArgs(prompt, sessionID string, opts ExecOptions, logger *slog.
 	return args
 }

+// customArgsContains reports whether args contains the given flag
+// (either as a standalone token "--flag" or in "--flag=value" form).
+func customArgsContains(args []string, flag string) bool {
+	prefix := flag + "="
+	for _, a := range args {
+		if a == flag || strings.HasPrefix(a, prefix) {
+			return true
+		}
+	}
+	return false
+}
+
 // ── Event handlers ──

 // openclawEventResult holds accumulated state from processing the event stream.
@@ -439,9 +461,9 @@ type openclawEvent struct {
 	CallID    string          `json:"callId,omitempty"`
 	Input     json.RawMessage `json:"input,omitempty"`
 	Usage     map[string]any  `json:"usage,omitempty"`
-	Phase     string          `json:"phase,omitempty"`     // lifecycle event phase
-	Error     *openclawError  `json:"error,omitempty"`     // structured error object
-	Message   string          `json:"message,omitempty"`   // alternative error message field
+	Phase     string          `json:"phase,omitempty"`   // lifecycle event phase
+	Error     *openclawError  `json:"error,omitempty"`   // structured error object
+	Message   string          `json:"message,omitempty"` // alternative error message field
 }

 // errorMessage extracts a human-readable error message from the event,
--- a/server/pkg/agent/openclaw_test.go
+++ b/server/pkg/agent/openclaw_test.go
@@ -688,8 +688,8 @@ func TestOpenclawUsageAlternativeFieldNames(t *testing.T) {

 	// Test PaperClip-style field names (inputTokens, outputTokens, etc.)
 	data := map[string]any{
-		"inputTokens":      float64(500),
-		"outputTokens":     float64(200),
+		"inputTokens":       float64(500),
+		"outputTokens":      float64(200),
 		"cachedInputTokens": float64(100),
 	}
 	usage := parseOpenclawUsage(data)
@@ -711,8 +711,8 @@ func TestOpenclawUsageSnakeCaseFieldNames(t *testing.T) {
 	// Test snake_case field names (Anthropic API style)
 	data := map[string]any{
 		"input_tokens":                float64(300),
-		"output_tokens":              float64(150),
-		"cache_read_input_tokens":    float64(80),
+		"output_tokens":               float64(150),
+		"cache_read_input_tokens":     float64(80),
 		"cache_creation_input_tokens": float64(40),
 	}
 	usage := parseOpenclawUsage(data)
@@ -796,8 +796,8 @@ func TestOpenclawUsageFinalResultAlternativeFields(t *testing.T) {
 			DurationMs: 1000,
 			AgentMeta: map[string]any{
 				"usage": map[string]any{
-					"inputTokens":      float64(400),
-					"outputTokens":     float64(180),
+					"inputTokens":       float64(400),
+					"outputTokens":      float64(180),
 					"cachedInputTokens": float64(90),
 				},
 			},
@@ -943,13 +943,15 @@ func TestBuildOpenclawArgsMinimal(t *testing.T) {
 	}
 }

-func TestBuildOpenclawArgsDoesNotForwardModelOrSystemPrompt(t *testing.T) {
+func TestBuildOpenclawArgsMapsModelToAgent(t *testing.T) {
 	t.Parallel()

-	// openclaw agent rejects --model and --system-prompt; verify they are
-	// never emitted as flags even when Model and SystemPrompt are set.
+	// For openclaw, agent.model stores the pre-registered agent name;
+	// the daemon must translate that to `--agent <name>` because the
+	// CLI rejects `--model` entirely. `--system-prompt` is also
+	// rejected and must not be emitted as a flag.
 	args := buildOpenclawArgs("task", "ses-2", ExecOptions{
-		Model:        "gpt-4o",
+		Model:        "deepseek-v4-agent",
 		SystemPrompt: "You are a helpful agent.",
 	}, slog.Default())

@@ -959,6 +961,40 @@ func TestBuildOpenclawArgsDoesNotForwardModelOrSystemPrompt(t *testing.T) {
 	if idx := indexOf(args, "--system-prompt"); idx != -1 {
 		t.Fatalf("unexpected --system-prompt flag at %d: %v", idx, args)
 	}
+
+	agentIdx := indexOf(args, "--agent")
+	if agentIdx == -1 || agentIdx+1 >= len(args) {
+		t.Fatalf("expected --agent <value> in args: %v", args)
+	}
+	if got := args[agentIdx+1]; got != "deepseek-v4-agent" {
+		t.Errorf("--agent value = %q, want %q", got, "deepseek-v4-agent")
+	}
+}
+
+func TestBuildOpenclawArgsCustomAgentWinsOverModel(t *testing.T) {
+	t.Parallel()
+
+	// If the user already configured --agent via custom_args, their
+	// value wins — we don't double-inject. This keeps existing configs
+	// working when they later set agent.model.
+	args := buildOpenclawArgs("task", "ses-2b", ExecOptions{
+		Model:      "from-dropdown",
+		CustomArgs: []string{"--agent", "from-custom-args"},
+	}, slog.Default())
+
+	count := 0
+	for _, a := range args {
+		if a == "--agent" {
+			count++
+		}
+	}
+	if count != 1 {
+		t.Fatalf("expected exactly one --agent flag, got %d: %v", count, args)
+	}
+	agentIdx := indexOf(args, "--agent")
+	if args[agentIdx+1] != "from-custom-args" {
+		t.Errorf("custom --agent should win, got %q", args[agentIdx+1])
+	}
 }

 func TestBuildOpenclawArgsPrependsSystemPromptToMessage(t *testing.T) {
--- a/server/pkg/db/generated/agent.sql.go
+++ b/server/pkg/db/generated/agent.sql.go
@@ -14,7 +14,7 @@ import (
 const archiveAgent = `-- name: ArchiveAgent :one
 UPDATE agent SET archived_at = now(), archived_by = $2, updated_at = now()
 WHERE id = $1
-RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config
+RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model
 `

 type ArchiveAgentParams struct {
@@ -46,6 +46,7 @@ func (q *Queries) ArchiveAgent(ctx context.Context, arg ArchiveAgentParams) (Age
 		&i.CustomEnv,
 		&i.CustomArgs,
 		&i.McpConfig,
+		&i.Model,
 	)
 	return i, err
 }
@@ -161,7 +162,7 @@ func (q *Queries) ClaimAgentTask(ctx context.Context, agentID pgtype.UUID) (Agen
 const clearAgentMcpConfig = `-- name: ClearAgentMcpConfig :one
 UPDATE agent SET mcp_config = NULL, updated_at = now()
 WHERE id = $1
-RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config
+RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model
 `

 func (q *Queries) ClearAgentMcpConfig(ctx context.Context, id pgtype.UUID) (Agent, error) {
@@ -188,6 +189,7 @@ func (q *Queries) ClearAgentMcpConfig(ctx context.Context, id pgtype.UUID) (Agen
 		&i.CustomEnv,
 		&i.CustomArgs,
 		&i.McpConfig,
+		&i.Model,
 	)
 	return i, err
 }
@@ -253,9 +255,9 @@ const createAgent = `-- name: CreateAgent :one
 INSERT INTO agent (
    workspace_id, name, description, avatar_url, runtime_mode,
    runtime_config, runtime_id, visibility, max_concurrent_tasks, owner_id,
-    instructions, custom_env, custom_args, mcp_config
-) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14)
-RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config
+    instructions, custom_env, custom_args, mcp_config, model
+) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15)
+RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model
 `

 type CreateAgentParams struct {
@@ -273,6 +275,7 @@ type CreateAgentParams struct {
 	CustomEnv          []byte      `json:"custom_env"`
 	CustomArgs         []byte      `json:"custom_args"`
 	McpConfig          []byte      `json:"mcp_config"`
+	Model              pgtype.Text `json:"model"`
 }

 func (q *Queries) CreateAgent(ctx context.Context, arg CreateAgentParams) (Agent, error) {
@@ -291,6 +294,7 @@ func (q *Queries) CreateAgent(ctx context.Context, arg CreateAgentParams) (Agent
 		arg.CustomEnv,
 		arg.CustomArgs,
 		arg.McpConfig,
+		arg.Model,
 	)
 	var i Agent
 	err := row.Scan(
@@ -314,6 +318,7 @@ func (q *Queries) CreateAgent(ctx context.Context, arg CreateAgentParams) (Agent
 		&i.CustomEnv,
 		&i.CustomArgs,
 		&i.McpConfig,
+		&i.Model,
 	)
 	return i, err
 }
@@ -462,7 +467,7 @@ func (q *Queries) FailStaleTasks(ctx context.Context, arg FailStaleTasksParams)
 }

 const getAgent = `-- name: GetAgent :one
-SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config FROM agent
+SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model FROM agent
 WHERE id = $1
 `

@@ -490,12 +495,13 @@ func (q *Queries) GetAgent(ctx context.Context, id pgtype.UUID) (Agent, error) {
 		&i.CustomEnv,
 		&i.CustomArgs,
 		&i.McpConfig,
+		&i.Model,
 	)
 	return i, err
 }

 const getAgentInWorkspace = `-- name: GetAgentInWorkspace :one
-SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config FROM agent
+SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model FROM agent
 WHERE id = $1 AND workspace_id = $2
 `

@@ -528,6 +534,7 @@ func (q *Queries) GetAgentInWorkspace(ctx context.Context, arg GetAgentInWorkspa
 		&i.CustomEnv,
 		&i.CustomArgs,
 		&i.McpConfig,
+		&i.Model,
 	)
 	return i, err
 }
@@ -728,7 +735,7 @@ func (q *Queries) ListAgentTasks(ctx context.Context, agentID pgtype.UUID) ([]Ag
 }

 const listAgents = `-- name: ListAgents :many
-SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config FROM agent
+SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model FROM agent
 WHERE workspace_id = $1 AND archived_at IS NULL
 ORDER BY created_at ASC
 `
@@ -763,6 +770,7 @@ func (q *Queries) ListAgents(ctx context.Context, workspaceID pgtype.UUID) ([]Ag
 			&i.CustomEnv,
 			&i.CustomArgs,
 			&i.McpConfig,
+			&i.Model,
 		); err != nil {
 			return nil, err
 		}
@@ -775,7 +783,7 @@ func (q *Queries) ListAgents(ctx context.Context, workspaceID pgtype.UUID) ([]Ag
 }

 const listAllAgents = `-- name: ListAllAgents :many
-SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config FROM agent
+SELECT id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model FROM agent
 WHERE workspace_id = $1
 ORDER BY created_at ASC
 `
@@ -810,6 +818,7 @@ func (q *Queries) ListAllAgents(ctx context.Context, workspaceID pgtype.UUID) ([
 			&i.CustomEnv,
 			&i.CustomArgs,
 			&i.McpConfig,
+			&i.Model,
 		); err != nil {
 			return nil, err
 		}
@@ -914,7 +923,7 @@ func (q *Queries) ListTasksByIssue(ctx context.Context, issueID pgtype.UUID) ([]
 const restoreAgent = `-- name: RestoreAgent :one
 UPDATE agent SET archived_at = NULL, archived_by = NULL, updated_at = now()
 WHERE id = $1
-RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config
+RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model
 `

 func (q *Queries) RestoreAgent(ctx context.Context, id pgtype.UUID) (Agent, error) {
@@ -941,6 +950,7 @@ func (q *Queries) RestoreAgent(ctx context.Context, id pgtype.UUID) (Agent, erro
 		&i.CustomEnv,
 		&i.CustomArgs,
 		&i.McpConfig,
+		&i.Model,
 	)
 	return i, err
 }
@@ -993,9 +1003,10 @@ UPDATE agent SET
    custom_env = COALESCE($12, custom_env),
    custom_args = COALESCE($13, custom_args),
    mcp_config = COALESCE($14, mcp_config),
+    model = COALESCE($15, model),
    updated_at = now()
 WHERE id = $1
-RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config
+RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model
 `

 type UpdateAgentParams struct {
@@ -1013,6 +1024,7 @@ type UpdateAgentParams struct {
 	CustomEnv          []byte      `json:"custom_env"`
 	CustomArgs         []byte      `json:"custom_args"`
 	McpConfig          []byte      `json:"mcp_config"`
+	Model              pgtype.Text `json:"model"`
 }

 func (q *Queries) UpdateAgent(ctx context.Context, arg UpdateAgentParams) (Agent, error) {
@@ -1031,6 +1043,7 @@ func (q *Queries) UpdateAgent(ctx context.Context, arg UpdateAgentParams) (Agent
 		arg.CustomEnv,
 		arg.CustomArgs,
 		arg.McpConfig,
+		arg.Model,
 	)
 	var i Agent
 	err := row.Scan(
@@ -1054,6 +1067,7 @@ func (q *Queries) UpdateAgent(ctx context.Context, arg UpdateAgentParams) (Agent
 		&i.CustomEnv,
 		&i.CustomArgs,
 		&i.McpConfig,
+		&i.Model,
 	)
 	return i, err
 }
@@ -1061,7 +1075,7 @@ func (q *Queries) UpdateAgent(ctx context.Context, arg UpdateAgentParams) (Agent
 const updateAgentStatus = `-- name: UpdateAgentStatus :one
 UPDATE agent SET status = $2, updated_at = now()
 WHERE id = $1
-RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config
+RETURNING id, workspace_id, name, avatar_url, runtime_mode, runtime_config, visibility, status, max_concurrent_tasks, owner_id, created_at, updated_at, description, runtime_id, instructions, archived_at, archived_by, custom_env, custom_args, mcp_config, model
 `

 type UpdateAgentStatusParams struct {
@@ -1093,6 +1107,7 @@ func (q *Queries) UpdateAgentStatus(ctx context.Context, arg UpdateAgentStatusPa
 		&i.CustomEnv,
 		&i.CustomArgs,
 		&i.McpConfig,
+		&i.Model,
 	)
 	return i, err
 }
--- a/server/pkg/db/generated/models.go
+++ b/server/pkg/db/generated/models.go
@@ -40,6 +40,7 @@ type Agent struct {
 	CustomEnv          []byte             `json:"custom_env"`
 	CustomArgs         []byte             `json:"custom_args"`
 	McpConfig          []byte             `json:"mcp_config"`
+	Model              pgtype.Text        `json:"model"`
 }

 type AgentRuntime struct {
--- a/server/pkg/db/queries/agent.sql
+++ b/server/pkg/db/queries/agent.sql
@@ -20,8 +20,8 @@ WHERE id = $1 AND workspace_id = $2;
 INSERT INTO agent (
    workspace_id, name, description, avatar_url, runtime_mode,
    runtime_config, runtime_id, visibility, max_concurrent_tasks, owner_id,
-    instructions, custom_env, custom_args, mcp_config
-) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14)
+    instructions, custom_env, custom_args, mcp_config, model
+) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15)
 RETURNING *;

 -- name: UpdateAgent :one
@@ -39,6 +39,7 @@ UPDATE agent SET
    custom_env = COALESCE(sqlc.narg('custom_env'), custom_env),
    custom_args = COALESCE(sqlc.narg('custom_args'), custom_args),
    mcp_config = COALESCE(sqlc.narg('mcp_config'), mcp_config),
+    model = COALESCE(sqlc.narg('model'), model),
    updated_at = now()
 WHERE id = $1
 RETURNING *;
Author	SHA1	Message	Date
Jiayuan Zhang	aaa5529f61	fix(agents): honour default flag wire-side and stop masking model errors Addresses three issues from the latest PR #1399 review. 1. Wire the `default` flag end-to-end. The Model struct tagged one entry per provider as Default=true so the UI can badge it, but the daemon's heartbeat-report writer serialised models as `map[string]string`, silently dropping the bool; handler's `ModelEntry` also lacked the field. Result: the dropdown always showed "Default (provider)" and never "Default — <name>". Replace the map with a typed wire struct on the daemon side and add `Default bool` to `handler.ModelEntry`, plus a regression test asserting the flag round-trips through the report body and the store. 2. Stop imposing a static Multica-side default on task execution. `runTask` resolved the model via a three-tier chain that ended in `agent.DefaultModel(provider)`, forcing `claude-sonnet-4-6` / `gpt-5.4` / etc. onto every task whose agent hadn't set a model. That's the same shape of bug that bit us on cursor: any Go-side guess drifts from what the user's account actually has access to. Drop the third tier — when both `agent.model` and `MULTICA_<PROVIDER>_MODEL` are empty we now pass `""` through, so each backend omits `--model` from the CLI and the provider resolves its own default. `DefaultModel` / `defaultStaticModelsFor` become dead and are removed along with their test; the per-entry `Default: true` markers remain as a display hint on discovery responses (and are now surfaced via Fix #1). 3. Hermes: fail the task when the user's chosen model can't be applied. `session/set_model` errors used to log a warn and continue on hermes' own default, so a successful task would mislead the user into thinking their explicit pick ran. Now, when `opts.Model != ""` and the RPC fails, we send a `failed` Result with the hermes error in `result.Error` and skip the prompt entirely. Plus gofmt clean-up on `hermes.go` (the new sniffer block picked up tab/space noise on the prior commit).	2026-04-21 00:00:34 +08:00
Jiayuan Zhang	a8e2ef4c4d	fix(agent/hermes): surface provider errors instead of reporting empty output When the model the user picked isn't usable by the configured provider — e.g. ChatGPT-account auth rejecting `gpt-5.1-codex-mini` as "not supported when using Codex with a ChatGPT account" — hermes still acknowledges `session/prompt` with `stopReason=end_turn` and an empty text payload. The daemon's task machinery then reports a useless "hermes returned empty output" and the real HTTP 400 detail (the one that actually tells the user what to change) stays buried in the daemon's stderr log. Tee stderr through a bounded provider-error sniffer that scans for `⚠️` / `❌` / `[ERROR]` headers plus the `Error:` / `detail:` tails hermes emits for API failures. When the task finishes `completed` with empty output, promote it to `failed` and use the sniffed message as the task error. Nothing changes for healthy runs — the sniffer is an additional writer behind an `io.MultiWriter`, it doesn't filter or replace the normal stderr log forwarder. Reproduced against a local hermes configured for `provider: openai-codex` + `chatgpt.com/backend-api/codex`: before this change the task result was `{status: completed, comment: "hermes returned empty output"}`; after, it's `{status: failed, error: "hermes provider error: HTTP 400: ... The 'gpt-5.1-codex-mini' model is not supported when using Codex with a ChatGPT account."}`. Tests cover the real fixture shape, partial-line buffering across Writer calls, log-line filtering (info-level lines never surface as errors), and the bounded line buffer.	2026-04-20 23:45:49 +08:00
Jiayuan Zhang	9acaf248dc	feat(agent/hermes): ACP-driven model discovery and per-task selection Hermes' ACP server already exposes everything we need to treat its model catalog like any other provider's: * `acp_adapter/server.py::_build_model_state` attaches a full `SessionModelState` (availableModels + currentModelId) to the response of `session/new` / `session/load` / `session/resume`. * `acp_adapter/server.py::set_session_model` is an ACP-standard RPC that switches the active session's provider + model. Use both from Multica's daemon instead of treating hermes as a "dropdown-disabled" special case. Discovery — `discoverHermesModels` spawns a throwaway `hermes acp` process, drives `initialize` + `session/new(cwd=<tmpdir>)`, reads the model catalog straight out of the response, then kills the child. Failures (hermes missing, bad credentials, config resolution error) degrade to an empty list so the creatable dropdown still works. The existing 60s TTL cache amortises the ~few-second process spin-up. Per-task selection — `hermesBackend.Execute` now issues a `session/set_model` RPC with the UI-chosen `opts.Model` before `session/prompt`. Failure is logged but not fatal; the session falls back to hermes' configured default rather than aborting the task. `ModelSelectionSupported` no longer special-cases hermes — every provider in the registry now honours `agent.model` end-to-end. The UI "disabled" branch of ModelDropdown is retained (dead for today's catalog) so a future provider can opt out without reintroducing this plumbing. Tests cover the JSON parser for the real `SessionModelState` shape (including `currentModelId` mapping to Default=true, dedup of duplicate `modelId` entries, and graceful handling of missing / malformed payloads), plus a regression guard on the support flag.	2026-04-20 23:34:17 +08:00
Jiayuan Zhang	ad5a2abaa2	fix(agent/cursor): discover models via `cursor-agent --list-models` The previous cursor catalog hardcoded model IDs like `claude-sonnet-4-6`, `gpt-5.4`, `gemini-2.5-pro` and `composer-1.5`. None of those are valid cursor-agent model names today — the real IDs have shape `composer-2-fast`, `claude-4.6-sonnet-medium`, `claude-opus-4-7-high`, `gemini-3.1-pro`, etc. Selecting one of the stale IDs from the UI made cursor-agent exit 1 during task execution with no actionable error, visible in practice as "cursor-agent exited with error: exit status 1" on a 2-second-long failed run. Cursor's catalog is too volatile to ship statically (there are ~60+ variants covering the fast/medium/high/xhigh/max × thinking matrix). Switch to dynamic discovery via `cursor-agent --list-models`, with a single-entry `auto` static fallback when the binary is missing. - `discoverCursorModels` shells out and parses the `<id> - <label>` rows; the entry tagged `(default)` surfaces as Default=true so the dropdown badge matches cursor's own pick rather than ours. - Strip the parenthetical suffix from the display label so "Composer 2 Fast (current, default)" renders as "Composer 2 Fast". - Reuses the same identifier guard as the openclaw text parser to skip the "Available models" header / blank lines. - Tests parse the real fixture from the live CLI. DefaultModel("cursor") now returns "auto" (the static-fallback default) instead of a specific ID, since we intentionally don't ship an opinionated cursor model — cursor itself surfaces the real default through the Default flag on whichever entry `--list-models` tags.	2026-04-20 23:09:56 +08:00
Jiayuan Zhang	6094a149ac	fix(agent/openclaw): reject TUI decoration when enumerating agents `openclaw agents list` prints a decorated banner with box-drawing characters and section headers ("Agents:", "Identity:", "Workspace:") surrounding the actual agent rows. The previous text parser picked up the first whitespace-separated token on every non-empty, non-dash-prefixed line, which surfaced the section headers and single-character icons ("│", "◇") as selectable models in the dropdown. - Try structured JSON output first (`agents list --json` / `--output json` / `-o json`); on any of those the UI gets names straight from the CLI's source of truth, with the model shown as a label suffix. - Fall back to a conservative text parser that only accepts rows with at least two whitespace-separated tokens where both look like safe identifiers (letters / digits / `-_./`, no trailing colon). Anything decorative or header-like is discarded. - On ambiguity we return an empty list — a silently-wrong enumeration would mislead users more than the creatable dropdown's manual entry. Adds a regression test using the exact banner shape seen in the field plus two tests for the JSON paths (array and `{agents: [...]}` wrapped).	2026-04-20 22:51:52 +08:00
Jiayuan Zhang	a67e533742	feat(agents): add per-agent model field with provider-aware dropdown Adds a first-class `model` field to agents so users can pick the LLM model from the create / settings UI instead of editing custom_env / custom_args. The previous "set MULTICA_<PROVIDER>_MODEL env var on the daemon" approach forced one model per provider per machine and was easy to misconfigure (e.g. -m as a custom_arg breaks codex app-server initialization). Backend (server/pkg/agent): - New `agent.ListModels(provider, path)` returns the models supported by a provider. Static catalogs for claude, codex, gemini, cursor, copilot; dynamic discovery for opencode (`opencode models`), pi (`pi --list-models`), openclaw (`openclaw agents list`); 60s TTL cache + empty-list fallback on failure. Hermes returns an empty list and `ModelSelectionSupported=false` because its model is configured out-of-band. - `agent.DefaultModel(provider)` returns the recommended default per provider (Sonnet 4.6 for claude, GPT-5.4 for codex, Gemini 2.5 Pro for gemini, composer-1.5 for cursor); copilot/openclaw/hermes deliberately have no default. The static catalog tags one entry per provider with `Default: true` so the UI can render a badge. - For openclaw, opts.Model is mapped to `--agent <name>` since the CLI rejects `--model` outright; custom_args `--agent` still wins for back-compat. Daemon protocol (server/internal/daemon): - Heartbeat response carries an optional `pending_model_list` request (same pattern as PingStore / UpdateStore). The daemon resolves models via `agent.ListModels`, including the `supported` flag, and reports back via /api/daemon/runtimes/{id}/models/{requestId}/result. - Task dispatch uses a three-tier fallback for the runtime model: agent.model → MULTICA_<PROVIDER>_MODEL env → agent.DefaultModel(provider). Server API (server/internal/handler): - `agent.model` is a new column (migration 050) and surfaces in Agent / CreateAgent / UpdateAgent payloads. - New endpoints under /api/runtimes/{id}/models: POST to initiate discovery, GET to poll the request, plus the daemon-side report endpoint above. CLI (server/cmd/multica): - `multica agent create / update --model <id>`. Help copy steers users away from passing --model via --custom-args, which fails on codex (app-server mode) and openclaw. Frontend (packages/core, packages/views): - `Agent.model`, `RuntimeModel`, `RuntimeModelListRequest`, `RuntimeModelsResult` types. - `runtimes/models.ts` exports `runtimeModelsOptions(runtimeId)` which initiates discovery and polls the request to completion (500ms cadence, 30s ceiling). - New `ModelDropdown` (packages/views/agents) — searchable popover, provider grouping, creatable manual entry, "default" badge on the shipped recommendation, disabled state when the provider reports `supported=false` (Hermes), and clears any stale model value in that case to avoid persisting a ghost configuration. - Wired into create-agent-dialog and the agent settings tab. Verification: - gofmt clean on touched files - `go build ./... && go test ./...` (server) green; new openclaw and models_test cases included - `pnpm typecheck` green across all 6 packages Closes the immediate UX gap behind MUL-1151. DeepSeek V4 (or any new model) becomes a zero-code addition: add it to the relevant static catalog, or rely on the creatable input for one-off use.	2026-04-20 22:45:56 +08:00
				`@@ -0,0 +1 @@`
				`ALTER TABLE agent DROP COLUMN IF EXISTS model;`