feat(agent): add Kimi CLI as agent runtime (#1400)

* feat(agent): add Kimi CLI as agent runtime

Adds support for Moonshot AI's Kimi Code CLI (https://github.com/MoonshotAI/kimi-cli)
as a new agent runtime, alongside Claude, Codex, OpenCode, OpenClaw, Hermes,
Gemini, Pi, Cursor and Copilot.

Kimi Code CLI implements the standard Agent Client Protocol (ACP) via the
`kimi acp` subcommand, so the new `kimiBackend` reuses the existing
hermesClient JSON-RPC transport in the agent package — only the binary,
client identity, log prefix, and tool-name extraction differ.

Wiring:
- server/pkg/agent: new kimiBackend + kimi_test.go; registered in New(),
  LaunchHeader map, and the supported-types coverage test.
- server/internal/daemon/config.go: probes `kimi` (overridable via
  MULTICA_KIMI_PATH / MULTICA_KIMI_MODEL).
- server/internal/daemon/execenv: writes AGENTS.md as the runtime context
  file (Kimi reads AGENTS.md natively via /init), and writes skills under
  `.kimi/skills/` so they are auto-discovered by the project-level skill
  loader.
- packages/views/runtimes: ProviderLogo gains a Kimi mark.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(agent/kimi): support per-agent model selection via ACP set_model

Wire Kimi into the model dropdown introduced in #1399:

- ListModels gets a 'kimi' case that drives the same ACP
  initialize + session/new handshake as Hermes; both share a new
  discoverACPModels helper and parseACPSessionNewModels parser
  so future ACP backends only need a small provider entry.
- kimiBackend now issues session/set_model after session/new when
  opts.Model is non-empty, mirroring the Hermes flow. Failures
  fail the task instead of silently falling back to Kimi's
  default model — silent fallback would hide that the dropdown
  pick wasn't honoured.

Verified: go build ./..., go test ./pkg/agent/... ./internal/daemon/... ./internal/handler/..., pnpm typecheck and pnpm test (138 passed).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* refactor(agent): address code review feedback on Kimi runtime

- Share ACP provider-error sniffer between hermes and kimi. Previously
  only hermes promoted stderr-observed 4xx/5xx into a failed task;
  kimi would report "completed + empty output" when the Moonshot
  upstream rejected a request (expired token, rate limit, …). Rename
  hermesProviderErrorSniffer → acpProviderErrorSniffer and parameterise
  the provider name; wire it into kimiBackend.Execute the same way.
- Rename extractHermesSessionID → extractACPSessionID (shared by all
  ACP backends) so the name matches parseACPSessionNewModels.
- Drop the redundant second argument to kimiToolNameFromTitle; the
  Message struct has only one relevant field (Tool), so passing it
  twice was a dead fallback. Document that the function normalises
  residual capitalised kimi titles not caught by hermesToolNameFromTitle.
- Remove kimi-only cmd.WaitDelay override; the hermes baseline is
  fine for both and divergence adds noise.
- Add TestKimiBackendSetModelFailureFailsTask: fake `kimi acp` binary
  that returns a JSON-RPC error for session/set_model, asserts that
  the task result surfaces status=failed with the model name + upstream
  message and preserves the session id.
- Fix stale agent listings in agent.go / daemon/config.go doc comments
  (missing cursor, gemini, copilot).

All: `go build ./...`, `go vet ./...`, `go test ./pkg/agent/...
./internal/daemon/... ./internal/handler/...` green.

* fix(agent/kimi): pass --yolo so Shell tools don't hang on approval

Kimi's default config has `default_yolo = false`. Every Shell/file-mutating
tool call causes kimi acp to send a `session/request_permission` request
and block (up to 300s) waiting for a response. The daemon's hermesClient
only handles `session/update` notifications — permission requests go
unanswered, the tool call times out, and the UI loop eventually dies
("UI loop timed out"). Observed with the first real kimi task: agent sat
as Live for ~7 minutes before the daemon killed it.

The fix mirrors hermes' HERMES_YOLO_MODE=1 override: pass `--yolo` to
`kimi` so it auto-approves everything. `--yolo` is a top-level flag on
the `kimi` CLI (not a flag on `kimi acp`), so it must come before the
`acp` subcommand in argv. Added to kimiBlockedArgs so user custom_args
can't strip it.

While here, fix a related bug that made kimi tool names show up empty
in the daemon log ("tool #1: "): hermesToolNameFromTitle's fallback
returned `kind` when neither title-with-colon nor kind matched a known
tool. Kimi's ACP `tool_call` emits bare titles like "Shell" or "Read
file" with no `kind` at all, so we'd drop the title on the floor before
kimiToolNameFromTitle ever got a chance to map it. Now: preserve the
title when kind is unclassified; hermes titles always carry a colon so
this branch never fires for hermes.

Tests:
- TestKimiBackendPassesYoloFlag — fake binary that records its argv,
  asserts --yolo comes before acp.
- TestHermesToolNameFromTitle rows for bare kimi-style titles.
- Existing suite green: go build, go vet, full pkg/agent + daemon +
  handler test packages.

* fix(agent/acp): auto-approve session/request_permission from agent

The previous attempt (`kimi --yolo acp`) was a no-op. Inspected the
kimi-cli source: the `acp` Typer subcommand takes no parameters, so
flags on the root `kimi` command are dropped before `acp_main()` runs
— it's impossible to opt into YOLO mode through CLI flags for ACP.

The real fix is on our side: respond to session/request_permission.

ACP is bidirectional. When kimi runs a Shell or file-write tool, it
sends `session/request_permission` (agent → client, JSON-RPC request
with id + method) and waits up to 300s for a response. Our existing
hermesClient.handleLine only dispatched: (id + result/error) →
handleResponse, and (no id + method) → handleNotification. A request
with BOTH id and method fell through and got silently dropped — kimi
timed out, UI loop died, task sat stuck for 7 minutes.

Add handleAgentRequest: for session/request_permission, echo the id
and respond with outcome=selected, optionId=approve_for_session. The
daemon is headless; there's no user to prompt. `approve_for_session`
lets the agent remember the action so subsequent identical calls
(every Shell, every file write) skip the round-trip entirely. For any
other agent → client method, reply with standard -32601 method-not-
found so the agent doesn't block.

Also:
- Add writeMu so request() (main goroutine) and handleAgentRequest
  (reader goroutine) don't interleave JSON frames on stdin.
- Revert the `--yolo acp` flag — it's a no-op, and carrying it in
  kimiBlockedArgs gives the wrong impression that it does something.
  Comment in kimi.go now points at handleAgentRequest as the real fix.

Tests:
- TestHermesClientAutoApprovesPermissionRequest: inject a
  session/request_permission, assert the reply echoes the id and
  carries {outcome: selected, optionId: approve_for_session}.
- TestHermesClientReplesMethodNotFoundForUnknownAgentRequest: confirm
  unknown agent → client methods get JSON-RPC -32601 instead of silence.
- TestKimiBackendInvokesACPSubcommand replaces the yolo-flag assertion
  with a negative assertion: no dead --yolo / --auto-approve / -y on
  argv, since they'd pretend to do something they can't.

All: go build ./..., go vet ./..., go test ./pkg/agent/... green.

* fix(agent/acp): surface kimi tool input/output via content blocks

Kimi-cli emits tool_call and tool_call_update ACP frames with the
input/output inside a `content` array of ContentToolCallContent
blocks (shape: {type:"content", content:{type:"text", text:"..."}}),
not in the hermes-style `rawInput` map / `rawOutput` string. Our
parser only looked at rawInput/rawOutput, so the daemon recorded
empty Input and Output for every kimi tool — the execution-history
UI showed blank terminal panels even for commands that ran fine.

Add extractACPToolCallText() and a fallback in handleToolCallStart /
handleToolCallUpdate: when rawInput is nil / rawOutput is empty, pull
the text out of the content blocks. rawInput / rawOutput still take
precedence so hermes' behaviour is untouched. Terminal /
FileEditToolCallContent blocks are skipped (we have nothing to render
them as — kimi only emits TerminalToolCallContent when the client
advertises terminal capability, which we don't).

Tests:
- TestHermesClientHandleToolCallStartKimiContent — content array →
  Input.text populated.
- TestHermesClientHandleToolCallCompleteKimiContent — multi-block
  content → Output concatenated with newline separator.
- TestHermesClientHandleToolCallRawOutputTakesPrecedence — hermes
  rawOutput still wins when both are present.
- TestExtractACPToolCallText — unit coverage for the helper
  (single/multiple text blocks, terminal-block skip, empty input).

* fix(agent/acp): buffer streaming tool args so Input isn't empty in UI

kimi-cli streams tool args token-by-token via tool_call_update frames
— the initial tool_call carries an empty content block and each
subsequent in_progress update carries the cumulative JSON so far
(`{`, `{"comma`, `{"command": "echo`, …). The final completed update
then carries the tool's stdout, not the args. Observed per kimi-cli
acp/session.py::_send_tool_call{,_part,_result} and confirmed by
driving a real Shell call end-to-end: 10 in_progress frames, last
with `{"command": "echo hello world"}`, then completed with `hello
world\n`.

Our previous handleToolCallStart emitted MessageToolUse on the first
tool_call frame, capturing the empty content — so every kimi tool
appeared in the execution-history UI with a blank input. Output was
correct (fix 4335c198) but command was missing.

Changes:
- hermesClient now tracks pending tool calls per toolCallId. Hermes
  path is unchanged — rawInput is present at tool_call time, so
  emit-immediately-then-flag-emitted still fires on the initial frame.
- kimi path defers MessageToolUse until status=completed / failed.
  tool_call_update in_progress frames update the buffered argsText
  (cumulative, so overwrite); on completion we parse the accumulated
  JSON into Message.Input. Malformed JSON falls back to `{"text": …}`
  so non-JSON tool args still render.
- Orphan completion frames (no matching tool_call seen — e.g. daemon
  restarted mid-task) synthesise ToolUse from the update's own
  title/kind/rawInput so the UI still gets a header.
- extractACPToolCallText now also renders FileEditToolCallContent
  blocks as a compact header ("--- path / +++ path / (edited: N → M
  bytes)"). kimi emits these for Write / StrReplaceFile / Patch when
  the tool's display block is a DiffDisplayBlock.

Tests:
- TestHermesClientKimiStreamingToolCall: empty tool_call + 5 streaming
  in_progress + completed. Asserts no emission until complete, then
  [ToolUse(Input.command="echo hi"), ToolResult(Output="hi\n")].
- TestHermesClientKimiMalformedArgsFallback: non-JSON argsText → falls
  back to Input.text.
- TestHermesClientHandleToolCallCompleteOrphan: completed frame
  without a start → ToolUse synthesised from update's rawInput.
- TestExtractACPToolCallText: diff + new-file-diff cases.

All agent / daemon / handler test packages green.

---------

Co-authored-by: Eve <8b0578a3-cf72-4394-9e38-b328eca92463@users.noreply.multica.ai>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: Lambda <f252c2c5-7d1d-4f3c-b394-a61abfe673fc@users.noreply.multica.ai>
This commit is contained in:
devv-eve
2026-04-20 11:18:30 -07:00
committed by GitHub
parent b291db11c2
commit 9e47b83f02
12 changed files with 1448 additions and 96 deletions

View File

@@ -111,6 +111,20 @@ function CursorLogo({ className }: { className: string }) {
);
}
// Kimi (Moonshot AI) — wordmark "K" mark in Moonshot brand purple, simple
// rounded-square logotype suitable for small icon sizes.
function KimiLogo({ className }: { className: string }) {
return (
<svg viewBox="0 0 24 24" fill="none" className={className}>
<rect width="24" height="24" rx="5" fill="#1F1147" />
<path
d="M7.2 6h2.4v5.1l4.3-5.1h2.9l-4.4 5.1L17 18h-2.9l-3.2-5.2-1.3 1.5V18H7.2V6z"
fill="#FFFFFF"
/>
</svg>
);
}
export function ProviderLogo({
provider,
className = "h-4 w-4",
@@ -135,6 +149,8 @@ export function ProviderLogo({
return <CopilotLogo className={className} />;
case "cursor":
return <CursorLogo className={className} />;
case "kimi":
return <KimiLogo className={className} />;
default:
return <Monitor className={className} />;
}

View File

@@ -34,7 +34,7 @@ type Config struct {
CLIVersion string // multica CLI version (e.g. "0.1.13")
LaunchedBy string // "desktop" when spawned by the Electron app, empty for standalone
Profile string // profile name (empty = default)
Agents map[string]AgentEntry // keyed by provider: claude, codex, opencode, openclaw, hermes, gemini, pi
Agents map[string]AgentEntry // keyed by provider: claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, cursor, kimi
WorkspacesRoot string // base path for execution envs (default: ~/multica_workspaces)
KeepEnvAfterTask bool // preserve env after task for debugging
HealthPort int // local HTTP port for health checks (default: 19514)
@@ -142,8 +142,15 @@ func LoadConfig(overrides Overrides) (Config, error) {
Model: strings.TrimSpace(os.Getenv("MULTICA_COPILOT_MODEL")),
}
}
kimiPath := envOrDefault("MULTICA_KIMI_PATH", "kimi")
if _, err := exec.LookPath(kimiPath); err == nil {
agents["kimi"] = AgentEntry{
Path: kimiPath,
Model: strings.TrimSpace(os.Getenv("MULTICA_KIMI_MODEL")),
}
}
if len(agents) == 0 {
return Config{}, fmt.Errorf("no agent CLI found: install claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, or cursor-agent and ensure it is on PATH")
return Config{}, fmt.Errorf("no agent CLI found: install claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, cursor-agent, or kimi and ensure it is on PATH")
}
// Host info

View File

@@ -17,6 +17,7 @@ import (
// OpenCode: skills → {workDir}/.config/opencode/skills/{name}/SKILL.md (native discovery)
// Pi: skills → {workDir}/.pi/agent/skills/{name}/SKILL.md (native discovery)
// Cursor: skills → {workDir}/.cursor/skills/{name}/SKILL.md (native discovery)
// Kimi: skills → {workDir}/.kimi/skills/{name}/SKILL.md (native discovery)
// Default: skills → {workDir}/.agent_context/skills/{name}/SKILL.md
func writeContextFiles(workDir, provider string, ctx TaskContextForEnv) error {
contextDir := filepath.Join(workDir, ".agent_context")
@@ -69,6 +70,10 @@ func resolveSkillsDir(workDir, provider string) (string, error) {
case "cursor":
// Cursor natively discovers skills from .cursor/skills/ in the workdir.
skillsDir = filepath.Join(workDir, ".cursor", "skills")
case "kimi":
// Kimi Code CLI auto-discovers project-level skills from .kimi/skills/
// in the workdir. See https://moonshotai.github.io/kimi-cli/en/customization/skills.html
skillsDir = filepath.Join(workDir, ".kimi", "skills")
default:
// Fallback: write to .agent_context/skills/ (referenced by meta config).
skillsDir = filepath.Join(workDir, ".agent_context", "skills")

View File

@@ -18,13 +18,14 @@ import (
// For Gemini: writes {workDir}/GEMINI.md (discovered natively by the Gemini CLI)
// For Pi: writes {workDir}/AGENTS.md (skills discovered natively from ~/.pi/agent/skills/)
// For Cursor: writes {workDir}/AGENTS.md (skills discovered natively from .cursor/skills/)
// For Kimi: writes {workDir}/AGENTS.md (Kimi Code CLI reads AGENTS.md natively; skills auto-discovered from project skills dirs)
func InjectRuntimeConfig(workDir, provider string, ctx TaskContextForEnv) error {
content := buildMetaSkillContent(provider, ctx)
switch provider {
case "claude":
return os.WriteFile(filepath.Join(workDir, "CLAUDE.md"), []byte(content), 0o644)
case "codex", "copilot", "opencode", "openclaw", "pi", "cursor":
case "codex", "copilot", "opencode", "openclaw", "pi", "cursor", "kimi":
return os.WriteFile(filepath.Join(workDir, "AGENTS.md"), []byte(content), 0o644)
case "gemini":
return os.WriteFile(filepath.Join(workDir, "GEMINI.md"), []byte(content), 0o644)
@@ -151,8 +152,8 @@ func buildMetaSkillContent(provider string, ctx TaskContextForEnv) string {
case "claude":
// Claude discovers skills natively from .claude/skills/ — just list names.
b.WriteString("You have the following skills installed (discovered automatically):\n\n")
case "codex", "copilot", "opencode", "openclaw", "pi", "cursor":
// Codex, Copilot, OpenCode, OpenClaw, Pi, and Cursor discover skills natively from their respective paths — just list names.
case "codex", "copilot", "opencode", "openclaw", "pi", "cursor", "kimi":
// Codex, Copilot, OpenCode, OpenClaw, Pi, Cursor, and Kimi discover skills natively from their respective paths — just list names.
b.WriteString("You have the following skills installed (discovered automatically):\n\n")
case "gemini":
// Gemini reads GEMINI.md directly; point it at the fallback skills dir.

View File

@@ -1,5 +1,6 @@
// Package agent provides a unified interface for executing prompts via
// coding agents (Claude Code, Codex, OpenCode, OpenClaw, Hermes, Pi). It mirrors the happy-cli AgentBackend
// coding agents (Claude Code, Codex, Copilot, OpenCode, OpenClaw, Hermes,
// Gemini, Pi, Cursor, Kimi). It mirrors the happy-cli AgentBackend
// pattern, translated to idiomatic Go.
package agent
@@ -85,13 +86,13 @@ type Result struct {
// Config configures a Backend instance.
type Config struct {
ExecutablePath string // path to CLI binary (claude, codex, copilot, opencode, openclaw, hermes, gemini, or pi)
ExecutablePath string // path to CLI binary (claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, cursor, kimi)
Env map[string]string // extra environment variables
Logger *slog.Logger
}
// New creates a Backend for the given agent type.
// Supported types: "claude", "codex", "copilot", "opencode", "openclaw", "hermes", "gemini", "pi", "cursor".
// Supported types: "claude", "codex", "copilot", "opencode", "openclaw", "hermes", "gemini", "pi", "cursor", "kimi".
func New(agentType string, cfg Config) (Backend, error) {
if cfg.Logger == nil {
cfg.Logger = slog.Default()
@@ -116,8 +117,10 @@ func New(agentType string, cfg Config) (Backend, error) {
return &piBackend{cfg: cfg}, nil
case "cursor":
return &cursorBackend{cfg: cfg}, nil
case "kimi":
return &kimiBackend{cfg: cfg}, nil
default:
return nil, fmt.Errorf("unknown agent type: %q (supported: claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, cursor)", agentType)
return nil, fmt.Errorf("unknown agent type: %q (supported: claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, cursor, kimi)", agentType)
}
}
@@ -142,6 +145,7 @@ var launchHeaders = map[string]string{
"openclaw": "openclaw agent (json)",
"opencode": "opencode run (json)",
"pi": "pi (json mode)",
"kimi": "kimi acp",
}
// LaunchHeader returns the user-visible launch skeleton for agentType, or an

View File

@@ -72,7 +72,7 @@ func TestLaunchHeaderCoversAllSupportedBackends(t *testing.T) {
// entry to launchHeaders in agent.go and extend this list.
supported := []string{
"claude", "codex", "copilot", "cursor", "gemini",
"hermes", "openclaw", "opencode", "pi",
"hermes", "kimi", "openclaw", "opencode", "pi",
}
for _, t_ := range supported {
if header := LaunchHeader(t_); header == "" {

View File

@@ -8,6 +8,7 @@ import (
"io"
"os/exec"
"regexp"
"strconv"
"strings"
"sync"
"time"
@@ -73,7 +74,7 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
// without this we'd report a misleading "empty output" and hide
// the real cause (wrong model for the current provider, bad
// credentials, rate limit, …) in the daemon log.
providerErr := newHermesProviderErrorSniffer()
providerErr := newACPProviderErrorSniffer("hermes")
cmd.Stderr = io.MultiWriter(newLogWriter(b.cfg.Logger, "[hermes:stderr] "), providerErr)
if err := cmd.Start(); err != nil {
@@ -92,9 +93,10 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
promptDone := make(chan hermesPromptResult, 1)
c := &hermesClient{
cfg: b.cfg,
stdin: stdin,
pending: make(map[int]*pendingRPC),
cfg: b.cfg,
stdin: stdin,
pending: make(map[int]*pendingRPC),
pendingTools: make(map[string]*pendingToolCall),
onMessage: func(msg Message) {
if msg.Type == MessageText {
outputMu.Lock()
@@ -188,7 +190,7 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
resCh <- Result{Status: finalStatus, Error: finalError, DurationMs: time.Since(startTime).Milliseconds()}
return
}
sessionID = extractHermesSessionID(result)
sessionID = extractACPSessionID(result)
if sessionID == "" {
finalStatus = "failed"
finalError = "hermes session/new returned no session ID"
@@ -336,6 +338,7 @@ type hermesPromptResult struct {
type hermesClient struct {
cfg Config
stdin interface{ Write([]byte) (int, error) }
writeMu sync.Mutex // serialises stdin.Write calls across goroutines
mu sync.Mutex
nextID int
pending map[int]*pendingRPC
@@ -343,10 +346,39 @@ type hermesClient struct {
onMessage func(Message)
onPromptDone func(hermesPromptResult)
// pendingTools buffers the args for tool calls whose input streams in
// across multiple ACP tool_call_update messages (kimi does this —
// tokens from the LLM arrive one at a time, and each update carries
// the cumulative args JSON so far). We defer emitting MessageToolUse
// until we either see status=completed/failed or have a full arg set,
// so the UI never sees a half-written command like `{"comma`.
toolMu sync.Mutex
pendingTools map[string]*pendingToolCall
usageMu sync.Mutex
usage TokenUsage
}
// pendingToolCall buffers state for a tool call while its arguments
// are streaming in. One entry per ACP toolCallId.
type pendingToolCall struct {
toolName string // already mapped via hermesToolNameFromTitle
input map[string]any // from rawInput when the agent sends it up front (hermes)
argsText string // accumulated `content[].text` args (kimi, cumulative)
emitted bool // whether we've already sent MessageToolUse
}
// writeLine serialises concurrent JSON-RPC writes so request() (main
// goroutine) and handleAgentRequest() (reader goroutine) don't
// interleave frames. The pipe itself is atomic for small writes, but
// we also want deterministic ordering under contention.
func (c *hermesClient) writeLine(data []byte) error {
c.writeMu.Lock()
defer c.writeMu.Unlock()
_, err := c.stdin.Write(data)
return err
}
func (c *hermesClient) request(ctx context.Context, method string, params any) (json.RawMessage, error) {
c.mu.Lock()
id := c.nextID
@@ -369,7 +401,7 @@ func (c *hermesClient) request(ctx context.Context, method string, params any) (
return nil, err
}
data = append(data, '\n')
if _, err := c.stdin.Write(data); err != nil {
if err := c.writeLine(data); err != nil {
c.mu.Lock()
delete(c.pending, id)
c.mu.Unlock()
@@ -402,7 +434,11 @@ func (c *hermesClient) handleLine(line string) {
return
}
// Check if it's a response to our request (has id + result or error).
// Agent → client request: has id + method (no result / error yet).
// Kimi uses this for session/request_permission; if we don't answer,
// the agent blocks for 300s and the task hangs. Hermes doesn't send
// these when launched with HERMES_YOLO_MODE=1, but we still handle
// the case generically for any future ACP backend we bolt on.
if _, hasID := raw["id"]; hasID {
if _, hasResult := raw["result"]; hasResult {
c.handleResponse(raw)
@@ -412,6 +448,10 @@ func (c *hermesClient) handleLine(line string) {
c.handleResponse(raw)
return
}
if _, hasMethod := raw["method"]; hasMethod {
c.handleAgentRequest(raw)
return
}
}
// Notification (no id, has method) — session updates from Hermes.
@@ -420,6 +460,62 @@ func (c *hermesClient) handleLine(line string) {
}
}
// handleAgentRequest replies to JSON-RPC requests the agent sends
// us (agent → client direction). The only one we care about today is
// `session/request_permission`: the daemon is headless and cannot
// actually prompt a user, so we auto-approve every action. Using
// `approve_for_session` rather than `approve` means subsequent
// identical actions (every Shell invocation, every file write) don't
// round-trip through us — the agent remembers them locally.
func (c *hermesClient) handleAgentRequest(raw map[string]json.RawMessage) {
var method string
_ = json.Unmarshal(raw["method"], &method)
rawID, ok := raw["id"]
if !ok {
return
}
var resp map[string]any
switch method {
case "session/request_permission":
resp = map[string]any{
"jsonrpc": "2.0",
"id": json.RawMessage(rawID),
"result": map[string]any{
"outcome": map[string]any{
"outcome": "selected",
"optionId": "approve_for_session",
},
},
}
c.cfg.Logger.Debug("auto-approved agent permission request", "method", method)
default:
// Unknown agent→client method — reply with standard "method
// not found" so the agent doesn't block waiting for us. Better
// than silence: the agent can decide how to proceed.
resp = map[string]any{
"jsonrpc": "2.0",
"id": json.RawMessage(rawID),
"error": map[string]any{
"code": -32601,
"message": "method not found: " + method,
},
}
c.cfg.Logger.Debug("unhandled agent→client request", "method", method)
}
data, err := json.Marshal(resp)
if err != nil {
c.cfg.Logger.Warn("marshal agent-request response", "method", method, "error", err)
return
}
data = append(data, '\n')
if err := c.writeLine(data); err != nil {
c.cfg.Logger.Warn("write agent-request response", "method", method, "error", err)
}
}
func (c *hermesClient) handleResponse(raw map[string]json.RawMessage) {
var id int
if err := json.Unmarshal(raw["id"], &id); err != nil {
@@ -560,51 +656,283 @@ func (c *hermesClient) handleAgentThought(data json.RawMessage) {
func (c *hermesClient) handleToolCallStart(data json.RawMessage) {
var msg struct {
ToolCallID string `json:"toolCallId"`
Title string `json:"title"`
Kind string `json:"kind"`
RawInput map[string]any `json:"rawInput"`
ToolCallID string `json:"toolCallId"`
Title string `json:"title"`
Kind string `json:"kind"`
RawInput map[string]any `json:"rawInput"`
Content []json.RawMessage `json:"content"`
}
if err := json.Unmarshal(data, &msg); err != nil {
return
}
toolName := hermesToolNameFromTitle(msg.Title, msg.Kind)
if c.onMessage != nil {
c.onMessage(Message{
Type: MessageToolUse,
Tool: toolName,
CallID: msg.ToolCallID,
Input: msg.RawInput,
// Hermes pre-populates rawInput on the initial tool_call — emit
// MessageToolUse immediately so the UI can show the tool invocation
// live. Record the emission so handleToolCallUpdate doesn't re-emit
// on completion.
if msg.RawInput != nil {
c.trackTool(msg.ToolCallID, &pendingToolCall{
toolName: toolName,
input: msg.RawInput,
emitted: true,
})
if c.onMessage != nil {
c.onMessage(Message{
Type: MessageToolUse,
Tool: toolName,
CallID: msg.ToolCallID,
Input: msg.RawInput,
})
}
return
}
// Kimi streams args token-by-token across tool_call_update messages;
// the initial tool_call often carries an empty content block. Buffer
// the tool and defer MessageToolUse emission to avoid the UI seeing
// a command with `{""` as its input.
c.trackTool(msg.ToolCallID, &pendingToolCall{
toolName: toolName,
argsText: extractACPToolCallText(msg.Content),
emitted: false,
})
}
func (c *hermesClient) handleToolCallUpdate(data json.RawMessage) {
var msg struct {
ToolCallID string `json:"toolCallId"`
Status string `json:"status"`
Kind string `json:"kind"`
RawOutput string `json:"rawOutput"`
ToolCallID string `json:"toolCallId"`
Status string `json:"status"`
Title string `json:"title"`
Kind string `json:"kind"`
RawInput map[string]any `json:"rawInput"`
RawOutput string `json:"rawOutput"`
Content []json.RawMessage `json:"content"`
}
if err := json.Unmarshal(data, &msg); err != nil {
return
}
// Only emit tool result when the call is completed.
// Mid-stream: only buffer updates. Kimi emits many of these per
// tool call, each carrying the cumulative args JSON so far.
if msg.Status != "completed" && msg.Status != "failed" {
if pending := c.getPendingTool(msg.ToolCallID); pending != nil && !pending.emitted {
if text := extractACPToolCallText(msg.Content); text != "" {
// kimi streams the full cumulative args on every frame;
// overwrite rather than concatenate.
pending.argsText = text
}
}
return
}
// Completion: emit any deferred MessageToolUse first, then the result.
pending := c.takePendingTool(msg.ToolCallID)
c.emitDeferredToolUse(pending, msg.ToolCallID, msg.Title, msg.Kind, msg.RawInput)
output := msg.RawOutput
if output == "" {
output = extractACPToolCallText(msg.Content)
}
if c.onMessage != nil {
c.onMessage(Message{
Type: MessageToolResult,
CallID: msg.ToolCallID,
Output: msg.RawOutput,
Output: output,
})
}
}
// trackTool stores pending-tool state for a given callID. Lazy-inits
// the map so zero-value hermesClient values (common in tests) don't
// panic on the first tool call.
func (c *hermesClient) trackTool(callID string, p *pendingToolCall) {
c.toolMu.Lock()
defer c.toolMu.Unlock()
if c.pendingTools == nil {
c.pendingTools = make(map[string]*pendingToolCall)
}
c.pendingTools[callID] = p
}
// getPendingTool returns the pending entry (may be nil) without
// removing it. Safe to call on a zero-value hermesClient.
func (c *hermesClient) getPendingTool(callID string) *pendingToolCall {
c.toolMu.Lock()
defer c.toolMu.Unlock()
if c.pendingTools == nil {
return nil
}
return c.pendingTools[callID]
}
// takePendingTool removes and returns the pending entry, or nil if
// none was tracked (e.g. the tool completed before we saw its start,
// or we missed the start frame).
func (c *hermesClient) takePendingTool(callID string) *pendingToolCall {
c.toolMu.Lock()
defer c.toolMu.Unlock()
if c.pendingTools == nil {
return nil
}
p := c.pendingTools[callID]
delete(c.pendingTools, callID)
return p
}
// emitDeferredToolUse emits a buffered MessageToolUse right before the
// matching MessageToolResult. Handles three cases:
// - hermes tool: already emitted on tool_call → skip
// - kimi tool with streamed args → parse accumulated JSON as Input
// - unknown tool (completed arrived without a start frame) →
// synthesize minimal info from the update's own fields
func (c *hermesClient) emitDeferredToolUse(
p *pendingToolCall,
callID, updateTitle, updateKind string,
updateRawInput map[string]any,
) {
if p != nil && p.emitted {
return
}
var toolName string
var input map[string]any
switch {
case p != nil && p.input != nil:
// Pre-buffered rawInput path — shouldn't happen because we set
// emitted=true in that case, but handle defensively.
toolName = p.toolName
input = p.input
case p != nil:
toolName = p.toolName
input = parseToolArgsJSON(p.argsText)
default:
// No record of the start frame — fall back to the update's own
// title/kind/rawInput so the UI at least sees the tool name.
toolName = hermesToolNameFromTitle(updateTitle, updateKind)
input = updateRawInput
}
if c.onMessage == nil {
return
}
c.onMessage(Message{
Type: MessageToolUse,
Tool: toolName,
CallID: callID,
Input: input,
})
}
// parseToolArgsJSON turns kimi's accumulated args string into the
// structured map the UI expects under Message.Input. Kimi sends args
// as a JSON-encoded object (`{"command":"echo hi"}`), so a full JSON
// parse recovers the original tool-arg shape. On malformed input
// (streaming glitch, non-JSON tool) we preserve the raw text under a
// `text` key so the UI still has something to render.
func parseToolArgsJSON(argsText string) map[string]any {
argsText = strings.TrimSpace(argsText)
if argsText == "" {
return nil
}
var m map[string]any
if err := json.Unmarshal([]byte(argsText), &m); err == nil {
return m
}
return map[string]any{"text": argsText}
}
// extractACPToolCallText concatenates the rendered text of every ACP
// block in a tool_call / tool_call_update's `content` array.
//
// Handles the two block types kimi emits:
// - {type:"content", content:{type:"text", text:"..."}} — plain text
// (shell output, tool args). Text is concatenated verbatim.
// - {type:"diff", path, oldText, newText} — FileEdit output. Rendered
// as a minimal unified-diff header so the UI distinguishes writes
// from reads without needing a diff viewer.
//
// Terminal blocks ({type:"terminal", terminalId}) reference a remote
// terminal the client would normally subscribe to via terminal/output;
// we don't advertise terminal capability so we never receive those in
// practice, but if one slips through we skip it (nothing useful to
// surface from a bare ID).
func extractACPToolCallText(blocks []json.RawMessage) string {
var b strings.Builder
appendPiece := func(piece string) {
if piece == "" {
return
}
if b.Len() > 0 {
b.WriteByte('\n')
}
b.WriteString(piece)
}
for _, raw := range blocks {
var kind struct {
Type string `json:"type"`
}
if err := json.Unmarshal(raw, &kind); err != nil {
continue
}
switch kind.Type {
case "content":
var outer struct {
Content json.RawMessage `json:"content"`
}
if err := json.Unmarshal(raw, &outer); err != nil || len(outer.Content) == 0 {
continue
}
var inner struct {
Type string `json:"type"`
Text string `json:"text"`
}
if err := json.Unmarshal(outer.Content, &inner); err != nil {
continue
}
if inner.Type != "text" {
continue
}
appendPiece(inner.Text)
case "diff":
var diff struct {
Path string `json:"path"`
OldText string `json:"oldText"`
NewText string `json:"newText"`
}
if err := json.Unmarshal(raw, &diff); err != nil || diff.Path == "" {
continue
}
// Keep it tiny — a full unified diff can be huge and we're
// really just recording "this tool wrote to this file".
// The UI can re-read the file if it needs the actual content.
var piece strings.Builder
piece.WriteString("--- ")
piece.WriteString(diff.Path)
piece.WriteString("\n+++ ")
piece.WriteString(diff.Path)
if diff.OldText == "" {
piece.WriteString("\n(new file, ")
piece.WriteString(strconv.Itoa(len(diff.NewText)))
piece.WriteString(" bytes)")
} else {
piece.WriteString("\n(edited: ")
piece.WriteString(strconv.Itoa(len(diff.OldText)))
piece.WriteString(" → ")
piece.WriteString(strconv.Itoa(len(diff.NewText)))
piece.WriteString(" bytes)")
}
appendPiece(piece.String())
default:
// terminal blocks, image blocks, unknown future types —
// ignore. We have no way to inline-render them.
}
}
return b.String()
}
func (c *hermesClient) handleUsageUpdate(data json.RawMessage) {
var msg struct {
Usage struct {
@@ -634,7 +962,10 @@ func (c *hermesClient) handleUsageUpdate(data json.RawMessage) {
// ── Helpers ──
func extractHermesSessionID(result json.RawMessage) string {
// extractACPSessionID pulls `sessionId` out of a session/new or
// session/resume response. Shared by all ACP backends (hermes, kimi,
// and anything else that follows the standard ACP schema).
func extractACPSessionID(result json.RawMessage) string {
var r struct {
SessionID string `json:"sessionId"`
}
@@ -697,46 +1028,64 @@ func hermesToolNameFromTitle(title string, kind string) string {
case "think":
return "thinking"
default:
// Preserve a non-empty title when we can't classify it: kimi
// emits bare titles like "Shell" or "Read file" without any
// `kind`, so returning an empty string here drops the tool
// name entirely before kimiToolNameFromTitle can map it.
// Hermes titles always carry a colon, so hermes never reaches
// this branch with a non-empty title.
if title != "" {
return title
}
return kind
}
}
// ── Provider-error sniffing ──
//
// hermes' session/prompt RPC reports stopReason=end_turn even when
// the underlying HTTP call to the configured LLM endpoint returned
// an error — the actionable detail only appears on stderr (e.g.
// ACP agents (hermes, kimi, …) all have the same failure mode:
// session/prompt reports stopReason=end_turn even when the underlying
// HTTP call to the configured LLM endpoint returned an error — the
// actionable detail only appears on stderr (e.g.
// `⚠️ API call failed (attempt 1/3): BadRequestError [HTTP 400]` and
// `Error: HTTP 400: Error code: 400 - {'detail': "The '...' model
// is not supported when using Codex with a ChatGPT account."}`).
// We scan for those patterns so the daemon can surface a real
// failure instead of a generic "empty output".
type hermesProviderErrorSniffer struct {
mu sync.Mutex
remains []byte // buffer for a partial trailing line across writes
lines []string // captured error lines, bounded
seen map[string]bool
// The sniffer scans for those patterns so the daemon can surface a
// real failure instead of a generic "empty output".
//
// Parameterised by provider name so both hermes and kimi can share
// the transport: the regexes match format-level signals (HTTP status,
// error-kind tags, "API call failed" banner) that both runtimes emit.
type acpProviderErrorSniffer struct {
provider string
mu sync.Mutex
remains []byte // buffer for a partial trailing line across writes
lines []string // captured error lines, bounded
seen map[string]bool
}
// hermesErrorHeaderRe matches the first line of an API-error block.
// Hermes prefixes these with ⚠️ / ❌ and includes an HTTP status
// code or a non-retryable-error tag.
var hermesErrorHeaderRe = regexp.MustCompile(`(?:⚠️|❌|\[ERROR\]).*(?:BadRequestError|AuthenticationError|RateLimitError|HTTP [0-9]{3}|Non-retryable|API call failed)`)
// acpErrorHeaderRe matches the first line of an API-error block.
// ACP agents typically prefix these with ⚠️ / ❌ and include an HTTP
// status code or a non-retryable-error tag.
var acpErrorHeaderRe = regexp.MustCompile(`(?:⚠️|❌|\[ERROR\]).*(?:BadRequestError|AuthenticationError|RateLimitError|HTTP [0-9]{3}|Non-retryable|API call failed)`)
// hermesErrorDetailRe pulls the most useful single-line messages
// out of the subsequent lines of the error block (the one whose
// "Error:" or "Details:" tag actually spells out what happened).
var hermesErrorDetailRe = regexp.MustCompile(`(?:Error:|detail:|Details:)\s*(.+)`)
// acpErrorDetailRe pulls the most useful single-line messages out of
// the subsequent lines of the error block (the one whose "Error:" or
// "Details:" tag actually spells out what happened).
var acpErrorDetailRe = regexp.MustCompile(`(?:Error:|detail:|Details:)\s*(.+)`)
const hermesMaxErrorLines = 8
const acpMaxErrorLines = 8
func newHermesProviderErrorSniffer() *hermesProviderErrorSniffer {
return &hermesProviderErrorSniffer{seen: map[string]bool{}}
// newACPProviderErrorSniffer returns a sniffer that tags its messages
// with the given provider name (e.g. "hermes", "kimi") so failure
// strings make it obvious which runtime produced the error.
func newACPProviderErrorSniffer(provider string) *acpProviderErrorSniffer {
return &acpProviderErrorSniffer{provider: provider, seen: map[string]bool{}}
}
// Write implements io.Writer so the sniffer can sit behind an
// io.MultiWriter next to the normal stderr log forwarder.
func (s *hermesProviderErrorSniffer) Write(p []byte) (int, error) {
func (s *acpProviderErrorSniffer) Write(p []byte) (int, error) {
s.mu.Lock()
defer s.mu.Unlock()
@@ -757,7 +1106,7 @@ func (s *hermesProviderErrorSniffer) Write(p []byte) (int, error) {
if line == "" {
continue
}
if !(hermesErrorHeaderRe.MatchString(line) || hermesErrorDetailRe.MatchString(line)) {
if !(acpErrorHeaderRe.MatchString(line) || acpErrorDetailRe.MatchString(line)) {
continue
}
if s.seen[line] {
@@ -765,8 +1114,8 @@ func (s *hermesProviderErrorSniffer) Write(p []byte) (int, error) {
}
s.seen[line] = true
s.lines = append(s.lines, line)
if len(s.lines) > hermesMaxErrorLines {
s.lines = s.lines[len(s.lines)-hermesMaxErrorLines:]
if len(s.lines) > acpMaxErrorLines {
s.lines = s.lines[len(s.lines)-acpMaxErrorLines:]
}
}
return len(p), nil
@@ -776,21 +1125,22 @@ func (s *hermesProviderErrorSniffer) Write(p []byte) (int, error) {
// error field. Prefers the most specific "Error:" / "detail:"
// fragment; falls back to the first captured header line; empty
// when nothing useful was seen.
func (s *hermesProviderErrorSniffer) message() string {
func (s *acpProviderErrorSniffer) message() string {
s.mu.Lock()
defer s.mu.Unlock()
prefix := s.provider + " provider error: "
for _, line := range s.lines {
if m := hermesErrorDetailRe.FindStringSubmatch(line); m != nil {
if m := acpErrorDetailRe.FindStringSubmatch(line); m != nil {
detail := strings.TrimSpace(m[1])
if detail != "" {
return "hermes provider error: " + detail
return prefix + detail
}
}
}
for _, line := range s.lines {
if hermesErrorHeaderRe.MatchString(line) {
return "hermes provider error: " + line
if acpErrorHeaderRe.MatchString(line) {
return prefix + line
}
}
return ""

View File

@@ -2,7 +2,9 @@ package agent
import (
"encoding/json"
"log/slog"
"strings"
"sync"
"testing"
)
@@ -17,30 +19,30 @@ func TestNewReturnsHermesBackend(t *testing.T) {
}
}
// ── extractHermesSessionID ──
// ── extractACPSessionID ──
func TestExtractHermesSessionID(t *testing.T) {
func TestExtractACPSessionID(t *testing.T) {
t.Parallel()
raw := json.RawMessage(`{"sessionId":"20260410_141145_47260c"}`)
got := extractHermesSessionID(raw)
got := extractACPSessionID(raw)
if got != "20260410_141145_47260c" {
t.Errorf("got %q, want %q", got, "20260410_141145_47260c")
}
}
func TestExtractHermesSessionIDEmpty(t *testing.T) {
func TestExtractACPSessionIDEmpty(t *testing.T) {
t.Parallel()
raw := json.RawMessage(`{}`)
got := extractHermesSessionID(raw)
got := extractACPSessionID(raw)
if got != "" {
t.Errorf("got %q, want empty", got)
}
}
func TestExtractHermesSessionIDInvalidJSON(t *testing.T) {
func TestExtractACPSessionIDInvalidJSON(t *testing.T) {
t.Parallel()
raw := json.RawMessage(`not json`)
got := extractHermesSessionID(raw)
got := extractACPSessionID(raw)
if got != "" {
t.Errorf("got %q, want empty", got)
}
@@ -65,14 +67,24 @@ func TestHermesToolNameFromTitle(t *testing.T) {
{"delegate: fix the bug", "execute", "delegate_task"},
{"analyze image: what is this?", "read", "vision_analyze"},
{"execute code", "execute", "execute_code"},
// Fallback to kind when no colon in title.
// Fallback to kind when no colon in title but kind is known.
{"unknownTool", "read", "read_file"},
{"unknownTool", "edit", "write_file"},
{"unknownTool", "execute", "terminal"},
{"unknownTool", "search", "search_files"},
{"unknownTool", "fetch", "web_search"},
{"unknownTool", "think", "thinking"},
{"unknownTool", "other", "other"},
// Bare title (no colon, no known kind) — preserve the title
// itself rather than falling back to an unclassified kind.
// Matters for kimi: its ACP `tool_call` updates emit a bare
// `title: "Shell"` with no `kind`, and we need downstream
// normalisation (kimiToolNameFromTitle) to see "Shell" rather
// than an empty string.
{"Shell", "", "Shell"},
{"Read file", "", "Read file"},
{"unknownTool", "other", "unknownTool"},
// Empty title falls back to kind, even when kind isn't known.
{"", "other", "other"},
// Tool with colon but not in known map.
{"custom_tool: args", "other", "custom_tool"},
}
@@ -101,7 +113,7 @@ func TestHermesClientHandleLineResponse(t *testing.T) {
if res.err != nil {
t.Fatalf("unexpected error: %v", res.err)
}
sid := extractHermesSessionID(res.result)
sid := extractACPSessionID(res.result)
if sid != "ses_abc" {
t.Errorf("sessionId: got %q, want %q", sid, "ses_abc")
}
@@ -127,6 +139,111 @@ func TestHermesClientHandleLineError(t *testing.T) {
}
}
// ── agent → client request handling ──
// bufferWriter is a test stand-in for cmd.StdinPipe that captures
// writes in-memory so we can assert what handleAgentRequest emitted.
type bufferWriter struct {
mu sync.Mutex
buf strings.Builder
}
func (b *bufferWriter) Write(p []byte) (int, error) {
b.mu.Lock()
defer b.mu.Unlock()
return b.buf.WriteString(string(p))
}
func (b *bufferWriter) String() string {
b.mu.Lock()
defer b.mu.Unlock()
return b.buf.String()
}
// TestHermesClientAutoApprovesPermissionRequest asserts that when an
// ACP agent sends us `session/request_permission` (kimi does this on
// every Shell / file-mutating tool call), the client replies with
// `approve_for_session` — without this the agent blocks 300s and the
// task hangs. The id in the reply must match the agent's request id
// so its in-flight future resolves.
func TestHermesClientAutoApprovesPermissionRequest(t *testing.T) {
t.Parallel()
w := &bufferWriter{}
c := &hermesClient{
cfg: Config{Logger: slog.Default()},
stdin: w,
pending: make(map[int]*pendingRPC),
}
c.handleLine(`{"jsonrpc":"2.0","id":42,"method":"session/request_permission","params":{"sessionId":"ses_1","options":[{"optionId":"approve","name":"Approve once","kind":"allow_once"},{"optionId":"approve_for_session","name":"Approve for this session","kind":"allow_always"},{"optionId":"reject","name":"Reject","kind":"reject_once"}],"toolCall":{"toolCallId":"tc_1","title":"Shell","content":[]}}}`)
got := w.String()
var resp struct {
JSONRPC string `json:"jsonrpc"`
ID int `json:"id"`
Result struct {
Outcome struct {
Outcome string `json:"outcome"`
OptionID string `json:"optionId"`
} `json:"outcome"`
} `json:"result"`
}
if err := json.Unmarshal([]byte(strings.TrimSpace(got)), &resp); err != nil {
t.Fatalf("reply is not valid JSON: %q err=%v", got, err)
}
if resp.JSONRPC != "2.0" {
t.Errorf("jsonrpc: got %q, want 2.0", resp.JSONRPC)
}
if resp.ID != 42 {
t.Errorf("id: got %d, want 42 (must echo agent's request id)", resp.ID)
}
if resp.Result.Outcome.Outcome != "selected" {
t.Errorf("outcome.outcome: got %q, want %q", resp.Result.Outcome.Outcome, "selected")
}
if resp.Result.Outcome.OptionID != "approve_for_session" {
t.Errorf("outcome.optionId: got %q, want %q", resp.Result.Outcome.OptionID, "approve_for_session")
}
}
// TestHermesClientReplesMethodNotFoundForUnknownAgentRequest ensures
// that any agent → client request we don't explicitly handle gets a
// proper JSON-RPC error back, not silence. Silence would block the
// agent for however long its internal timeout is, same as the
// session/request_permission hang this change fixes.
func TestHermesClientReplesMethodNotFoundForUnknownAgentRequest(t *testing.T) {
t.Parallel()
w := &bufferWriter{}
c := &hermesClient{
cfg: Config{Logger: slog.Default()},
stdin: w,
pending: make(map[int]*pendingRPC),
}
c.handleLine(`{"jsonrpc":"2.0","id":7,"method":"fs/read_text_file","params":{"path":"/tmp/x"}}`)
got := w.String()
var resp struct {
ID int `json:"id"`
Error struct {
Code int `json:"code"`
Message string `json:"message"`
} `json:"error"`
}
if err := json.Unmarshal([]byte(strings.TrimSpace(got)), &resp); err != nil {
t.Fatalf("reply not valid JSON: %q err=%v", got, err)
}
if resp.ID != 7 {
t.Errorf("id echo: got %d, want 7", resp.ID)
}
if resp.Error.Code != -32601 {
t.Errorf("error code: got %d, want -32601 (method not found)", resp.Error.Code)
}
if !strings.Contains(resp.Error.Message, "fs/read_text_file") {
t.Errorf("error message should name the unhandled method, got %q", resp.Error.Message)
}
}
// ── session/update notification handling ──
func TestHermesClientHandleAgentMessage(t *testing.T) {
@@ -226,6 +343,211 @@ func TestHermesClientHandleToolCallComplete(t *testing.T) {
}
}
// TestHermesClientKimiStreamingToolCall walks the real kimi frame
// sequence for a single Shell call:
// 1. tool_call with empty content (LLM hasn't started emitting args yet)
// 2. tool_call_update status=in_progress carrying the cumulative args
// JSON character-by-character ("{", "{\"command", …)
// 3. tool_call_update status=completed carrying the command's stdout
//
// The client must defer MessageToolUse until we have the full args so
// the UI doesn't show a command like `{"comma` — and the MessageToolUse
// must carry the parsed args as the Input map (`{"command": "echo hi"}`
// → Input["command"] = "echo hi") rather than a raw string.
func TestHermesClientKimiStreamingToolCall(t *testing.T) {
t.Parallel()
var got []Message
c := &hermesClient{
pending: make(map[int]*pendingRPC),
onMessage: func(msg Message) {
got = append(got, msg)
},
}
// 1. tool_call: empty content (classic kimi start frame).
c.handleLine(`{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call","toolCallId":"tc-kimi-1","title":"Shell","status":"in_progress","content":[{"type":"content","content":{"type":"text","text":""}}]}}}`)
if len(got) != 0 {
t.Fatalf("expected nothing emitted yet (args empty), got %+v", got)
}
// 2. Streaming updates — cumulative args JSON.
partials := []string{
`{"`,
`{"command`,
`{"command":`,
`{"command":"echo `,
`{"command":"echo hi"}`,
}
for _, args := range partials {
// JSON-encode args so embedded quotes are escaped properly.
argsJSON, _ := json.Marshal(args)
line := `{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call_update","toolCallId":"tc-kimi-1","status":"in_progress","content":[{"type":"content","content":{"type":"text","text":` + string(argsJSON) + `}}]}}}`
c.handleLine(line)
}
if len(got) != 0 {
t.Fatalf("expected nothing emitted mid-stream, got %+v", got)
}
// 3. Completed — stdout.
c.handleLine(`{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call_update","toolCallId":"tc-kimi-1","status":"completed","content":[{"type":"content","content":{"type":"text","text":"hi\n"}}]}}}`)
if len(got) != 2 {
t.Fatalf("expected [MessageToolUse, MessageToolResult], got %d: %+v", len(got), got)
}
if got[0].Type != MessageToolUse {
t.Errorf("first message: got %v, want MessageToolUse", got[0].Type)
}
if got[0].CallID != "tc-kimi-1" {
t.Errorf("first.callID: got %q", got[0].CallID)
}
if cmd, _ := got[0].Input["command"].(string); cmd != "echo hi" {
t.Errorf("first.Input.command: got %v, want %q", got[0].Input["command"], "echo hi")
}
if got[1].Type != MessageToolResult {
t.Errorf("second message: got %v, want MessageToolResult", got[1].Type)
}
if got[1].Output != "hi\n" {
t.Errorf("second.output: got %q, want %q", got[1].Output, "hi\n")
}
}
// TestHermesClientKimiMalformedArgsFallback: if the accumulated args
// aren't valid JSON (streaming glitch, tool with non-JSON args), we
// still surface the text under Input.text rather than silently
// dropping it.
func TestHermesClientKimiMalformedArgsFallback(t *testing.T) {
t.Parallel()
var got []Message
c := &hermesClient{
pending: make(map[int]*pendingRPC),
onMessage: func(msg Message) {
got = append(got, msg)
},
}
c.handleLine(`{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call","toolCallId":"tc","title":"Shell","status":"in_progress","content":[{"type":"content","content":{"type":"text","text":"not-json"}}]}}}`)
c.handleLine(`{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call_update","toolCallId":"tc","status":"completed","content":[{"type":"content","content":{"type":"text","text":"output"}}]}}}`)
if len(got) < 1 {
t.Fatalf("expected ToolUse+ToolResult, got %+v", got)
}
if text, _ := got[0].Input["text"].(string); text != "not-json" {
t.Errorf("fallback Input.text: got %v", got[0].Input["text"])
}
}
// TestHermesClientHandleToolCallCompleteOrphan: if a completion frame
// arrives without a preceding tool_call (out-of-order / missed frame),
// still emit ToolUse synthesised from the update's own title/rawInput
// before ToolResult. Keeps the UI from showing a bare result with no
// header.
func TestHermesClientHandleToolCallCompleteOrphan(t *testing.T) {
t.Parallel()
var got []Message
c := &hermesClient{
pending: make(map[int]*pendingRPC),
onMessage: func(msg Message) {
got = append(got, msg)
},
}
c.handleLine(`{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call_update","toolCallId":"tc","status":"completed","title":"terminal: ls","kind":"execute","rawInput":{"command":"ls"},"content":[{"type":"content","content":{"type":"text","text":"file.go\n"}}]}}}`)
if len(got) != 2 || got[0].Type != MessageToolUse || got[1].Type != MessageToolResult {
t.Fatalf("expected [ToolUse, ToolResult], got %+v", got)
}
if got[0].Tool != "terminal" {
t.Errorf("orphan ToolUse tool: got %q", got[0].Tool)
}
if cmd, _ := got[0].Input["command"].(string); cmd != "ls" {
t.Errorf("orphan ToolUse input.command: got %v", got[0].Input["command"])
}
if got[1].Output != "file.go\n" {
t.Errorf("ToolResult output: got %q", got[1].Output)
}
}
// TestHermesClientHandleToolCallRawOutputTakesPrecedence keeps hermes
// behaviour unchanged: when the update has both `rawOutput` (hermes
// convention) and `content` (would be ambiguous), honour rawOutput.
func TestHermesClientHandleToolCallRawOutputTakesPrecedence(t *testing.T) {
t.Parallel()
var got Message
c := &hermesClient{
pending: make(map[int]*pendingRPC),
onMessage: func(msg Message) {
got = msg
},
}
line := `{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call_update","toolCallId":"tc","status":"completed","rawOutput":"raw wins","content":[{"type":"content","content":{"type":"text","text":"ignored"}}]}}}`
c.handleLine(line)
if got.Output != "raw wins" {
t.Errorf("output: got %q, want %q", got.Output, "raw wins")
}
}
func TestExtractACPToolCallText(t *testing.T) {
t.Parallel()
tests := []struct {
name string
json string
want string
}{
{
name: "single text block",
json: `[{"type":"content","content":{"type":"text","text":"hello"}}]`,
want: "hello",
},
{
name: "multiple text blocks join with newline",
json: `[{"type":"content","content":{"type":"text","text":"a"}},{"type":"content","content":{"type":"text","text":"b"}}]`,
want: "a\nb",
},
{
name: "terminal blocks skipped",
json: `[{"type":"terminal","terminalId":"t1"},{"type":"content","content":{"type":"text","text":"shell out"}}]`,
want: "shell out",
},
{
name: "diff block renders as mini header",
json: `[{"type":"diff","path":"foo.go","oldText":"abc","newText":"abcdef"}]`,
want: "--- foo.go\n+++ foo.go\n(edited: 3 → 6 bytes)",
},
{
name: "new-file diff (no oldText)",
json: `[{"type":"diff","path":"new.go","oldText":"","newText":"hi"}]`,
want: "--- new.go\n+++ new.go\n(new file, 2 bytes)",
},
{
name: "empty array returns empty",
json: `[]`,
want: "",
},
{
name: "no text content",
json: `[{"type":"terminal","terminalId":"t1"}]`,
want: "",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
var blocks []json.RawMessage
if err := json.Unmarshal([]byte(tt.json), &blocks); err != nil {
t.Fatalf("unmarshal: %v", err)
}
if got := extractACPToolCallText(blocks); got != tt.want {
t.Errorf("got %q, want %q", got, tt.want)
}
})
}
}
func TestHermesClientHandleToolCallInProgressIgnored(t *testing.T) {
t.Parallel()
@@ -384,7 +706,7 @@ func TestHermesProviderErrorSniffer(t *testing.T) {
// LLM endpoint rejects the requested model. We verify the
// sniffer extracts the `Error: ...` line so the task error
// tells the user *why* it failed.
s := newHermesProviderErrorSniffer()
s := newACPProviderErrorSniffer("hermes")
lines := []string{
"2026-04-20 23:41:47 [INFO] acp_adapter.server: Prompt on session abc",
`⚠️ API call failed (attempt 1/3): BadRequestError [HTTP 400]`,
@@ -409,7 +731,7 @@ func TestHermesProviderErrorSniffer(t *testing.T) {
func TestHermesProviderErrorSnifferIgnoresInfoLines(t *testing.T) {
t.Parallel()
s := newHermesProviderErrorSniffer()
s := newACPProviderErrorSniffer("hermes")
s.Write([]byte("2026-04-20 23:41:45 [INFO] acp_adapter.entry: Loaded env\n"))
s.Write([]byte("2026-04-20 23:41:47 [INFO] agent.auxiliary_client: Vision auto-detect...\n"))
if msg := s.message(); msg != "" {
@@ -422,7 +744,7 @@ func TestHermesProviderErrorSnifferHandlesPartialLines(t *testing.T) {
// Writer may be called mid-line; the sniffer must buffer until
// it sees a newline so the regex doesn't miss the header.
s := newHermesProviderErrorSniffer()
s := newACPProviderErrorSniffer("hermes")
s.Write([]byte(`⚠️ API call failed (attempt 1/3):`))
s.Write([]byte(` BadRequestError [HTTP 400]` + "\n"))
s.Write([]byte(` 📝 Error: something went wrong` + "\n"))
@@ -435,12 +757,12 @@ func TestHermesProviderErrorSnifferHandlesPartialLines(t *testing.T) {
func TestHermesProviderErrorSnifferBoundedBuffer(t *testing.T) {
t.Parallel()
s := newHermesProviderErrorSniffer()
s := newACPProviderErrorSniffer("hermes")
for i := 0; i < 20; i++ {
// Each line differs so dedup doesn't merge them.
s.Write([]byte(`⚠️ API call failed (HTTP 400) attempt ` + string(rune('a'+i%26)) + `: Non-retryable error` + "\n"))
}
if len(s.lines) > hermesMaxErrorLines {
t.Errorf("sniffer kept %d lines, limit is %d", len(s.lines), hermesMaxErrorLines)
if len(s.lines) > acpMaxErrorLines {
t.Errorf("sniffer kept %d lines, limit is %d", len(s.lines), acpMaxErrorLines)
}
}

382
server/pkg/agent/kimi.go Normal file
View File

@@ -0,0 +1,382 @@
package agent
import (
"bufio"
"context"
"fmt"
"io"
"os/exec"
"strings"
"sync"
"time"
)
// kimiBlockedArgs are flags hardcoded by the daemon that must not be
// overridden by user-configured custom_args. `acp` is the protocol
// subcommand that drives the ACP JSON-RPC transport for Kimi Code CLI;
// overriding it would break the daemon↔Kimi communication contract.
var kimiBlockedArgs = map[string]blockedArgMode{
"acp": blockedStandalone,
}
// kimiBackend implements Backend by spawning `kimi acp` and communicating
// via the ACP (Agent Client Protocol) JSON-RPC 2.0 over stdin/stdout.
//
// Kimi Code CLI (https://github.com/MoonshotAI/kimi-cli) supports ACP out of
// the box via the `kimi acp` subcommand. We reuse the existing hermesClient
// ACP transport since both runtimes speak the same protocol — only the
// binary, env, and tool-name extraction differ.
type kimiBackend struct {
cfg Config
}
func (b *kimiBackend) Execute(ctx context.Context, prompt string, opts ExecOptions) (*Session, error) {
execPath := b.cfg.ExecutablePath
if execPath == "" {
execPath = "kimi"
}
if _, err := exec.LookPath(execPath); err != nil {
return nil, fmt.Errorf("kimi executable not found at %q: %w", execPath, err)
}
timeout := opts.Timeout
if timeout == 0 {
timeout = 20 * time.Minute
}
runCtx, cancel := context.WithTimeout(ctx, timeout)
// `kimi acp` ignores --yolo / --auto-approve (they're flags on the
// root `kimi` command, not on the `acp` subcommand). Instead, the
// daemon auto-approves in hermesClient.handleAgentRequest by replying
// "approve_for_session" to every session/request_permission request.
kimiArgs := append([]string{"acp"}, filterCustomArgs(opts.CustomArgs, kimiBlockedArgs, b.cfg.Logger)...)
cmd := exec.CommandContext(runCtx, execPath, kimiArgs...)
b.cfg.Logger.Debug("agent command", "exec", execPath, "args", kimiArgs)
if opts.Cwd != "" {
cmd.Dir = opts.Cwd
}
cmd.Env = buildEnv(b.cfg.Env)
stdout, err := cmd.StdoutPipe()
if err != nil {
cancel()
return nil, fmt.Errorf("kimi stdout pipe: %w", err)
}
stdin, err := cmd.StdinPipe()
if err != nil {
cancel()
return nil, fmt.Errorf("kimi stdin pipe: %w", err)
}
// Forward stderr to the daemon log *and* sniff provider-level
// errors out of it so we can surface them in the task result.
// Kimi's session/prompt still reports stopReason=end_turn when
// the underlying HTTP call to api.kimi.com returns 4xx/5xx, so
// without this the daemon reports a misleading "empty output"
// and the actionable error (expired token, rate limit, upstream
// 5xx, …) stays buried in the daemon log.
providerErr := newACPProviderErrorSniffer("kimi")
cmd.Stderr = io.MultiWriter(newLogWriter(b.cfg.Logger, "[kimi:stderr] "), providerErr)
if err := cmd.Start(); err != nil {
cancel()
return nil, fmt.Errorf("start kimi: %w", err)
}
b.cfg.Logger.Info("kimi acp started", "pid", cmd.Process.Pid, "cwd", opts.Cwd)
msgCh := make(chan Message, 256)
resCh := make(chan Result, 1)
var outputMu sync.Mutex
var output strings.Builder
promptDone := make(chan hermesPromptResult, 1)
// Reuse the hermesClient ACP transport — Kimi speaks the same protocol.
c := &hermesClient{
cfg: b.cfg,
stdin: stdin,
pending: make(map[int]*pendingRPC),
pendingTools: make(map[string]*pendingToolCall),
onMessage: func(msg Message) {
// hermesClient.handleToolCallStart has already mapped
// the raw ACP title via hermesToolNameFromTitle — which
// covers lowercase hermes-style titles ("read:", "patch
// (replace)", …) but not capitalised kimi-style ones
// ("Read file: …", "Run command: …"). Re-normalise so
// the UI sees consistent snake_case identifiers across
// both backends. No-op when the name is already normal
// form (e.g. already mapped to "read_file").
if msg.Type == MessageToolUse {
msg.Tool = kimiToolNameFromTitle(msg.Tool)
}
if msg.Type == MessageText {
outputMu.Lock()
output.WriteString(msg.Content)
outputMu.Unlock()
}
trySend(msgCh, msg)
},
onPromptDone: func(result hermesPromptResult) {
select {
case promptDone <- result:
default:
}
},
}
// Start reading stdout in background.
readerDone := make(chan struct{})
go func() {
defer close(readerDone)
scanner := bufio.NewScanner(stdout)
scanner.Buffer(make([]byte, 0, 1024*1024), 10*1024*1024)
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if line == "" {
continue
}
c.handleLine(line)
}
c.closeAllPending(fmt.Errorf("kimi process exited"))
}()
// Drive the ACP session lifecycle in a goroutine.
go func() {
defer cancel()
defer close(msgCh)
defer close(resCh)
defer func() {
stdin.Close()
_ = cmd.Wait()
}()
startTime := time.Now()
finalStatus := "completed"
var finalError string
var sessionID string
// 1. Initialize handshake.
_, err := c.request(runCtx, "initialize", map[string]any{
"protocolVersion": 1,
"clientInfo": map[string]any{
"name": "multica-agent-sdk",
"version": "0.2.0",
},
"clientCapabilities": map[string]any{},
})
if err != nil {
finalStatus = "failed"
finalError = fmt.Sprintf("kimi initialize failed: %v", err)
resCh <- Result{Status: finalStatus, Error: finalError, DurationMs: time.Since(startTime).Milliseconds()}
return
}
// 2. Create or resume a session.
cwd := opts.Cwd
if cwd == "" {
cwd = "."
}
if opts.ResumeSessionID != "" {
result, err := c.request(runCtx, "session/resume", map[string]any{
"cwd": cwd,
"sessionId": opts.ResumeSessionID,
})
if err != nil {
finalStatus = "failed"
finalError = fmt.Sprintf("kimi session/resume failed: %v", err)
resCh <- Result{Status: finalStatus, Error: finalError, DurationMs: time.Since(startTime).Milliseconds()}
return
}
sessionID = opts.ResumeSessionID
_ = result
} else {
result, err := c.request(runCtx, "session/new", map[string]any{
"cwd": cwd,
"mcpServers": []any{},
})
if err != nil {
finalStatus = "failed"
finalError = fmt.Sprintf("kimi session/new failed: %v", err)
resCh <- Result{Status: finalStatus, Error: finalError, DurationMs: time.Since(startTime).Milliseconds()}
return
}
sessionID = extractACPSessionID(result)
if sessionID == "" {
finalStatus = "failed"
finalError = "kimi session/new returned no session ID"
resCh <- Result{Status: finalStatus, Error: finalError, DurationMs: time.Since(startTime).Milliseconds()}
return
}
}
c.sessionID = sessionID
b.cfg.Logger.Info("kimi session created", "session_id", sessionID)
// 3. If the caller picked a model (via agent.model from the
// UI dropdown), ask kimi to switch the session to it before
// we send any prompt. Kimi's ACP server exposes
// `session/set_model` and advertises available models via
// the `models.availableModels` block returned by
// `session/new` — we pass the chosen modelId through
// verbatim. This MUST fail the task on error: silently
// falling back to kimi's default model would let the user
// believe their pick was honoured while the task actually
// ran on something else.
if opts.Model != "" {
if _, err := c.request(runCtx, "session/set_model", map[string]any{
"sessionId": sessionID,
"modelId": opts.Model,
}); err != nil {
b.cfg.Logger.Warn("kimi set_session_model failed", "error", err, "requested_model", opts.Model)
finalStatus = "failed"
finalError = fmt.Sprintf("kimi could not switch to model %q: %v", opts.Model, err)
resCh <- Result{
Status: finalStatus,
Error: finalError,
DurationMs: time.Since(startTime).Milliseconds(),
SessionID: sessionID,
}
return
}
b.cfg.Logger.Info("kimi session model set", "model", opts.Model)
}
// 4. Build the prompt content. If we have a system prompt, prepend it.
userText := prompt
if opts.SystemPrompt != "" {
userText = opts.SystemPrompt + "\n\n---\n\n" + prompt
}
// 5. Send the prompt and wait for PromptResponse.
_, err = c.request(runCtx, "session/prompt", map[string]any{
"sessionId": sessionID,
"prompt": []map[string]any{
{"type": "text", "text": userText},
},
})
if err != nil {
if runCtx.Err() == context.DeadlineExceeded {
finalStatus = "timeout"
finalError = fmt.Sprintf("kimi timed out after %s", timeout)
} else if runCtx.Err() == context.Canceled {
finalStatus = "aborted"
finalError = "execution cancelled"
} else {
finalStatus = "failed"
finalError = fmt.Sprintf("kimi session/prompt failed: %v", err)
}
} else {
select {
case pr := <-promptDone:
if pr.stopReason == "cancelled" {
finalStatus = "aborted"
finalError = "kimi cancelled the prompt"
}
c.usageMu.Lock()
c.usage.InputTokens += pr.usage.InputTokens
c.usage.OutputTokens += pr.usage.OutputTokens
c.usageMu.Unlock()
default:
}
}
duration := time.Since(startTime)
b.cfg.Logger.Info("kimi finished", "pid", cmd.Process.Pid, "status", finalStatus, "duration", duration.Round(time.Millisecond).String())
stdin.Close()
cancel()
<-readerDone
outputMu.Lock()
finalOutput := output.String()
outputMu.Unlock()
// If kimi produced no visible output but we sniffed a
// provider-level error on stderr (typically HTTP 4xx from
// api.kimi.com — token expired, rate-limited, upstream
// 5xx, …), promote the status to failed and surface the
// real reason. Without this the daemon reports a cryptic
// "completed + empty output" and the actionable error
// stays buried in daemon logs.
if finalStatus == "completed" && finalOutput == "" {
if msg := providerErr.message(); msg != "" {
finalStatus = "failed"
finalError = msg
}
}
c.usageMu.Lock()
u := c.usage
c.usageMu.Unlock()
var usageMap map[string]TokenUsage
if u.InputTokens > 0 || u.OutputTokens > 0 || u.CacheReadTokens > 0 {
model := opts.Model
if model == "" {
model = "unknown"
}
usageMap = map[string]TokenUsage{model: u}
}
resCh <- Result{
Status: finalStatus,
Output: finalOutput,
Error: finalError,
DurationMs: duration.Milliseconds(),
SessionID: sessionID,
Usage: usageMap,
}
}()
return &Session{Messages: msgCh, Result: resCh}, nil
}
// kimiToolNameFromTitle normalises tool names emitted by Kimi's ACP
// server into the snake_case identifiers the Multica UI expects.
//
// Kimi follows the ACP spec where `title` is a short human-readable
// label such as "Read file: /path/to/foo.go" or "Run command: ls".
// hermesToolNameFromTitle upstream handles hermes' lowercase
// convention ("read:", "patch (replace)") but not kimi's capitalised
// format — so we get called on the already-mapped name from hermes
// and fix up anything that slipped through. Empty input returns "".
func kimiToolNameFromTitle(title string) string {
t := strings.TrimSpace(title)
if t == "" {
return ""
}
// Strip everything after the first colon — ACP titles often look like
// "Tool Name: argument detail" and we want only the tool name.
if idx := strings.Index(t, ":"); idx > 0 {
t = strings.TrimSpace(t[:idx])
}
lower := strings.ToLower(t)
switch lower {
case "read", "read file":
return "read_file"
case "write", "write file":
return "write_file"
case "edit", "patch":
return "edit_file"
case "shell", "bash", "terminal", "run command", "run shell command":
return "terminal"
case "search", "grep", "find":
return "search_files"
case "glob":
return "glob"
case "web search":
return "web_search"
case "fetch", "web fetch":
return "web_fetch"
case "todo", "todo write":
return "todo_write"
}
// Fallback: snake_case the title so the UI gets a stable identifier.
return strings.ReplaceAll(lower, " ", "_")
}

View File

@@ -0,0 +1,215 @@
package agent
import (
"context"
"log/slog"
"os"
"path/filepath"
"strings"
"testing"
"time"
)
func TestNewReturnsKimiBackend(t *testing.T) {
t.Parallel()
b, err := New("kimi", Config{ExecutablePath: "/nonexistent/kimi"})
if err != nil {
t.Fatalf("New(kimi) error: %v", err)
}
if _, ok := b.(*kimiBackend); !ok {
t.Fatalf("expected *kimiBackend, got %T", b)
}
}
func TestKimiToolNameFromTitle(t *testing.T) {
t.Parallel()
tests := []struct {
title string
want string
}{
{"Read file: /tmp/foo.go", "read_file"},
{"read", "read_file"},
{"Write: /tmp/bar.go", "write_file"},
{"Edit", "edit_file"},
{"Patch: /tmp/x", "edit_file"},
{"Shell: ls -la", "terminal"},
{"Bash", "terminal"},
{"Run command: pwd", "terminal"},
{"Search: foo", "search_files"},
{"Glob: *.go", "glob"},
{"Web search: golang acp", "web_search"},
{"Fetch: https://example.com", "web_fetch"},
{"Todo Write", "todo_write"},
// Fallback: snake_case the title.
{"Custom Thing", "custom_thing"},
// Empty input returns empty — caller decides how to react.
{"", ""},
}
for _, tt := range tests {
got := kimiToolNameFromTitle(tt.title)
if got != tt.want {
t.Errorf("kimiToolNameFromTitle(%q) = %q, want %q", tt.title, got, tt.want)
}
}
}
// fakeKimiACPScript returns a POSIX-sh script that impersonates
// `kimi acp` for a single short ACP session: it acks initialize /
// session/new and then replies to session/set_model with a JSON-RPC
// error — the scenario the kimiBackend must propagate as a failed
// task rather than silently falling back to the default model.
func fakeKimiACPScript() string {
return `#!/bin/sh
# Fake ` + "`kimi`" + ` binary — used by TestKimiBackendSetModelFailureFailsTask
# and TestKimiBackendPassesYoloFlag.
#
# Writes the full argv (one arg per line) to $KIMI_ARGS_FILE if that env
# var is set, so tests can assert that the daemon invokes us with the
# right flags (`+"`--yolo acp`"+`, not bare `+"`acp`"+`).
#
# Then reads one JSON-RPC request per line from stdin, matches on the
# method name, and writes back a canned response. Exits after set_model
# so the kimiBackend cleanup path can run.
if [ -n "$KIMI_ARGS_FILE" ]; then
for arg in "$@"; do
printf '%s\n' "$arg" >> "$KIMI_ARGS_FILE"
done
fi
while IFS= read -r line; do
id=$(printf '%s' "$line" | sed -n 's/.*"id":\([0-9]*\).*/\1/p')
case "$line" in
*'"method":"initialize"'*)
printf '{"jsonrpc":"2.0","id":%s,"result":{"protocolVersion":1,"agentCapabilities":{}}}\n' "$id"
;;
*'"method":"session/new"'*)
printf '{"jsonrpc":"2.0","id":%s,"result":{"sessionId":"ses_fake"}}\n' "$id"
;;
*'"method":"session/set_model"'*)
printf '{"jsonrpc":"2.0","id":%s,"error":{"code":-32602,"message":"model not available: bogus-model"}}\n' "$id"
exit 0
;;
esac
done
`
}
// TestKimiBackendSetModelFailureFailsTask pins the "don't silently
// fall back" behaviour that landed in this PR: when kimi rejects the
// caller-selected model via session/set_model, the task result must
// report status=failed with a message that names the model and the
// upstream error — not claim success while actually running on the
// default model.
func TestKimiBackendSetModelFailureFailsTask(t *testing.T) {
t.Parallel()
fakePath := filepath.Join(t.TempDir(), "kimi")
if err := os.WriteFile(fakePath, []byte(fakeKimiACPScript()), 0o755); err != nil {
t.Fatalf("write fake kimi: %v", err)
}
backend, err := New("kimi", Config{ExecutablePath: fakePath, Logger: slog.Default()})
if err != nil {
t.Fatalf("new kimi backend: %v", err)
}
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
session, err := backend.Execute(ctx, "prompt-ignored", ExecOptions{
Model: "bogus-model",
Timeout: 5 * time.Second,
})
if err != nil {
t.Fatalf("execute: %v", err)
}
// Drain message stream so the lifecycle goroutine can progress.
go func() {
for range session.Messages {
}
}()
select {
case result, ok := <-session.Result:
if !ok {
t.Fatal("result channel closed without a value")
}
if result.Status != "failed" {
t.Fatalf("expected status=failed, got %q (error=%q)", result.Status, result.Error)
}
if !strings.Contains(result.Error, `could not switch to model "bogus-model"`) {
t.Errorf("expected error to name the requested model, got %q", result.Error)
}
if !strings.Contains(result.Error, "model not available") {
t.Errorf("expected error to surface upstream message, got %q", result.Error)
}
if result.SessionID != "ses_fake" {
t.Errorf("expected session id to be preserved on failure, got %q", result.SessionID)
}
case <-time.After(10 * time.Second):
t.Fatal("timeout waiting for result")
}
}
// TestKimiBackendInvokesACPSubcommand pins the argv for `kimi`. An
// earlier fix tried passing `--yolo` to bypass per-tool approval
// prompts, but the `acp` subcommand in kimi-cli takes no options
// (see cli/__init__.py @cli.command def acp()), so `--yolo` was a
// no-op and the daemon still hung for 5 min on the first Shell call.
// The actual bypass is in hermesClient.handleAgentRequest, which
// auto-approves session/request_permission. This test catches
// accidental re-introduction of the dead flag.
func TestKimiBackendInvokesACPSubcommand(t *testing.T) {
t.Parallel()
tempDir := t.TempDir()
argsFile := filepath.Join(tempDir, "argv.txt")
fakePath := filepath.Join(tempDir, "kimi")
if err := os.WriteFile(fakePath, []byte(fakeKimiACPScript()), 0o755); err != nil {
t.Fatalf("write fake kimi: %v", err)
}
backend, err := New("kimi", Config{
ExecutablePath: fakePath,
Logger: slog.Default(),
Env: map[string]string{"KIMI_ARGS_FILE": argsFile},
})
if err != nil {
t.Fatalf("new kimi backend: %v", err)
}
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
// Set Model so the fake binary exits on set_model and we don't
// have to wait for the prompt branch. We only care about argv here.
session, err := backend.Execute(ctx, "prompt-ignored", ExecOptions{
Model: "bogus-model",
Timeout: 5 * time.Second,
})
if err != nil {
t.Fatalf("execute: %v", err)
}
go func() {
for range session.Messages {
}
}()
<-session.Result
raw, err := os.ReadFile(argsFile)
if err != nil {
t.Fatalf("read args file: %v", err)
}
lines := strings.Split(strings.TrimSpace(string(raw)), "\n")
if len(lines) < 1 {
t.Fatalf("expected at least 1 arg (acp), got %d: %q", len(lines), lines)
}
if lines[0] != "acp" {
t.Errorf("expected first arg to be acp, got %q (full: %q)", lines[0], lines)
}
for _, l := range lines {
switch l {
case "--yolo", "--auto-approve", "--yes", "-y":
t.Errorf("kimi acp doesn't accept %q; auto-approval is handled in hermesClient.handleAgentRequest", l)
}
}
}

View File

@@ -71,6 +71,10 @@ func ListModels(ctx context.Context, providerType, executablePath string) ([]Mod
return cachedDiscovery(providerType, func() ([]Model, error) {
return discoverHermesModels(ctx, executablePath)
})
case "kimi":
return cachedDiscovery(providerType, func() ([]Model, error) {
return discoverKimiModels(ctx, executablePath)
})
case "opencode":
return cachedDiscovery(providerType, func() ([]Model, error) {
return discoverOpenCodeModels(ctx, executablePath)
@@ -317,8 +321,52 @@ func parsePiModels(output string) []Model {
// error) all return an empty list so the UI falls back to the
// creatable manual-entry input instead of blocking the form.
func discoverHermesModels(ctx context.Context, executablePath string) ([]Model, error) {
return discoverACPModels(ctx, executablePath, acpDiscoveryProvider{
defaultBin: "hermes",
clientName: "multica-model-discovery",
extraEnv: []string{"HERMES_YOLO_MODE=1"},
tmpdirPrefix: "multica-hermes-discovery-",
})
}
// discoverKimiModels spins up a throwaway `kimi acp` process and
// drives the same minimal ACP handshake as Hermes to surface the
// model catalog advertised by Kimi's `session/new` response. Kimi's
// ACPServer.new_session returns a `models` block of the same shape
// (`availableModels`/`currentModelId`) so the parsing path is shared.
//
// Failure modes (kimi missing, not logged in, config error) all
// return an empty list so the UI falls back to manual entry.
func discoverKimiModels(ctx context.Context, executablePath string) ([]Model, error) {
return discoverACPModels(ctx, executablePath, acpDiscoveryProvider{
defaultBin: "kimi",
clientName: "multica-model-discovery",
tmpdirPrefix: "multica-kimi-discovery-",
})
}
// acpDiscoveryProvider configures how discoverACPModels launches an
// ACP-speaking agent CLI. The shared helper drives every CLI in
// the same way (initialize → session/new → parse models block) — the
// per-provider differences are which binary to spawn, which env
// vars suppress interactive prompts during init, and what to label
// temporary work directories so they're easy to identify in logs.
type acpDiscoveryProvider struct {
defaultBin string
clientName string
extraEnv []string
tmpdirPrefix string
}
// discoverACPModels runs the ACP handshake for any agent CLI that
// implements the standard `initialize` + `session/new` flow and
// advertises its model catalog in the response under
// `models.availableModels` / `models.currentModelId`. This covers
// Hermes and Kimi today; future ACP backends can plug in by adding
// an acpDiscoveryProvider entry instead of duplicating the loop.
func discoverACPModels(ctx context.Context, executablePath string, p acpDiscoveryProvider) ([]Model, error) {
if executablePath == "" {
executablePath = "hermes"
executablePath = p.defaultBin
}
if _, err := exec.LookPath(executablePath); err != nil {
return []Model{}, nil
@@ -327,8 +375,9 @@ func discoverHermesModels(ctx context.Context, executablePath string) ([]Model,
defer cancel()
cmd := exec.CommandContext(runCtx, executablePath, "acp")
// Mirror the real backend's auto-approve so init doesn't prompt.
cmd.Env = append(os.Environ(), "HERMES_YOLO_MODE=1")
if len(p.extraEnv) > 0 {
cmd.Env = append(os.Environ(), p.extraEnv...)
}
stdin, err := cmd.StdinPipe()
if err != nil {
return []Model{}, nil
@@ -370,16 +419,16 @@ func discoverHermesModels(ctx context.Context, executablePath string) ([]Model,
// Send initialize + session/new.
if err := writeACP(1, "initialize", map[string]any{
"protocolVersion": 1,
"clientInfo": map[string]any{"name": "multica-model-discovery", "version": "0.1.0"},
"clientInfo": map[string]any{"name": p.clientName, "version": "0.1.0"},
"clientCapabilities": map[string]any{},
}); err != nil {
return []Model{}, nil
}
// Hermes requires a valid cwd for session/new — use a temp
// directory we clean up afterwards, not the daemon's workdir
// (which might be in the middle of another task's worktree).
tmp, err := os.MkdirTemp("", "multica-hermes-discovery-")
// session/new requires a valid cwd — use a temp directory we
// clean up afterwards, not the daemon's workdir (which might
// be in the middle of another task's worktree).
tmp, err := os.MkdirTemp("", p.tmpdirPrefix)
if err != nil {
return []Model{}, nil
}
@@ -414,7 +463,7 @@ func discoverHermesModels(ctx context.Context, executablePath string) ([]Model,
if env.ID.String() != "2" || len(env.Result) == 0 {
continue
}
done <- parseHermesSessionNewModels(env.Result)
done <- parseACPSessionNewModels(env.Result)
return
}
}()
@@ -432,14 +481,15 @@ func discoverHermesModels(ctx context.Context, executablePath string) ([]Model,
}
}
// parseHermesSessionNewModels extracts the model catalog from a
// hermes `session/new` response. Hermes' ACP schema emits:
// parseACPSessionNewModels extracts the model catalog from an ACP
// `session/new` response. Both Hermes and Kimi (and any other ACP
// agent that follows the standard schema) emit:
//
// {
// "sessionId": "...",
// "models": {
// "availableModels": [
// {"modelId": "...", "name": "...", "description": "... current"}
// {"modelId": "...", "name": "...", "description": "..."}
// ],
// "currentModelId": "..."
// }
@@ -448,7 +498,7 @@ func discoverHermesModels(ctx context.Context, executablePath string) ([]Model,
// Returns nil (not an empty slice) when the payload is missing so
// the caller can distinguish "parsed with no models" (valid but
// empty catalog) from "couldn't find the structure at all".
func parseHermesSessionNewModels(raw json.RawMessage) []Model {
func parseACPSessionNewModels(raw json.RawMessage) []Model {
var resp struct {
Models struct {
AvailableModels []struct {

View File

@@ -258,7 +258,7 @@ func TestParseHermesSessionNewModels(t *testing.T) {
"currentModelId": "nous:anthropic/claude-opus-4.7"
}
}`)
models := parseHermesSessionNewModels(raw)
models := parseACPSessionNewModels(raw)
if len(models) != 2 {
t.Fatalf("expected 2 models (duplicate deduped), got %d: %+v", len(models), models)
}
@@ -281,13 +281,13 @@ func TestParseHermesSessionNewModelsMissingField(t *testing.T) {
// failed _build_model_state — should yield nil so the caller
// can distinguish "no catalog" from "empty catalog".
raw := []byte(`{"sessionId": "ses_123"}`)
if got := parseHermesSessionNewModels(raw); got != nil && len(got) != 0 {
if got := parseACPSessionNewModels(raw); got != nil && len(got) != 0 {
t.Errorf("expected nil/empty, got %+v", got)
}
}
func TestParseHermesSessionNewModelsGarbage(t *testing.T) {
if got := parseHermesSessionNewModels([]byte("not json")); got != nil {
if got := parseACPSessionNewModels([]byte("not json")); got != nil {
t.Errorf("expected nil for non-JSON, got %+v", got)
}
}