mirror of
https://github.com/multica-ai/multica.git
synced 2026-06-17 03:38:32 +02:00
feat(agent): add Kimi CLI as agent runtime (#1400)
* feat(agent): add Kimi CLI as agent runtime
Adds support for Moonshot AI's Kimi Code CLI (https://github.com/MoonshotAI/kimi-cli)
as a new agent runtime, alongside Claude, Codex, OpenCode, OpenClaw, Hermes,
Gemini, Pi, Cursor and Copilot.
Kimi Code CLI implements the standard Agent Client Protocol (ACP) via the
`kimi acp` subcommand, so the new `kimiBackend` reuses the existing
hermesClient JSON-RPC transport in the agent package — only the binary,
client identity, log prefix, and tool-name extraction differ.
Wiring:
- server/pkg/agent: new kimiBackend + kimi_test.go; registered in New(),
LaunchHeader map, and the supported-types coverage test.
- server/internal/daemon/config.go: probes `kimi` (overridable via
MULTICA_KIMI_PATH / MULTICA_KIMI_MODEL).
- server/internal/daemon/execenv: writes AGENTS.md as the runtime context
file (Kimi reads AGENTS.md natively via /init), and writes skills under
`.kimi/skills/` so they are auto-discovered by the project-level skill
loader.
- packages/views/runtimes: ProviderLogo gains a Kimi mark.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* feat(agent/kimi): support per-agent model selection via ACP set_model
Wire Kimi into the model dropdown introduced in #1399:
- ListModels gets a 'kimi' case that drives the same ACP
initialize + session/new handshake as Hermes; both share a new
discoverACPModels helper and parseACPSessionNewModels parser
so future ACP backends only need a small provider entry.
- kimiBackend now issues session/set_model after session/new when
opts.Model is non-empty, mirroring the Hermes flow. Failures
fail the task instead of silently falling back to Kimi's
default model — silent fallback would hide that the dropdown
pick wasn't honoured.
Verified: go build ./..., go test ./pkg/agent/... ./internal/daemon/... ./internal/handler/..., pnpm typecheck and pnpm test (138 passed).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* refactor(agent): address code review feedback on Kimi runtime
- Share ACP provider-error sniffer between hermes and kimi. Previously
only hermes promoted stderr-observed 4xx/5xx into a failed task;
kimi would report "completed + empty output" when the Moonshot
upstream rejected a request (expired token, rate limit, …). Rename
hermesProviderErrorSniffer → acpProviderErrorSniffer and parameterise
the provider name; wire it into kimiBackend.Execute the same way.
- Rename extractHermesSessionID → extractACPSessionID (shared by all
ACP backends) so the name matches parseACPSessionNewModels.
- Drop the redundant second argument to kimiToolNameFromTitle; the
Message struct has only one relevant field (Tool), so passing it
twice was a dead fallback. Document that the function normalises
residual capitalised kimi titles not caught by hermesToolNameFromTitle.
- Remove kimi-only cmd.WaitDelay override; the hermes baseline is
fine for both and divergence adds noise.
- Add TestKimiBackendSetModelFailureFailsTask: fake `kimi acp` binary
that returns a JSON-RPC error for session/set_model, asserts that
the task result surfaces status=failed with the model name + upstream
message and preserves the session id.
- Fix stale agent listings in agent.go / daemon/config.go doc comments
(missing cursor, gemini, copilot).
All: `go build ./...`, `go vet ./...`, `go test ./pkg/agent/...
./internal/daemon/... ./internal/handler/...` green.
* fix(agent/kimi): pass --yolo so Shell tools don't hang on approval
Kimi's default config has `default_yolo = false`. Every Shell/file-mutating
tool call causes kimi acp to send a `session/request_permission` request
and block (up to 300s) waiting for a response. The daemon's hermesClient
only handles `session/update` notifications — permission requests go
unanswered, the tool call times out, and the UI loop eventually dies
("UI loop timed out"). Observed with the first real kimi task: agent sat
as Live for ~7 minutes before the daemon killed it.
The fix mirrors hermes' HERMES_YOLO_MODE=1 override: pass `--yolo` to
`kimi` so it auto-approves everything. `--yolo` is a top-level flag on
the `kimi` CLI (not a flag on `kimi acp`), so it must come before the
`acp` subcommand in argv. Added to kimiBlockedArgs so user custom_args
can't strip it.
While here, fix a related bug that made kimi tool names show up empty
in the daemon log ("tool #1: "): hermesToolNameFromTitle's fallback
returned `kind` when neither title-with-colon nor kind matched a known
tool. Kimi's ACP `tool_call` emits bare titles like "Shell" or "Read
file" with no `kind` at all, so we'd drop the title on the floor before
kimiToolNameFromTitle ever got a chance to map it. Now: preserve the
title when kind is unclassified; hermes titles always carry a colon so
this branch never fires for hermes.
Tests:
- TestKimiBackendPassesYoloFlag — fake binary that records its argv,
asserts --yolo comes before acp.
- TestHermesToolNameFromTitle rows for bare kimi-style titles.
- Existing suite green: go build, go vet, full pkg/agent + daemon +
handler test packages.
* fix(agent/acp): auto-approve session/request_permission from agent
The previous attempt (`kimi --yolo acp`) was a no-op. Inspected the
kimi-cli source: the `acp` Typer subcommand takes no parameters, so
flags on the root `kimi` command are dropped before `acp_main()` runs
— it's impossible to opt into YOLO mode through CLI flags for ACP.
The real fix is on our side: respond to session/request_permission.
ACP is bidirectional. When kimi runs a Shell or file-write tool, it
sends `session/request_permission` (agent → client, JSON-RPC request
with id + method) and waits up to 300s for a response. Our existing
hermesClient.handleLine only dispatched: (id + result/error) →
handleResponse, and (no id + method) → handleNotification. A request
with BOTH id and method fell through and got silently dropped — kimi
timed out, UI loop died, task sat stuck for 7 minutes.
Add handleAgentRequest: for session/request_permission, echo the id
and respond with outcome=selected, optionId=approve_for_session. The
daemon is headless; there's no user to prompt. `approve_for_session`
lets the agent remember the action so subsequent identical calls
(every Shell, every file write) skip the round-trip entirely. For any
other agent → client method, reply with standard -32601 method-not-
found so the agent doesn't block.
Also:
- Add writeMu so request() (main goroutine) and handleAgentRequest
(reader goroutine) don't interleave JSON frames on stdin.
- Revert the `--yolo acp` flag — it's a no-op, and carrying it in
kimiBlockedArgs gives the wrong impression that it does something.
Comment in kimi.go now points at handleAgentRequest as the real fix.
Tests:
- TestHermesClientAutoApprovesPermissionRequest: inject a
session/request_permission, assert the reply echoes the id and
carries {outcome: selected, optionId: approve_for_session}.
- TestHermesClientReplesMethodNotFoundForUnknownAgentRequest: confirm
unknown agent → client methods get JSON-RPC -32601 instead of silence.
- TestKimiBackendInvokesACPSubcommand replaces the yolo-flag assertion
with a negative assertion: no dead --yolo / --auto-approve / -y on
argv, since they'd pretend to do something they can't.
All: go build ./..., go vet ./..., go test ./pkg/agent/... green.
* fix(agent/acp): surface kimi tool input/output via content blocks
Kimi-cli emits tool_call and tool_call_update ACP frames with the
input/output inside a `content` array of ContentToolCallContent
blocks (shape: {type:"content", content:{type:"text", text:"..."}}),
not in the hermes-style `rawInput` map / `rawOutput` string. Our
parser only looked at rawInput/rawOutput, so the daemon recorded
empty Input and Output for every kimi tool — the execution-history
UI showed blank terminal panels even for commands that ran fine.
Add extractACPToolCallText() and a fallback in handleToolCallStart /
handleToolCallUpdate: when rawInput is nil / rawOutput is empty, pull
the text out of the content blocks. rawInput / rawOutput still take
precedence so hermes' behaviour is untouched. Terminal /
FileEditToolCallContent blocks are skipped (we have nothing to render
them as — kimi only emits TerminalToolCallContent when the client
advertises terminal capability, which we don't).
Tests:
- TestHermesClientHandleToolCallStartKimiContent — content array →
Input.text populated.
- TestHermesClientHandleToolCallCompleteKimiContent — multi-block
content → Output concatenated with newline separator.
- TestHermesClientHandleToolCallRawOutputTakesPrecedence — hermes
rawOutput still wins when both are present.
- TestExtractACPToolCallText — unit coverage for the helper
(single/multiple text blocks, terminal-block skip, empty input).
* fix(agent/acp): buffer streaming tool args so Input isn't empty in UI
kimi-cli streams tool args token-by-token via tool_call_update frames
— the initial tool_call carries an empty content block and each
subsequent in_progress update carries the cumulative JSON so far
(`{`, `{"comma`, `{"command": "echo`, …). The final completed update
then carries the tool's stdout, not the args. Observed per kimi-cli
acp/session.py::_send_tool_call{,_part,_result} and confirmed by
driving a real Shell call end-to-end: 10 in_progress frames, last
with `{"command": "echo hello world"}`, then completed with `hello
world\n`.
Our previous handleToolCallStart emitted MessageToolUse on the first
tool_call frame, capturing the empty content — so every kimi tool
appeared in the execution-history UI with a blank input. Output was
correct (fix 4335c198) but command was missing.
Changes:
- hermesClient now tracks pending tool calls per toolCallId. Hermes
path is unchanged — rawInput is present at tool_call time, so
emit-immediately-then-flag-emitted still fires on the initial frame.
- kimi path defers MessageToolUse until status=completed / failed.
tool_call_update in_progress frames update the buffered argsText
(cumulative, so overwrite); on completion we parse the accumulated
JSON into Message.Input. Malformed JSON falls back to `{"text": …}`
so non-JSON tool args still render.
- Orphan completion frames (no matching tool_call seen — e.g. daemon
restarted mid-task) synthesise ToolUse from the update's own
title/kind/rawInput so the UI still gets a header.
- extractACPToolCallText now also renders FileEditToolCallContent
blocks as a compact header ("--- path / +++ path / (edited: N → M
bytes)"). kimi emits these for Write / StrReplaceFile / Patch when
the tool's display block is a DiffDisplayBlock.
Tests:
- TestHermesClientKimiStreamingToolCall: empty tool_call + 5 streaming
in_progress + completed. Asserts no emission until complete, then
[ToolUse(Input.command="echo hi"), ToolResult(Output="hi\n")].
- TestHermesClientKimiMalformedArgsFallback: non-JSON argsText → falls
back to Input.text.
- TestHermesClientHandleToolCallCompleteOrphan: completed frame
without a start → ToolUse synthesised from update's rawInput.
- TestExtractACPToolCallText: diff + new-file-diff cases.
All agent / daemon / handler test packages green.
---------
Co-authored-by: Eve <8b0578a3-cf72-4394-9e38-b328eca92463@users.noreply.multica.ai>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: Lambda <f252c2c5-7d1d-4f3c-b394-a61abfe673fc@users.noreply.multica.ai>
This commit is contained in:
@@ -111,6 +111,20 @@ function CursorLogo({ className }: { className: string }) {
|
||||
);
|
||||
}
|
||||
|
||||
// Kimi (Moonshot AI) — wordmark "K" mark in Moonshot brand purple, simple
|
||||
// rounded-square logotype suitable for small icon sizes.
|
||||
function KimiLogo({ className }: { className: string }) {
|
||||
return (
|
||||
<svg viewBox="0 0 24 24" fill="none" className={className}>
|
||||
<rect width="24" height="24" rx="5" fill="#1F1147" />
|
||||
<path
|
||||
d="M7.2 6h2.4v5.1l4.3-5.1h2.9l-4.4 5.1L17 18h-2.9l-3.2-5.2-1.3 1.5V18H7.2V6z"
|
||||
fill="#FFFFFF"
|
||||
/>
|
||||
</svg>
|
||||
);
|
||||
}
|
||||
|
||||
export function ProviderLogo({
|
||||
provider,
|
||||
className = "h-4 w-4",
|
||||
@@ -135,6 +149,8 @@ export function ProviderLogo({
|
||||
return <CopilotLogo className={className} />;
|
||||
case "cursor":
|
||||
return <CursorLogo className={className} />;
|
||||
case "kimi":
|
||||
return <KimiLogo className={className} />;
|
||||
default:
|
||||
return <Monitor className={className} />;
|
||||
}
|
||||
|
||||
@@ -34,7 +34,7 @@ type Config struct {
|
||||
CLIVersion string // multica CLI version (e.g. "0.1.13")
|
||||
LaunchedBy string // "desktop" when spawned by the Electron app, empty for standalone
|
||||
Profile string // profile name (empty = default)
|
||||
Agents map[string]AgentEntry // keyed by provider: claude, codex, opencode, openclaw, hermes, gemini, pi
|
||||
Agents map[string]AgentEntry // keyed by provider: claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, cursor, kimi
|
||||
WorkspacesRoot string // base path for execution envs (default: ~/multica_workspaces)
|
||||
KeepEnvAfterTask bool // preserve env after task for debugging
|
||||
HealthPort int // local HTTP port for health checks (default: 19514)
|
||||
@@ -142,8 +142,15 @@ func LoadConfig(overrides Overrides) (Config, error) {
|
||||
Model: strings.TrimSpace(os.Getenv("MULTICA_COPILOT_MODEL")),
|
||||
}
|
||||
}
|
||||
kimiPath := envOrDefault("MULTICA_KIMI_PATH", "kimi")
|
||||
if _, err := exec.LookPath(kimiPath); err == nil {
|
||||
agents["kimi"] = AgentEntry{
|
||||
Path: kimiPath,
|
||||
Model: strings.TrimSpace(os.Getenv("MULTICA_KIMI_MODEL")),
|
||||
}
|
||||
}
|
||||
if len(agents) == 0 {
|
||||
return Config{}, fmt.Errorf("no agent CLI found: install claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, or cursor-agent and ensure it is on PATH")
|
||||
return Config{}, fmt.Errorf("no agent CLI found: install claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, cursor-agent, or kimi and ensure it is on PATH")
|
||||
}
|
||||
|
||||
// Host info
|
||||
|
||||
@@ -17,6 +17,7 @@ import (
|
||||
// OpenCode: skills → {workDir}/.config/opencode/skills/{name}/SKILL.md (native discovery)
|
||||
// Pi: skills → {workDir}/.pi/agent/skills/{name}/SKILL.md (native discovery)
|
||||
// Cursor: skills → {workDir}/.cursor/skills/{name}/SKILL.md (native discovery)
|
||||
// Kimi: skills → {workDir}/.kimi/skills/{name}/SKILL.md (native discovery)
|
||||
// Default: skills → {workDir}/.agent_context/skills/{name}/SKILL.md
|
||||
func writeContextFiles(workDir, provider string, ctx TaskContextForEnv) error {
|
||||
contextDir := filepath.Join(workDir, ".agent_context")
|
||||
@@ -69,6 +70,10 @@ func resolveSkillsDir(workDir, provider string) (string, error) {
|
||||
case "cursor":
|
||||
// Cursor natively discovers skills from .cursor/skills/ in the workdir.
|
||||
skillsDir = filepath.Join(workDir, ".cursor", "skills")
|
||||
case "kimi":
|
||||
// Kimi Code CLI auto-discovers project-level skills from .kimi/skills/
|
||||
// in the workdir. See https://moonshotai.github.io/kimi-cli/en/customization/skills.html
|
||||
skillsDir = filepath.Join(workDir, ".kimi", "skills")
|
||||
default:
|
||||
// Fallback: write to .agent_context/skills/ (referenced by meta config).
|
||||
skillsDir = filepath.Join(workDir, ".agent_context", "skills")
|
||||
|
||||
@@ -18,13 +18,14 @@ import (
|
||||
// For Gemini: writes {workDir}/GEMINI.md (discovered natively by the Gemini CLI)
|
||||
// For Pi: writes {workDir}/AGENTS.md (skills discovered natively from ~/.pi/agent/skills/)
|
||||
// For Cursor: writes {workDir}/AGENTS.md (skills discovered natively from .cursor/skills/)
|
||||
// For Kimi: writes {workDir}/AGENTS.md (Kimi Code CLI reads AGENTS.md natively; skills auto-discovered from project skills dirs)
|
||||
func InjectRuntimeConfig(workDir, provider string, ctx TaskContextForEnv) error {
|
||||
content := buildMetaSkillContent(provider, ctx)
|
||||
|
||||
switch provider {
|
||||
case "claude":
|
||||
return os.WriteFile(filepath.Join(workDir, "CLAUDE.md"), []byte(content), 0o644)
|
||||
case "codex", "copilot", "opencode", "openclaw", "pi", "cursor":
|
||||
case "codex", "copilot", "opencode", "openclaw", "pi", "cursor", "kimi":
|
||||
return os.WriteFile(filepath.Join(workDir, "AGENTS.md"), []byte(content), 0o644)
|
||||
case "gemini":
|
||||
return os.WriteFile(filepath.Join(workDir, "GEMINI.md"), []byte(content), 0o644)
|
||||
@@ -151,8 +152,8 @@ func buildMetaSkillContent(provider string, ctx TaskContextForEnv) string {
|
||||
case "claude":
|
||||
// Claude discovers skills natively from .claude/skills/ — just list names.
|
||||
b.WriteString("You have the following skills installed (discovered automatically):\n\n")
|
||||
case "codex", "copilot", "opencode", "openclaw", "pi", "cursor":
|
||||
// Codex, Copilot, OpenCode, OpenClaw, Pi, and Cursor discover skills natively from their respective paths — just list names.
|
||||
case "codex", "copilot", "opencode", "openclaw", "pi", "cursor", "kimi":
|
||||
// Codex, Copilot, OpenCode, OpenClaw, Pi, Cursor, and Kimi discover skills natively from their respective paths — just list names.
|
||||
b.WriteString("You have the following skills installed (discovered automatically):\n\n")
|
||||
case "gemini":
|
||||
// Gemini reads GEMINI.md directly; point it at the fallback skills dir.
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
// Package agent provides a unified interface for executing prompts via
|
||||
// coding agents (Claude Code, Codex, OpenCode, OpenClaw, Hermes, Pi). It mirrors the happy-cli AgentBackend
|
||||
// coding agents (Claude Code, Codex, Copilot, OpenCode, OpenClaw, Hermes,
|
||||
// Gemini, Pi, Cursor, Kimi). It mirrors the happy-cli AgentBackend
|
||||
// pattern, translated to idiomatic Go.
|
||||
package agent
|
||||
|
||||
@@ -85,13 +86,13 @@ type Result struct {
|
||||
|
||||
// Config configures a Backend instance.
|
||||
type Config struct {
|
||||
ExecutablePath string // path to CLI binary (claude, codex, copilot, opencode, openclaw, hermes, gemini, or pi)
|
||||
ExecutablePath string // path to CLI binary (claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, cursor, kimi)
|
||||
Env map[string]string // extra environment variables
|
||||
Logger *slog.Logger
|
||||
}
|
||||
|
||||
// New creates a Backend for the given agent type.
|
||||
// Supported types: "claude", "codex", "copilot", "opencode", "openclaw", "hermes", "gemini", "pi", "cursor".
|
||||
// Supported types: "claude", "codex", "copilot", "opencode", "openclaw", "hermes", "gemini", "pi", "cursor", "kimi".
|
||||
func New(agentType string, cfg Config) (Backend, error) {
|
||||
if cfg.Logger == nil {
|
||||
cfg.Logger = slog.Default()
|
||||
@@ -116,8 +117,10 @@ func New(agentType string, cfg Config) (Backend, error) {
|
||||
return &piBackend{cfg: cfg}, nil
|
||||
case "cursor":
|
||||
return &cursorBackend{cfg: cfg}, nil
|
||||
case "kimi":
|
||||
return &kimiBackend{cfg: cfg}, nil
|
||||
default:
|
||||
return nil, fmt.Errorf("unknown agent type: %q (supported: claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, cursor)", agentType)
|
||||
return nil, fmt.Errorf("unknown agent type: %q (supported: claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, cursor, kimi)", agentType)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -142,6 +145,7 @@ var launchHeaders = map[string]string{
|
||||
"openclaw": "openclaw agent (json)",
|
||||
"opencode": "opencode run (json)",
|
||||
"pi": "pi (json mode)",
|
||||
"kimi": "kimi acp",
|
||||
}
|
||||
|
||||
// LaunchHeader returns the user-visible launch skeleton for agentType, or an
|
||||
|
||||
@@ -72,7 +72,7 @@ func TestLaunchHeaderCoversAllSupportedBackends(t *testing.T) {
|
||||
// entry to launchHeaders in agent.go and extend this list.
|
||||
supported := []string{
|
||||
"claude", "codex", "copilot", "cursor", "gemini",
|
||||
"hermes", "openclaw", "opencode", "pi",
|
||||
"hermes", "kimi", "openclaw", "opencode", "pi",
|
||||
}
|
||||
for _, t_ := range supported {
|
||||
if header := LaunchHeader(t_); header == "" {
|
||||
|
||||
@@ -8,6 +8,7 @@ import (
|
||||
"io"
|
||||
"os/exec"
|
||||
"regexp"
|
||||
"strconv"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
@@ -73,7 +74,7 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
|
||||
// without this we'd report a misleading "empty output" and hide
|
||||
// the real cause (wrong model for the current provider, bad
|
||||
// credentials, rate limit, …) in the daemon log.
|
||||
providerErr := newHermesProviderErrorSniffer()
|
||||
providerErr := newACPProviderErrorSniffer("hermes")
|
||||
cmd.Stderr = io.MultiWriter(newLogWriter(b.cfg.Logger, "[hermes:stderr] "), providerErr)
|
||||
|
||||
if err := cmd.Start(); err != nil {
|
||||
@@ -92,9 +93,10 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
|
||||
promptDone := make(chan hermesPromptResult, 1)
|
||||
|
||||
c := &hermesClient{
|
||||
cfg: b.cfg,
|
||||
stdin: stdin,
|
||||
pending: make(map[int]*pendingRPC),
|
||||
cfg: b.cfg,
|
||||
stdin: stdin,
|
||||
pending: make(map[int]*pendingRPC),
|
||||
pendingTools: make(map[string]*pendingToolCall),
|
||||
onMessage: func(msg Message) {
|
||||
if msg.Type == MessageText {
|
||||
outputMu.Lock()
|
||||
@@ -188,7 +190,7 @@ func (b *hermesBackend) Execute(ctx context.Context, prompt string, opts ExecOpt
|
||||
resCh <- Result{Status: finalStatus, Error: finalError, DurationMs: time.Since(startTime).Milliseconds()}
|
||||
return
|
||||
}
|
||||
sessionID = extractHermesSessionID(result)
|
||||
sessionID = extractACPSessionID(result)
|
||||
if sessionID == "" {
|
||||
finalStatus = "failed"
|
||||
finalError = "hermes session/new returned no session ID"
|
||||
@@ -336,6 +338,7 @@ type hermesPromptResult struct {
|
||||
type hermesClient struct {
|
||||
cfg Config
|
||||
stdin interface{ Write([]byte) (int, error) }
|
||||
writeMu sync.Mutex // serialises stdin.Write calls across goroutines
|
||||
mu sync.Mutex
|
||||
nextID int
|
||||
pending map[int]*pendingRPC
|
||||
@@ -343,10 +346,39 @@ type hermesClient struct {
|
||||
onMessage func(Message)
|
||||
onPromptDone func(hermesPromptResult)
|
||||
|
||||
// pendingTools buffers the args for tool calls whose input streams in
|
||||
// across multiple ACP tool_call_update messages (kimi does this —
|
||||
// tokens from the LLM arrive one at a time, and each update carries
|
||||
// the cumulative args JSON so far). We defer emitting MessageToolUse
|
||||
// until we either see status=completed/failed or have a full arg set,
|
||||
// so the UI never sees a half-written command like `{"comma`.
|
||||
toolMu sync.Mutex
|
||||
pendingTools map[string]*pendingToolCall
|
||||
|
||||
usageMu sync.Mutex
|
||||
usage TokenUsage
|
||||
}
|
||||
|
||||
// pendingToolCall buffers state for a tool call while its arguments
|
||||
// are streaming in. One entry per ACP toolCallId.
|
||||
type pendingToolCall struct {
|
||||
toolName string // already mapped via hermesToolNameFromTitle
|
||||
input map[string]any // from rawInput when the agent sends it up front (hermes)
|
||||
argsText string // accumulated `content[].text` args (kimi, cumulative)
|
||||
emitted bool // whether we've already sent MessageToolUse
|
||||
}
|
||||
|
||||
// writeLine serialises concurrent JSON-RPC writes so request() (main
|
||||
// goroutine) and handleAgentRequest() (reader goroutine) don't
|
||||
// interleave frames. The pipe itself is atomic for small writes, but
|
||||
// we also want deterministic ordering under contention.
|
||||
func (c *hermesClient) writeLine(data []byte) error {
|
||||
c.writeMu.Lock()
|
||||
defer c.writeMu.Unlock()
|
||||
_, err := c.stdin.Write(data)
|
||||
return err
|
||||
}
|
||||
|
||||
func (c *hermesClient) request(ctx context.Context, method string, params any) (json.RawMessage, error) {
|
||||
c.mu.Lock()
|
||||
id := c.nextID
|
||||
@@ -369,7 +401,7 @@ func (c *hermesClient) request(ctx context.Context, method string, params any) (
|
||||
return nil, err
|
||||
}
|
||||
data = append(data, '\n')
|
||||
if _, err := c.stdin.Write(data); err != nil {
|
||||
if err := c.writeLine(data); err != nil {
|
||||
c.mu.Lock()
|
||||
delete(c.pending, id)
|
||||
c.mu.Unlock()
|
||||
@@ -402,7 +434,11 @@ func (c *hermesClient) handleLine(line string) {
|
||||
return
|
||||
}
|
||||
|
||||
// Check if it's a response to our request (has id + result or error).
|
||||
// Agent → client request: has id + method (no result / error yet).
|
||||
// Kimi uses this for session/request_permission; if we don't answer,
|
||||
// the agent blocks for 300s and the task hangs. Hermes doesn't send
|
||||
// these when launched with HERMES_YOLO_MODE=1, but we still handle
|
||||
// the case generically for any future ACP backend we bolt on.
|
||||
if _, hasID := raw["id"]; hasID {
|
||||
if _, hasResult := raw["result"]; hasResult {
|
||||
c.handleResponse(raw)
|
||||
@@ -412,6 +448,10 @@ func (c *hermesClient) handleLine(line string) {
|
||||
c.handleResponse(raw)
|
||||
return
|
||||
}
|
||||
if _, hasMethod := raw["method"]; hasMethod {
|
||||
c.handleAgentRequest(raw)
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
// Notification (no id, has method) — session updates from Hermes.
|
||||
@@ -420,6 +460,62 @@ func (c *hermesClient) handleLine(line string) {
|
||||
}
|
||||
}
|
||||
|
||||
// handleAgentRequest replies to JSON-RPC requests the agent sends
|
||||
// us (agent → client direction). The only one we care about today is
|
||||
// `session/request_permission`: the daemon is headless and cannot
|
||||
// actually prompt a user, so we auto-approve every action. Using
|
||||
// `approve_for_session` rather than `approve` means subsequent
|
||||
// identical actions (every Shell invocation, every file write) don't
|
||||
// round-trip through us — the agent remembers them locally.
|
||||
func (c *hermesClient) handleAgentRequest(raw map[string]json.RawMessage) {
|
||||
var method string
|
||||
_ = json.Unmarshal(raw["method"], &method)
|
||||
|
||||
rawID, ok := raw["id"]
|
||||
if !ok {
|
||||
return
|
||||
}
|
||||
|
||||
var resp map[string]any
|
||||
switch method {
|
||||
case "session/request_permission":
|
||||
resp = map[string]any{
|
||||
"jsonrpc": "2.0",
|
||||
"id": json.RawMessage(rawID),
|
||||
"result": map[string]any{
|
||||
"outcome": map[string]any{
|
||||
"outcome": "selected",
|
||||
"optionId": "approve_for_session",
|
||||
},
|
||||
},
|
||||
}
|
||||
c.cfg.Logger.Debug("auto-approved agent permission request", "method", method)
|
||||
default:
|
||||
// Unknown agent→client method — reply with standard "method
|
||||
// not found" so the agent doesn't block waiting for us. Better
|
||||
// than silence: the agent can decide how to proceed.
|
||||
resp = map[string]any{
|
||||
"jsonrpc": "2.0",
|
||||
"id": json.RawMessage(rawID),
|
||||
"error": map[string]any{
|
||||
"code": -32601,
|
||||
"message": "method not found: " + method,
|
||||
},
|
||||
}
|
||||
c.cfg.Logger.Debug("unhandled agent→client request", "method", method)
|
||||
}
|
||||
|
||||
data, err := json.Marshal(resp)
|
||||
if err != nil {
|
||||
c.cfg.Logger.Warn("marshal agent-request response", "method", method, "error", err)
|
||||
return
|
||||
}
|
||||
data = append(data, '\n')
|
||||
if err := c.writeLine(data); err != nil {
|
||||
c.cfg.Logger.Warn("write agent-request response", "method", method, "error", err)
|
||||
}
|
||||
}
|
||||
|
||||
func (c *hermesClient) handleResponse(raw map[string]json.RawMessage) {
|
||||
var id int
|
||||
if err := json.Unmarshal(raw["id"], &id); err != nil {
|
||||
@@ -560,51 +656,283 @@ func (c *hermesClient) handleAgentThought(data json.RawMessage) {
|
||||
|
||||
func (c *hermesClient) handleToolCallStart(data json.RawMessage) {
|
||||
var msg struct {
|
||||
ToolCallID string `json:"toolCallId"`
|
||||
Title string `json:"title"`
|
||||
Kind string `json:"kind"`
|
||||
RawInput map[string]any `json:"rawInput"`
|
||||
ToolCallID string `json:"toolCallId"`
|
||||
Title string `json:"title"`
|
||||
Kind string `json:"kind"`
|
||||
RawInput map[string]any `json:"rawInput"`
|
||||
Content []json.RawMessage `json:"content"`
|
||||
}
|
||||
if err := json.Unmarshal(data, &msg); err != nil {
|
||||
return
|
||||
}
|
||||
|
||||
toolName := hermesToolNameFromTitle(msg.Title, msg.Kind)
|
||||
if c.onMessage != nil {
|
||||
c.onMessage(Message{
|
||||
Type: MessageToolUse,
|
||||
Tool: toolName,
|
||||
CallID: msg.ToolCallID,
|
||||
Input: msg.RawInput,
|
||||
|
||||
// Hermes pre-populates rawInput on the initial tool_call — emit
|
||||
// MessageToolUse immediately so the UI can show the tool invocation
|
||||
// live. Record the emission so handleToolCallUpdate doesn't re-emit
|
||||
// on completion.
|
||||
if msg.RawInput != nil {
|
||||
c.trackTool(msg.ToolCallID, &pendingToolCall{
|
||||
toolName: toolName,
|
||||
input: msg.RawInput,
|
||||
emitted: true,
|
||||
})
|
||||
if c.onMessage != nil {
|
||||
c.onMessage(Message{
|
||||
Type: MessageToolUse,
|
||||
Tool: toolName,
|
||||
CallID: msg.ToolCallID,
|
||||
Input: msg.RawInput,
|
||||
})
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
// Kimi streams args token-by-token across tool_call_update messages;
|
||||
// the initial tool_call often carries an empty content block. Buffer
|
||||
// the tool and defer MessageToolUse emission to avoid the UI seeing
|
||||
// a command with `{""` as its input.
|
||||
c.trackTool(msg.ToolCallID, &pendingToolCall{
|
||||
toolName: toolName,
|
||||
argsText: extractACPToolCallText(msg.Content),
|
||||
emitted: false,
|
||||
})
|
||||
}
|
||||
|
||||
func (c *hermesClient) handleToolCallUpdate(data json.RawMessage) {
|
||||
var msg struct {
|
||||
ToolCallID string `json:"toolCallId"`
|
||||
Status string `json:"status"`
|
||||
Kind string `json:"kind"`
|
||||
RawOutput string `json:"rawOutput"`
|
||||
ToolCallID string `json:"toolCallId"`
|
||||
Status string `json:"status"`
|
||||
Title string `json:"title"`
|
||||
Kind string `json:"kind"`
|
||||
RawInput map[string]any `json:"rawInput"`
|
||||
RawOutput string `json:"rawOutput"`
|
||||
Content []json.RawMessage `json:"content"`
|
||||
}
|
||||
if err := json.Unmarshal(data, &msg); err != nil {
|
||||
return
|
||||
}
|
||||
|
||||
// Only emit tool result when the call is completed.
|
||||
// Mid-stream: only buffer updates. Kimi emits many of these per
|
||||
// tool call, each carrying the cumulative args JSON so far.
|
||||
if msg.Status != "completed" && msg.Status != "failed" {
|
||||
if pending := c.getPendingTool(msg.ToolCallID); pending != nil && !pending.emitted {
|
||||
if text := extractACPToolCallText(msg.Content); text != "" {
|
||||
// kimi streams the full cumulative args on every frame;
|
||||
// overwrite rather than concatenate.
|
||||
pending.argsText = text
|
||||
}
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
// Completion: emit any deferred MessageToolUse first, then the result.
|
||||
pending := c.takePendingTool(msg.ToolCallID)
|
||||
c.emitDeferredToolUse(pending, msg.ToolCallID, msg.Title, msg.Kind, msg.RawInput)
|
||||
|
||||
output := msg.RawOutput
|
||||
if output == "" {
|
||||
output = extractACPToolCallText(msg.Content)
|
||||
}
|
||||
if c.onMessage != nil {
|
||||
c.onMessage(Message{
|
||||
Type: MessageToolResult,
|
||||
CallID: msg.ToolCallID,
|
||||
Output: msg.RawOutput,
|
||||
Output: output,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// trackTool stores pending-tool state for a given callID. Lazy-inits
|
||||
// the map so zero-value hermesClient values (common in tests) don't
|
||||
// panic on the first tool call.
|
||||
func (c *hermesClient) trackTool(callID string, p *pendingToolCall) {
|
||||
c.toolMu.Lock()
|
||||
defer c.toolMu.Unlock()
|
||||
if c.pendingTools == nil {
|
||||
c.pendingTools = make(map[string]*pendingToolCall)
|
||||
}
|
||||
c.pendingTools[callID] = p
|
||||
}
|
||||
|
||||
// getPendingTool returns the pending entry (may be nil) without
|
||||
// removing it. Safe to call on a zero-value hermesClient.
|
||||
func (c *hermesClient) getPendingTool(callID string) *pendingToolCall {
|
||||
c.toolMu.Lock()
|
||||
defer c.toolMu.Unlock()
|
||||
if c.pendingTools == nil {
|
||||
return nil
|
||||
}
|
||||
return c.pendingTools[callID]
|
||||
}
|
||||
|
||||
// takePendingTool removes and returns the pending entry, or nil if
|
||||
// none was tracked (e.g. the tool completed before we saw its start,
|
||||
// or we missed the start frame).
|
||||
func (c *hermesClient) takePendingTool(callID string) *pendingToolCall {
|
||||
c.toolMu.Lock()
|
||||
defer c.toolMu.Unlock()
|
||||
if c.pendingTools == nil {
|
||||
return nil
|
||||
}
|
||||
p := c.pendingTools[callID]
|
||||
delete(c.pendingTools, callID)
|
||||
return p
|
||||
}
|
||||
|
||||
// emitDeferredToolUse emits a buffered MessageToolUse right before the
|
||||
// matching MessageToolResult. Handles three cases:
|
||||
// - hermes tool: already emitted on tool_call → skip
|
||||
// - kimi tool with streamed args → parse accumulated JSON as Input
|
||||
// - unknown tool (completed arrived without a start frame) →
|
||||
// synthesize minimal info from the update's own fields
|
||||
func (c *hermesClient) emitDeferredToolUse(
|
||||
p *pendingToolCall,
|
||||
callID, updateTitle, updateKind string,
|
||||
updateRawInput map[string]any,
|
||||
) {
|
||||
if p != nil && p.emitted {
|
||||
return
|
||||
}
|
||||
|
||||
var toolName string
|
||||
var input map[string]any
|
||||
|
||||
switch {
|
||||
case p != nil && p.input != nil:
|
||||
// Pre-buffered rawInput path — shouldn't happen because we set
|
||||
// emitted=true in that case, but handle defensively.
|
||||
toolName = p.toolName
|
||||
input = p.input
|
||||
case p != nil:
|
||||
toolName = p.toolName
|
||||
input = parseToolArgsJSON(p.argsText)
|
||||
default:
|
||||
// No record of the start frame — fall back to the update's own
|
||||
// title/kind/rawInput so the UI at least sees the tool name.
|
||||
toolName = hermesToolNameFromTitle(updateTitle, updateKind)
|
||||
input = updateRawInput
|
||||
}
|
||||
|
||||
if c.onMessage == nil {
|
||||
return
|
||||
}
|
||||
c.onMessage(Message{
|
||||
Type: MessageToolUse,
|
||||
Tool: toolName,
|
||||
CallID: callID,
|
||||
Input: input,
|
||||
})
|
||||
}
|
||||
|
||||
// parseToolArgsJSON turns kimi's accumulated args string into the
|
||||
// structured map the UI expects under Message.Input. Kimi sends args
|
||||
// as a JSON-encoded object (`{"command":"echo hi"}`), so a full JSON
|
||||
// parse recovers the original tool-arg shape. On malformed input
|
||||
// (streaming glitch, non-JSON tool) we preserve the raw text under a
|
||||
// `text` key so the UI still has something to render.
|
||||
func parseToolArgsJSON(argsText string) map[string]any {
|
||||
argsText = strings.TrimSpace(argsText)
|
||||
if argsText == "" {
|
||||
return nil
|
||||
}
|
||||
var m map[string]any
|
||||
if err := json.Unmarshal([]byte(argsText), &m); err == nil {
|
||||
return m
|
||||
}
|
||||
return map[string]any{"text": argsText}
|
||||
}
|
||||
|
||||
// extractACPToolCallText concatenates the rendered text of every ACP
|
||||
// block in a tool_call / tool_call_update's `content` array.
|
||||
//
|
||||
// Handles the two block types kimi emits:
|
||||
// - {type:"content", content:{type:"text", text:"..."}} — plain text
|
||||
// (shell output, tool args). Text is concatenated verbatim.
|
||||
// - {type:"diff", path, oldText, newText} — FileEdit output. Rendered
|
||||
// as a minimal unified-diff header so the UI distinguishes writes
|
||||
// from reads without needing a diff viewer.
|
||||
//
|
||||
// Terminal blocks ({type:"terminal", terminalId}) reference a remote
|
||||
// terminal the client would normally subscribe to via terminal/output;
|
||||
// we don't advertise terminal capability so we never receive those in
|
||||
// practice, but if one slips through we skip it (nothing useful to
|
||||
// surface from a bare ID).
|
||||
func extractACPToolCallText(blocks []json.RawMessage) string {
|
||||
var b strings.Builder
|
||||
appendPiece := func(piece string) {
|
||||
if piece == "" {
|
||||
return
|
||||
}
|
||||
if b.Len() > 0 {
|
||||
b.WriteByte('\n')
|
||||
}
|
||||
b.WriteString(piece)
|
||||
}
|
||||
for _, raw := range blocks {
|
||||
var kind struct {
|
||||
Type string `json:"type"`
|
||||
}
|
||||
if err := json.Unmarshal(raw, &kind); err != nil {
|
||||
continue
|
||||
}
|
||||
switch kind.Type {
|
||||
case "content":
|
||||
var outer struct {
|
||||
Content json.RawMessage `json:"content"`
|
||||
}
|
||||
if err := json.Unmarshal(raw, &outer); err != nil || len(outer.Content) == 0 {
|
||||
continue
|
||||
}
|
||||
var inner struct {
|
||||
Type string `json:"type"`
|
||||
Text string `json:"text"`
|
||||
}
|
||||
if err := json.Unmarshal(outer.Content, &inner); err != nil {
|
||||
continue
|
||||
}
|
||||
if inner.Type != "text" {
|
||||
continue
|
||||
}
|
||||
appendPiece(inner.Text)
|
||||
case "diff":
|
||||
var diff struct {
|
||||
Path string `json:"path"`
|
||||
OldText string `json:"oldText"`
|
||||
NewText string `json:"newText"`
|
||||
}
|
||||
if err := json.Unmarshal(raw, &diff); err != nil || diff.Path == "" {
|
||||
continue
|
||||
}
|
||||
// Keep it tiny — a full unified diff can be huge and we're
|
||||
// really just recording "this tool wrote to this file".
|
||||
// The UI can re-read the file if it needs the actual content.
|
||||
var piece strings.Builder
|
||||
piece.WriteString("--- ")
|
||||
piece.WriteString(diff.Path)
|
||||
piece.WriteString("\n+++ ")
|
||||
piece.WriteString(diff.Path)
|
||||
if diff.OldText == "" {
|
||||
piece.WriteString("\n(new file, ")
|
||||
piece.WriteString(strconv.Itoa(len(diff.NewText)))
|
||||
piece.WriteString(" bytes)")
|
||||
} else {
|
||||
piece.WriteString("\n(edited: ")
|
||||
piece.WriteString(strconv.Itoa(len(diff.OldText)))
|
||||
piece.WriteString(" → ")
|
||||
piece.WriteString(strconv.Itoa(len(diff.NewText)))
|
||||
piece.WriteString(" bytes)")
|
||||
}
|
||||
appendPiece(piece.String())
|
||||
default:
|
||||
// terminal blocks, image blocks, unknown future types —
|
||||
// ignore. We have no way to inline-render them.
|
||||
}
|
||||
}
|
||||
return b.String()
|
||||
}
|
||||
|
||||
func (c *hermesClient) handleUsageUpdate(data json.RawMessage) {
|
||||
var msg struct {
|
||||
Usage struct {
|
||||
@@ -634,7 +962,10 @@ func (c *hermesClient) handleUsageUpdate(data json.RawMessage) {
|
||||
|
||||
// ── Helpers ──
|
||||
|
||||
func extractHermesSessionID(result json.RawMessage) string {
|
||||
// extractACPSessionID pulls `sessionId` out of a session/new or
|
||||
// session/resume response. Shared by all ACP backends (hermes, kimi,
|
||||
// and anything else that follows the standard ACP schema).
|
||||
func extractACPSessionID(result json.RawMessage) string {
|
||||
var r struct {
|
||||
SessionID string `json:"sessionId"`
|
||||
}
|
||||
@@ -697,46 +1028,64 @@ func hermesToolNameFromTitle(title string, kind string) string {
|
||||
case "think":
|
||||
return "thinking"
|
||||
default:
|
||||
// Preserve a non-empty title when we can't classify it: kimi
|
||||
// emits bare titles like "Shell" or "Read file" without any
|
||||
// `kind`, so returning an empty string here drops the tool
|
||||
// name entirely before kimiToolNameFromTitle can map it.
|
||||
// Hermes titles always carry a colon, so hermes never reaches
|
||||
// this branch with a non-empty title.
|
||||
if title != "" {
|
||||
return title
|
||||
}
|
||||
return kind
|
||||
}
|
||||
}
|
||||
|
||||
// ── Provider-error sniffing ──
|
||||
//
|
||||
// hermes' session/prompt RPC reports stopReason=end_turn even when
|
||||
// the underlying HTTP call to the configured LLM endpoint returned
|
||||
// an error — the actionable detail only appears on stderr (e.g.
|
||||
// ACP agents (hermes, kimi, …) all have the same failure mode:
|
||||
// session/prompt reports stopReason=end_turn even when the underlying
|
||||
// HTTP call to the configured LLM endpoint returned an error — the
|
||||
// actionable detail only appears on stderr (e.g.
|
||||
// `⚠️ API call failed (attempt 1/3): BadRequestError [HTTP 400]` and
|
||||
// `Error: HTTP 400: Error code: 400 - {'detail': "The '...' model
|
||||
// is not supported when using Codex with a ChatGPT account."}`).
|
||||
// We scan for those patterns so the daemon can surface a real
|
||||
// failure instead of a generic "empty output".
|
||||
type hermesProviderErrorSniffer struct {
|
||||
mu sync.Mutex
|
||||
remains []byte // buffer for a partial trailing line across writes
|
||||
lines []string // captured error lines, bounded
|
||||
seen map[string]bool
|
||||
// The sniffer scans for those patterns so the daemon can surface a
|
||||
// real failure instead of a generic "empty output".
|
||||
//
|
||||
// Parameterised by provider name so both hermes and kimi can share
|
||||
// the transport: the regexes match format-level signals (HTTP status,
|
||||
// error-kind tags, "API call failed" banner) that both runtimes emit.
|
||||
type acpProviderErrorSniffer struct {
|
||||
provider string
|
||||
mu sync.Mutex
|
||||
remains []byte // buffer for a partial trailing line across writes
|
||||
lines []string // captured error lines, bounded
|
||||
seen map[string]bool
|
||||
}
|
||||
|
||||
// hermesErrorHeaderRe matches the first line of an API-error block.
|
||||
// Hermes prefixes these with ⚠️ / ❌ and includes an HTTP status
|
||||
// code or a non-retryable-error tag.
|
||||
var hermesErrorHeaderRe = regexp.MustCompile(`(?:⚠️|❌|\[ERROR\]).*(?:BadRequestError|AuthenticationError|RateLimitError|HTTP [0-9]{3}|Non-retryable|API call failed)`)
|
||||
// acpErrorHeaderRe matches the first line of an API-error block.
|
||||
// ACP agents typically prefix these with ⚠️ / ❌ and include an HTTP
|
||||
// status code or a non-retryable-error tag.
|
||||
var acpErrorHeaderRe = regexp.MustCompile(`(?:⚠️|❌|\[ERROR\]).*(?:BadRequestError|AuthenticationError|RateLimitError|HTTP [0-9]{3}|Non-retryable|API call failed)`)
|
||||
|
||||
// hermesErrorDetailRe pulls the most useful single-line messages
|
||||
// out of the subsequent lines of the error block (the one whose
|
||||
// "Error:" or "Details:" tag actually spells out what happened).
|
||||
var hermesErrorDetailRe = regexp.MustCompile(`(?:Error:|detail:|Details:)\s*(.+)`)
|
||||
// acpErrorDetailRe pulls the most useful single-line messages out of
|
||||
// the subsequent lines of the error block (the one whose "Error:" or
|
||||
// "Details:" tag actually spells out what happened).
|
||||
var acpErrorDetailRe = regexp.MustCompile(`(?:Error:|detail:|Details:)\s*(.+)`)
|
||||
|
||||
const hermesMaxErrorLines = 8
|
||||
const acpMaxErrorLines = 8
|
||||
|
||||
func newHermesProviderErrorSniffer() *hermesProviderErrorSniffer {
|
||||
return &hermesProviderErrorSniffer{seen: map[string]bool{}}
|
||||
// newACPProviderErrorSniffer returns a sniffer that tags its messages
|
||||
// with the given provider name (e.g. "hermes", "kimi") so failure
|
||||
// strings make it obvious which runtime produced the error.
|
||||
func newACPProviderErrorSniffer(provider string) *acpProviderErrorSniffer {
|
||||
return &acpProviderErrorSniffer{provider: provider, seen: map[string]bool{}}
|
||||
}
|
||||
|
||||
// Write implements io.Writer so the sniffer can sit behind an
|
||||
// io.MultiWriter next to the normal stderr log forwarder.
|
||||
func (s *hermesProviderErrorSniffer) Write(p []byte) (int, error) {
|
||||
func (s *acpProviderErrorSniffer) Write(p []byte) (int, error) {
|
||||
s.mu.Lock()
|
||||
defer s.mu.Unlock()
|
||||
|
||||
@@ -757,7 +1106,7 @@ func (s *hermesProviderErrorSniffer) Write(p []byte) (int, error) {
|
||||
if line == "" {
|
||||
continue
|
||||
}
|
||||
if !(hermesErrorHeaderRe.MatchString(line) || hermesErrorDetailRe.MatchString(line)) {
|
||||
if !(acpErrorHeaderRe.MatchString(line) || acpErrorDetailRe.MatchString(line)) {
|
||||
continue
|
||||
}
|
||||
if s.seen[line] {
|
||||
@@ -765,8 +1114,8 @@ func (s *hermesProviderErrorSniffer) Write(p []byte) (int, error) {
|
||||
}
|
||||
s.seen[line] = true
|
||||
s.lines = append(s.lines, line)
|
||||
if len(s.lines) > hermesMaxErrorLines {
|
||||
s.lines = s.lines[len(s.lines)-hermesMaxErrorLines:]
|
||||
if len(s.lines) > acpMaxErrorLines {
|
||||
s.lines = s.lines[len(s.lines)-acpMaxErrorLines:]
|
||||
}
|
||||
}
|
||||
return len(p), nil
|
||||
@@ -776,21 +1125,22 @@ func (s *hermesProviderErrorSniffer) Write(p []byte) (int, error) {
|
||||
// error field. Prefers the most specific "Error:" / "detail:"
|
||||
// fragment; falls back to the first captured header line; empty
|
||||
// when nothing useful was seen.
|
||||
func (s *hermesProviderErrorSniffer) message() string {
|
||||
func (s *acpProviderErrorSniffer) message() string {
|
||||
s.mu.Lock()
|
||||
defer s.mu.Unlock()
|
||||
|
||||
prefix := s.provider + " provider error: "
|
||||
for _, line := range s.lines {
|
||||
if m := hermesErrorDetailRe.FindStringSubmatch(line); m != nil {
|
||||
if m := acpErrorDetailRe.FindStringSubmatch(line); m != nil {
|
||||
detail := strings.TrimSpace(m[1])
|
||||
if detail != "" {
|
||||
return "hermes provider error: " + detail
|
||||
return prefix + detail
|
||||
}
|
||||
}
|
||||
}
|
||||
for _, line := range s.lines {
|
||||
if hermesErrorHeaderRe.MatchString(line) {
|
||||
return "hermes provider error: " + line
|
||||
if acpErrorHeaderRe.MatchString(line) {
|
||||
return prefix + line
|
||||
}
|
||||
}
|
||||
return ""
|
||||
|
||||
@@ -2,7 +2,9 @@ package agent
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"log/slog"
|
||||
"strings"
|
||||
"sync"
|
||||
"testing"
|
||||
)
|
||||
|
||||
@@ -17,30 +19,30 @@ func TestNewReturnsHermesBackend(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// ── extractHermesSessionID ──
|
||||
// ── extractACPSessionID ──
|
||||
|
||||
func TestExtractHermesSessionID(t *testing.T) {
|
||||
func TestExtractACPSessionID(t *testing.T) {
|
||||
t.Parallel()
|
||||
raw := json.RawMessage(`{"sessionId":"20260410_141145_47260c"}`)
|
||||
got := extractHermesSessionID(raw)
|
||||
got := extractACPSessionID(raw)
|
||||
if got != "20260410_141145_47260c" {
|
||||
t.Errorf("got %q, want %q", got, "20260410_141145_47260c")
|
||||
}
|
||||
}
|
||||
|
||||
func TestExtractHermesSessionIDEmpty(t *testing.T) {
|
||||
func TestExtractACPSessionIDEmpty(t *testing.T) {
|
||||
t.Parallel()
|
||||
raw := json.RawMessage(`{}`)
|
||||
got := extractHermesSessionID(raw)
|
||||
got := extractACPSessionID(raw)
|
||||
if got != "" {
|
||||
t.Errorf("got %q, want empty", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestExtractHermesSessionIDInvalidJSON(t *testing.T) {
|
||||
func TestExtractACPSessionIDInvalidJSON(t *testing.T) {
|
||||
t.Parallel()
|
||||
raw := json.RawMessage(`not json`)
|
||||
got := extractHermesSessionID(raw)
|
||||
got := extractACPSessionID(raw)
|
||||
if got != "" {
|
||||
t.Errorf("got %q, want empty", got)
|
||||
}
|
||||
@@ -65,14 +67,24 @@ func TestHermesToolNameFromTitle(t *testing.T) {
|
||||
{"delegate: fix the bug", "execute", "delegate_task"},
|
||||
{"analyze image: what is this?", "read", "vision_analyze"},
|
||||
{"execute code", "execute", "execute_code"},
|
||||
// Fallback to kind when no colon in title.
|
||||
// Fallback to kind when no colon in title but kind is known.
|
||||
{"unknownTool", "read", "read_file"},
|
||||
{"unknownTool", "edit", "write_file"},
|
||||
{"unknownTool", "execute", "terminal"},
|
||||
{"unknownTool", "search", "search_files"},
|
||||
{"unknownTool", "fetch", "web_search"},
|
||||
{"unknownTool", "think", "thinking"},
|
||||
{"unknownTool", "other", "other"},
|
||||
// Bare title (no colon, no known kind) — preserve the title
|
||||
// itself rather than falling back to an unclassified kind.
|
||||
// Matters for kimi: its ACP `tool_call` updates emit a bare
|
||||
// `title: "Shell"` with no `kind`, and we need downstream
|
||||
// normalisation (kimiToolNameFromTitle) to see "Shell" rather
|
||||
// than an empty string.
|
||||
{"Shell", "", "Shell"},
|
||||
{"Read file", "", "Read file"},
|
||||
{"unknownTool", "other", "unknownTool"},
|
||||
// Empty title falls back to kind, even when kind isn't known.
|
||||
{"", "other", "other"},
|
||||
// Tool with colon but not in known map.
|
||||
{"custom_tool: args", "other", "custom_tool"},
|
||||
}
|
||||
@@ -101,7 +113,7 @@ func TestHermesClientHandleLineResponse(t *testing.T) {
|
||||
if res.err != nil {
|
||||
t.Fatalf("unexpected error: %v", res.err)
|
||||
}
|
||||
sid := extractHermesSessionID(res.result)
|
||||
sid := extractACPSessionID(res.result)
|
||||
if sid != "ses_abc" {
|
||||
t.Errorf("sessionId: got %q, want %q", sid, "ses_abc")
|
||||
}
|
||||
@@ -127,6 +139,111 @@ func TestHermesClientHandleLineError(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// ── agent → client request handling ──
|
||||
|
||||
// bufferWriter is a test stand-in for cmd.StdinPipe that captures
|
||||
// writes in-memory so we can assert what handleAgentRequest emitted.
|
||||
type bufferWriter struct {
|
||||
mu sync.Mutex
|
||||
buf strings.Builder
|
||||
}
|
||||
|
||||
func (b *bufferWriter) Write(p []byte) (int, error) {
|
||||
b.mu.Lock()
|
||||
defer b.mu.Unlock()
|
||||
return b.buf.WriteString(string(p))
|
||||
}
|
||||
|
||||
func (b *bufferWriter) String() string {
|
||||
b.mu.Lock()
|
||||
defer b.mu.Unlock()
|
||||
return b.buf.String()
|
||||
}
|
||||
|
||||
// TestHermesClientAutoApprovesPermissionRequest asserts that when an
|
||||
// ACP agent sends us `session/request_permission` (kimi does this on
|
||||
// every Shell / file-mutating tool call), the client replies with
|
||||
// `approve_for_session` — without this the agent blocks 300s and the
|
||||
// task hangs. The id in the reply must match the agent's request id
|
||||
// so its in-flight future resolves.
|
||||
func TestHermesClientAutoApprovesPermissionRequest(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
w := &bufferWriter{}
|
||||
c := &hermesClient{
|
||||
cfg: Config{Logger: slog.Default()},
|
||||
stdin: w,
|
||||
pending: make(map[int]*pendingRPC),
|
||||
}
|
||||
|
||||
c.handleLine(`{"jsonrpc":"2.0","id":42,"method":"session/request_permission","params":{"sessionId":"ses_1","options":[{"optionId":"approve","name":"Approve once","kind":"allow_once"},{"optionId":"approve_for_session","name":"Approve for this session","kind":"allow_always"},{"optionId":"reject","name":"Reject","kind":"reject_once"}],"toolCall":{"toolCallId":"tc_1","title":"Shell","content":[]}}}`)
|
||||
|
||||
got := w.String()
|
||||
var resp struct {
|
||||
JSONRPC string `json:"jsonrpc"`
|
||||
ID int `json:"id"`
|
||||
Result struct {
|
||||
Outcome struct {
|
||||
Outcome string `json:"outcome"`
|
||||
OptionID string `json:"optionId"`
|
||||
} `json:"outcome"`
|
||||
} `json:"result"`
|
||||
}
|
||||
if err := json.Unmarshal([]byte(strings.TrimSpace(got)), &resp); err != nil {
|
||||
t.Fatalf("reply is not valid JSON: %q err=%v", got, err)
|
||||
}
|
||||
if resp.JSONRPC != "2.0" {
|
||||
t.Errorf("jsonrpc: got %q, want 2.0", resp.JSONRPC)
|
||||
}
|
||||
if resp.ID != 42 {
|
||||
t.Errorf("id: got %d, want 42 (must echo agent's request id)", resp.ID)
|
||||
}
|
||||
if resp.Result.Outcome.Outcome != "selected" {
|
||||
t.Errorf("outcome.outcome: got %q, want %q", resp.Result.Outcome.Outcome, "selected")
|
||||
}
|
||||
if resp.Result.Outcome.OptionID != "approve_for_session" {
|
||||
t.Errorf("outcome.optionId: got %q, want %q", resp.Result.Outcome.OptionID, "approve_for_session")
|
||||
}
|
||||
}
|
||||
|
||||
// TestHermesClientReplesMethodNotFoundForUnknownAgentRequest ensures
|
||||
// that any agent → client request we don't explicitly handle gets a
|
||||
// proper JSON-RPC error back, not silence. Silence would block the
|
||||
// agent for however long its internal timeout is, same as the
|
||||
// session/request_permission hang this change fixes.
|
||||
func TestHermesClientReplesMethodNotFoundForUnknownAgentRequest(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
w := &bufferWriter{}
|
||||
c := &hermesClient{
|
||||
cfg: Config{Logger: slog.Default()},
|
||||
stdin: w,
|
||||
pending: make(map[int]*pendingRPC),
|
||||
}
|
||||
c.handleLine(`{"jsonrpc":"2.0","id":7,"method":"fs/read_text_file","params":{"path":"/tmp/x"}}`)
|
||||
|
||||
got := w.String()
|
||||
var resp struct {
|
||||
ID int `json:"id"`
|
||||
Error struct {
|
||||
Code int `json:"code"`
|
||||
Message string `json:"message"`
|
||||
} `json:"error"`
|
||||
}
|
||||
if err := json.Unmarshal([]byte(strings.TrimSpace(got)), &resp); err != nil {
|
||||
t.Fatalf("reply not valid JSON: %q err=%v", got, err)
|
||||
}
|
||||
if resp.ID != 7 {
|
||||
t.Errorf("id echo: got %d, want 7", resp.ID)
|
||||
}
|
||||
if resp.Error.Code != -32601 {
|
||||
t.Errorf("error code: got %d, want -32601 (method not found)", resp.Error.Code)
|
||||
}
|
||||
if !strings.Contains(resp.Error.Message, "fs/read_text_file") {
|
||||
t.Errorf("error message should name the unhandled method, got %q", resp.Error.Message)
|
||||
}
|
||||
}
|
||||
|
||||
// ── session/update notification handling ──
|
||||
|
||||
func TestHermesClientHandleAgentMessage(t *testing.T) {
|
||||
@@ -226,6 +343,211 @@ func TestHermesClientHandleToolCallComplete(t *testing.T) {
|
||||
}
|
||||
}
|
||||
|
||||
// TestHermesClientKimiStreamingToolCall walks the real kimi frame
|
||||
// sequence for a single Shell call:
|
||||
// 1. tool_call with empty content (LLM hasn't started emitting args yet)
|
||||
// 2. tool_call_update status=in_progress carrying the cumulative args
|
||||
// JSON character-by-character ("{", "{\"command", …)
|
||||
// 3. tool_call_update status=completed carrying the command's stdout
|
||||
//
|
||||
// The client must defer MessageToolUse until we have the full args so
|
||||
// the UI doesn't show a command like `{"comma` — and the MessageToolUse
|
||||
// must carry the parsed args as the Input map (`{"command": "echo hi"}`
|
||||
// → Input["command"] = "echo hi") rather than a raw string.
|
||||
func TestHermesClientKimiStreamingToolCall(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
var got []Message
|
||||
c := &hermesClient{
|
||||
pending: make(map[int]*pendingRPC),
|
||||
onMessage: func(msg Message) {
|
||||
got = append(got, msg)
|
||||
},
|
||||
}
|
||||
|
||||
// 1. tool_call: empty content (classic kimi start frame).
|
||||
c.handleLine(`{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call","toolCallId":"tc-kimi-1","title":"Shell","status":"in_progress","content":[{"type":"content","content":{"type":"text","text":""}}]}}}`)
|
||||
if len(got) != 0 {
|
||||
t.Fatalf("expected nothing emitted yet (args empty), got %+v", got)
|
||||
}
|
||||
|
||||
// 2. Streaming updates — cumulative args JSON.
|
||||
partials := []string{
|
||||
`{"`,
|
||||
`{"command`,
|
||||
`{"command":`,
|
||||
`{"command":"echo `,
|
||||
`{"command":"echo hi"}`,
|
||||
}
|
||||
for _, args := range partials {
|
||||
// JSON-encode args so embedded quotes are escaped properly.
|
||||
argsJSON, _ := json.Marshal(args)
|
||||
line := `{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call_update","toolCallId":"tc-kimi-1","status":"in_progress","content":[{"type":"content","content":{"type":"text","text":` + string(argsJSON) + `}}]}}}`
|
||||
c.handleLine(line)
|
||||
}
|
||||
if len(got) != 0 {
|
||||
t.Fatalf("expected nothing emitted mid-stream, got %+v", got)
|
||||
}
|
||||
|
||||
// 3. Completed — stdout.
|
||||
c.handleLine(`{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call_update","toolCallId":"tc-kimi-1","status":"completed","content":[{"type":"content","content":{"type":"text","text":"hi\n"}}]}}}`)
|
||||
|
||||
if len(got) != 2 {
|
||||
t.Fatalf("expected [MessageToolUse, MessageToolResult], got %d: %+v", len(got), got)
|
||||
}
|
||||
if got[0].Type != MessageToolUse {
|
||||
t.Errorf("first message: got %v, want MessageToolUse", got[0].Type)
|
||||
}
|
||||
if got[0].CallID != "tc-kimi-1" {
|
||||
t.Errorf("first.callID: got %q", got[0].CallID)
|
||||
}
|
||||
if cmd, _ := got[0].Input["command"].(string); cmd != "echo hi" {
|
||||
t.Errorf("first.Input.command: got %v, want %q", got[0].Input["command"], "echo hi")
|
||||
}
|
||||
if got[1].Type != MessageToolResult {
|
||||
t.Errorf("second message: got %v, want MessageToolResult", got[1].Type)
|
||||
}
|
||||
if got[1].Output != "hi\n" {
|
||||
t.Errorf("second.output: got %q, want %q", got[1].Output, "hi\n")
|
||||
}
|
||||
}
|
||||
|
||||
// TestHermesClientKimiMalformedArgsFallback: if the accumulated args
|
||||
// aren't valid JSON (streaming glitch, tool with non-JSON args), we
|
||||
// still surface the text under Input.text rather than silently
|
||||
// dropping it.
|
||||
func TestHermesClientKimiMalformedArgsFallback(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
var got []Message
|
||||
c := &hermesClient{
|
||||
pending: make(map[int]*pendingRPC),
|
||||
onMessage: func(msg Message) {
|
||||
got = append(got, msg)
|
||||
},
|
||||
}
|
||||
|
||||
c.handleLine(`{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call","toolCallId":"tc","title":"Shell","status":"in_progress","content":[{"type":"content","content":{"type":"text","text":"not-json"}}]}}}`)
|
||||
c.handleLine(`{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call_update","toolCallId":"tc","status":"completed","content":[{"type":"content","content":{"type":"text","text":"output"}}]}}}`)
|
||||
|
||||
if len(got) < 1 {
|
||||
t.Fatalf("expected ToolUse+ToolResult, got %+v", got)
|
||||
}
|
||||
if text, _ := got[0].Input["text"].(string); text != "not-json" {
|
||||
t.Errorf("fallback Input.text: got %v", got[0].Input["text"])
|
||||
}
|
||||
}
|
||||
|
||||
// TestHermesClientHandleToolCallCompleteOrphan: if a completion frame
|
||||
// arrives without a preceding tool_call (out-of-order / missed frame),
|
||||
// still emit ToolUse synthesised from the update's own title/rawInput
|
||||
// before ToolResult. Keeps the UI from showing a bare result with no
|
||||
// header.
|
||||
func TestHermesClientHandleToolCallCompleteOrphan(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
var got []Message
|
||||
c := &hermesClient{
|
||||
pending: make(map[int]*pendingRPC),
|
||||
onMessage: func(msg Message) {
|
||||
got = append(got, msg)
|
||||
},
|
||||
}
|
||||
|
||||
c.handleLine(`{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call_update","toolCallId":"tc","status":"completed","title":"terminal: ls","kind":"execute","rawInput":{"command":"ls"},"content":[{"type":"content","content":{"type":"text","text":"file.go\n"}}]}}}`)
|
||||
|
||||
if len(got) != 2 || got[0].Type != MessageToolUse || got[1].Type != MessageToolResult {
|
||||
t.Fatalf("expected [ToolUse, ToolResult], got %+v", got)
|
||||
}
|
||||
if got[0].Tool != "terminal" {
|
||||
t.Errorf("orphan ToolUse tool: got %q", got[0].Tool)
|
||||
}
|
||||
if cmd, _ := got[0].Input["command"].(string); cmd != "ls" {
|
||||
t.Errorf("orphan ToolUse input.command: got %v", got[0].Input["command"])
|
||||
}
|
||||
if got[1].Output != "file.go\n" {
|
||||
t.Errorf("ToolResult output: got %q", got[1].Output)
|
||||
}
|
||||
}
|
||||
|
||||
// TestHermesClientHandleToolCallRawOutputTakesPrecedence keeps hermes
|
||||
// behaviour unchanged: when the update has both `rawOutput` (hermes
|
||||
// convention) and `content` (would be ambiguous), honour rawOutput.
|
||||
func TestHermesClientHandleToolCallRawOutputTakesPrecedence(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
var got Message
|
||||
c := &hermesClient{
|
||||
pending: make(map[int]*pendingRPC),
|
||||
onMessage: func(msg Message) {
|
||||
got = msg
|
||||
},
|
||||
}
|
||||
|
||||
line := `{"jsonrpc":"2.0","method":"session/update","params":{"sessionId":"ses_1","update":{"sessionUpdate":"tool_call_update","toolCallId":"tc","status":"completed","rawOutput":"raw wins","content":[{"type":"content","content":{"type":"text","text":"ignored"}}]}}}`
|
||||
c.handleLine(line)
|
||||
|
||||
if got.Output != "raw wins" {
|
||||
t.Errorf("output: got %q, want %q", got.Output, "raw wins")
|
||||
}
|
||||
}
|
||||
|
||||
func TestExtractACPToolCallText(t *testing.T) {
|
||||
t.Parallel()
|
||||
tests := []struct {
|
||||
name string
|
||||
json string
|
||||
want string
|
||||
}{
|
||||
{
|
||||
name: "single text block",
|
||||
json: `[{"type":"content","content":{"type":"text","text":"hello"}}]`,
|
||||
want: "hello",
|
||||
},
|
||||
{
|
||||
name: "multiple text blocks join with newline",
|
||||
json: `[{"type":"content","content":{"type":"text","text":"a"}},{"type":"content","content":{"type":"text","text":"b"}}]`,
|
||||
want: "a\nb",
|
||||
},
|
||||
{
|
||||
name: "terminal blocks skipped",
|
||||
json: `[{"type":"terminal","terminalId":"t1"},{"type":"content","content":{"type":"text","text":"shell out"}}]`,
|
||||
want: "shell out",
|
||||
},
|
||||
{
|
||||
name: "diff block renders as mini header",
|
||||
json: `[{"type":"diff","path":"foo.go","oldText":"abc","newText":"abcdef"}]`,
|
||||
want: "--- foo.go\n+++ foo.go\n(edited: 3 → 6 bytes)",
|
||||
},
|
||||
{
|
||||
name: "new-file diff (no oldText)",
|
||||
json: `[{"type":"diff","path":"new.go","oldText":"","newText":"hi"}]`,
|
||||
want: "--- new.go\n+++ new.go\n(new file, 2 bytes)",
|
||||
},
|
||||
{
|
||||
name: "empty array returns empty",
|
||||
json: `[]`,
|
||||
want: "",
|
||||
},
|
||||
{
|
||||
name: "no text content",
|
||||
json: `[{"type":"terminal","terminalId":"t1"}]`,
|
||||
want: "",
|
||||
},
|
||||
}
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
var blocks []json.RawMessage
|
||||
if err := json.Unmarshal([]byte(tt.json), &blocks); err != nil {
|
||||
t.Fatalf("unmarshal: %v", err)
|
||||
}
|
||||
if got := extractACPToolCallText(blocks); got != tt.want {
|
||||
t.Errorf("got %q, want %q", got, tt.want)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestHermesClientHandleToolCallInProgressIgnored(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
@@ -384,7 +706,7 @@ func TestHermesProviderErrorSniffer(t *testing.T) {
|
||||
// LLM endpoint rejects the requested model. We verify the
|
||||
// sniffer extracts the `Error: ...` line so the task error
|
||||
// tells the user *why* it failed.
|
||||
s := newHermesProviderErrorSniffer()
|
||||
s := newACPProviderErrorSniffer("hermes")
|
||||
lines := []string{
|
||||
"2026-04-20 23:41:47 [INFO] acp_adapter.server: Prompt on session abc",
|
||||
`⚠️ API call failed (attempt 1/3): BadRequestError [HTTP 400]`,
|
||||
@@ -409,7 +731,7 @@ func TestHermesProviderErrorSniffer(t *testing.T) {
|
||||
func TestHermesProviderErrorSnifferIgnoresInfoLines(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
s := newHermesProviderErrorSniffer()
|
||||
s := newACPProviderErrorSniffer("hermes")
|
||||
s.Write([]byte("2026-04-20 23:41:45 [INFO] acp_adapter.entry: Loaded env\n"))
|
||||
s.Write([]byte("2026-04-20 23:41:47 [INFO] agent.auxiliary_client: Vision auto-detect...\n"))
|
||||
if msg := s.message(); msg != "" {
|
||||
@@ -422,7 +744,7 @@ func TestHermesProviderErrorSnifferHandlesPartialLines(t *testing.T) {
|
||||
|
||||
// Writer may be called mid-line; the sniffer must buffer until
|
||||
// it sees a newline so the regex doesn't miss the header.
|
||||
s := newHermesProviderErrorSniffer()
|
||||
s := newACPProviderErrorSniffer("hermes")
|
||||
s.Write([]byte(`⚠️ API call failed (attempt 1/3):`))
|
||||
s.Write([]byte(` BadRequestError [HTTP 400]` + "\n"))
|
||||
s.Write([]byte(` 📝 Error: something went wrong` + "\n"))
|
||||
@@ -435,12 +757,12 @@ func TestHermesProviderErrorSnifferHandlesPartialLines(t *testing.T) {
|
||||
func TestHermesProviderErrorSnifferBoundedBuffer(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
s := newHermesProviderErrorSniffer()
|
||||
s := newACPProviderErrorSniffer("hermes")
|
||||
for i := 0; i < 20; i++ {
|
||||
// Each line differs so dedup doesn't merge them.
|
||||
s.Write([]byte(`⚠️ API call failed (HTTP 400) attempt ` + string(rune('a'+i%26)) + `: Non-retryable error` + "\n"))
|
||||
}
|
||||
if len(s.lines) > hermesMaxErrorLines {
|
||||
t.Errorf("sniffer kept %d lines, limit is %d", len(s.lines), hermesMaxErrorLines)
|
||||
if len(s.lines) > acpMaxErrorLines {
|
||||
t.Errorf("sniffer kept %d lines, limit is %d", len(s.lines), acpMaxErrorLines)
|
||||
}
|
||||
}
|
||||
|
||||
382
server/pkg/agent/kimi.go
Normal file
382
server/pkg/agent/kimi.go
Normal file
@@ -0,0 +1,382 @@
|
||||
package agent
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"context"
|
||||
"fmt"
|
||||
"io"
|
||||
"os/exec"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
)
|
||||
|
||||
// kimiBlockedArgs are flags hardcoded by the daemon that must not be
|
||||
// overridden by user-configured custom_args. `acp` is the protocol
|
||||
// subcommand that drives the ACP JSON-RPC transport for Kimi Code CLI;
|
||||
// overriding it would break the daemon↔Kimi communication contract.
|
||||
var kimiBlockedArgs = map[string]blockedArgMode{
|
||||
"acp": blockedStandalone,
|
||||
}
|
||||
|
||||
// kimiBackend implements Backend by spawning `kimi acp` and communicating
|
||||
// via the ACP (Agent Client Protocol) JSON-RPC 2.0 over stdin/stdout.
|
||||
//
|
||||
// Kimi Code CLI (https://github.com/MoonshotAI/kimi-cli) supports ACP out of
|
||||
// the box via the `kimi acp` subcommand. We reuse the existing hermesClient
|
||||
// ACP transport since both runtimes speak the same protocol — only the
|
||||
// binary, env, and tool-name extraction differ.
|
||||
type kimiBackend struct {
|
||||
cfg Config
|
||||
}
|
||||
|
||||
func (b *kimiBackend) Execute(ctx context.Context, prompt string, opts ExecOptions) (*Session, error) {
|
||||
execPath := b.cfg.ExecutablePath
|
||||
if execPath == "" {
|
||||
execPath = "kimi"
|
||||
}
|
||||
if _, err := exec.LookPath(execPath); err != nil {
|
||||
return nil, fmt.Errorf("kimi executable not found at %q: %w", execPath, err)
|
||||
}
|
||||
|
||||
timeout := opts.Timeout
|
||||
if timeout == 0 {
|
||||
timeout = 20 * time.Minute
|
||||
}
|
||||
runCtx, cancel := context.WithTimeout(ctx, timeout)
|
||||
|
||||
// `kimi acp` ignores --yolo / --auto-approve (they're flags on the
|
||||
// root `kimi` command, not on the `acp` subcommand). Instead, the
|
||||
// daemon auto-approves in hermesClient.handleAgentRequest by replying
|
||||
// "approve_for_session" to every session/request_permission request.
|
||||
kimiArgs := append([]string{"acp"}, filterCustomArgs(opts.CustomArgs, kimiBlockedArgs, b.cfg.Logger)...)
|
||||
cmd := exec.CommandContext(runCtx, execPath, kimiArgs...)
|
||||
b.cfg.Logger.Debug("agent command", "exec", execPath, "args", kimiArgs)
|
||||
if opts.Cwd != "" {
|
||||
cmd.Dir = opts.Cwd
|
||||
}
|
||||
cmd.Env = buildEnv(b.cfg.Env)
|
||||
|
||||
stdout, err := cmd.StdoutPipe()
|
||||
if err != nil {
|
||||
cancel()
|
||||
return nil, fmt.Errorf("kimi stdout pipe: %w", err)
|
||||
}
|
||||
stdin, err := cmd.StdinPipe()
|
||||
if err != nil {
|
||||
cancel()
|
||||
return nil, fmt.Errorf("kimi stdin pipe: %w", err)
|
||||
}
|
||||
// Forward stderr to the daemon log *and* sniff provider-level
|
||||
// errors out of it so we can surface them in the task result.
|
||||
// Kimi's session/prompt still reports stopReason=end_turn when
|
||||
// the underlying HTTP call to api.kimi.com returns 4xx/5xx, so
|
||||
// without this the daemon reports a misleading "empty output"
|
||||
// and the actionable error (expired token, rate limit, upstream
|
||||
// 5xx, …) stays buried in the daemon log.
|
||||
providerErr := newACPProviderErrorSniffer("kimi")
|
||||
cmd.Stderr = io.MultiWriter(newLogWriter(b.cfg.Logger, "[kimi:stderr] "), providerErr)
|
||||
|
||||
if err := cmd.Start(); err != nil {
|
||||
cancel()
|
||||
return nil, fmt.Errorf("start kimi: %w", err)
|
||||
}
|
||||
|
||||
b.cfg.Logger.Info("kimi acp started", "pid", cmd.Process.Pid, "cwd", opts.Cwd)
|
||||
|
||||
msgCh := make(chan Message, 256)
|
||||
resCh := make(chan Result, 1)
|
||||
|
||||
var outputMu sync.Mutex
|
||||
var output strings.Builder
|
||||
|
||||
promptDone := make(chan hermesPromptResult, 1)
|
||||
|
||||
// Reuse the hermesClient ACP transport — Kimi speaks the same protocol.
|
||||
c := &hermesClient{
|
||||
cfg: b.cfg,
|
||||
stdin: stdin,
|
||||
pending: make(map[int]*pendingRPC),
|
||||
pendingTools: make(map[string]*pendingToolCall),
|
||||
onMessage: func(msg Message) {
|
||||
// hermesClient.handleToolCallStart has already mapped
|
||||
// the raw ACP title via hermesToolNameFromTitle — which
|
||||
// covers lowercase hermes-style titles ("read:", "patch
|
||||
// (replace)", …) but not capitalised kimi-style ones
|
||||
// ("Read file: …", "Run command: …"). Re-normalise so
|
||||
// the UI sees consistent snake_case identifiers across
|
||||
// both backends. No-op when the name is already normal
|
||||
// form (e.g. already mapped to "read_file").
|
||||
if msg.Type == MessageToolUse {
|
||||
msg.Tool = kimiToolNameFromTitle(msg.Tool)
|
||||
}
|
||||
if msg.Type == MessageText {
|
||||
outputMu.Lock()
|
||||
output.WriteString(msg.Content)
|
||||
outputMu.Unlock()
|
||||
}
|
||||
trySend(msgCh, msg)
|
||||
},
|
||||
onPromptDone: func(result hermesPromptResult) {
|
||||
select {
|
||||
case promptDone <- result:
|
||||
default:
|
||||
}
|
||||
},
|
||||
}
|
||||
|
||||
// Start reading stdout in background.
|
||||
readerDone := make(chan struct{})
|
||||
go func() {
|
||||
defer close(readerDone)
|
||||
scanner := bufio.NewScanner(stdout)
|
||||
scanner.Buffer(make([]byte, 0, 1024*1024), 10*1024*1024)
|
||||
for scanner.Scan() {
|
||||
line := strings.TrimSpace(scanner.Text())
|
||||
if line == "" {
|
||||
continue
|
||||
}
|
||||
c.handleLine(line)
|
||||
}
|
||||
c.closeAllPending(fmt.Errorf("kimi process exited"))
|
||||
}()
|
||||
|
||||
// Drive the ACP session lifecycle in a goroutine.
|
||||
go func() {
|
||||
defer cancel()
|
||||
defer close(msgCh)
|
||||
defer close(resCh)
|
||||
defer func() {
|
||||
stdin.Close()
|
||||
_ = cmd.Wait()
|
||||
}()
|
||||
|
||||
startTime := time.Now()
|
||||
finalStatus := "completed"
|
||||
var finalError string
|
||||
var sessionID string
|
||||
|
||||
// 1. Initialize handshake.
|
||||
_, err := c.request(runCtx, "initialize", map[string]any{
|
||||
"protocolVersion": 1,
|
||||
"clientInfo": map[string]any{
|
||||
"name": "multica-agent-sdk",
|
||||
"version": "0.2.0",
|
||||
},
|
||||
"clientCapabilities": map[string]any{},
|
||||
})
|
||||
if err != nil {
|
||||
finalStatus = "failed"
|
||||
finalError = fmt.Sprintf("kimi initialize failed: %v", err)
|
||||
resCh <- Result{Status: finalStatus, Error: finalError, DurationMs: time.Since(startTime).Milliseconds()}
|
||||
return
|
||||
}
|
||||
|
||||
// 2. Create or resume a session.
|
||||
cwd := opts.Cwd
|
||||
if cwd == "" {
|
||||
cwd = "."
|
||||
}
|
||||
|
||||
if opts.ResumeSessionID != "" {
|
||||
result, err := c.request(runCtx, "session/resume", map[string]any{
|
||||
"cwd": cwd,
|
||||
"sessionId": opts.ResumeSessionID,
|
||||
})
|
||||
if err != nil {
|
||||
finalStatus = "failed"
|
||||
finalError = fmt.Sprintf("kimi session/resume failed: %v", err)
|
||||
resCh <- Result{Status: finalStatus, Error: finalError, DurationMs: time.Since(startTime).Milliseconds()}
|
||||
return
|
||||
}
|
||||
sessionID = opts.ResumeSessionID
|
||||
_ = result
|
||||
} else {
|
||||
result, err := c.request(runCtx, "session/new", map[string]any{
|
||||
"cwd": cwd,
|
||||
"mcpServers": []any{},
|
||||
})
|
||||
if err != nil {
|
||||
finalStatus = "failed"
|
||||
finalError = fmt.Sprintf("kimi session/new failed: %v", err)
|
||||
resCh <- Result{Status: finalStatus, Error: finalError, DurationMs: time.Since(startTime).Milliseconds()}
|
||||
return
|
||||
}
|
||||
sessionID = extractACPSessionID(result)
|
||||
if sessionID == "" {
|
||||
finalStatus = "failed"
|
||||
finalError = "kimi session/new returned no session ID"
|
||||
resCh <- Result{Status: finalStatus, Error: finalError, DurationMs: time.Since(startTime).Milliseconds()}
|
||||
return
|
||||
}
|
||||
}
|
||||
|
||||
c.sessionID = sessionID
|
||||
b.cfg.Logger.Info("kimi session created", "session_id", sessionID)
|
||||
|
||||
// 3. If the caller picked a model (via agent.model from the
|
||||
// UI dropdown), ask kimi to switch the session to it before
|
||||
// we send any prompt. Kimi's ACP server exposes
|
||||
// `session/set_model` and advertises available models via
|
||||
// the `models.availableModels` block returned by
|
||||
// `session/new` — we pass the chosen modelId through
|
||||
// verbatim. This MUST fail the task on error: silently
|
||||
// falling back to kimi's default model would let the user
|
||||
// believe their pick was honoured while the task actually
|
||||
// ran on something else.
|
||||
if opts.Model != "" {
|
||||
if _, err := c.request(runCtx, "session/set_model", map[string]any{
|
||||
"sessionId": sessionID,
|
||||
"modelId": opts.Model,
|
||||
}); err != nil {
|
||||
b.cfg.Logger.Warn("kimi set_session_model failed", "error", err, "requested_model", opts.Model)
|
||||
finalStatus = "failed"
|
||||
finalError = fmt.Sprintf("kimi could not switch to model %q: %v", opts.Model, err)
|
||||
resCh <- Result{
|
||||
Status: finalStatus,
|
||||
Error: finalError,
|
||||
DurationMs: time.Since(startTime).Milliseconds(),
|
||||
SessionID: sessionID,
|
||||
}
|
||||
return
|
||||
}
|
||||
b.cfg.Logger.Info("kimi session model set", "model", opts.Model)
|
||||
}
|
||||
|
||||
// 4. Build the prompt content. If we have a system prompt, prepend it.
|
||||
userText := prompt
|
||||
if opts.SystemPrompt != "" {
|
||||
userText = opts.SystemPrompt + "\n\n---\n\n" + prompt
|
||||
}
|
||||
|
||||
// 5. Send the prompt and wait for PromptResponse.
|
||||
_, err = c.request(runCtx, "session/prompt", map[string]any{
|
||||
"sessionId": sessionID,
|
||||
"prompt": []map[string]any{
|
||||
{"type": "text", "text": userText},
|
||||
},
|
||||
})
|
||||
if err != nil {
|
||||
if runCtx.Err() == context.DeadlineExceeded {
|
||||
finalStatus = "timeout"
|
||||
finalError = fmt.Sprintf("kimi timed out after %s", timeout)
|
||||
} else if runCtx.Err() == context.Canceled {
|
||||
finalStatus = "aborted"
|
||||
finalError = "execution cancelled"
|
||||
} else {
|
||||
finalStatus = "failed"
|
||||
finalError = fmt.Sprintf("kimi session/prompt failed: %v", err)
|
||||
}
|
||||
} else {
|
||||
select {
|
||||
case pr := <-promptDone:
|
||||
if pr.stopReason == "cancelled" {
|
||||
finalStatus = "aborted"
|
||||
finalError = "kimi cancelled the prompt"
|
||||
}
|
||||
c.usageMu.Lock()
|
||||
c.usage.InputTokens += pr.usage.InputTokens
|
||||
c.usage.OutputTokens += pr.usage.OutputTokens
|
||||
c.usageMu.Unlock()
|
||||
default:
|
||||
}
|
||||
}
|
||||
|
||||
duration := time.Since(startTime)
|
||||
b.cfg.Logger.Info("kimi finished", "pid", cmd.Process.Pid, "status", finalStatus, "duration", duration.Round(time.Millisecond).String())
|
||||
|
||||
stdin.Close()
|
||||
cancel()
|
||||
|
||||
<-readerDone
|
||||
|
||||
outputMu.Lock()
|
||||
finalOutput := output.String()
|
||||
outputMu.Unlock()
|
||||
|
||||
// If kimi produced no visible output but we sniffed a
|
||||
// provider-level error on stderr (typically HTTP 4xx from
|
||||
// api.kimi.com — token expired, rate-limited, upstream
|
||||
// 5xx, …), promote the status to failed and surface the
|
||||
// real reason. Without this the daemon reports a cryptic
|
||||
// "completed + empty output" and the actionable error
|
||||
// stays buried in daemon logs.
|
||||
if finalStatus == "completed" && finalOutput == "" {
|
||||
if msg := providerErr.message(); msg != "" {
|
||||
finalStatus = "failed"
|
||||
finalError = msg
|
||||
}
|
||||
}
|
||||
|
||||
c.usageMu.Lock()
|
||||
u := c.usage
|
||||
c.usageMu.Unlock()
|
||||
|
||||
var usageMap map[string]TokenUsage
|
||||
if u.InputTokens > 0 || u.OutputTokens > 0 || u.CacheReadTokens > 0 {
|
||||
model := opts.Model
|
||||
if model == "" {
|
||||
model = "unknown"
|
||||
}
|
||||
usageMap = map[string]TokenUsage{model: u}
|
||||
}
|
||||
|
||||
resCh <- Result{
|
||||
Status: finalStatus,
|
||||
Output: finalOutput,
|
||||
Error: finalError,
|
||||
DurationMs: duration.Milliseconds(),
|
||||
SessionID: sessionID,
|
||||
Usage: usageMap,
|
||||
}
|
||||
}()
|
||||
|
||||
return &Session{Messages: msgCh, Result: resCh}, nil
|
||||
}
|
||||
|
||||
// kimiToolNameFromTitle normalises tool names emitted by Kimi's ACP
|
||||
// server into the snake_case identifiers the Multica UI expects.
|
||||
//
|
||||
// Kimi follows the ACP spec where `title` is a short human-readable
|
||||
// label such as "Read file: /path/to/foo.go" or "Run command: ls".
|
||||
// hermesToolNameFromTitle upstream handles hermes' lowercase
|
||||
// convention ("read:", "patch (replace)") but not kimi's capitalised
|
||||
// format — so we get called on the already-mapped name from hermes
|
||||
// and fix up anything that slipped through. Empty input returns "".
|
||||
func kimiToolNameFromTitle(title string) string {
|
||||
t := strings.TrimSpace(title)
|
||||
if t == "" {
|
||||
return ""
|
||||
}
|
||||
|
||||
// Strip everything after the first colon — ACP titles often look like
|
||||
// "Tool Name: argument detail" and we want only the tool name.
|
||||
if idx := strings.Index(t, ":"); idx > 0 {
|
||||
t = strings.TrimSpace(t[:idx])
|
||||
}
|
||||
|
||||
lower := strings.ToLower(t)
|
||||
switch lower {
|
||||
case "read", "read file":
|
||||
return "read_file"
|
||||
case "write", "write file":
|
||||
return "write_file"
|
||||
case "edit", "patch":
|
||||
return "edit_file"
|
||||
case "shell", "bash", "terminal", "run command", "run shell command":
|
||||
return "terminal"
|
||||
case "search", "grep", "find":
|
||||
return "search_files"
|
||||
case "glob":
|
||||
return "glob"
|
||||
case "web search":
|
||||
return "web_search"
|
||||
case "fetch", "web fetch":
|
||||
return "web_fetch"
|
||||
case "todo", "todo write":
|
||||
return "todo_write"
|
||||
}
|
||||
|
||||
// Fallback: snake_case the title so the UI gets a stable identifier.
|
||||
return strings.ReplaceAll(lower, " ", "_")
|
||||
}
|
||||
215
server/pkg/agent/kimi_test.go
Normal file
215
server/pkg/agent/kimi_test.go
Normal file
@@ -0,0 +1,215 @@
|
||||
package agent
|
||||
|
||||
import (
|
||||
"context"
|
||||
"log/slog"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
)
|
||||
|
||||
func TestNewReturnsKimiBackend(t *testing.T) {
|
||||
t.Parallel()
|
||||
b, err := New("kimi", Config{ExecutablePath: "/nonexistent/kimi"})
|
||||
if err != nil {
|
||||
t.Fatalf("New(kimi) error: %v", err)
|
||||
}
|
||||
if _, ok := b.(*kimiBackend); !ok {
|
||||
t.Fatalf("expected *kimiBackend, got %T", b)
|
||||
}
|
||||
}
|
||||
|
||||
func TestKimiToolNameFromTitle(t *testing.T) {
|
||||
t.Parallel()
|
||||
tests := []struct {
|
||||
title string
|
||||
want string
|
||||
}{
|
||||
{"Read file: /tmp/foo.go", "read_file"},
|
||||
{"read", "read_file"},
|
||||
{"Write: /tmp/bar.go", "write_file"},
|
||||
{"Edit", "edit_file"},
|
||||
{"Patch: /tmp/x", "edit_file"},
|
||||
{"Shell: ls -la", "terminal"},
|
||||
{"Bash", "terminal"},
|
||||
{"Run command: pwd", "terminal"},
|
||||
{"Search: foo", "search_files"},
|
||||
{"Glob: *.go", "glob"},
|
||||
{"Web search: golang acp", "web_search"},
|
||||
{"Fetch: https://example.com", "web_fetch"},
|
||||
{"Todo Write", "todo_write"},
|
||||
// Fallback: snake_case the title.
|
||||
{"Custom Thing", "custom_thing"},
|
||||
// Empty input returns empty — caller decides how to react.
|
||||
{"", ""},
|
||||
}
|
||||
for _, tt := range tests {
|
||||
got := kimiToolNameFromTitle(tt.title)
|
||||
if got != tt.want {
|
||||
t.Errorf("kimiToolNameFromTitle(%q) = %q, want %q", tt.title, got, tt.want)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// fakeKimiACPScript returns a POSIX-sh script that impersonates
|
||||
// `kimi acp` for a single short ACP session: it acks initialize /
|
||||
// session/new and then replies to session/set_model with a JSON-RPC
|
||||
// error — the scenario the kimiBackend must propagate as a failed
|
||||
// task rather than silently falling back to the default model.
|
||||
func fakeKimiACPScript() string {
|
||||
return `#!/bin/sh
|
||||
# Fake ` + "`kimi`" + ` binary — used by TestKimiBackendSetModelFailureFailsTask
|
||||
# and TestKimiBackendPassesYoloFlag.
|
||||
#
|
||||
# Writes the full argv (one arg per line) to $KIMI_ARGS_FILE if that env
|
||||
# var is set, so tests can assert that the daemon invokes us with the
|
||||
# right flags (`+"`--yolo acp`"+`, not bare `+"`acp`"+`).
|
||||
#
|
||||
# Then reads one JSON-RPC request per line from stdin, matches on the
|
||||
# method name, and writes back a canned response. Exits after set_model
|
||||
# so the kimiBackend cleanup path can run.
|
||||
if [ -n "$KIMI_ARGS_FILE" ]; then
|
||||
for arg in "$@"; do
|
||||
printf '%s\n' "$arg" >> "$KIMI_ARGS_FILE"
|
||||
done
|
||||
fi
|
||||
while IFS= read -r line; do
|
||||
id=$(printf '%s' "$line" | sed -n 's/.*"id":\([0-9]*\).*/\1/p')
|
||||
case "$line" in
|
||||
*'"method":"initialize"'*)
|
||||
printf '{"jsonrpc":"2.0","id":%s,"result":{"protocolVersion":1,"agentCapabilities":{}}}\n' "$id"
|
||||
;;
|
||||
*'"method":"session/new"'*)
|
||||
printf '{"jsonrpc":"2.0","id":%s,"result":{"sessionId":"ses_fake"}}\n' "$id"
|
||||
;;
|
||||
*'"method":"session/set_model"'*)
|
||||
printf '{"jsonrpc":"2.0","id":%s,"error":{"code":-32602,"message":"model not available: bogus-model"}}\n' "$id"
|
||||
exit 0
|
||||
;;
|
||||
esac
|
||||
done
|
||||
`
|
||||
}
|
||||
|
||||
// TestKimiBackendSetModelFailureFailsTask pins the "don't silently
|
||||
// fall back" behaviour that landed in this PR: when kimi rejects the
|
||||
// caller-selected model via session/set_model, the task result must
|
||||
// report status=failed with a message that names the model and the
|
||||
// upstream error — not claim success while actually running on the
|
||||
// default model.
|
||||
func TestKimiBackendSetModelFailureFailsTask(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
fakePath := filepath.Join(t.TempDir(), "kimi")
|
||||
if err := os.WriteFile(fakePath, []byte(fakeKimiACPScript()), 0o755); err != nil {
|
||||
t.Fatalf("write fake kimi: %v", err)
|
||||
}
|
||||
|
||||
backend, err := New("kimi", Config{ExecutablePath: fakePath, Logger: slog.Default()})
|
||||
if err != nil {
|
||||
t.Fatalf("new kimi backend: %v", err)
|
||||
}
|
||||
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
|
||||
defer cancel()
|
||||
|
||||
session, err := backend.Execute(ctx, "prompt-ignored", ExecOptions{
|
||||
Model: "bogus-model",
|
||||
Timeout: 5 * time.Second,
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("execute: %v", err)
|
||||
}
|
||||
// Drain message stream so the lifecycle goroutine can progress.
|
||||
go func() {
|
||||
for range session.Messages {
|
||||
}
|
||||
}()
|
||||
|
||||
select {
|
||||
case result, ok := <-session.Result:
|
||||
if !ok {
|
||||
t.Fatal("result channel closed without a value")
|
||||
}
|
||||
if result.Status != "failed" {
|
||||
t.Fatalf("expected status=failed, got %q (error=%q)", result.Status, result.Error)
|
||||
}
|
||||
if !strings.Contains(result.Error, `could not switch to model "bogus-model"`) {
|
||||
t.Errorf("expected error to name the requested model, got %q", result.Error)
|
||||
}
|
||||
if !strings.Contains(result.Error, "model not available") {
|
||||
t.Errorf("expected error to surface upstream message, got %q", result.Error)
|
||||
}
|
||||
if result.SessionID != "ses_fake" {
|
||||
t.Errorf("expected session id to be preserved on failure, got %q", result.SessionID)
|
||||
}
|
||||
case <-time.After(10 * time.Second):
|
||||
t.Fatal("timeout waiting for result")
|
||||
}
|
||||
}
|
||||
|
||||
// TestKimiBackendInvokesACPSubcommand pins the argv for `kimi`. An
|
||||
// earlier fix tried passing `--yolo` to bypass per-tool approval
|
||||
// prompts, but the `acp` subcommand in kimi-cli takes no options
|
||||
// (see cli/__init__.py @cli.command def acp()), so `--yolo` was a
|
||||
// no-op and the daemon still hung for 5 min on the first Shell call.
|
||||
// The actual bypass is in hermesClient.handleAgentRequest, which
|
||||
// auto-approves session/request_permission. This test catches
|
||||
// accidental re-introduction of the dead flag.
|
||||
func TestKimiBackendInvokesACPSubcommand(t *testing.T) {
|
||||
t.Parallel()
|
||||
|
||||
tempDir := t.TempDir()
|
||||
argsFile := filepath.Join(tempDir, "argv.txt")
|
||||
fakePath := filepath.Join(tempDir, "kimi")
|
||||
if err := os.WriteFile(fakePath, []byte(fakeKimiACPScript()), 0o755); err != nil {
|
||||
t.Fatalf("write fake kimi: %v", err)
|
||||
}
|
||||
|
||||
backend, err := New("kimi", Config{
|
||||
ExecutablePath: fakePath,
|
||||
Logger: slog.Default(),
|
||||
Env: map[string]string{"KIMI_ARGS_FILE": argsFile},
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("new kimi backend: %v", err)
|
||||
}
|
||||
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
|
||||
defer cancel()
|
||||
|
||||
// Set Model so the fake binary exits on set_model and we don't
|
||||
// have to wait for the prompt branch. We only care about argv here.
|
||||
session, err := backend.Execute(ctx, "prompt-ignored", ExecOptions{
|
||||
Model: "bogus-model",
|
||||
Timeout: 5 * time.Second,
|
||||
})
|
||||
if err != nil {
|
||||
t.Fatalf("execute: %v", err)
|
||||
}
|
||||
go func() {
|
||||
for range session.Messages {
|
||||
}
|
||||
}()
|
||||
<-session.Result
|
||||
|
||||
raw, err := os.ReadFile(argsFile)
|
||||
if err != nil {
|
||||
t.Fatalf("read args file: %v", err)
|
||||
}
|
||||
lines := strings.Split(strings.TrimSpace(string(raw)), "\n")
|
||||
if len(lines) < 1 {
|
||||
t.Fatalf("expected at least 1 arg (acp), got %d: %q", len(lines), lines)
|
||||
}
|
||||
if lines[0] != "acp" {
|
||||
t.Errorf("expected first arg to be acp, got %q (full: %q)", lines[0], lines)
|
||||
}
|
||||
for _, l := range lines {
|
||||
switch l {
|
||||
case "--yolo", "--auto-approve", "--yes", "-y":
|
||||
t.Errorf("kimi acp doesn't accept %q; auto-approval is handled in hermesClient.handleAgentRequest", l)
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -71,6 +71,10 @@ func ListModels(ctx context.Context, providerType, executablePath string) ([]Mod
|
||||
return cachedDiscovery(providerType, func() ([]Model, error) {
|
||||
return discoverHermesModels(ctx, executablePath)
|
||||
})
|
||||
case "kimi":
|
||||
return cachedDiscovery(providerType, func() ([]Model, error) {
|
||||
return discoverKimiModels(ctx, executablePath)
|
||||
})
|
||||
case "opencode":
|
||||
return cachedDiscovery(providerType, func() ([]Model, error) {
|
||||
return discoverOpenCodeModels(ctx, executablePath)
|
||||
@@ -317,8 +321,52 @@ func parsePiModels(output string) []Model {
|
||||
// error) all return an empty list so the UI falls back to the
|
||||
// creatable manual-entry input instead of blocking the form.
|
||||
func discoverHermesModels(ctx context.Context, executablePath string) ([]Model, error) {
|
||||
return discoverACPModels(ctx, executablePath, acpDiscoveryProvider{
|
||||
defaultBin: "hermes",
|
||||
clientName: "multica-model-discovery",
|
||||
extraEnv: []string{"HERMES_YOLO_MODE=1"},
|
||||
tmpdirPrefix: "multica-hermes-discovery-",
|
||||
})
|
||||
}
|
||||
|
||||
// discoverKimiModels spins up a throwaway `kimi acp` process and
|
||||
// drives the same minimal ACP handshake as Hermes to surface the
|
||||
// model catalog advertised by Kimi's `session/new` response. Kimi's
|
||||
// ACPServer.new_session returns a `models` block of the same shape
|
||||
// (`availableModels`/`currentModelId`) so the parsing path is shared.
|
||||
//
|
||||
// Failure modes (kimi missing, not logged in, config error) all
|
||||
// return an empty list so the UI falls back to manual entry.
|
||||
func discoverKimiModels(ctx context.Context, executablePath string) ([]Model, error) {
|
||||
return discoverACPModels(ctx, executablePath, acpDiscoveryProvider{
|
||||
defaultBin: "kimi",
|
||||
clientName: "multica-model-discovery",
|
||||
tmpdirPrefix: "multica-kimi-discovery-",
|
||||
})
|
||||
}
|
||||
|
||||
// acpDiscoveryProvider configures how discoverACPModels launches an
|
||||
// ACP-speaking agent CLI. The shared helper drives every CLI in
|
||||
// the same way (initialize → session/new → parse models block) — the
|
||||
// per-provider differences are which binary to spawn, which env
|
||||
// vars suppress interactive prompts during init, and what to label
|
||||
// temporary work directories so they're easy to identify in logs.
|
||||
type acpDiscoveryProvider struct {
|
||||
defaultBin string
|
||||
clientName string
|
||||
extraEnv []string
|
||||
tmpdirPrefix string
|
||||
}
|
||||
|
||||
// discoverACPModels runs the ACP handshake for any agent CLI that
|
||||
// implements the standard `initialize` + `session/new` flow and
|
||||
// advertises its model catalog in the response under
|
||||
// `models.availableModels` / `models.currentModelId`. This covers
|
||||
// Hermes and Kimi today; future ACP backends can plug in by adding
|
||||
// an acpDiscoveryProvider entry instead of duplicating the loop.
|
||||
func discoverACPModels(ctx context.Context, executablePath string, p acpDiscoveryProvider) ([]Model, error) {
|
||||
if executablePath == "" {
|
||||
executablePath = "hermes"
|
||||
executablePath = p.defaultBin
|
||||
}
|
||||
if _, err := exec.LookPath(executablePath); err != nil {
|
||||
return []Model{}, nil
|
||||
@@ -327,8 +375,9 @@ func discoverHermesModels(ctx context.Context, executablePath string) ([]Model,
|
||||
defer cancel()
|
||||
|
||||
cmd := exec.CommandContext(runCtx, executablePath, "acp")
|
||||
// Mirror the real backend's auto-approve so init doesn't prompt.
|
||||
cmd.Env = append(os.Environ(), "HERMES_YOLO_MODE=1")
|
||||
if len(p.extraEnv) > 0 {
|
||||
cmd.Env = append(os.Environ(), p.extraEnv...)
|
||||
}
|
||||
stdin, err := cmd.StdinPipe()
|
||||
if err != nil {
|
||||
return []Model{}, nil
|
||||
@@ -370,16 +419,16 @@ func discoverHermesModels(ctx context.Context, executablePath string) ([]Model,
|
||||
// Send initialize + session/new.
|
||||
if err := writeACP(1, "initialize", map[string]any{
|
||||
"protocolVersion": 1,
|
||||
"clientInfo": map[string]any{"name": "multica-model-discovery", "version": "0.1.0"},
|
||||
"clientInfo": map[string]any{"name": p.clientName, "version": "0.1.0"},
|
||||
"clientCapabilities": map[string]any{},
|
||||
}); err != nil {
|
||||
return []Model{}, nil
|
||||
}
|
||||
|
||||
// Hermes requires a valid cwd for session/new — use a temp
|
||||
// directory we clean up afterwards, not the daemon's workdir
|
||||
// (which might be in the middle of another task's worktree).
|
||||
tmp, err := os.MkdirTemp("", "multica-hermes-discovery-")
|
||||
// session/new requires a valid cwd — use a temp directory we
|
||||
// clean up afterwards, not the daemon's workdir (which might
|
||||
// be in the middle of another task's worktree).
|
||||
tmp, err := os.MkdirTemp("", p.tmpdirPrefix)
|
||||
if err != nil {
|
||||
return []Model{}, nil
|
||||
}
|
||||
@@ -414,7 +463,7 @@ func discoverHermesModels(ctx context.Context, executablePath string) ([]Model,
|
||||
if env.ID.String() != "2" || len(env.Result) == 0 {
|
||||
continue
|
||||
}
|
||||
done <- parseHermesSessionNewModels(env.Result)
|
||||
done <- parseACPSessionNewModels(env.Result)
|
||||
return
|
||||
}
|
||||
}()
|
||||
@@ -432,14 +481,15 @@ func discoverHermesModels(ctx context.Context, executablePath string) ([]Model,
|
||||
}
|
||||
}
|
||||
|
||||
// parseHermesSessionNewModels extracts the model catalog from a
|
||||
// hermes `session/new` response. Hermes' ACP schema emits:
|
||||
// parseACPSessionNewModels extracts the model catalog from an ACP
|
||||
// `session/new` response. Both Hermes and Kimi (and any other ACP
|
||||
// agent that follows the standard schema) emit:
|
||||
//
|
||||
// {
|
||||
// "sessionId": "...",
|
||||
// "models": {
|
||||
// "availableModels": [
|
||||
// {"modelId": "...", "name": "...", "description": "... current"}
|
||||
// {"modelId": "...", "name": "...", "description": "..."}
|
||||
// ],
|
||||
// "currentModelId": "..."
|
||||
// }
|
||||
@@ -448,7 +498,7 @@ func discoverHermesModels(ctx context.Context, executablePath string) ([]Model,
|
||||
// Returns nil (not an empty slice) when the payload is missing so
|
||||
// the caller can distinguish "parsed with no models" (valid but
|
||||
// empty catalog) from "couldn't find the structure at all".
|
||||
func parseHermesSessionNewModels(raw json.RawMessage) []Model {
|
||||
func parseACPSessionNewModels(raw json.RawMessage) []Model {
|
||||
var resp struct {
|
||||
Models struct {
|
||||
AvailableModels []struct {
|
||||
|
||||
@@ -258,7 +258,7 @@ func TestParseHermesSessionNewModels(t *testing.T) {
|
||||
"currentModelId": "nous:anthropic/claude-opus-4.7"
|
||||
}
|
||||
}`)
|
||||
models := parseHermesSessionNewModels(raw)
|
||||
models := parseACPSessionNewModels(raw)
|
||||
if len(models) != 2 {
|
||||
t.Fatalf("expected 2 models (duplicate deduped), got %d: %+v", len(models), models)
|
||||
}
|
||||
@@ -281,13 +281,13 @@ func TestParseHermesSessionNewModelsMissingField(t *testing.T) {
|
||||
// failed _build_model_state — should yield nil so the caller
|
||||
// can distinguish "no catalog" from "empty catalog".
|
||||
raw := []byte(`{"sessionId": "ses_123"}`)
|
||||
if got := parseHermesSessionNewModels(raw); got != nil && len(got) != 0 {
|
||||
if got := parseACPSessionNewModels(raw); got != nil && len(got) != 0 {
|
||||
t.Errorf("expected nil/empty, got %+v", got)
|
||||
}
|
||||
}
|
||||
|
||||
func TestParseHermesSessionNewModelsGarbage(t *testing.T) {
|
||||
if got := parseHermesSessionNewModels([]byte("not json")); got != nil {
|
||||
if got := parseACPSessionNewModels([]byte("not json")); got != nil {
|
||||
t.Errorf("expected nil for non-JSON, got %+v", got)
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user