Files
multica/server/pkg/agent/agent.go
Bohan Jiang 960befa56f feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200)
* feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603)

Adds an agent-scoped `skills_local` switch ("ignore" default / "merge") so
shared agents stop inheriting the operator's user-global Claude skill
directory. A single broken local skill on one operator's machine was
crashing the Claude CLI before it ever read stdin — the daemon saw a
"broken pipe" with no recoverable signal (GitHub #3052).

- DB: migration 108 adds `agent.skills_local` (NOT NULL DEFAULT 'ignore'),
  with sqlc CreateAgent/UpdateAgent updates and handler validation.
- Claude runtime: when the agent is in "ignore" mode the backend points
  CLAUDE_CONFIG_DIR at an empty per-task scratch dir under the task cwd
  (fallback: OS temp), strips any inherited override, and cleans up after
  the run. Workspace skills under `{cwd}/.claude/skills/` still load.
  "merge" preserves the legacy inherit-from-machine behavior; Codex and
  other isolated backends are no-ops.
- UI: new Skills toggle in the Create Agent dialog and the Agent → Skills
  tab, with EN/zh-Hans copy and SkillsLocalToggle shared between the two.
- Tests: unit coverage for the new env helper, isolation dir lifecycle,
  full Claude execute paths (ignore + merge), and the handler tristate
  contract. Existing skills-tab test updated for the new copy.
- Docs: updated `/skills` docs (EN + ZH) and added a 0.3.7 changelog entry
  in the landing-page i18n.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): preserve claude login + validate skills_local input (MUL-2603)

Address Elon's review on PR #3200:

1. Skill isolation no longer drops the operator's Claude login. The
   per-task scratch dir now mirrors every entry under `~/.claude/`
   as symlinks except `skills/`, so `.credentials.json`, settings,
   plugins, etc. reach the CLI exactly as on the host while the
   user-global skills directory stays hidden. Without this, default
   `ignore` would have broken every Claude agent on a non-API-key
   host the moment migration 108 landed.

2. Internal CreateAgent callers (agent_template, onboarding_shim)
   now set `SkillsLocal: "ignore"`. The Go zero value was about to
   trip the migration-108 CHECK constraint and 500 template /
   onboarding agent creation.

3. Create / update handler validation no longer normalizes garbage
   to "ignore". The strict 400 path is now reachable on bad client
   input; the drift-safe `normalizeSkillsLocal` stays on the read
   side only.

UI copy + docs clarified that the toggle is Claude-only; other
runtimes ignore the setting.

Verification:
- `go test ./...` green (full suite locally).
- `pnpm --filter @multica/views exec vitest run agents/components/tabs/skills-tab.test.tsx` green.
- Handler DB-backed tests still skip locally without docker (same
  as Elon's run) — CI will validate the create / update paths
  against migration 108.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): mirror effective claude config dir with windows fallback (MUL-2603)

Address Elon's second-round review on PR #3200:

1. The per-task scratch dir now mirrors the *effective* host Claude
   config dir, not unconditionally `~/.claude/`. Precedence: agent
   `custom_env` CLAUDE_CONFIG_DIR > parent process env > `~/.claude/`.
   Without this, an operator who pinned Claude at a managed install
   (custom env CLAUDE_CONFIG_DIR) would get the wrong credentials in
   the scratch dir, because `buildClaudeEnv` strips that env before
   handing it to the child. We resolve the source up front and feed
   it to the mirror, so the override env still points at the right
   bytes.

2. Mirror entries now go through platform-aware linkers. On Windows
   without Developer Mode / admin, `os.Symlink` is denied, which
   previously left the scratch dir empty and broke Claude Code auth
   on default `ignore`. The new helpers try symlink first, then fall
   back to a directory junction (`mklink /J`) for dirs or a hardlink
   (same-volume content share) / copy for files. Mirrors the
   execenv/codex_home_link_windows.go pattern.

3. Tests:
   - `TestResolveHostClaudeConfigDir` locks in the custom_env >
     parent_env > `~/.claude` precedence.
   - `TestNewIsolatedClaudeConfigDirMirrorsCustomHostDir` confirms
     the scratch dir picks up `.credentials.json` from a synthetic
     custom host dir, proving the source resolution actually
     propagates into the mirror.
   - `TestNewIsolatedClaudeConfigDirEmptyHostIsNoop` documents the
     env-var-auth-only case (no host source ⇒ empty scratch dir).
   - `TestMirrorHostClaudeExceptSkillsWith_FallbackWhenSymlinkFails`
     exercises the Windows-no-Developer-Mode path via the new
     `mirrorHostClaudeExceptSkillsWith` seam, asserting credentials
     and sub-dir children still reach the scratch dir after the
     symlink stand-in fails.
   - `TestMirrorHostClaudeExceptSkillsWith_PropagatesFirstLinkError`
     confirms callers see the per-entry error when even fallback
     fails (so the warn-log fires on broken Windows installs).
   - `TestCopyFileRoundTrip` covers the last-resort copy fallback
     and its EXCL no-overwrite contract.
   - `TestClaudeExecuteIsolatesUsesCustomEnvSource` is the
     end-to-end check: an agent with custom_env CLAUDE_CONFIG_DIR
     reads its credentials from the pinned dir, not `~/.claude/`.

4. Docs: `apps/docs/content/docs/skills.{mdx,zh.mdx}` updated to
   describe the effective-source resolution and the Windows
   fallback chain so the docs match the runtime behaviour.

Verification:
- `go test ./...` green (full server suite locally, including
  `pkg/agent` 23 cases covering the new + existing isolation
  paths).
- `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and
  `go test -c -o /dev/null` both compile clean, confirming the
  Windows-tagged linker file builds.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): default skills_local to merge to preserve legacy behavior (MUL-2603)

Per Bohan's product decision on PR #3200, the per-agent host-skill toggle
defaults to "merge" — the pre-MUL-2603 inherit-from-machine behavior —
so existing personal workflows that rely on locally installed Claude
Skills keep working unchanged. Agent owners explicitly opt into "ignore"
when they need to harden a shared agent against a broken local skill on
one operator's machine (GitHub #3052).

Also audited all 11 runtimes for user-global skill discovery paths and
documented the scope of the toggle. Only Claude reads a user-global
`~/.claude/skills/`; Codex isolates via `CODEX_HOME`, the ACP backends
(Hermes / Kimi / Kiro) and the JSON-stream backends (Copilot / Cursor /
Gemini / Pi / OpenCode / OpenClaw) anchor discovery to the task workdir
and never read a user-global skill directory. UI copy and docs now say
"for runtimes that support it (currently Claude Code)" everywhere so
the scope is explicit.

Changes:

- Migration 108: column default flipped to 'merge'.
- Handler CreateAgent: missing field → "merge"; explicit "ignore" /
  "merge" still validated, garbage still 400.
- normalizeSkillsLocal: drift-safe coercion now lands on "merge" for
  anything that isn't the exact literal "ignore".
- agent_template.go / onboarding_shim.go: internal CreateAgent callers
  send "merge" instead of "ignore" to match the new default.
- Claude runtime (`claude.go`): isolate-mode gate flipped from
  `SkillsLocal != "merge"` to `SkillsLocal == "ignore"`, so "" (legacy
  daemons / older clients) and "merge" both walk `~/.claude/` directly.
- Create Agent dialog + Skills tab: toggle defaults to on (merge); only
  duplicate of an explicit "ignore" agent carries through. The
  isolation opt-in is now `skills_local: "ignore"` when the user flips
  off; "merge" is omitted from the request body.
- i18n (EN + zh-Hans): copy reframed — "On (default) — merged"; "Off —
  ignored. Recommended for shared agents".
- Docs (`/skills`, `/guides/agents.zh`): describe new default and
  enumerate which runtimes act on the toggle.
- Landing changelog 0.3.7: retitled "Per-Agent Local-Skill Toggle"; note
  the on-by-default behavior + off-to-isolate framing.
- Tests:
  - `TestClaudeExecuteIsolatesHostSkillsWhenIgnoreOptedIn` replaces the
    old by-default isolation case (now requires explicit "ignore").
  - New `TestClaudeExecuteDefaultModeKeepsHostConfigDir` locks in that
    default ExecOptions preserve the host CLAUDE_CONFIG_DIR.
  - `TestClaudeExecuteIsolatesUsesCustomEnvSource` now explicitly opts
    into "ignore" mode.
  - Handler tests: omitted → "merge"; explicit "ignore" round-trips;
    preserve-existing test seeds "ignore" and asserts "merge" flip-back.
  - `TestNormalizeSkillsLocal_DriftStaysSafe`: only literal "ignore"
    maps to ignore; everything else → "merge".
  - `skills-tab.test.tsx`: toggle ON by default; flip OFF when agent
    opted into "ignore". Intro-text matcher anchored to a more specific
    phrase so it no longer collides with the toggle hint copy.

Verification:
- `go test ./...` green (full server suite locally).
- `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and
  `go test -c -o /dev/null` both compile clean (windows-tagged linker
  file still builds).
- `pnpm typecheck` green across all packages and apps.
- `pnpm --filter @multica/views test` 88 files / 771 tests green.
- `pnpm --filter @multica/core test` 43 files / 390 tests green.
- Handler DB-backed tests still skip locally without docker; CI will
  validate the create / update paths against migration 108.

Co-authored-by: multica-agent <github@multica.ai>

* chore(landing): drop 0.3.7 changelog entry from this PR (MUL-2603)

The landing-page release notes belong in a separate release-prep PR, not in the feature PR.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): propagate skills_local=ignore to codex user-skill seed (MUL-2603)

Make the per-agent skills_local toggle real for Codex too, not just Claude.
Previously the toggle was only consumed by the Claude backend, while the
daemon's execenv layer always seeded Codex's per-task CODEX_HOME with the
host machine's user-installed skills from ~/.codex/skills/. A shared Codex
agent with skills_local=ignore could still inherit a broken local skill
from one operator's machine.

Now: PrepareParams/ReuseParams carry SkillsLocal; hydrateCodexSkills
skips seedUserCodexSkills when SkillsLocal == "ignore" so the per-task
CODEX_HOME exposes only workspace skills to the codex CLI. Default
("merge", or empty from older servers/clients) preserves existing
inherit-from-machine behavior. UI / docs are updated to reflect the
contract honestly: Claude Code and Codex honor the toggle; other
runtimes (Hermes / Kimi / Kiro / Copilot / Cursor / Gemini / Pi /
OpenCode / OpenClaw) leave $HOME untouched and discover user-level
skills natively, so the toggle is a no-op for them today.

New tests: TestPrepareCodexSkillsLocalIgnoreSkipsUserSeed,
TestPrepareCodexSkillsLocalMergeSeedsUserSkills, and
TestReuseCodexSkillsLocalIgnoreSkipsUserSeed cover Prepare(ignore),
Prepare(merge), and the toggle-flip-on-reuse path.

Co-authored-by: multica-agent <github@multica.ai>

* docs(skills): scope skills_local toggle copy to Claude Code + Codex (MUL-2603)

Off-state hint and Skills tab intro now explicitly call out Claude Code +
Codex as the only runtimes that honor the toggle, with "other runtimes
ignore this setting" wired into both states (en + zh-Hans), so users on
non-Claude/Codex agents don't read "Off" as runtime-wide isolation.

Docs (skills.mdx, skills.zh.mdx, guides/agents.zh.mdx) stop describing
Hermes / Kimi / Gemini / Copilot / Cursor / Pi / OpenCode / OpenClaw / Kiro
as having native user-level skill discovery; the daemon simply does not
manage user-level skill discovery for those runtimes today, and the toggle
is a no-op regardless of where it is set.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-26 13:26:33 +08:00

188 lines
7.4 KiB
Go

// Package agent provides a unified interface for executing prompts via
// coding agents (Claude Code, Codex, Copilot, OpenCode, OpenClaw, Hermes,
// Gemini, Pi, Cursor, Kimi, Kiro). It mirrors the happy-cli AgentBackend
// pattern, translated to idiomatic Go.
package agent
import (
"context"
"encoding/json"
"fmt"
"log/slog"
"time"
)
// Backend is the unified interface for executing prompts via coding agents.
type Backend interface {
// Execute runs a prompt and returns a Session for streaming results.
// The caller should read from Session.Messages (optional) and wait on
// Session.Result for the final outcome.
Execute(ctx context.Context, prompt string, opts ExecOptions) (*Session, error)
}
// ExecOptions configures a single execution.
type ExecOptions struct {
Cwd string
Model string
// SystemPrompt is consumed only by providers that can pass or safely inline
// developer/system instructions. Hermes ACP intentionally ignores it and
// relies on cwd-scoped context files such as AGENTS.md instead.
SystemPrompt string
MaxTurns int
Timeout time.Duration
SemanticInactivityTimeout time.Duration
ResumeSessionID string // if non-empty, resume a previous agent session
ExtraArgs []string // daemon-wide default CLI arguments appended before CustomArgs; currently read by claude and codex backends only
CustomArgs []string // per-agent CLI arguments appended after ExtraArgs
McpConfig json.RawMessage // if non-nil, MCP server config to pass via --mcp-config
// ThinkingLevel is the runtime-native reasoning/effort value (e.g.
// Claude's "low|medium|high|xhigh|max", Codex's "none|minimal|low|
// medium|high|xhigh"). Empty means "use the runtime/model default" —
// every backend that consumes this skips its --effort / reasoning_effort
// injection so the upstream CLI's own default applies. Currently honoured
// by the claude and codex backends only; other backends ignore the
// field rather than fail (so MUL-2339 can grow runtime support
// incrementally without breaking unrelated agents).
ThinkingLevel string
// SkillsLocal opts the runtime into ("merge") or out of ("ignore") the
// host machine's user-global skill directory. Honoured by the Claude
// backend in this package — "merge" (platform default; anything other
// than "ignore") preserves the inherit-from-machine behavior so the
// CLI walks `~/.claude/` directly, and "ignore" redirects
// CLAUDE_CONFIG_DIR to an isolated per-task scratch dir so the CLI
// never reads `~/.claude/skills/` (used to harden shared agents
// against broken host skills — GitHub #3052). Codex enforces the same
// contract one layer up in the daemon: `execenv.hydrateCodexSkills`
// skips the `~/.codex/skills/` seed when this value is "ignore", so
// the per-task CODEX_HOME exposes only workspace skills. Other
// backends ignore this option today (no-op).
SkillsLocal string
}
// Session represents a running agent execution.
type Session struct {
// Messages streams events as the agent works. The channel is closed
// when the agent finishes (before Result is sent).
Messages <-chan Message
// Result receives exactly one value — the final outcome — then closes.
Result <-chan Result
}
// MessageType identifies the kind of Message.
type MessageType string
const (
MessageText MessageType = "text"
MessageThinking MessageType = "thinking"
MessageToolUse MessageType = "tool-use"
MessageToolResult MessageType = "tool-result"
MessageStatus MessageType = "status"
MessageError MessageType = "error"
MessageLog MessageType = "log"
)
// Message is a unified event emitted by an agent during execution.
type Message struct {
Type MessageType
Content string // text content (Text, Error, Log)
Tool string // tool name (ToolUse, ToolResult)
CallID string // tool call ID (ToolUse, ToolResult)
Input map[string]any // tool input (ToolUse)
Output string // tool output (ToolResult)
Status string // agent status string (Status)
Level string // log level (Log)
SessionID string // backend session id (Status), for early resume-pointer pinning
}
// TokenUsage tracks token consumption for a single model.
type TokenUsage struct {
InputTokens int64
OutputTokens int64
CacheReadTokens int64
CacheWriteTokens int64
}
// Result is the final outcome after an agent session completes.
type Result struct {
Status string // "completed", "failed", "aborted", "timeout", "cancelled"
Output string // accumulated text output
Error string // error message if failed
DurationMs int64
SessionID string
Usage map[string]TokenUsage // keyed by model name
}
// Config configures a Backend instance.
type Config struct {
ExecutablePath string // path to CLI binary (claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, cursor, kimi, kiro-cli)
Env map[string]string // extra environment variables
Logger *slog.Logger
}
// New creates a Backend for the given agent type.
// Supported types: "claude", "codex", "copilot", "opencode", "openclaw", "hermes", "gemini", "pi", "cursor", "kimi", "kiro".
func New(agentType string, cfg Config) (Backend, error) {
if cfg.Logger == nil {
cfg.Logger = slog.Default()
}
switch agentType {
case "claude":
return &claudeBackend{cfg: cfg}, nil
case "codex":
return &codexBackend{cfg: cfg}, nil
case "copilot":
return &copilotBackend{cfg: cfg}, nil
case "opencode":
return &opencodeBackend{cfg: cfg}, nil
case "openclaw":
return &openclawBackend{cfg: cfg}, nil
case "hermes":
return &hermesBackend{cfg: cfg}, nil
case "gemini":
return &geminiBackend{cfg: cfg}, nil
case "pi":
return &piBackend{cfg: cfg}, nil
case "cursor":
return &cursorBackend{cfg: cfg}, nil
case "kimi":
return &kimiBackend{cfg: cfg}, nil
case "kiro":
return &kiroBackend{cfg: cfg}, nil
default:
return nil, fmt.Errorf("unknown agent type: %q (supported: claude, codex, copilot, opencode, openclaw, hermes, gemini, pi, cursor, kimi, kiro)", agentType)
}
}
// DetectVersion runs the agent CLI with --version and returns the output.
func DetectVersion(ctx context.Context, executablePath string) (string, error) {
return detectCLIVersion(ctx, executablePath)
}
// launchHeaders maps each supported agent type to the user-visible skeleton
// that the daemon spawns before any custom_args are appended. This is
// intentionally minimal — only the command + subcommand (or a short mode
// label when there is no subcommand). Internal flags, transport values, and
// environment variables are deliberately omitted so the string is a hint
// about *what* users are extending, not a dump of the full command line.
var launchHeaders = map[string]string{
"claude": "claude (stream-json)",
"codex": "codex app-server",
"copilot": "copilot (json)",
"cursor": "cursor-agent (stream-json)",
"gemini": "gemini (stream-json)",
"hermes": "hermes acp",
"openclaw": "openclaw agent (json)",
"opencode": "opencode run (json)",
"pi": "pi (json mode)",
"kimi": "kimi acp",
"kiro": "kiro-cli acp",
}
// LaunchHeader returns the user-visible launch skeleton for agentType, or an
// empty string if the type is unknown. Callers render this as a preview so
// users understand which command their custom_args get appended to.
func LaunchHeader(agentType string) string {
return launchHeaders[agentType]
}