multica

mirror of https://github.com/multica-ai/multica.git synced 2026-07-05 13:29:44 +02:00

Author	SHA1	Message	Date
Bohan Jiang	e0e91fc792	feat(daemon): harden agent mention-loop instructions (#1581 ) * feat(daemon): harden agent mention-loop instructions Two agents that mention each other via `mention://agent/<id>` can fall into an infinite reply loop — each says "I'm done" in prose but keeps `@mentioning` the other, which re-enqueues their run. Adding hard caps on agent-to-agent turns conflicts with Multica's design principle of giving agents the same authorship freedom as humans, so this change hardens the instructions that the harness injects instead. - Replace the terse "mentions are actions" blurb with a full Mentions protocol: `side-effecting` warning, explicit "when NOT to mention" (replying to another agent, sign-offs, thanks) and "when a mention IS appropriate" (human escalation, first-time delegation, user asked). - Add a pre-workflow decision step for comment-triggered runs: decide whether a reply is warranted at all, decide whether to include any `@mention`, and clarify that the post-a-comment rule is mandatory if you reply — silence is a valid exit for agent-to-agent threads. - Thread the triggering comment's author kind + display name (`TriggerAuthorType` / `TriggerAuthorName`) from the claim endpoint through the daemon task type, per-turn prompt, and CLAUDE.md workflow. When the author is another agent, both surfaces now name that agent and warn against sign-off mentions. - Soften the old closing line that told agents to `always` use the mention format — the word generalized to member/agent mentions and encouraged the very behavior that causes loops. Refs GH#1576, MUL-1323. * fix(daemon): remove MUST-respond conflict and sanitize trigger author name Addresses two blocking points on PR #1581: 1. buildCommentPrompt told the agent "You MUST respond to THIS comment" and unconditionally appended the reply command — directly conflicting with the new agent-to-agent silence-as-valid-exit workflow. Models were likely to keep following the older must-reply rule and fall back into the loop this PR is trying to close. Rewrite the header as "Focus on THIS comment — do not confuse it with previous ones" (keeps the anti-stale-comment signal) and change BuildCommentReplyInstructions to open with "If you decide to reply, post it by running exactly this command" so the reply command is available but conditional across both prompt surfaces. 2. Raw agent/user display names were being embedded directly into the high-priority prompt and CLAUDE.md via TriggerAuthorName. Agent and member names are only validated as non-empty at write time, so a name containing newlines, backticks, or fake mention markup would turn the field into a cross-agent prompt-injection surface. Add execenv.SanitizePromptField — strip control runes, collapse whitespace, drop markdown structural characters (backtick, asterisk, brackets, pipe, angle brackets, hash, backslash), truncate to 64 runes — and apply it at both embed sites (per-turn prompt and CLAUDE.md). Defense-in-depth at the consumption layer so this works for already-stored names without a migration. Tests: TestSanitizePromptField covers the policy; TestBuildPromptSanitizesAgentName plants an attack payload in TriggerAuthorName and checks the rendered prompt does not leak the newline-anchored injection or the fake mention markup. TestBuildPromptCommentTriggered{,ByMember} updated to lock in the conditional reply-command framing. refactor(daemon): trim redundant CLAUDE.md preamble and drop name sanitizer Per PR #1581 feedback: 1. Remove the `if ctx.TriggerAuthorType == "agent"` preamble block in runtime_config.go. It duplicated what workflow steps 4 and 5 already say ("Decide whether a reply is warranted", "Never @mention the agent you are replying to as a thank-you or sign-off"), so the signal lands the same without the extra ~7 lines of CLAUDE.md. The per-turn prompt preamble in prompt.go stays — that surface has no numbered workflow below it and would otherwise lose the silence-as-exit signal. 2. Delete execenv.SanitizePromptField + its test. Workspace agents are created by trusted team members, so the cross-agent name-injection surface it defended isn't realistic in the current trust model. 3. Drop TriggerAuthorType/Name from execenv.TaskContextForEnv and stop populating them in daemon.go — they're no longer read by the execenv package. The same fields on daemon.Task stay because prompt.go still needs them to label the triggering author in the per-turn prompt. Tests simplified to match the leaner shape: CLAUDE.md regression guards now assert that the anti-loop phrases live in the numbered workflow, and the sanitizer-specific tests are removed.	2026-04-24 01:39:12 +08:00
LinYushen	d97aec83d7	fix: pass model to Hermes ACP and add hermes to InjectRuntimeConfig (#1203 ) * fix: pass model to Hermes ACP session/new and add hermes to InjectRuntimeConfig - hermes.go: include opts.Model in session/new params so Hermes uses the configured model instead of its default (fixes local LLM failures) - runtime_config.go: add "hermes" to the AGENTS.md provider list so Hermes receives the Multica runtime instructions and skill discovery Fixes: https://github.com/multica-ai/multica/issues/1195 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(hermes): drop false native-skill claim and add regression tests The previous change added 'hermes' to the 'skills discovered automatically' branch of buildMetaSkillContent, but resolveSkillsDir has no Hermes case so skills still land in the .agent_context/skills/ fallback. AGENTS.md ended up claiming native discovery while the files were somewhere else, which would mislead Hermes (and future debuggers). - Move 'hermes' to the fallback branch alongside 'gemini' so AGENTS.md points Hermes at .agent_context/skills/ — matching where writeContextFiles actually writes them. - Extract buildHermesSessionParams so the session/new payload is unit-testable. - Add regression tests covering: * buildHermesSessionParams includes/omits 'model' correctly * InjectRuntimeConfig('hermes') writes AGENTS.md with the fallback hint * writeContextFiles('hermes') writes skills to .agent_context/skills/ Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: CC-Girl <cc-girl@multica.ai>	2026-04-23 12:43:30 +08:00
Bohan Jiang	c76c790b32	fix(daemon/execenv): make posting result comment an explicit workflow step (#1372 ) Agents were silently finishing tasks without ever posting results to the issue — their final reply stayed in terminal/log output only. See MUL-1124. Root cause: the injected CLAUDE.md / AGENTS.md put "post a comment with results" inside the body of step 4 (a nested clause in the default workflow description), so skill-driven flows jumped straight from "do the work" to `status in_review`. - Hoist posting the result comment into its own explicit, numbered step in both assignment-triggered and comment-triggered workflows, with the exact `multica issue comment add` invocation inlined. - Add a hard warning at the top of the Output section that terminal / chat text is never delivered to the user. - Add regression test covering both workflow branches.	2026-04-20 17:48:06 +08:00
devv-eve	b2307a5ee9	fix(execenv): write Copilot skills to .github/skills/ for native discovery (#1270 ) GitHub Copilot CLI scans project-level skills from .github/skills/<name>/SKILL.md (per the official cli-config-dir-reference docs), not from .agent_context/skills/. Previously, skills injected for the copilot provider were placed under .agent_context/skills/ and only referenced by name in AGENTS.md, meaning Copilot would not actually pick them up. - resolveSkillsDir: add a dedicated copilot case writing to .github/skills/ - Update doc comments in context.go and runtime_config.go - Add TestWriteContextFilesCopilotNativeSkills covering the new path and ensuring .agent_context/skills/ is not created for copilot Co-authored-by: Devv <devv@Devvs-Mac-mini.local> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-17 03:07:32 -07:00
LinYushen	b5de04da59	fix(daemon): platform-aware Codex sandbox config to unbreak macOS network (MUL-963) (#1246 ) * fix(daemon): platform-aware Codex sandbox config to unbreak macOS network On macOS, Codex's Seatbelt sandbox in workspace-write mode silently ignores '[sandbox_workspace_write] network_access = true' (see openai/codex#10390). That blocks DNS inside the sandbox, so 'multica issue get' and other CLI calls fail with 'dial tcp: lookup ...: no such host' — this is what caused MUL-963. Changes: - New server/internal/daemon/execenv/codex_sandbox.go: picks a sandbox policy based on runtime.GOOS and the detected Codex CLI version. Non-darwin or darwin with a known-fixed version keeps workspace-write + network_access=true; older darwin falls back to danger-full-access and logs a warn with upgrade hint. The fix-version threshold is a single constant (CodexDarwinNetworkAccessFixedVersion) so it's easy to bump once upstream ships. - Per-task config.toml now gets a 'multica-managed' marker block (BEGIN/END comments) rewritten idempotently; user-owned keys outside the markers are preserved. Legacy inline sandbox directives from earlier daemon versions are stripped on migration. - execenv.PrepareParams gains CodexVersion; execenv.Reuse takes a codexVersion arg; daemon.go caches detected versions at registration and threads them through to Prepare/Reuse. - Replaces the old ensureCodexNetworkAccess tests with platform-parameterised coverage (linux vs darwin, idempotency, legacy-migration, policy matrix). - docs/codex-sandbox-troubleshooting.md: symptom fingerprint table, decision matrix, self-check commands, trade-offs. Refs: MUL-963 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(daemon): hoist managed sandbox block above user tables (MUL-963) Review on #1246 flagged that upsertMulticaManagedBlock appended the managed block to EOF. If the user's config.toml ends inside a TOML table (e.g. [permissions.multica] or [profiles.foo]), a trailing bare sandbox_mode = "..." is parsed as a key of that preceding table, so Codex silently ignores the policy the daemon meant to apply. Two changes make the block position-independent: - renderMulticaManagedBlock now emits only top-level key=value lines and uses TOML dotted-key form (sandbox_workspace_write.network_access = true) instead of opening a [sandbox_workspace_write] header. The block therefore neither inherits from nor leaks into any surrounding table. - upsertMulticaManagedBlock always hoists the block to the top of the file (stripping any previously written managed block first), so the sandbox_mode line is always at the TOML root regardless of what the user put below it. This also migrates configs written by the original PR #1246 logic where the block was trapped behind a user table. Added tests for the regression scenario (pre-existing [permissions.*] table) and the legacy-trailing-block migration; updated the existing Linux default test and the troubleshooting runbook to reflect the dotted-key form. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: CC-Girl <cc-girl@multica.ai> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-17 14:03:13 +08:00
Jiayuan Zhang	bc1185f525	Merge pull request #755 from sanjay3290/feat/gemini-backend feat(daemon): add Google Gemini CLI backend	2026-04-14 02:46:20 +08:00
yushen	20809052f5	fix(daemon): address GC review feedback - Move WriteGCMeta from runTask() to handleTask() so it runs after task completion, not at start. Mid-task crashes leave orphan dirs that get cleaned by GCOrphanTTL. - Strengthen isBareRepo to check both HEAD and objects/ directory. - Remove empty workspace directories after all task dirs are cleaned. - Add 30s context timeout to git worktree prune to prevent hangs. - Add comprehensive unit tests for shouldCleanTaskDir (8 scenarios), cleanTaskDir, gcWorkspace empty-dir cleanup, isBareRepo, and WriteGCMeta/ReadGCMeta roundtrip. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 15:00:37 +08:00
bulai0408	47eb6cb612	fix(agent): enable network access for Codex sandbox so Multica CLI can reach API Codex tasks running in workspace-write sandbox mode could not resolve api.multica.ai because the hardcoded sandbox parameter in thread/start overrode any config.toml settings, and the default sandbox policy blocks network access. Changes: - Remove hardcoded `sandbox: "workspace-write"` from thread/start RPC — let Codex read sandbox config from its own config.toml instead - Auto-generate config.toml in per-task CODEX_HOME with `sandbox_mode = "workspace-write"` and `network_access = true`, preserving any existing user settings - Fix Reuse() to restore CodexHome for Codex provider on workdir reuse Closes #368	2026-04-13 01:03:43 +08:00
Sanjay Ramadugu	f99f50eb0c	feat(daemon): add Google Gemini CLI backend Registers `gemini` as a sixth supported agent provider alongside claude, codex, opencode, openclaw, and hermes. - Daemon config probes for `gemini` on PATH (MULTICA_GEMINI_PATH / MULTICA_GEMINI_MODEL env overrides mirror the other providers). - New agent.geminiBackend in pkg/agent/gemini.go: spawns `gemini -p <prompt> --yolo -o text [-m <model>] [-r <session>]`, reads stdout to completion, and returns a single MessageText plus the standard Result struct (Status / Output / DurationMs). - Execution environment writes a GEMINI.md file into the task workdir (mirroring the existing CLAUDE.md / AGENTS.md injection for other providers) so Gemini discovers the Multica runtime meta-skill through its native mechanism. Tests: - pkg/agent/gemini_test.go — unit coverage for buildGeminiArgs (baseline, model override, resume session, omit-when-empty). - internal/daemon/execenv/TestInjectRuntimeConfigGemini — verifies GEMINI.md is written and that CLAUDE.md/AGENTS.md are NOT. Scope (intentional for v1): - Text output only (`-o text`). Streaming tool events via `--output-format stream-json` is a follow-up once we have a reliable reproduction of Gemini's event schema. - No MCP config plumbing. Gemini's `--allowed-mcp-server-names` filter pairs well with the per-agent MCP work on feat/per-agent-mcp; stacking the two can land as a follow-up. - No token usage scraping (Gemini's accounting lives on the Google Cloud side, not a local JSONL log like claude/codex). - No session resume wiring beyond accepting the ExecOptions field — the daemon does not yet persist Gemini session IDs because the text output mode does not expose them. Migration / env changes: - New optional environment variables MULTICA_GEMINI_PATH and MULTICA_GEMINI_MODEL. Default path is the string "gemini" (resolved via PATH at daemon startup). If no Gemini install is detected, the provider is simply absent from the runtime — no behavior change for existing deployments.	2026-04-11 22:58:49 -04:00
Jiayuan Zhang	2c1d1d989c	fix(daemon): symlink Codex sessions dir to shared home for discoverability (#627 ) Per-task CODEX_HOME isolated session logs in per-task directories, making them invisible from the global ~/.codex/sessions/ where users expect to find them. Symlink the sessions directory back to the shared home so Codex writes session logs to the global location while keeping skills isolated per task. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 15:38:34 +08:00
Quake Wang	36db325d50	feat(daemon): add opencode as supported agent provider (#341 ) * feat(daemon): add opencode as supported agent provider Add opencode backend alongside claude and codex. The backend spawns `opencode run --format json`, parses streaming JSON events (text, tool_use, error, step_start/finish), and supports --prompt for system prompts. Includes CLI detection, AGENTS.md runtime config, native skill discovery via .config/opencode/skills/, and 21 tests covering handlers, JSON parsing, and integration-level processEvents scenarios. * chore: add .tool-versions to gitignore	2026-04-02 17:52:07 +08:00
Jiayuan	1054e218ed	fix(daemon): update execenv tests to match current renderIssueContext output CLI hints like "multica issue get" were moved to CLAUDE.md and are no longer rendered into issue_context.md. Remove stale assertions.	2026-03-31 15:15:06 +08:00
Jiayuan	7d126cc549	feat(daemon): group task directories by workspace ID Task execution environments were all created flat under WorkspacesRoot, mixing tasks from different workspaces. Now tasks are nested under their workspace ID for clearer organization and easier per-workspace cleanup.	2026-03-30 20:13:30 +08:00
Jiayuan	cdc1ac708e	feat(daemon): agent-driven repo checkout with bare clone cache Agents now decide which repo to use based on issue context and check out repos on demand via `multica repo checkout <url>`. Workspace repos are cached locally as bare clones for fast worktree creation. Key changes: - Add repocache package for bare clone management (clone, fetch, worktree) - Add `multica repo checkout` CLI command that talks to local daemon - Add POST /repo/checkout endpoint on daemon health server - Pass workspace repos metadata through register + task claim responses - Remove pre-created worktrees from execenv (workdir starts empty) - Update CLAUDE.md template to instruct agents to use `multica repo checkout` - Pass MULTICA_DAEMON_PORT, WORKSPACE_ID, AGENT_NAME, TASK_ID env vars to agent	2026-03-29 19:37:48 +08:00
Jiayuan	46144646c5	feat(daemon): inject skills into agent-native directories Write skills to provider-native paths so agents discover them automatically instead of relying on manual path references in CLAUDE.md/AGENTS.md. - Claude: write to {workDir}/.claude/skills/ (native discovery) - Codex: write to per-task CODEX_HOME/skills/ with auth/config seeded from ~/.codex/ (symlink auth.json, copy config files) - Fallback: keep .agent_context/skills/ for unknown providers Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-28 00:47:00 +08:00
LinYushen	6d2a0b45d2	refactor: decouple task lifecycle from issue status (#151 ) * refactor: decouple task lifecycle from issue status, add daemon health server - Remove automatic issue status changes from StartTask (in_progress), CompleteTask (in_review), and FailTask (blocked) in task service. Issue status is now fully managed by the agent via `multica issue status`. - Update agent prompt and meta skill to instruct agents to manage issue status themselves (in_progress → done/in_review/blocked). - Add daemon health HTTP server on 127.0.0.1:19514 with /health endpoint exposing pid, uptime, agents, and workspaces. Fail fast if port is taken (another daemon already running). - Update `multica status` to check both server and daemon health. - Add Save button to repos section in workspace settings UI. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(daemon): simplify prompt, fix runtime config path, improve task error logging - Slim down BuildPrompt to a minimal hint; detailed workflow now lives in CLAUDE.md/AGENTS.md - Write CLAUDE.md to workDir root instead of .claude/CLAUDE.md - Fix git-exclude pattern (.claude → CLAUDE.md) - Decouple task queue reconciliation from issue status changes (agents manage status via CLI) - Add diagnostic logging when CompleteTask/FailTask fail due to unexpected task state Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(task): use task_completed/task_failed inbox notification types FailTask was sending "agent_blocked" which conflates agent crash with issue-level blocked status. Align notification types with the new decoupled model: task_completed and task_failed. Update frontend types and labels accordingly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 18:30:21 +08:00
yushen	1deae2a1e9	refactor(daemon): remove context snapshot, let agent fetch data via CLI Replace the frozen context snapshot pattern with a CLI-driven approach: agents now use `multica` CLI commands to fetch issue details, comments, and workspace context on demand, always getting the latest data. - Remove buildContextSnapshot and snapshot generation from enqueue - Claim endpoint now returns fresh agent name + skills from DB - Daemon resolves provider from local runtimeIndex, not snapshot - Prompt instructs agent to use `multica issue get` / `comment list` - Meta skill (CLAUDE.md/AGENTS.md) documents all available CLI commands - Skills still injected as filesystem files (static agent config) - Simplify daemon types: remove TaskContext/IssueContext/RuntimeContext Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 15:31:22 +08:00
Naiyuan Qing	395814b16a	fix(test): update daemon tests after removing acceptance_criteria/context_refs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 19:26:35 +08:00
Naiyuan Qing	a500001093	refactor: remove acceptance_criteria and context_refs from issues These fields were unused in practice. Removed from frontend types, issue detail UI, backend handlers, daemon prompt/context, protocol messages, SQL queries, and tests. DB columns retained with defaults. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 19:24:34 +08:00
yushen	e4a905c841	fix(daemon): improve error handling in auth and workspace loading Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 17:22:12 +08:00
yushen	7b4a73c989	refactor(daemon): remove global ReposRoot, use per-task RepoPath from server ReposRoot was a daemon-level config that locked all tasks to a single git repo. Replace with RepoPath in TaskContext so the server can specify the repo per task. When not provided, daemon falls back to directory mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 16:04:33 +08:00
Naiyuan Qing	8983a9fefa	feat(logging): add structured logging across server and SDK Replace raw fmt/log calls with structured slog logger (Go) and console-based logger (TypeScript). Add request logging middleware. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:57:11 +08:00
Jiayuan Zhang	02df33803a	feat: structured skills system with meta skill runtime injection Replace agent.skills TEXT field with structured skill/skill_file/agent_skill tables. Skills are workspace-level entities with supporting files, reusable across agents via many-to-many bindings. Backend: migration 008, sqlc queries, CRUD handler, agent-skill junction, structured skill loading in task context snapshot. Daemon: meta skill injection via runtime-native config (.claude/CLAUDE.md for Claude, AGENTS.md for Codex) so agents discover .agent_context/ skills through their native mechanism. Lean prompt without inlined skill content. Frontend: Skills management page, agent Skills tab picker, SDK methods, TypeScript types, workspace store integration. Also removes auto-creation of init issues when creating agents. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-25 15:17:59 +08:00
Jiayuan Zhang	678266ec87	feat(daemon): add per-task isolated execution environments Introduce the `execenv` package that creates isolated working directories for each agent task. Supports git worktree mode (code tasks) and plain directory mode (non-code tasks), with `.agent_context/issue_context.md` injected into the workdir for Claude Code to discover. Key changes: - New `server/internal/daemon/execenv/` package (Prepare/Cleanup) - `runTask()` now creates isolated env instead of using shared reposRoot - Prompt updated to reference `.agent_context/` files - Add `WorkspacesRoot` config (default ~/multica_workspaces) - Add `KeepEnvAfterTask` config for debugging - Default agent timeout increased from 20min to 2h - `CompleteTask` now forwards branch name to server Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-25 12:41:52 +08:00

24 Commits