multica

mirror of https://github.com/multica-ai/multica.git synced 2026-07-05 21:39:54 +02:00

Author	SHA1	Message	Date
J	18b338ba1a	fix: remove gemini cli runtime Co-authored-by: multica-agent <github@multica.ai>	2026-06-24 14:44:05 +08:00
Multica Eve	8ad673fdb7	MUL-3560: gate slim runtime brief behind `runtime_brief_slim` feature flag (#4449 ) The MUL-3560 slim runtime brief — kind-driven dispatcher, per-section gating, prose compression for ~7k chars saved on the typical comment-triggered task — now ships behind the `runtime_brief_slim` feature flag wired via the framework-level service from MUL-3615. Default: OFF in every environment (production stays on the legacy brief that has shipped for ~2 years). Staging opts in via the YAML rule set; ops can override per-process with `FF_RUNTIME_BRIEF_SLIM=true`. Production is held back until staging has burned in long enough that we are confident the slim brief does not regress agent behaviour. Architecture (one toggle point, two code paths, both fully tested): buildMetaSkillContent (runtime_config.go) │ └─ useSlimBrief() → false (default) │ → fall through to the legacy verbose body that ships on │ main today — byte-for-byte unchanged, no migration risk │ └─ useSlimBrief() → true → buildMetaSkillContentSlim (runtime_config_sections.go) → classifyTask → 5-way kind switch → per-section writers BuildCommentReplyInstructions takes the same gate, so the per-turn comment prompt and the runtime brief stay in sync on which template they emit. What's in this PR: - runtime_config_flag.go (new): package-scope `runtimeFlags` atomic pointer + `SetFeatureFlags` setter + `useSlimBrief` toggle point. Nil-safe: a daemon that forgets to wire the service falls back to legacy, no panic. - runtime_config_kind.go (new): `taskKind` enum + `classifyTask` + `hasIssueContext` predicate. Used only by the slim path. - runtime_config_sections.go (new): the slim brief itself — `buildMetaSkillContentSlim` + per-section `writeXxx` helpers + `writeAvailableCommandsQuickCreate` minimal variant + `writeBackgroundTaskSafetySlim` compressed safety section. The Section × Kind matrix is documented inline on `buildMetaSkillContentSlim` and the test below checks the dispatcher does not diverge from the spec. - reply_instructions.go: `BuildCommentReplyInstructions` gains a short slim-or-legacy prelude; new `buildCommentReplyInstructionsSlim` is the compressed cookbook (defers the shell-hazard rationale to `## Comment Formatting`). - runtime_config.go: `buildMetaSkillContent` gains a 2-line dispatcher at the top; the legacy body is otherwise untouched. - runtime_config_kind_test.go (new): canaries for both paths. - TestClassifyTask: 5 kinds + 3 tiebreak cases. - TestTaskKindHasIssueContext: predicate semantics. - TestSlimFlagOffUsesLegacy: nil flag service → legacy path (renders "Get full issue details.", a legacy-only substring). - TestSlimFlagOnUsesSlim: flag on → slim path (renders "full issue.", a slim-only one-liner) AND must NOT render legacy "Get full issue details.". - TestBuildMetaSkillContentSlimKindMatrix: locks the per-kind section set; heading match is line-anchored so inline references don't trip absence assertions. - TestSlimQuickCreateAvailableCommands: locks the minimal-variant content for quick-create (issue create present, every other Core command absent). - TestSlimBriefIsSubstantiallyShorter: ≥ 30% reduction guard so a future change can't accidentally re-bloat the slim path back to legacy levels. - cmd/server/main.go: now calls `execenv.SetFeatureFlags(flags)` immediately after constructing the feature flag service. Measured impact (slim vs legacy, claude provider, realistic fixture with 2 repos + 2 skills + member initiator): legacy = 19567 chars slim = 11868 chars Δ = -7699 (-39.3%) Verification: - go vet ./internal/daemon/... ./cmd/server/... ok - go test ./internal/daemon/... ok - go test ./pkg/featureflag/... ok - TestSlimBriefIsSubstantiallyShorter logs the 39.3% ratio - TestSlimFlagOffUsesLegacy + TestSlimFlagOnUsesSlim pass both directions, so the dispatcher is locked in code. The pre-existing `internal/handler` test failures (TestLeaveWorkspace_RevokesOwnRuntimes, TestDeleteMember_CancelsTasksFromAgentReassignment, TestDeleteMember_NoRuntimes_DeletesMember) reproduce on plain `origin/main` with the same `relation "channel_user_binding" does not exist` SQL error — they are a missing-migration bug from the recent channels foundation PR (`ce28d0aa0`), not anything this PR touched. Rollout plan: 1. Merge this PR. Production daemons keep emitting the legacy brief (flag default false). 2. Add a YAML rule to staging's `MULTICA_FEATURE_FLAGS_FILE`: runtime_brief_slim: default: true Staging daemons start emitting the slim brief on next restart. 3. Watch `agent prompt prepared` logs + agent behaviour for 7 days. 4. If staging is clean, flip the prod YAML to `default: true`. Legacy code path stays in the binary as a kill-switch (`FF_RUNTIME_BRIEF_SLIM=false` to revert without a deploy). 5. After ~30 days clean in prod, follow up with a PR that deletes the legacy body and the flag — same pattern as docs/feature-flags.md recommends ("plan the death of the flag at birth"). Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-06-24 14:23:17 +08:00
Naiyuan Qing	b79777caec	feat(comments): resolve-aware fold for agent comment reads (MUL-3555) (#4463 ) * feat(comments): resolve-aware fold for agent comment reads (MUL-3555) Agents reading a long issue paid tokens for settled discussion. The human timeline already folds resolved threads, but the agent read path (`comment list`) ignored resolved_at entirely — humans saw the conclusion, agents got the full raw discussion. Add an opt-in `fold=true` projection to ListComments that collapses each resolved thread to root + conclusion (reply-resolved) or root only (root-resolved), reusing the human timeline's deriveThreadResolution semantics. The resolved thread's root carries `thread_resolved` + `folded_count`; `--full` brings the dropped comments back. Fold is rejected on partial-thread reads (since/tail) and roots_only, where a resolution comment could be unfetched and silently dropped. CLI `comment list` folds by default on the complete-thread reads (default, --recent, untailed --thread) with a `--full` escape hatch; the agent prompts and runtime brief document the fold + escape. No new endpoint, no human UI change, no SQL/migration change — in-memory projection, same precedent as summary/roots_only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * refactor(daemon): dedupe fold prompt restatements per review (MUL-3555) Howard's PR review flagged DRY redundancy: the resolve-fold rule was restated in full in the task prompt (prompt.go:41/:182) and the brief workflow steps (runtime_config.go:673/:692, reply_instructions cold hint) even though the canonical command catalog (runtime_config.go:477) — always present in the brief — already documents it in full, and the task prompt explicitly defers to it ("follow the rule in your runtime workflow file"). Keep the catalog entry full (the canonical reference); shrink the five inline restatements to a short "resolved threads come back folded — `--full` to expand" pointer. No loss of signal (the agent always has the full catalog in context), ~80-120 tokens/run saved on the worst-case assignment / cold paths. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-06-24 09:52:18 +08:00
Naiyuan Qing	4ab335b8a5	MUL-3416: Issue pre-trigger preview + Handoff Note (#4383 ) * feat(issues): unify run-enqueue decision behind WillEnqueueRun + preview endpoint Collapse the issue update/batch enqueue copies into one service predicate service.IssueService.WillEnqueueRun, shared verbatim with a new dry-run endpoint POST /api/issues/preview-trigger so the four entry points stop drifting (squad/self-loop/batch omissions, MUL-3375). The private-agent gate stays at the HTTP boundary: write paths inject allow-all, preview injects the real gate so it never leaks a private agent's readiness. Add suppress_run to issue update/batch: the change applies but no run starts. Remove the now-dead handler mirrors shouldEnqueueSquadLeaderOnAssign / isSquadLeaderReady. service.Create and the comment trigger chain are untouched. Tests: preview behavior, preview<->write-path match, batch aggregation, member no-trigger, suppress_run skip, malformed-body 400. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(issues): inject handoff note into assigned runs via first-class task field Add an optional handoff_note carried by issue assign/promote into the run's opening prompt and issue_context.md, via a dedicated agent_task_queue column (migration 122) and a daemon assignment-handoff render branch — never a fabricated comment, never trigger_comment_id (MUL-3375 §6.1). Thread the note through enqueueIssueTask/enqueueMentionTask + WithHandoff public variants and dispatchIssueRun; suppress_run or a parked write drops it (no run = nothing to inject). Soft version gate: MinHandoffCLIVersion + HandoffSupported, surfaced per-trigger as handoff_supported in the preview so the UI can gray the note box on old daemons; the assignment never hard-fails. Tests: daemon prompt + issue_context render via the assignment branch (not quick-create/comment), version helper matrix, note persists on the task, suppressed assign enqueues nothing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(issues): leave a display-only handoff record on the timeline When an assign/promote with a handoff note starts a run, write one type='handoff' timeline record via TaskService.RecordHandoff — a direct Queries.CreateComment + timeline event that bypasses Handler.CreateComment, so it never reaches triggerTasksForComment and cannot start a second run (MUL-3375 §6.2, the must-not-retrigger invariant). Author is the actor who handed off; body is the note. Migration 123 admits the 'handoff' comment type. Recorded only on a real run start: suppress_run or a parked write writes nothing. enqueueSquadLeaderTask now reports whether it enqueued so the trace is gated on an actual dispatch. Test: exactly one handoff record on assign-with-note, exactly one task (no re-trigger), and no record when suppressed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(issues): frontend plumbing for issue-trigger preview + handoff (core) Add api.previewIssueTrigger + IssueTriggerPreviewSchema (zod parseWithFallback), the use-issue-trigger-preview hook, issueKeys.issueTriggerPreview(+All) with WS queue-state invalidation, suppress_run/handoff_note on UpdateIssueRequest, the 'handoff' CommentType, and stripping of the control fields from optimistic update/batch cache patches (MUL-3375 §9). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * fix(issues): exclude handoff records from new-comment counting type='handoff' is a display-only timeline record, not conversation. Exclude it from CountNewCommentsSince so a handoff note never inflates the count of "new comments to catch up on" fed to a claiming agent (MUL-3375 §12). Analytics already excludes it (RecordHandoff is a direct write that emits no analytics event), and the comment-trigger path is already bypassed. Test: a handoff record does not bump the new-comment count; a real comment does. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(issues): pre-trigger preview UI, handoff note, timeline card (web/desktop) Wire the §9 frontend onto the preview endpoint + handoff fields: - Delete the backlog blocking dialog (backlog-agent-hint) and its modal type; the over-eager nag is gone. Backlog awareness is now a passive label. - RunConfirmModal: single assign + batch assign/status route here. Shows the backend predicate's verdict ("将启动 @X" / "将启动 N 个" / parked), an optional handoff note (assign only, soft-gated by handoff_supported), and 暂不启动 — then applies via update/batch. No frontend guessing. - create modal: passive CreateRunHint ("将启动 @X" / backlog parked). - single status change stays a direct apply (unchanged). - timeline: render type='handoff' as a distinct, non-interactive handoff card. - i18n run_confirm + handoff_card across en/ja/ko/zh-Hans; drop backlog action keys; locale parity green. Tests: use-issue-actions (assign → run-confirm modal, member → direct), create-issue + comment-card suites updated/green; views typecheck + lint clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> test(issues): use a valid anchor in the handoff count-exclusion test CountNewCommentsSince filters id <> @anchor_id; SQL id <> NULL is NULL and excludes every row, so an empty anchor made the control assertion read 0. The production caller always passes a real anchor — mirror that with a non-matching sentinel uuid. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * test(issues): RunConfirmModal apply logic (start/suppress/note-gate/batch) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * test(core): preview schema malformed/missing/null fallback coverage Cover IssueTriggerPreviewSchema via parseWithFallback (MUL-3375): well-formed parse, top-level + item default fills (empty/older backend), and fallback to { triggers: [], total_count: 0 } for malformed shapes, a dropped required issue_id, a wrong-typed total_count, and null/non-object bodies — so the four entry points degrade to "nothing will start" instead of throwing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * refactor(issues): remove display-only handoff timeline record (留痕) The handoff "留痕" timeline record (type='handoff' comment written on run start) was judged superfluous and dropped per product call. This removes only the display-only trace; the handoff NOTE injection into the run's opening prompt + issue_context.md is untouched. - backend: drop RecordHandoff + its call in dispatchIssueRun - db: drop the `type <> 'handoff'` exclusion in CountNewCommentsSince and migration 123 (comment_type_check reverts to the 4-type set from 001); no production data exists for this unreleased feature - frontend: drop the "handoff" CommentType, HandoffCard, and handoff_card i18n (all locales) - tests: drop handoff_count_test.go and the record-write assertions in issue_trigger_preview_test.go (note-injection tests retained) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(issues): dismissable run-confirm modal + team-handoff copy Two fixes to the pre-trigger confirm modal (MUL-3375). 1. Dismissable: switch RunConfirmModal from AlertDialog to the standard shadcn Dialog so it has the close (X) button + Esc + click-outside. Previously the only choices were "start" / "don't start now" with no way to abort the action entirely; dismissing now cancels with no write. 2. Copy: rework the action-surface wording away from the backend term "run" toward team-handoff voice — 指派 / 开始 / 交接 (run stays only on record surfaces). Unifies the note's three names to "交接说明", and parallels the rewrite across en/ja/ko. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * chore(agent): bump handoff note min CLI version to 0.3.28 The daemon release that renders handoff notes ships in 0.3.28 (0.3.27 was the prior tag), so move the soft-gate threshold up. Below this the note is silently dropped and the frontend grays the note box — assignment is never blocked. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(issues): skip run-confirm when batch-moving issues to backlog A move into backlog never starts a run (service/issue_trigger.go), so the pre-trigger confirm modal degenerated to an empty "won't start" box with a single Apply button — pure friction. Apply directly instead, matching the single-issue status path. Other target statuses still route through the modal. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(issues): refine pre-trigger preview hint and copy - Move the create-issue run hint to a reveal band (grid 0fr→1fr) above the property toolbar. It was sharing the footer button row and, lacking a width constraint, reflowed the submit buttons whenever it appeared. Restyle to a borderless, comment-style avatar+caption that is purely a caption (non-interactive avatar). - Distinguish squad from agent in the pre-trigger copy: a squad's leader evaluates and delegates rather than "starting work" itself. Add will_start_named_squad / will_start_squad / create_will_start_squad across en/zh/ja/ko (reusing the squad_leader_* evaluate→arrange vocabulary) and branch run-confirm + the create hint on squad assignees. - Bold the assignee name in the run-confirm headline via a language-safe sentinel split (no per-language prefix/suffix keys). - Align zh "开始处理" → "开始工作" on the single-assign copy. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(issues): stub ActorAvatar in create-issue suite CreateRunHint now renders an ActorAvatar for agent/squad assignees, which pulls in getActorInitials/getActorAvatarUrl + the workspace/presence/navigation hook tree. This form-focused suite only stubbed getActorName, so the squad-forwarding test crashed with "getActorInitials is not a function". Stub the avatar inert — its own behavior is covered elsewhere. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Walt <walt@multica.ai> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-06-23 13:17:13 +08:00
Bohan Jiang	48b8dbf439	feat(daemon): surface sub-issue stages in the always-on runtime brief (#4426 ) Agents creating sub-issues only saw the runtime brief's Sub-issue Creation section, which taught the manual todo/backlog serial chain and never mentioned stages — the `--stage` flow was documented only in the multica-working-on-issues skill, which an agent reads only if it opens it. So agents defaulted to hand-managed backlog chains and rarely reached for stages. - Add an "Ordering with stages" paragraph to the brief's Sub-issue Creation section nudging agents to group ordered/waiting sub-issues with --stage instead of hand-promoting a backlog chain. - List --stage on the brief's issue create / update command lines and add multica issue children to the Core command list for discoverability. - Extend the brief test with the new stage assertions. The Sub-issue Creation section stays gated to issue-bound runs (skipped for chat/quick-create/autopilot), unconditional on parent_issue_id, and free of parent-notification guidance — all existing canaries still pass. MUL-3508 Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-23 01:08:10 +08:00
Bohan Jiang	da72e2fa22	feat(daemon): inject project description into the agent brief (MUL-3465) (#4395 ) * feat(daemon): inject project description into the agent brief Issues bound to a project only surfaced the project title in the runtime brief; the project description (durable, project-wide context the owner sets) was loaded but dropped. Carry it end-to-end: - claim handler reads proj.Description onto the response (issue-bound and quick-create paths) - new ProjectDescription field on AgentTaskResponse, daemon Task, and TaskContextForEnv - rendered in the brief's `## Project Context` section and written to .multica/project/resources.json as project_description Empty descriptions render nothing (no extra heading). Updated the projects-and-resources built-in skill docs in the same change. MUL-3465 Co-authored-by: multica-agent <github@multica.ai> * feat(projects): clarify project description is injected as agent context The project description is now durable context injected into every task's brief, but the UI still presented it as a plain "Description" field, so existing descriptions could silently become agent input. Add a hint under the description editor on the project detail page and in the create-project modal, in all four locales, stating it is shared with agents as context for every task in the project. No data-semantics change. Addresses review feedback on PR #4395. MUL-3465 Co-authored-by: multica-agent <github@multica.ai> * test(handler): assert project description flows through task claim The execenv tests cover brief rendering, but nothing pinned the claim handler boundary where proj.Description is read onto the response. Add two tests — issue-bound and quick-create paths — so a regression in that assignment fails loudly instead of silently dropping the description. Addresses review feedback on PR #4395. MUL-3465 Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-22 23:39:27 +08:00
DylanLi	78342a39ce	MUL-3305: feat(agent): add qoder CLI as a choice of agent provider. (#2461 ) * feat(agent): Qoder ACP runtime, chat reconnect recovery, and task linkage - Add Qoder CLI backend (ACP transport, model discovery, blocked-args policy) - Wire daemon/runtime config, docs, and UI provider assets - Retry terminal task reports; add backoff unit tests - Chat: SQL attach user message to task; handler + optimistic cache reconcile - Invalidate chat/task-messages caches on WS reconnect; extract helper + tests Co-authored-by: Orca <help@stably.ai> Co-authored-by: Cursor <cursoragent@cursor.com> * chore: drop non-Qoder changes (chat reconnect, task link, terminal report retries) Keep only Qoder runtime, docs, daemon config/execenv, and UI provider assets. Co-authored-by: Orca <help@stably.ai> Co-authored-by: Cursor <cursoragent@cursor.com> * fix(agent): harden Qoder ACP drain and wire project skills path - Stop streaming to msgCh after reader wait so grace timeout cannot race close - Resolve injected skills to .qoder/skills per Qoder CLI discovery - Update AGENTS.md skill copy and add execenv tests Co-authored-by: Orca <help@stably.ai> Co-authored-by: Cursor <cursoragent@cursor.com> * feat(qoder): add provider logo and wire MCP config into ACP sessions - Add inline SVG QoderLogo component to provider-logo.tsx, replacing the generic Monitor icon placeholder - Add convertMcpConfigForACP helper to convert Claude-style MCP server config (object map) into ACP array format for session/new and session/resume - Add unit tests for convertMcpConfigForACP covering stdio, SSE, empty/nil, and multi-server cases Co-authored-by: Orca <help@stably.ai> * fix(test): capture both return values from InjectRuntimeConfig in Qoder test Co-authored-by: Orca <help@stably.ai> * fix(qoder): preserve remote MCP headers and promote provider errors Addresses review feedback on #2461 (Bohan-J): two runtime-correctness issues in the Qoder ACP backend. 1. Remote MCP headers were dropped. The bespoke convertMcpConfigForACP only forwarded url/type, so an authenticated remote MCP server looked configured in Multica but failed inside the Qoder session. Replace it with the shared buildACPMcpServers helper (same path Hermes/Kimi/Kiro use), which preserves headers as [{name, value}], sorts for deterministic output, and handles remote transport aliases. Fail closed on malformed mcp_config instead of silently dropping servers. 2. Provider failures could report as completed tasks. stderr was wired via io.MultiWriter and the result was only promoted to failed when output was empty, so a terminal upstream error (HTTP 429 / expired token) racing a stopReason=end_turn with text still became "completed". Switch to StderrPipe + an explicit copier, drain it (bounded by the existing grace window, since qodercli can leave a child holding the inherited fds) before the decision, and run the shared promoteACPResultOnProviderError. Tests: replace the convertMcpConfigForACP unit tests with two end-to-end Qoder tests — one asserts the Authorization header reaches the session/new payload as {name, value}, the other asserts a terminal stderr error with non-empty output reports failed. Co-authored-by: Orca <help@stably.ai> * fix(qoder): align ACP session handling Co-authored-by: Orca <help@stably.ai> * fix(agent): guard qoder late output after drain Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Orca <help@stably.ai> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-22 18:55:45 +08:00
Naiyuan Qing	4fe8b54e9b	MUL-3446: keep chat output in chat (#4387 ) * MUL-3446: keep chat output in chat Co-authored-by: multica-agent <github@multica.ai> * MUL-3446: simplify chat output guidance Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-06-22 15:51:03 +08:00
BeliyDym	5fd3d01d13	MUL-3502: OST-1161: Bound assignment comment catch-up Squashed PR #4392. Updates assignment/comment catch-up guidance to use recent 10 and aligns related examples.	2026-06-22 15:46:47 +08:00
Hzzzzzx	b4c9e4423c	test: enable -race detector in Go test pipeline (WOR-61) (#4274 ) * test: enable -race detector in Go test pipeline (WOR-61) Add the -race flag to all three Go test invocation sites so the existing concurrency regression harness (workdir_race_test.go for #3999, runtime_gone_test.go, runtime_profile_drift_test.go) actually exercises the race detector. The daemon package alone has 28+ goroutine launch points with no automated race coverage before this change. Sites updated: - Makefile:299 (make test, local) - .github/workflows/ci.yml:101 (CI backend job) - .github/workflows/release.yml:55 (release verify job) go test already runs a vet subset by default, so no separate -vet flag is added. No production code touched. Co-authored-by: multica-agent <github@multica.ai> * test(execenv): serialize runtimeGOOS-mutating test (WOR-61) TestInjectRuntimeConfigIssueMetadataCodexFormattingUnchanged called t.Parallel() while mutating the package-level runtimeGOOS to drive the windows/linux branches, racing with the other parallel tests that read runtimeGOOS in buildMetaSkillContent. The -race flag enabled in the prior commit surfaced it as 3 WARNING: DATA RACE reports and 11 "race detected" failures in CI (only the execenv package failed). Drop t.Parallel() and add the "// Not parallel: mutates the package-level runtimeGOOS." comment already used by the six sibling writer tests across execenv_test.go and reply_instructions_test.go. This is test-isolation only; no production code, no mutex/atomic, no signature change. Verified locally: go test -race -count=1 ./internal/daemon/execenv/ -> ok 2.276s go test -race -count=1 ./internal/daemon/... -> all 3 pkgs ok Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: hzz <331380069@qq.com> Co-authored-by: multica-agent <github@multica.ai>	2026-06-18 15:50:24 +08:00
Bohan Jiang	1279f22d1c	MUL-3325: add background task safety brief (#4257 ) * fix(daemon): add background task safety brief Co-authored-by: multica-agent <github@multica.ai> * fix(agent): force Claude background tools foreground Co-authored-by: multica-agent <github@multica.ai> * fix(agent): narrow Claude async launch detection Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-18 10:45:51 +08:00
Multica Eve	18a58e80c0	MUL-3316: fix(execenv): switch agent prompt to --content-file to prevent heredoc flag swallowing (#4182 ) (#4191 ) * fix(execenv): switch agent prompt to --content-file to prevent heredoc flag swallowing (#4182) The Linux/macOS reply template recommended --content-stdin with a quoted HEREDOC. That pattern is safe for the trivial single-flag comment-add case that BuildCommentReplyInstructions emits, but as soon as a model wraps extra flags around the heredoc on multica issue create / update — assignee, project — the bash heredoc/flag boundary is fragile in two ways the model cannot see: - A 'BODY \\' terminator with a trailing token is not recognised as the heredoc end, so flag lines after it are swallowed into the description (OXY-78: residual flag text leaked into the description, command exit 0). - A clean terminator turns the trailing '--assignee ...' line into a separate failing shell statement, while the create itself already exited 0 with no assignee (OXY-76: assignee silently dropped, no residual text). In both cases the CLI never receives the swallowed flags, the API request omits the fields, and the daemon has no visibility. The created issue lands with assignee_id: null / project_id: null. This commit: * Switches the Linux/macOS branch of BuildCommentReplyInstructions to --content-file with a 3-step recipe (write file, post, rm) so the body never reaches the shell and all flags live on one shell-token line. There is no heredoc boundary for flags to leak across. * Adds a parallel cleanup step (Remove-Item) to the Windows branch so the cross-platform template is one shape. * Rewrites the runtime_config.go ## Comment Formatting non-Windows section to mandate --content-file and explicitly ban --content-stdin HEREDOC for agent-authored comments, citing #4182. * Reorders the Available Commands menu lines for issue create / update / comment add to put --content-file / --description-file ahead of the stdin variant and add a per-line note pointing at #4182. * Updates and renames the affected tests (TestBuildCommentReplyInstructionsCodexLinux, TestBuildCommentReplyInstructionsNonCodexLinux, TestInjectRuntimeConfigLinuxCommentFormattingEmphasizesFile, TestInjectRuntimeConfigIssueMetadataCodexFormattingUnchanged) so the new file-first contract is pinned and the old HEREDOC mandate is in the banned-strings lists. This converges Linux/macOS with the long-standing Windows file-only path, so the cross-platform guidance is now one shape. It also strictly improves on the previous MUL-2904 guardrail by eliminating shell exposure of the body entirely (no body ever reaches the shell, so backtick / $() / $VAR substitution cannot corrupt it). Closes GitHub multica-ai/multica#4182. No CLI or backend changes — --content-file / --description-file already exist. Co-authored-by: multica-agent <github@multica.ai> * docs(prompt): correct stale BuildPrompt comment to file-first (#4182) --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: CC-Girl <cc-girl@multica.ai>	2026-06-16 17:14:25 +08:00
YOMXXX	34d4cd3a28	feat(openclaw): support connecting to existing OpenClaw gateway (#3260 ) [MUL-3158] (#3664 ) * feat(openclaw): support connecting to existing OpenClaw gateway (#3260) When the daemon host is a lightweight dev machine or CI coordinator, the heavy agent work (LLM inference, code execution, tool use) often belongs on a more powerful remote server already running an OpenClaw gateway. Multica historically hard-coded `openclaw agent --local`, forcing every turn to execute in-process on the daemon host. This change adds an opt-in gateway routing mode controlled per-agent via `runtime_config`: { "mode": "gateway", "gateway": { "host": "...", "port": 18789, "token": "...", "tls": false } } - Backend: ExecOptions gains OpenclawMode + OpenclawGateway; buildOpenclawArgs drops `--local` when mode == "gateway". Per-task openclaw-config.json wrapper pins gateway.{host,port,auth.{mode,token},tls} so users do not need to edit the daemon host's `~/.openclaw/openclaw.json` to point at a different endpoint. - Daemon: AgentData carries the raw runtime_config; decoding is fail-soft (malformed JSON falls back to local mode rather than blocking dispatch). - API: gateway.token is masked to "**" on every GET; PATCH replays the sentinel back, and the update handler restores the persisted token so the round-trip never destroys the secret. Defense-in-depth masking on WS broadcasts, plus String/MarshalJSON masking on the in-memory struct to block stray `%+v` / json.Marshal leaks. - UI: openclaw-only "Routing" tab on the agent detail page with mode selector + structured endpoint form. Token uses a "saved — submit a new value to rotate" UX and matching backend preserve hook. Empty `runtime_config` keeps the historical embedded behaviour, so existing agents are unaffected. fix(openclaw): address #3664 review — drop dead gateway field, gate pin on mode Per Bohan-J's review: - Remove the dead ExecOptions.OpenclawGateway field (+ its String/MarshalJSON and the daemon.go construction block). It carried the plaintext bearer token but was never read — buildOpenclawArgs only consumes OpenclawMode and the live gateway path runs through execenv.OpenclawGatewayPin — so this narrows the secret's footprint. - Gate the gateway pin on mode=="gateway" in decodeOpenclawRuntimeConfig: a {"mode":"local","gateway":{...,"token"}} payload no longer writes the token into the 0o600 per-task wrapper that --local makes openclaw ignore. - Warn on an unrecognized non-empty mode (e.g. "gatway") instead of silently falling back to local. - Run preserveMaskedGatewayToken in CreateAgent too, so a literal "***" at create time can't persist as a real bearer token. - Document the gateway host:port trust boundary (SSRF note for shared daemon hosts). Adds regression tests for the local-mode pin drop and the unknown-mode warning.	2026-06-13 15:33:28 +08:00
Bohan Jiang	f415099c4a	MUL-3263: support managed MCP config for Cursor (#4081 ) * feat: support managed MCP config for Cursor Co-authored-by: multica-agent <github@multica.ai> * fix: address Cursor MCP review feedback Co-authored-by: multica-agent <github@multica.ai> * docs: include Cursor in skills MCP support Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-13 02:07:00 +08:00
Liu Guanzhong	4594c776e1	feat(agent): add CodeBuddy as first-class CLI backend (#3186 ) * feat(agent): add codebuddyBackend struct and buildCodebuddyArgs Introduces the codebuddy agent backend skeleton with args builder that mirrors claudeBackend's protocol flags (stream-json, bypass permissions, blocked args filtering) for the codebuddy CLI fork. * feat(agent): implement codebuddyBackend.Execute with stream-json parsing * feat(agent): wire codebuddy into New() factory and launchHeaders * feat(agent): add codebuddy dynamic model discovery from --help * feat(agent): add codebuddy thinking/effort discovery and providerThinkingEnums * feat(daemon): add codebuddy CLI probe, env vars, and args support * fix(agent): use len(models)==0 for default model instead of loop index * fix(agent): increase codebuddy --help timeout to 35s for slow CLI startup * fix(agent): address codebuddy PR review feedback - Wire codebuddy into execenv: reuse claude's CLAUDE.md, .claude/skills, and ~/.claude/skills paths since CodeBuddy is a Claude Code fork - Replace hardcoded 20-min timeout with runContext for zero-timeout = no-deadline semantics matching all other backends - Restore runContext regression tests lost in rebase merge - Mirror claude.go execution model: concurrent stdin write to prevent pipe deadlock, sync.Once for stdin closure, keep stdin open for control_request auto-approval mid-run - Add control_request handling with auto-approve behavior - Add RequestID/Request fields to codebuddySDKMessage - Add codebuddy to metrics knownRuntimeProviders - Add codebuddy to provider-logo.tsx (reuses ClaudeLogo) - Consolidate --help discovery: shared codebuddyHelpOutput cache eliminates duplicate cold-start invocations --------- Co-authored-by: krislliu <krislliu@tencent.com>	2026-06-12 15:22:16 +08:00
Bohan Jiang	24b162cdbc	feat(daemon): surface the real task initiator to the agent runtime (MUL-2645) (#3899 ) * feat(daemon): surface the real task initiator to the agent runtime (MUL-2645) In a multi-person workspace the agent runtime only ever saw the runtime OWNER identity: the brief's `## Requesting User` is sourced from runtime.OwnerID and the task-scoped token is owner-bound, so every requester (whoever commented, @mentioned, or chatted) appeared to the agent as the owner. Agents that route by initiator for permission, privacy, or audit all misjudged. Resolve the real task initiator at claim time and surface it distinctly from the owner: - comment / mention trigger -> triggering comment's author (member or agent) - chat task -> chat session creator (sessions are creator-only) - on-assign / autopilot / quick-create -> no attributable initiator (omitted) Adds initiator_{type,id,name,email} to the claim response, the daemon Task, and TaskContextForEnv, rendered into the brief as a new `## Task Initiator` section. The section documents the privacy boundary: the agent's credentials stay owner-scoped, so this is an attested identity for the agent's own routing/privacy logic, not act-as. No DB migration — both paths are derivable from existing rows. Tests: brief rendering (member/agent/omit/sanitize) + email guard unit tests, and claim-handler tests for the comment and chat paths. Co-authored-by: multica-agent <github@multica.ai> * fix(chat): store real sender as task initiator, not chat_session creator (MUL-2645) Review fix (Niko, PR #3899). v1 resolved the chat task initiator from chat_session.creator_id at claim time. That is correct for web chat and Lark p2p (creator == sender), but WRONG for Lark group chats: the group session creator is deliberately the installer (stable identity across member churn), not the message sender. So in a Lark group, every member who triggered the agent showed up in the brief as the installer/owner — the exact bug this issue is about, still live at that entry point. Capture the real sender at enqueue time instead of deriving it from the session creator at claim time: - migration 117: agent_task_queue.initiator_user_id (FK user, ON DELETE SET NULL); NULL for non-chat and pre-migration rows. - EnqueueChatTask now takes an explicit initiatorUserID. Web chat passes the authenticated request user; the Lark dispatcher threads the inbound sender (binding.MulticaUserID) through scheduleRun -> flushChatRun. The debouncer keeps the latest scheduled flush per session, so in a multi- sender silence window the LATEST sender wins (documented + tested). - claim handler resolves the initiator from task.initiator_user_id and drops the creator_id fallback entirely. The Lark group session creator stays the installer (unchanged) — only the task initiator is corrected, keeping the two concepts cleanly separate. Tests: dispatcher group regression (initiator = sender, not installer), latest-sender-wins, p2p initiator assertion; the chat claim handler test now sets creator != initiator and asserts the stored sender wins. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-08 19:29:57 +08:00
liujianqiang-niu	5be7d1bc17	MUL-3136 fix(openclaw): parse config path from last non-empty line of CLI output Fix OpenClaw config discovery when `openclaw config file` prints Doctor warning UI before the actual config path. The daemon now uses the last non-empty stdout line as the path while preserving the existing tilde expansion, absolute-path validation, stat checks, and fail-closed behavior. Tests: go test ./internal/daemon/execenv	2026-06-08 17:22:02 +08:00
HMYDK	4190de3d64	fix(skills): quote description values in built-in SKILL.md YAML frontmatter (#3852 ) Built-in SKILL.md description values contained unquoted ': ' sequences, which strict YAML parsers (e.g. Codex) reject — silently dropping the skill at load. - Quote all eight built-in skill descriptions. - ensureSkillFrontmatter() re-synthesizes frontmatter that has a name but fails YAML validation, so malformed imports are repaired instead of dropped. - Unify frontmatter delimiter parsing into a single frontmatterParts helper. - Add strict-YAML regression tests over the built-in skills, plus unit tests for the recovery branch and delimiter variants. Closes #3851.	2026-06-08 13:10:24 +08:00
Multica Eve	63b847ee48	Honor agent identity in assignment workflow (#3802 ) Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-06-05 13:45:43 +08:00
Naiyuan Qing	b9334dd59f	fix: anchor comment triggers to thread roots (#3746 ) Co-authored-by: multica-agent <github@multica.ai>	2026-06-04 13:47:05 +08:00
Bohan Jiang	d6a556bdbf	fix(execenv): refresh skills in place on reuse instead of accumulating duplicate dirs (#3716 ) Re-dispatching the same agent on the same issue reuses the persistent workdir via execenv.Reuse(), where the standard-provider skill refresh re-wrote skills without clearing the prior dispatch's output, so allocateCollisionFreeSkillDir dodged Multica's own directories into issue-review-multica-N. On reuse, reclaim the platform-owned managed skill directories the prior manifest recorded (removeReusedManagedSkillDirs) and roll back the remaining sidecar files (CleanupSidecars) before refreshing, so each skill lands at its canonical slug every dispatch. Mirrors the Codex hydrateCodexSkills wipe; scoped to reuse, which never runs for local_directory tasks. Fixes #3684 (MUL-2963).	2026-06-03 19:30:42 +08:00
Naiyuan Qing	1544e3b68a	feat(skills): built-in agent skills (WIP) MUL-2759 (#3456 ) * feat(skills): introduce built-in agent skills (WIP) Inject platform-authored, version-bundled skills into every agent on top of its workspace-bound skills, so agents learn how to operate Multica correctly without users needing to know the internals or agents needing to read source. Mechanism: skills are embedded into the server binary and appended to the agent payload at task-claim time (handler/daemon.go), reusing the existing SkillData wire + daemon-side writeSkillFiles. The daemon needs no changes, and because it travels over an existing wire field, older daemons pick the skills up the moment the server ships. First skill: multica-mentioning — how to build a working @mention (look up the UUID, match type to id source, know what each mention type triggers). WIP: injection mechanism + first skill only; more skills to follow in dependency order (skill -> agent -> squad). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(skills): make multica-mentioning the standard template + add eval Add the contract-skill frontmatter the other built-in skills will copy: user-invocable:false (it triggers from context, not as a slash command) and allowed-tools fencing it to the multica CLI it teaches. These keys survive to agent machines untouched (ensureSkillFrontmatter only ever adds a missing name). Add a Go eval in builtin_skills_test.go (a _test.go so it never ships to agent machines via the skill-files walk): - Enforces the template invariants on every built-in skill, present and future: multica- prefix, name+description present, description within 1024 chars, body within the 500-line L2 budget, no eval file leaking into the shipped payload. - Couples the mentioning skill's documented contract to the real util.ParseMentions: its Incorrect examples must parse to nothing (a name where a UUID belongs fails silently) and its Correct example must fire. A drift in the mention regex now breaks CI instead of silently turning the skill into a lie agents act on. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(skills): add working-on-issues built-in skill Co-authored-by: multica-agent <github@multica.ai> * feat(skills): verify linked PRs in issue workflow skill Co-authored-by: multica-agent <github@multica.ai> * feat(skills): add skill import and discovery built-ins Co-authored-by: multica-agent <github@multica.ai> * feat(skills): add skill authoring built-in Co-authored-by: multica-agent <github@multica.ai> * docs(skills): align builtin skill workflows Co-authored-by: multica-agent <github@multica.ai> * docs(skills): use structured skill search Co-authored-by: multica-agent <github@multica.ai> * fix(skills): make built-in skill bundle launch-ready Co-authored-by: multica-agent <github@multica.ai> * fix(skills): align built-ins with additive skill binding Co-authored-by: multica-agent <github@multica.ai> * feat(skills): add creating agents built-in skill Co-authored-by: multica-agent <github@multica.ai> * Add built-in squads skill Co-authored-by: multica-agent <github@multica.ai> * refactor(skills): rewrite built-in skills as source-traced contracts Rewrite the built-in agent skills to the inbuilt-skill-authoring standard: state source-traced product facts with the source-code link logic as the core, not prescriptive how-to coaching. - creating-agents: drop the Decision-flow / Do-don't-consequences methodology; replace with field/behavior contracts (validation, persisted shape, daemon claim-time consumption, env gating, skill binding). - skill-discovery: stop teaching repo/github_stars as selection signals — searchClawHubSkills never populates them (always null); rank by install_count + source/url + description. Add file:line citations. - mentioning: drop the unbacked "member mention sends a notification" claim (no such path in the comment handler); state that only agent/squad mentions enqueue work. Tighten the parser-failure wording. - working-on-issues: refresh citations drifted by the main merge; describe the PR response `state` enum accurately; trim status coaching. - skill-importing: correct response type to SkillWithFilesResponse; document the reserved SKILL.md supporting-file rule; add line-accurate citations. - squads: correct the "leader cannot be archived" overstatement (not rejected at create/update; fails closed later at routing/dispatch); refresh source-map attributions and test list. Each skill now ships references/<skill>-source-map.md as its evidence layer (line-accurate citations live there, not pinned in the test, so a future main merge cannot rot them into stale lies). builtin_skills_test.go: replace coaching/line-number pins with drift-resistant contract anchors, forbid the coaching phrasing, and require every skill to ship its source-map. The ParseMentions behavior coupling is preserved. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * docs(skills): close field-role and citation gaps found in review Independent review of the rewritten built-in skills surfaced two real gaps and some citation drift; this fixes them. - creating-agents: add the three missing field rows (visibility, max_concurrent_tasks, mcp_config) to the field-contract table — mcp_config is runtime-consumed (TaskAgentData, daemon.go), visibility is access-control (default private), max_concurrent_tasks is a scheduler cap (default 6). Mark custom_args/runtime_config JSON validation as CLI-side (the server marshals as-is). Correct the CLI body-builder note (description/instructions use a non-empty check, the rest use Changed). Source-map: fix the env query name (UpdateAgentCustomEnv), the conformance test name, and add the new field defaults + the McpConfig runtime-payload line. - mentioning: the @squad mention private gate is canAccessPrivateAgent, not canEnqueueSquadLeader (that wrapper is the assignment/child-done path). - working-on-issues: cite notifyParentOfChildDone at its func def (:51), not the doc comment (:15). - skill-importing: config.origin is set only when the source supplied an origin — note it may be absent; cite createSkillWithFiles at its definition (skill_create.go:72), not the call site. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * Add built-in skills for autopilots runtimes and resources Co-authored-by: multica-agent <github@multica.ai> * feat(runtime): list skill descriptions in the brief Skills index The brief's `## Skills` section emitted bare skill names only, discarding the one-line description that SkillContextForEnv already carries. For Claude-family providers the frontmatter description is loaded natively; for providers without native skill discovery (hermes/default) the brief's list is the only signal they ever see, so a bare name gave them nothing to decide when to load a skill. Emit `name — description` when a description is present, falling back to the bare name when it is empty. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * refactor(skills): drop CLI-only rule from working-on-issues The "Platform data goes through the CLI" section duplicated the runtime brief's `## Important: Always Use the multica CLI` section verbatim (and the attachment-via-CLI note duplicated the brief's `## Attachments`). The CLI-only rule is universal and must be known before any skill loads, so the brief is its single source of truth; the skill copy was pure redundancy and a drift risk. Remove it and the matching intro clause. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * refactor(skills): remove discovery guidance from built-ins * docs(skills): remove stale skill-necessity records The per-skill necessity records had drifted to 3 of 8 shipped skills plus a record for `multica-skill-authoring`, which is not a shipped built-in skill. Per-skill "why it exists / when to use it" already lives co-located with each skill (frontmatter `description` + `references/<skill>-source-map.md`) where it cannot drift from the skill, and the doc's methodology duplicated the workspace's inbuilt-skill-authoring protocol. Remove the file rather than keep a parallel listing that every new skill has to remember to update. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(runtime): add source-authority escape hatch to the brief The brief already tells agents to run `--help` for command discovery, but nothing stated the trust precedence when a skill, the brief, or a doc seems to contradict actual behavior. Add one line to the Available Commands escape-hatch note: trust the live CLI (`--help`/`--output json`) and the checked-out source over source-traced prose that can lag the code, and verify on any conflict or confusion. Kept in the always-on brief (universal, needed before any skill loads) rather than duplicated into each skill; per-skill source-map pointers remain the specific layer. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * fix(runtime): scope the source-authority escape hatch to the CLI The previous version told agents the "checked-out source is the deeper authority" for verifying behavior. That over-claims: the repos in a task's brief come from GetWorkspaceRepos + project github_repo resources (per-workspace config, see daemon.registerTaskRepos), not the Multica platform source. A generic agent's checked-out source is its own app, not Multica's code, so it cannot verify a Multica skill/brief claim against it. The only universally available authority for Multica behavior is the live CLI (`--help` / `--output json` / observed command behavior). Re-scope the line accordingly and state plainly that the platform's source is not in the workdir. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * revert(runtime): drop the source-authority escape-hatch line Reverts the brief addition from `fdd5e82df` and its follow-up `cc67b2088`. The `--help` discovery fallback already in the Available Commands note is enough; the extra trust-precedence sentence was unnecessary. runtime_config.go is now identical to `6ca27ad74`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * docs(claude): remind to update built-in skills on CLI/field/behavior changes Add a Coding Rule: when a change touches a CLI command/flag, API field, or product behavior that a built-in skill documents, update that skill's SKILL.md and source-map in the same PR. Lives in the repo dev-guide (read when working in this repo), not the runtime brief — the runtime brief is injected into every workspace, where most agents have no Multica skill to update. AGENTS.md is a pointer to CLAUDE.md, so no mirror needed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-06-03 18:51:03 +08:00
Bohan Jiang	44feb3d06d	fix(skill): canonicalize reserved SKILL.md path check across daemon + API (#3660 ) A skill_file row whose path is the skill's own SKILL.md (persisted by older builds or direct create/update API calls) collides with the primary content the daemon writes itself, failing task prep with errPathPreExists on every non-codex local runtime (#3489). #3526 guarded this with strings.EqualFold(path, "SKILL.md") at the daemon write site and the three API ingress points, but the stored path is not canonicalized: "./SKILL.md" or "sub/../SKILL.md" slip past the exact-match guard while filepath.Join still resolves them onto the same SKILL.md, so prep can still break. Extract one canonical helper, skill.IsReservedContentPath, that cleans the path before the case-insensitive compare, and use it at all four sites (execenv writeSkillFiles, skill create, update, single-file upsert). Add a daemon-side regression test for writeSkillFiles ignoring a bundled SKILL.md (exact + "./" spellings) — the load-bearing fix previously had only API-layer coverage — plus a unit test for the helper. Existing poisoned rows are intentionally left in place (skipped at prep) per the decision on MUL-2928. MUL-2928 Follow-up to #3526; supersedes #3560. Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-02 18:23:57 +08:00
MSandro	996eb07dc5	fix(daemon): skip duplicate SKILL.md in supporting files to prevent task prep failures (#3526 ) Fixes #3489 MUL-2928	2026-06-02 17:53:20 +08:00
Bohan Jiang	888186b183	fix(daemon): make comment-posting guardrail provider-agnostic (MUL-2904) (#3654 ) * fix(daemon): make comment-posting guardrail provider-agnostic (MUL-2904) Agents inlining a backtick-wrapped token into `multica issue comment add --content "..."` had the shell run it as a command substitution, silently deleting the token; the stored comment never matched the model's intent, so it retried forever — spamming OKK-497 with duplicate comments. The corruption is shell-driven, not provider-driven, so extend the "never inline --content; use --content-file / quoted-HEREDOC --content-stdin" rule from Codex-only to ALL providers: - BuildCommentReplyInstructions: collapse the Linux/macOS non-Codex inline branch into the unified quoted-HEREDOC stdin template. - buildMetaSkillContent: rename "Codex-Specific Comment Formatting" -> "Comment Formatting" and emit it for every provider; strengthen the Available Commands entry and the assignment step-6 examples to steer away from inline --content. - Windows behavior unchanged (file-only; avoids PowerShell ASCII drop). Tests: flip the non-Codex Linux reply test into a MUL-2904 regression, broaden the stdin-emphasis test across providers, and pin the provider-agnostic guardrail. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): keep Windows assignment brief file-only (address review) Review catch on #3654: the previous commit added platform-agnostic prose recommending "--content-file or --content-stdin" in the Available Commands entry and the assignment-triggered step-6 example. The assignment path has no BuildCommentReplyInstructions OS override, so on Windows an agent following step 6 literally would pipe its final comment through PowerShell and drop non-ASCII bytes (#2198 / #2236 / #2376) — contradicting this PR's own Windows file-only rule in the ## Comment Formatting section. Make the platform-agnostic surfaces defer to the OS-aware ## Comment Formatting section (the single source of truth) instead of naming stdin. The flag synopsis still lists all three modes. Add TestInjectRuntimeConfigWindowsAssignmentBriefStaysFileOnly: a Windows assignment-triggered brief must not contain any prescriptive "... or --content-stdin" recommendation. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-02 17:47:09 +08:00
Naiyuan Qing	3c8645e546	feat(cli): add squad member set-role (#3583 ) Co-authored-by: multica-agent <github@multica.ai>	2026-06-01 12:51:15 +08:00
Naiyuan Qing	4ae4722ef0	fix(comments): preserve direct parent on replies (#3579 ) Co-authored-by: multica-agent <github@multica.ai>	2026-06-01 08:28:15 +08:00
Naiyuan Qing	973a43923f	fix(comments): revert since-delta to issue-wide, steer to parent thread first (#3535 ) #3509/#3523 scoped the comment-trigger since-delta count to the triggering thread, so an agent resuming a busy issue only saw "+N in this thread" and lost visibility of new comments in other threads. Revert the count to issue-wide (every thread), keeping the trigger-comment + agent-own exclusions, and reshape the warm-path hint to: - report the issue-wide new-comment volume, - steer the agent to read the triggering (parent) thread FIRST (`--thread <trigger> --since`, or `--tail 30` for full context), - demote the issue-wide `--since` catch-up to an only-if-needed fallback ("don't read them all blindly"). Also fixes the now-stale "scoped to the triggering thread" wording in the resumed-session no-delta hint (it's issue-wide zero now). Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-29 20:13:23 +08:00
Multica Eve	d1c7d478e1	MUL-2785: clarify thread-scoped comment delta (#3523 ) Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-29 17:16:58 +08:00
Multica Eve	9616d78e47	MUL-2785: optimize resumed comment reads (#3509 ) * feat(comments): skip default thread read on resumed comment sessions Co-authored-by: multica-agent <github@multica.ai> * fix(comments): scope since delta to trigger thread Co-authored-by: multica-agent <github@multica.ai> * chore(comments): address thread delta review nits Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-29 14:57:14 +08:00
Naiyuan Qing	3187bbf90c	feat(comments): re-add since-delta + cold-start thread read + parent-root write normalization (#3494 ) * feat(comments): since-delta new-comment hint + default-on comment session resume (#3432) * feat(db): add unresolved comment count + list filter queries Add CountUnresolvedComments (excludes the agent's own comments) and ListUnresolvedCommentsForIssue. Both are additive — existing callers stay on the unfiltered queries — so old clients are unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handler): support unresolved-only comment listing Wire an additive `unresolved` query param into ListComments. Defaults off so an old CLI that never sends it gets unchanged behavior; only true/1 enable it. Rejects combining unresolved with thread/recent (whole-issue filter vs navigation models). Includes filter + count query tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handler): plumb unresolved count + thread root into claim, gate comment resume Populate trigger_parent_id (thread root of the trigger comment) and unresolved_count (excludes the agent's own comments) on comment-triggered claim responses. Both fields are omitempty so old daemons ignore them. Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION (default off): resumed comment turns can inherit the prior turn's "Done." final message, so this stays an explicit rollout switch. The runtime-match and poisoned-session guards still apply regardless of the flag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(daemon): inject unresolved-comments hint + resolve step into agent brief Add a shared BuildUnresolvedCommentsHint helper rendered on both the per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It ships only the count and the relevant CLI call — never comment bodies — so the server stays cheap. Thread case points at --thread <root>; issue case points at --unresolved. Suppressed when the count is 0. Also add a workflow step telling the agent to `multica comment resolve <thread-root>` once a thread is fully handled, so the unresolved set converges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): add comment list --unresolved and comment resolve command Add an --unresolved filter to `issue comment list` (wired to the server's unresolved param, rejected when combined with --thread/--recent) and a top-level `comment resolve <id>` command that POSTs to the existing /api/comments/{id}/resolve endpoint, letting an agent close threads it has fully handled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(comments): since-delta new-comment hint + default-on comment resume Simplifies the comment-triggered agent flow down to what's actually needed: - New-comment awareness is now a pure time delta: the claim response carries new_comment_count + new_comments_since (anchored on the prior run's started_at, never completed_at so a long run can't miss comments). The per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s) since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the two surfaces can't drift. Cold start (no prior run) falls back to a plain read. - Comment-triggered tasks resume the prior session by default (same runtime), dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS comment" prompt guard defends against inheriting the prior turn's "Done." marker; GetLastTaskSession still excludes poisoned sessions. - Drops the resolved-based machinery from the first draft: CountUnresolvedComments / ListUnresolvedCommentsForIssue queries, the `comment list --unresolved` flag, the `multica comment resolve` command, and the resolve workflow step. - Removes the verbose cursor-pagination paragraph from the comment prompt; the --thread/--recent/--since flags stay in the CLI/API, just no longer explained inline every turn. Compatibility: new claim fields are omitempty (old daemons ignore them). Comment resume is default-on and affects even old daemons, which already consume prior_session_id. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(comments): collapse reply parent_id to thread root on write Comment threads are a 2-level model (root + flat replies, like Linear/Slack), enforced today only by the UI and the agent path — the CreateComment handler stored whatever parent_id it was handed, and the agent-side flatten walked just one level, so a reply-to-a-reply could land at depth 3+. Add GetThreadRoot (a recursive walk to the parent_id=NULL root) and run both write paths (handler.CreateComment, service.createAgentComment) through it, so every stored reply's parent_id IS its thread root. Readers can now treat parent_id as the thread root without re-walking. The agent-drift guard still compares the raw parent_id to the trigger comment before normalization. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(comments): cold-start reads triggering thread, warm keeps --thread pointer The since-delta rework dropped the thread-first read on the COLD path: a first-time agent fell back to the flat `comment list` dump (oldest-first, cap 2000), burying the trigger's context in ancient chatter. Point cold start at the triggering conversation instead via a shared BuildColdCommentsHint (`--thread <trigger> --tail 30` + a --recent pointer for cross-thread background). On the WARM path, --since is a pure time delta and can miss the triggering thread's pre-anchor history, so BuildNewCommentsHint now also emits a --thread pointer. Both surfaces (per-turn prompt + CLAUDE.md workflow) render via the shared helpers so they cannot drift (PR #2816 rule). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 10:38:37 +08:00
Bohan Jiang	fa076d38f2	MUL-2778 feat(agent): wire mcp_config through OpenClaw runtime (#3450 ) * MUL-2778 feat(agent): wire mcp_config through OpenClaw runtime The MCP config tab (#3419) lets admins save mcp_config on an agent, and recent work (#3439) plumbed it through the three ACP runtimes. OpenClaw still ignored the field, leaving the Tab silently inert for any OpenClaw-backed agent. Translate the agent's Claude-style `{"mcpServers": {...}}` into the per-task OpenClaw wrapper's `mcp.servers` block — OpenClaw resolves MCP via its own config schema rather than ExecOptions, so the existing OPENCLAW_CONFIG_PATH preparer is the right seam. Fail closed on malformed JSON / entries missing `command` or `url`, matching the fail-closed posture the preparer already uses for the agents.list step. Null / absent mcp_config leaves the wrapper free of an `mcp` key so the user's global mcp.servers flows through untouched; an explicit empty managed set (`{}` / `{"mcpServers":{}}`) is honoured as "admin saved no servers" mirroring `hasManagedCodexMcpConfig`. Strict-mode replacement (drop user-only servers entirely) would require OpenClaw to do a per-key replace rather than a deep merge at `mcp.servers`; the comment documents that caveat rather than relying on undocumented behaviour. Also adds `openclaw` to `MCP_SUPPORTED_PROVIDERS` so the MCP Tab actually surfaces in the agent overview pane, and pins the new visibility case with a renderPane test. Co-authored-by: multica-agent <github@multica.ai> * MUL-2778 fix(agent): make openclaw mcp_config strict-replace via sanitized snapshot Elon flagged on #3450 that the previous wiring let user-only mcp.servers leak through the wrapper's `$include` of the live user config: deep-merge at `mcp.servers` keeps user-only names, and the strict-empty case (`{ "mcpServers": {} }`) silently inherited user globals. Switch the strict-replace path to write a sanitized snapshot of the user's fully resolved config (via `openclaw config get --json`) with the `mcp` block stripped, then have the wrapper `$include` the snapshot instead of the live user file. With the user's `mcp` gone from the $include resolution, the wrapper's `mcp.servers` is the only definition the embedded OpenClaw sees — managed only, including the explicit empty set. The snapshot lives in envRoot at 0o600 alongside the wrapper so the GC reaper sweeps it with the rest of the task scratch, and no extra OPENCLAW_INCLUDE_ROOTS entry is needed (same-dir $include). Fail-closed on `config get --json` errors so the daemon never silently falls back to the leaky $include path. The inherit branch (null mcp_config) still uses the live user file directly — no extra CLI roundtrip and no snapshot is written. New tests pin the contract Elon's review required: - TestPrepareOpenclawConfigStrictReplacesUserMcpServers: user has global_one + shared, managed has shared + managed_only → wrapper has exactly {shared (managed value), managed_only}; global_one does NOT leak; snapshot file has the user's `mcp` stripped while preserving gateway / providers / API keys. - TestPrepareOpenclawConfigStrictEmptyManagedSetDropsUserMcp: empty managed set drops user's global_one (both `{}` and `{"mcpServers":{}}` cases). - TestPrepareOpenclawConfigNullMcpConfigKeepsUserInclude: null path inherits the live user config, writes no snapshot, makes no extra CLI call. - TestPrepareOpenclawConfigFailsClosedOnResolvedConfigError: errors during `config get --json` surface; no stale wrapper or snapshot. - TestPrepareOpenclawConfigManagedSetFreshInstall: fresh install with managed mcp_config skips the snapshot dance entirely. Also tightens en + zh-Hans MCP Tab copy to mention OpenClaw goes via the per-task wrapper, and to use OpenClaw's own `transport` field rather than Claude's `type` for HTTP/SSE entries. Co-authored-by: multica-agent <github@multica.ai> * MUL-2778 fix(agent): narrow openclaw snapshot strip to mcp.servers only Elon's third-round must-fix: the previous strict-replace snapshot deleted the entire `mcp` block, which wiped out non-server settings under `mcp` like `sessionIdleTtlMs`. Those are documented OpenClaw config keys (https://docs.openclaw.ai/gateway/configuration-reference#mcp) outside the MCP Tab's scope — the agent's saved mcp_config only manages server definitions, so other `mcp.*` tuning the user set must survive. Replace the blanket `delete(resolved, "mcp")` with a stripUserMcpServers helper that: - deletes only `mcp.servers` when `mcp` is an object - drops the parent `mcp` key only when the object is empty after the strip (so we don't emit `mcp: {}` placeholders) - leaves non-object `mcp` values untouched (we only know how to strip servers from the documented shape) Pinned with TestPrepareOpenclawConfigStrictPreservesNonServerMcpKeys: user resolved has both `mcp.sessionIdleTtlMs: 300000` and `mcp.servers.global_one`; after the strict path runs the snapshot keeps the TTL and drops the servers map, and the wrapper's `mcp.servers` is exactly the managed set with no leak. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-28 18:43:02 +08:00
Naiyuan Qing	d90732750f	Revert "feat(comments): since-delta new-comment hint + default-on comment ses…" (#3455 ) This reverts commit `5e78e5100a`.	2026-05-28 17:52:59 +08:00
Bohan Jiang	ee4ec3b76d	MUL-2784 fix(daemon): cleanup sidecar tree (.agent_context / .multica / provider skills) after local_directory tasks (#3444 ) * fix(daemon): cleanup .agent_context / .multica / provider skill sidecars after local_directory tasks (MUL-2784) PR #3438 (MUL-2753) only restored CLAUDE.md / AGENTS.md / GEMINI.md to their pre-task bytes; the sidecar tree writeContextFiles seeds (.agent_context/, .multica/, .claude/skills/, .github/skills/, .opencode/skills/, skills/, .pi/skills/, .cursor/skills/, .kimi/skills/, .kiro/skills/, .agents/skills/, fallback .agent_context/skills/) was explicitly deferred to this follow-up. In local_directory mode the agent's workdir is the user's repo, so each task accumulates one more layer of those directories in the user's tree. Plan A: track every file/dir Prepare creates inside workDir in a sidecarManifest written to envRoot/.multica_sidecar_manifest.json (daemon scratch — never in the user's workdir). On local_directory teardown CleanupSidecars walks the manifest, removes the recorded files, then rmdir-iterates the recorded directories in reverse. Pre-existing files and directories are deliberately NOT recorded, so a user-installed .claude/skills/my-own-skill/ sibling — or any unrelated file the user keeps under .claude/, .github/, etc. — is preserved bit-for-bit. Non-empty rmdir fails ENOTEMPTY and is silently skipped, which is the signal that the user owns the directory. Daemon wiring lives next to the existing CleanupRuntimeConfig defer in runTask: runtime brief first, sidecars second. Cloud-mode runs still write a manifest for symmetry but never trigger the cleanup (the GC loop wipes envRoot wholesale). Tests (sidecar_manifest_test.go) cover the round-trip invariant per the issue's acceptance criteria: - empty workdir → Prepare → Cleanup → empty workdir, byte-exact, for every file-based provider (claude, codex, copilot, opencode, openclaw, hermes, pi, cursor, kimi, kiro, antigravity, gemini), - user's .claude/skills/my-own-skill/ (and equivalents per provider) survives Cleanup intact, - unrelated user files under .claude/, .github/, etc. survive, - three repeated cycles do not accumulate any orphan state, - project_resources branch (.multica/project/resources.json) is also reversible, - recordWriteFile refuses to record pre-existing files, - recordMkdirAll refuses to record pre-existing dirs, - Cleanup is a no-op when the manifest file is missing. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): refuse to overwrite pre-existing sidecar paths; pick collision-free skill slugs (MUL-2784 review) Addresses PR #3444 review (Elon): Must-fix #1: recordWriteFile used to overwrite pre-existing target files unconditionally and only skip the manifest record. That destroys user bytes at write time AND leaves the corrupted contents in place at cleanup time — the byte-exact contract the issue requires is violated on both halves. Fixed by making recordWriteFile detect any pre-existing entry (regular file, symlink, directory) via Lstat and return a sentinel errPathPreExists without touching the path. The user's bytes are preserved verbatim. For per-skill collisions (user's .claude/skills/issue-review/ vs Multica's "Issue Review"), writeSkillFiles now allocates a collision-free sibling slug via allocateCollisionFreeSkillDir: first attempt is the natural slug, then `<base>-multica`, `<base>-multica-2`, …, bounded at 64 attempts. Provider-native discovery still picks the skill up (every subdir under skillsParent is a distinct skill) and the user's path stays bit-for-bit intact. For Multica-only namespace files (.agent_context/issue_context.md, .multica/project/resources.json), the writer swallows errPathPreExists and continues — the runtime brief already carries every fact those files would, so a collision degrades to brief-only mode rather than destroying user content. Must-fix #2: Added byte-exact collision matrix tests covering every file-based provider (claude / codex / copilot / opencode / openclaw / hermes / pi / cursor / kimi / kiro / antigravity / gemini): - TestPrepareThenCleanupSidecarsSameSlugCollisionPerProvider: seeds user's `<provider>/skills/issue-review/SKILL.md` plus a private notes.md sibling, runs Prepare → Inject → Cleanup, asserts workdir snapshot is byte-identical to seed. - TestPrepareThenCleanupSidecarsIssueContextCollisionPerProvider: seeds user's `.agent_context/issue_context.md`, asserts round-trip preserves it. - TestPrepareThenCleanupSidecarsProjectResourcesCollisionPerProvider: same for `.multica/project/resources.json`. - TestPrepareThenCleanupSidecarsMultiSkillCollisionFreeAllocation: end-to-end check that the Multica skill lands at the collision-free sibling and Cleanup removes only the Multica side. - TestAllocateCollisionFreeSkillDir: directed unit test pinning the slug-bumping sequence. - TestRecordWriteFileRefusesToOverwritePreExistingFile (was TestRecordWriteFileSkipsPreExistingFile): flipped to assert the user's bytes survive and errPathPreExists is returned. - TestRecordWriteFileRefusesToOverwriteSymlinkOrDir: covers the Lstat path for non-file entries. Should-fix: CleanupSidecars used to swallow ANY non-ENOENT rmdir error as "user content present," silently dropping real I/O failures (EACCES, EPERM, EBUSY). Now it re-reads the directory after a failed rmdir via the new dirHasEntries helper — non-empty → silently skip (ENOTEMPTY, the intended branch); empty → genuine error, captured into firstErr and surfaced. Plus directed tests: - TestCleanupSidecarsSurfacesRealRmdirErrors - TestDirHasEntries Local verification: - go test ./internal/daemon/execenv/... — all green - go test ./internal/daemon/... — all green - go vet ./... — clean Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): surface original rmdir error when post-rmdir ReadDir also fails (MUL-2784 review) Addresses remaining PR #3444 review blocker (Elon): dirHasEntries used to return true when ReadDir failed with anything other than ENOENT, which made CleanupSidecars treat every locked / faulted directory as ENOTEMPTY and silently drop the original rmdir error. The v1 fix from the previous round closed the EACCES-on-empty-dir branch but missed the case where the chmod also blocks ReadDir — exactly the failure mode the review called out. Helper change: dirHasEntries now returns (hasEntries, ok bool): - (false, true) — dir exists and is empty (or missing, race-safe) - (true, true) — dir has user content (the ENOTEMPTY branch) - (_, false) — ReadDir failed (EACCES, ENOTDIR, EIO, …); the caller cannot tell ENOTEMPTY from a real error and MUST surface the original rmdir error CleanupSidecars switches on (ok, hasEntries): - !ok → surface the ORIGINAL rmdir error (not the ReadDir failure — that's diagnostic plumbing and would distract from the root cause) - ok && hasEntries → swallow silently (intended ENOTEMPTY branch; preserve user content) - ok && !hasEntries → surface the rmdir error (empty dir + EACCES / EPERM / EBUSY → genuine cleanup failure) Tests: - TestDirHasEntries: extended with a regular-file sub-case (ReadDir returns ENOTDIR) asserting (false, false). The v1 helper returned (true) here, hiding the bug. - TestCleanupSidecarsSwallowsMissingAndNonEmptyDirs: renamed from TestCleanupSidecarsSurfacesRealRmdirErrors. The old name claimed to test the surfacing path but never actually exercised it. - TestCleanupSidecarsSurfacesEACCESOnEmptyRecordedDir: chmod parent to 0o555 so rmdir(recorded) fails EACCES while ReadDir(recorded) still succeeds (empty). Asserts firstErr is non-nil and references both the recorded path and the rmdir branch. Skipped when running as root (chmod is bypassed for uid 0). - TestCleanupSidecarsSurfacesEACCESWhenReadDirFailsToo: the must-fix case — chmod parent 0o555 AND chmod recorded 0o000 so BOTH rmdir and ReadDir fail. The surfaced error must be the ORIGINAL rmdir failure, not the ReadDir one. Skipped on uid 0. Local verification: - go test ./internal/daemon/execenv/... — all green - go test ./internal/daemon/... — all green - go vet ./... — clean Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-28 17:22:47 +08:00
Bohan Jiang	03f70209c4	fix(daemon): preserve user CLAUDE.md / AGENTS.md / GEMINI.md in local_directory runs (#3438 ) * fix(daemon): preserve user CLAUDE.md / AGENTS.md / GEMINI.md in local_directory runs (MUL-2753) InjectRuntimeConfig previously called os.WriteFile unconditionally, which truncated whatever file lived at the same path. For the local_directory project_resource flow the workdir is the user's own repo, so the agent silently destroyed any repo-level CLAUDE.md / AGENTS.md / GEMINI.md the first time it ran in that directory, and the daemon's local-directory cleanup explicitly skips the user's path so the file was never restored. Write the brief inside a marker block instead: <!-- BEGIN MULTICA-RUNTIME (auto-managed; do not edit) --> ...brief... <!-- END MULTICA-RUNTIME --> writeRuntimeConfigFile handles three states: - file missing -> create with just the marker block, - file present, no marker block -> append the marker block at the end (preserves user-authored content above), and - file present, marker block already there -> replace the block body in place so repeated runs don't grow the file unboundedly. This is the short-term fix called out on MUL-2753. The sidecar question (.agent_context/, .claude/skills/, .multica/project/resources.json) is left for a follow-up — those files don't overwrite user content, just litter the workdir. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): cleanup runtime config marker block after local_directory tasks (MUL-2753) Address Elon's review on PR #3438: 1. Add `CleanupRuntimeConfig` and wire it into the daemon's task path so `local_directory` runs excise the marker block on the way out. Without it, a user's subsequent manual `claude` / `codex` / `gemini` run in the same directory picks up the previous task's stale brief (issue id, trigger comment id, reply rules) and acts on the wrong context. Cloud workspace runs skip the cleanup — their scratch workdir is wiped by the GC loop anyway. 2. If excising the block would leave the file empty / whitespace-only, the file is removed so we don't leave behind a stub the user has to delete by hand. Surviving user content is preserved byte-for-byte. 3. Harden the marker parser: search for the end marker strictly after the begin marker. The previous `strings.Index` pair mishandled two malformed cases — - a stray end marker before any begin (e.g. user pasted a documentation snippet showing the wire format) would cause every run to stack another block, growing the file unboundedly; - a half-block left by a previous crashed run would cause every subsequent run to append a fresh block beneath the half-block. The `locateMarkerBlock` helper now anchors the end search past the begin offset, and treats "begin found, no end after" as "block runs to EOF" so the next write replaces it cleanly. Centralised the provider→filename mapping in `runtimeConfigPath` so Inject and Cleanup can't drift past each other when a new provider is added. Tests cover: parser hardening (stray-end-before-begin idempotency, half-block recovery), Cleanup happy path / file removal / no-op cases / malformed half-block / per-provider mapping, and an end-to-end inject→cleanup round trip that locks in byte-identical restoration of the user's pre-injection file. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): byte-exact inject/cleanup round trip for runtime config (MUL-2753) Address Elon's second-round review on PR #3438. The previous cleanup relied on `TrimRight + "\n"` for trailing newlines and `TrimSpace == ""` for file removal — both compensated for the inject path's "normalise trailing newlines so there's always exactly `\n\n` before the block" step, but they did so by mutating the user's bytes. The result was a real diff on three boundary cases: - file ended without a newline (`rules`) → cleanup added one; - file ended with two or more newlines (`rules\n\n`) → cleanup collapsed to a single newline; - file pre-existed but was empty / whitespace-only → cleanup deleted it. Reshape the contract so the bytes inject adds are the exact bytes cleanup removes, with no user-byte mutation in between: - Define `runtimeManagedSeparator = "\n\n"` as a fixed managed separator that inject always inserts (unconditionally — including for files that already end in two or more newlines) between pre-existing user content and the marker block. - Inject's missing-file branch still writes the block alone (no separator); that absence is the marker Cleanup uses to identify "we created this file from scratch" and is the only condition under which Cleanup is allowed to `os.Remove` the file. - Cleanup detects `HasSuffix(pre, runtimeManagedSeparator)` and strips exactly those bytes; whatever remains is written back verbatim with no `TrimRight` / `TrimSpace`, so the pre-injection bytes survive exactly. The replace-in-place branch is untouched — the managed separator established by the first inject lives in pre and survives across subsequent runs, so byte-exactness is preserved through arbitrary inject→inject→cleanup chains. Tests: - `TestInjectThenCleanupRoundTripByteExactBoundaries` parameterises 9 seed shapes (missing file, empty, whitespace-only, no trailing newline, one trailing newline, two trailing newlines, many trailing newlines, CRLF line endings, no final newline with embedded blank lines) and asserts byte-identical round trip across two full cycles. - `TestInjectReplaceThenCleanupRestoresByteExact` covers the replace-in-place branch for the same boundary seeds. - `TestWriteRuntimeConfigFileAlwaysInsertsFixedManagedSeparator` pins the new invariant at the source: regardless of seed shape, inject emits `<seed><\n\n><marker block>` with no normalisation. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-28 16:15:07 +08:00
Naiyuan Qing	5e78e5100a	feat(comments): since-delta new-comment hint + default-on comment session resume (#3432 ) * feat(db): add unresolved comment count + list filter queries Add CountUnresolvedComments (excludes the agent's own comments) and ListUnresolvedCommentsForIssue. Both are additive — existing callers stay on the unfiltered queries — so old clients are unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handler): support unresolved-only comment listing Wire an additive `unresolved` query param into ListComments. Defaults off so an old CLI that never sends it gets unchanged behavior; only true/1 enable it. Rejects combining unresolved with thread/recent (whole-issue filter vs navigation models). Includes filter + count query tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handler): plumb unresolved count + thread root into claim, gate comment resume Populate trigger_parent_id (thread root of the trigger comment) and unresolved_count (excludes the agent's own comments) on comment-triggered claim responses. Both fields are omitempty so old daemons ignore them. Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION (default off): resumed comment turns can inherit the prior turn's "Done." final message, so this stays an explicit rollout switch. The runtime-match and poisoned-session guards still apply regardless of the flag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(daemon): inject unresolved-comments hint + resolve step into agent brief Add a shared BuildUnresolvedCommentsHint helper rendered on both the per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It ships only the count and the relevant CLI call — never comment bodies — so the server stays cheap. Thread case points at --thread <root>; issue case points at --unresolved. Suppressed when the count is 0. Also add a workflow step telling the agent to `multica comment resolve <thread-root>` once a thread is fully handled, so the unresolved set converges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): add comment list --unresolved and comment resolve command Add an --unresolved filter to `issue comment list` (wired to the server's unresolved param, rejected when combined with --thread/--recent) and a top-level `comment resolve <id>` command that POSTs to the existing /api/comments/{id}/resolve endpoint, letting an agent close threads it has fully handled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(comments): since-delta new-comment hint + default-on comment resume Simplifies the comment-triggered agent flow down to what's actually needed: - New-comment awareness is now a pure time delta: the claim response carries new_comment_count + new_comments_since (anchored on the prior run's started_at, never completed_at so a long run can't miss comments). The per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s) since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the two surfaces can't drift. Cold start (no prior run) falls back to a plain read. - Comment-triggered tasks resume the prior session by default (same runtime), dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS comment" prompt guard defends against inheriting the prior turn's "Done." marker; GetLastTaskSession still excludes poisoned sessions. - Drops the resolved-based machinery from the first draft: CountUnresolvedComments / ListUnresolvedCommentsForIssue queries, the `comment list --unresolved` flag, the `multica comment resolve` command, and the resolve workflow step. - Removes the verbose cursor-pagination paragraph from the comment prompt; the --thread/--recent/--since flags stay in the CLI/API, just no longer explained inline every turn. Compatibility: new claim fields are omitempty (old daemons ignore them). Comment resume is default-on and affects even old daemons, which already consume prior_session_id. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 15:58:42 +08:00
Bohan Jiang	bae8a84abd	MUL-2767 feat(agent): add Antigravity runtime backend (#3427 ) * feat(agent): add Antigravity runtime backend Adds Google's Antigravity CLI (`agy`) as the 12th supported coding-tool runtime, alongside Claude / Codex / Cursor / Copilot / Gemini / Hermes / Kimi / Kiro / OpenCode / OpenClaw / Pi. The CLI emits plain assistant text on stdout (no structured event stream), so the backend streams stdout line-by-line as `MessageText` events and accumulates the same text as the final `Result.Output`. Session resumption uses `--conversation <id>`; because the conversation UUID is not echoed on stdout, the daemon routes `--log-file` to a temp file and recovers the id from the glog-formatted log lines. MUL-2767 Co-authored-by: multica-agent <github@multica.ai> * fix(agent): correct Antigravity capability contract from Elon review - ModelSelectionSupported now returns false for antigravity. `agy` has no --model flag and antigravityBackend deliberately drops opts.Model, so the UI must render a disabled "Managed by runtime" picker instead of an empty dropdown plus a silently-ignored manual-entry field. Also stop seeding AgentEntry.Model from MULTICA_ANTIGRAVITY_MODEL — the backend would silently ignore it. - Antigravity skills now write to {workDir}/.agents/skills/, the CLI's native workspace path (inherits Gemini CLI's layout per https://antigravity.google/docs/gcli-migration). Previously they went to the .agent_context/skills/ fallback that the CLI doesn't scan. Runtime brief moves antigravity into the native-discovery branch and local_skills.go points the user-level skill root at ~/.gemini/antigravity-cli/skills for Runtime → local skill import. - Doc + UI comment sync: providers matrix / install-agent-runtime / cloud-quickstart / agents-create / tasks (session-resume support) / skills / README all now list Antigravity in the right buckets, and the model-picker / model-dropdown comments cite antigravity (not the stale hermes reference) as the supported=false example. New tests: TestAntigravityModelSelectionUnsupported, TestInjectRuntimeConfigAntigravity (native discovery wording), TestWriteContextFilesAntigravityNativeSkills (.agents/skills/ landing, .agent_context/skills/ NOT written). Co-authored-by: multica-agent <github@multica.ai> * feat(provider-logo): swap inline placeholder for real Antigravity PNG Replaces the hand-drawn planet+arc placeholder with the official asset shipped from Downloads. Stored next to the component; bundlers (Next.js / electron-vite) resolve the PNG import to a URL string at build time. Added a small assets.d.ts so packages/views' tsc accepts PNG / SVG module imports — there was no prior asset usage in this package to register the declaration. --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-28 15:40:05 +08:00
Bohan Jiang	341ce7bfa5	feat: support local working directory for projects (MUL-2618 v1) (#3283 ) * feat(project): add local_directory project_resource type (MUL-2662) Adds a second project_resource type alongside github_repo so a project can be pinned to an existing directory on a specific daemon (the v1 of the local-working-directory flow tracked in MUL-2618). The ref schema is { local_path, daemon_id, label? }; local_path must be absolute and daemon_id is required. The same (daemon_id, local_path) pair is allowed on multiple projects by design — no UNIQUE constraint is added. Implementation reuses the existing project_resource API surface: the new type is wired through the validator switch with no migration, no new events, and no daemon-handler changes (daemon already passes through arbitrary resource types via ProjectResources). The CLI gains --local-path / --daemon-id / --ref-label shortcuts so `multica project resource add --type local_directory` mirrors the existing `--type github_repo --url ...` ergonomics; the generic --ref flag still works for both types. Tests cover the full CRUD lifecycle, the same-path-across-projects allowance, the same-path-same-project conflict, the validator rejections (missing/blank/relative path, missing daemon_id, wrong payload type), and the cross-platform isAbsoluteLocalPath helper. Co-authored-by: multica-agent <github@multica.ai> * feat(project): add update endpoint + label-shadow guard for project_resource (MUL-2662) Addresses the Elon review on PR #3263: - Add PUT /api/projects/{id}/resources/{resourceId} with sqlc query, matching handler, CLI `project resource update`, and a new EventProjectResourceUpdated WS event. resource_type stays immutable; ref/label/position are all individually optional. - Catch same-project (daemon_id, local_path) collisions where only the embedded label differs — the row-level UNIQUE only matches the full ref JSON, so a label typo would otherwise let the same working directory bind twice. - Tests cover the update lifecycle (label-only / ref / clear / 404 / invalid path) and the label-shadow conflict on both create and update; the in-place rename still succeeds because the conflict scan ignores the row being edited. Incidental: regenerating sqlc picked up a missing skills_local scan in UpdateAgentCustomEnv that drifted in from #3200. Co-authored-by: multica-agent <github@multica.ai> * fix(project): close bundled-create label-shadow gap + merge resource_ref on CLI update (MUL-2662) Two follow-ups from MUL-2662 review round 2: - CreateProject inline resources path now dedupes local_directory entries on (daemon_id, local_path) before opening the transaction. The DB-level UNIQUE(project_id, resource_type, resource_ref) constraint only fires on a full JSON match, so two rows with the same target but different `label` would otherwise slip past. Standalone POST/PUT already cover this via findLocalDirectoryConflict; bundled create was the missing surface. - `multica project resource update` now seeds resource_ref from the existing row before applying per-type shortcut flags, so `--default-branch-hint x` on its own no longer constructs a payload missing `url` (which the server 400s on). Local_directory partial edits get the same merge behavior. Co-authored-by: multica-agent <github@multica.ai> * feat(desktop): local_directory project_resource UI (MUL-2665) (#3273) * feat(desktop): local_directory project_resource UI (MUL-2665) First UI surface for the local-working-directory flow tracked in MUL-2618. Lets users on the desktop pin a project to an existing folder on this machine; web stays read-only since the per-daemon check can't be done in the browser. What's new for the renderer: - ProjectResourcesSection grows a desktop-only "Add local directory" button next to the existing GitHub-repo popover. Clicking it opens Electron's native folder picker, validates the path through a new IPC pair (existence + r/w), and submits a project_resource of resource_type=local_directory with daemon_id pulled live from daemonAPI.getStatus. - LocalDirectoryRow renders the rename pencil + path tooltip, and greys out when ref.daemon_id != this machine's daemon_id (with a "only available on the machine that registered this directory" tooltip). Delete stays enabled so users can drop stale registrations from any device. - LocalDirectoryHint sits above the issue-detail comment composer and shows "Agent will work in-place at {label} ({path})" when the issue's project has a local_directory matching this daemon. Hidden on web. - TaskStatusPill picks up a new "waiting_for_directory_release" stage that the daemon will publish when it dequeues a task but can't acquire the path lock. The render is in place now so the daemon sibling subtask can wire the status string without an additional UI PR. Plumbing: - @multica/core/types gains LocalDirectoryResourceRef + UpdateProjectResourceRequest, and the api client gets the matching PUT method backed by the server endpoint that landed in `2ac3faebb` (MUL-2662). A useUpdateProjectResource hook drives the in-place label edit. - New Electron handlers under apps/desktop/src/main/local-directory.ts: local-directory:pick -> dialog.showOpenDialog (openDirectory) local-directory:validate -> stat + access(R_OK + W_OK) exposed through the preload as desktopAPI.pickDirectory / validateLocalDirectory. View code talks to them via a thin packages/views/platform helper that returns reason=unsupported on web instead of crashing. - useLocalDaemonStatus exposes the local daemon's id, device name, and running flag from daemonAPI.onStatusChange so the renderer can do the cross-device match without coupling to the desktop preload typings. Tests: - pickStageKeys gets a unit test covering the new stage and proving the directory-release status outranks availability hints. - LocalDirectoryHint tests cover the four render branches (no project, no daemon, foreign daemon, matching daemon). - i18n parity stays green; new keys added under projects.resources.* and chat.status_pill.stages.waiting_for_directory_release in both locales. Out of scope (will land separately): - The daemon-side waiting/lock signal that flips the pill into the new state. - Adding local_directory to the create-project modal's bulk attach flow. - Docs page refresh for project-resources.mdx — left for the MUL-2618 umbrella sweep. Co-authored-by: multica-agent <github@multica.ai> * fix(desktop): hide rename for foreign daemon local_directory rows (MUL-2618) Address review nit on #3273: the rename pencil was gated only by `canEdit`, so a foreign / unknown-daemon row still showed it even though the spec says cross-device rows are disabled. Gate rename on `!mismatch` so it disappears on those rows; delete stays available so a stale registration can still be dropped from any device. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> * feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663) (#3274) * feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663) Wires up the daemon side of the local_directory project_resource introduced in MUL-2662. When a task is dispatched against a project whose resources include a local_directory pinned to this daemon's UUID, the daemon now: - Validates the path (absolute, exists, daemon process can read+write, not in the system-root / $HOME blacklist) and fails the task fast on any precondition violation, with a user-readable reason. - Serialises concurrent tasks on the same on-disk path via a daemon-local LocalPathLocker keyed by symlink-resolved realpath. The lock is held for the entire task lifetime (claim → context write → agent → result report). - When the lock is contended, the daemon flips the row to a new waiting_local_directory status on the server (carrying a wait_reason like "<path> (held by task <short id>)") so the UI can render "等待本地目录释放" instead of leaving the row silently in dispatched past the sweeper timeout. The status accepts being woken into running once the lock is acquired. - Sets execenv.WorkDir to the user's path (no copy, no mount). envRoot still lives under workspacesRoot/<wsID>/ and hosts output/, logs/, and .gc_meta.json — the daemon's logbook for the run. - Stamps GCMeta.LocalDirectory=true so the GC loop never RemoveAlls envRoot for these tasks (gcActionClean → gcActionCleanArtifacts, gcActionOrphan → gcActionSkip). The user's directory was never under envRoot to begin with, so this is defense in depth. - Skips execenv.Reuse for local_directory tasks because the prior WorkDir is the user's path and reusing it through that code path loses the envRoot association the GC loop needs. Prepare is cheap here (no clone, no copy), so always running it is fine. Server-side protocol changes: - New CHECK value 'waiting_local_directory' on agent_task_queue.status plus a wait_reason TEXT column (migration 109). - All cancel / active / counted-as-running / orphan-recovery queries expanded to include the new status; FailStaleTasks intentionally excludes it (the daemon owns the wait). - New SQL MarkAgentTaskWaitingLocalDirectory(id, reason) and a relaxed StartAgentTask that accepts both dispatched and waiting_local_directory as preconditions (and clears wait_reason on the way through). - New POST /api/daemon/tasks/{taskId}/wait-local-directory endpoint, TaskService.MarkTaskWaitingLocalDirectory broadcaster, and matching daemon Client.MarkTaskWaitingLocalDirectory. Tests cover: path blacklist + R/W enforcement, mutex serialisation + ctx-cancelled wait, lock handover between two tasks, GC never returns gcActionClean / gcActionOrphan for local_directory rows (with negative control for the standard path), and Prepare/Cleanup correctly substitute + protect the user's WorkDir. The desktop UI side (UI for adding a local_directory resource, surfacing the "等待本地目录" badge) is MUL-2665; the agent-task lifecycle changes (no branch switch, dirty-tree tolerant, auto-commit) are MUL-2664. This PR targets the shared MUL-2618 v1 feature branch agent/j/912b8cb1, not main; the whole v1 will be merged to main together when complete. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): tighten local_directory status, symlink, cancel handling (MUL-2618) Address the 3 must-fix items from Elon's review of PR #3274. 1. Status string unified. The server / daemon publish `waiting_local_directory`; align views, locales, and the pickStageKeys test (PR #3273 had used `waiting_for_directory_release` on a placeholder string). Without this, the daemon's wait state never reached the pill once the two siblings merged. 2. validateLocalPath now also runs the blacklist against the symlink-resolved realpath, with macOS's `/etc` -> `/private/etc` redirect handled via `isBlacklistedRealPath` which compares canonical forms. Without this, a symlink such as `/Users/me/proj/home -> /Users/me` slipped the literal $HOME check while every daemon write still landed in the user's home. Tests cover symlink-to-home, symlink-to-system-root, and the negative case (symlink to a regular subdirectory). 3. acquireLocalDirectoryLockIfNeeded now spins up a cancellation watcher inside `onWait` (lazy — the fast path stays free) so the gap between dispatch and StartTask responds to server-side cancel or row deletion. If the watcher fires while the daemon is parked on the path mutex, the lock-wait context is cancelled, Acquire returns promptly, and the helper exits silently the same way the run-phase poller does. New TestAcquireLocalDirectoryLock_CancelDuringWait exercises the path end-to-end with a fake server. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): unconditional canonical blacklist + Windows drive-root generalisation (MUL-2618) - validateLocalPath now always runs isBlacklistedRealPath on the symlink-resolved path, not only when it differs from absPath. The old guard let users type the canonical form of an OS-symlinked banned root (e.g. /private/tmp, /private/etc, /private/var on macOS) straight through, since EvalSymlinks is a no-op on already-canonical input. - Windows drive-root rejection moved off the static C/D/E/F enumeration onto filepath.VolumeName via a new isDriveRoot helper, so removable / network drives mounted at G:..Z: and UNC \\server\share roots are also blocked. systemRootBlacklist keeps the well-known C:\ trees only. - Tests: macOS-only case exercises direct /private/{tmp,etc,var}; a new TestIsDriveRoot covers the Windows generalisation (skipped on POSIX runners by runtime guard). Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> * feat(views): wire waiting_local_directory end-to-end in issue UI + presence (MUL-2618) Connect the daemon-emitted `task:waiting_local_directory` and `task:running` events through to issue execution log, sticky agent banner, activity indicator, and agent presence so a parked task is no longer invisible on the issue page. - Add `waiting_local_directory` to `AgentTask.status` and the typed `task:running` / `task:waiting_local_directory` WS event payloads. - Chat realtime sync writes both new statuses into the pending-task cache so the chat StatusPill flips out of a stale `dispatched` frame. - ExecutionLogSection: count `waiting_local_directory` as active, add tone + status label, treat parked tasks the same as dispatched for time anchor / transcript visibility / terminate-confirm note. - AgentLiveCard: subscribe to both new events, rank the parked state between dispatched and queued, and surface a "is waiting for the local directory" banner with the muted "Clock" treatment used for queued. - IssueAgentActivityIndicator: route parked tasks into the queued bucket so the hover stack and chip stay visible. - derive-presence: parked tasks count toward `queuedCount` so the agent workload chip stays out of `idle` while the daemon waits on the path lock. - Locales: add `agent_live.is_waiting_local_directory` and `execution_log.status_waiting_local_directory` (en + zh-Hans). Co-authored-by: multica-agent <github@multica.ai> * feat(project): enforce one local_directory per (project, daemon) (MUL-2618) The daemon-side resolver picks the first matching local_directory by daemon_id, so allowing two rows on the same daemon — even at different paths — let the agent silently write into whichever sorted first. Tighten the invariant top to bottom: - server: `findLocalDirectoryConflict` rejects any second row sharing a daemon_id, regardless of `local_path` or label. Bundled-create surface in `CreateProject` runs the same daemon-scoped dedupe up front. - daemon: `findLocalDirectoryAssignment` fails fast when it finds more than one row pinned to the current daemon (older API client / direct DB writes can still produce that state — refuse to guess). - desktop UI: hide the "Add local directory" action once the current daemon owns a row on this project, with a hint and a defensive toast on the call path; foreign-daemon rows stay visible read-only as before. - Tests: * daemon: new `two local_directory rows on this daemon fail fast` / `local_directory rows on different daemons coexist` cases. * handler: rewrite the legacy `LabelShadow` cases as `DaemonScopedConflict` / `BundledLocalDirectoryDaemonConflict` — asserts 409 on same-daemon different-path, 201 on per-daemon bundles. - Locales: en + zh-Hans copy for the new hint + toast. Co-authored-by: multica-agent <github@multica.ai> * chore(sqlc): drop stale skills_local in UpdateAgentCustomEnv (MUL-2618) Follow-up to the main-merge in `0f8e8ca7`: the auto-merge preserved most of main's skills_local revert but kept the column reference inside the UpdateAgentCustomEnv scanner because that block hadn't been touched by either side. Re-running `sqlc generate` regenerates the file without skills_local in this query, matching the rest of the file and the post-revert schema. Co-authored-by: multica-agent <github@multica.ai> * feat(create-project): binary source picker — repos OR local directory Turn the create-project dialog's "Repos" pill into a binary Source picker. A project's source is mutually exclusive: either a set of GitHub repos (worktree mode, default) or a single local working directory (local mode, desktop-only). Mirrors the constraint the backend will enforce next. Behavior: - Pill shows the active mode's selection (GitHub icon + repo count, or folder icon + local label/path). - Popover has a 2-tab segmented control at the top; the Local tab is hidden entirely on web (local_directory needs a daemon_id). - Local tab requires the daemon online — amber notice + disabled picker when offline, re-renders automatically via useLocalDaemonStatus. - Switching tabs preserves the other side's stash, but handleSubmit only emits the resource matching the active sourceMode, so abandoned picks never leak into the created project. Backend mutual-exclusion validation + the resources-section conditional-add-button still to come — this PR just unblocks the dialog so it can be demoed. * fix(mobile): cover waiting_local_directory in run row status maps (MUL-2618) --------- Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: Multica J <j@multica.ai>	2026-05-27 13:44:31 +08:00
Multica Eve	744b474199	revert(agent): remove per-agent local skill toggle (MUL-2603) (#3286 ) * Revert "feat(agents): hide skills_local toggle for runtimes that don't honour it (MUL-2603) (#3276)" This reverts commit `0b50c5a209`. Co-authored-by: multica-agent <github@multica.ai> * Revert "fix(agent): surface host OAuth token via env var on macOS isolation (MUL-2603) (#3267)" This reverts commit `a67bf81225`. Co-authored-by: multica-agent <github@multica.ai> * Revert "fix(agents): tighten skills-tab intro and drop redundant import hint (#3265)" This reverts commit `d8075a5775`. Co-authored-by: multica-agent <github@multica.ai> * Revert "fix(agent): mirror $HOME/.claude.json into isolated config dir (MUL-2661) (#3261)" This reverts commit `40da88fc16`. Co-authored-by: multica-agent <github@multica.ai> * Revert "feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200)" This reverts commit `960befa56f`. Co-authored-by: multica-agent <github@multica.ai> * Add migration cleanup for reverted agent skills toggle Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-26 17:00:01 +08:00
Bohan Jiang	6bc3d14eb3	fix(daemon/execenv): refresh stale Codex config copies across env reuse (MUL-2646) (#3268 ) * fix(daemon/execenv): refresh stale Codex config copies across env reuse (MUL-2646) `copyFileIfExists` previously short-circuited whenever the per-task `codex-home/{config.toml,config.json,instructions.md}` already existed, so once the files were seeded at first Prepare they were never refreshed again — even though `Reuse()` calls `prepareCodexHomeWithOpts` on every resume. A user who rotated their Codex `~/.codex/config.toml` between runs (e.g. switching the active `[model_providers.X]` `base_url`, or pointing `env_key` at a freshly rotated API key) kept reading the stale per-task copy on session resume. Codex then issued requests to the new URL using the old key and the API rejected the token. Treat any existing `dst` as something to drop and re-copy from the current shared source, mirroring the symlink path that already refreshes `auth.json` (#2126). The daemon-managed sandbox / multi-agent / memory blocks are applied via marker-bracketed idempotent passes after the copy, so a re-copy + re-ensure cycle preserves them. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon/execenv): drop per-task Codex copy when shared source removed (MUL-2646) Extend the MUL-2646 fix to the deletion arm of "sync the shared source": `syncCopiedFile` (renamed from `copyFileIfExists`) now also removes the per-task `dst` when the shared `src` is absent. The prior version short-circuited on missing src and left `config.toml` / `config.json` / `instructions.md` from the previous Prepare lingering in the per-task home — so a user who removed a provider by deleting `~/.codex/config.toml`, or pulled `config.json` / `instructions.md` out of the shared home, would keep replaying the stale copy on session resume. For `config.toml` the subsequent `ensureCodex{Sandbox,MultiAgent,Memory}Config` passes recreate the file with only the daemon-managed default blocks, so removing the shared file cleanly drops every user-managed `[model_providers.X]` / `model_provider` line. For `config.json` and `instructions.md` there is no daemon default, so they disappear in lockstep with the shared source. Adds `TestPrepareCodexHome_DropsCopiedConfigWhenSharedSourceRemoved` covering the new path, and extends the refresh-arm test to assert the multi-agent / memory marker blocks are still present after the copy is refreshed. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-26 14:52:53 +08:00
Bohan Jiang	960befa56f	feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200 ) * feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) Adds an agent-scoped `skills_local` switch ("ignore" default / "merge") so shared agents stop inheriting the operator's user-global Claude skill directory. A single broken local skill on one operator's machine was crashing the Claude CLI before it ever read stdin — the daemon saw a "broken pipe" with no recoverable signal (GitHub #3052). - DB: migration 108 adds `agent.skills_local` (NOT NULL DEFAULT 'ignore'), with sqlc CreateAgent/UpdateAgent updates and handler validation. - Claude runtime: when the agent is in "ignore" mode the backend points CLAUDE_CONFIG_DIR at an empty per-task scratch dir under the task cwd (fallback: OS temp), strips any inherited override, and cleans up after the run. Workspace skills under `{cwd}/.claude/skills/` still load. "merge" preserves the legacy inherit-from-machine behavior; Codex and other isolated backends are no-ops. - UI: new Skills toggle in the Create Agent dialog and the Agent → Skills tab, with EN/zh-Hans copy and SkillsLocalToggle shared between the two. - Tests: unit coverage for the new env helper, isolation dir lifecycle, full Claude execute paths (ignore + merge), and the handler tristate contract. Existing skills-tab test updated for the new copy. - Docs: updated `/skills` docs (EN + ZH) and added a 0.3.7 changelog entry in the landing-page i18n. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): preserve claude login + validate skills_local input (MUL-2603) Address Elon's review on PR #3200: 1. Skill isolation no longer drops the operator's Claude login. The per-task scratch dir now mirrors every entry under `~/.claude/` as symlinks except `skills/`, so `.credentials.json`, settings, plugins, etc. reach the CLI exactly as on the host while the user-global skills directory stays hidden. Without this, default `ignore` would have broken every Claude agent on a non-API-key host the moment migration 108 landed. 2. Internal CreateAgent callers (agent_template, onboarding_shim) now set `SkillsLocal: "ignore"`. The Go zero value was about to trip the migration-108 CHECK constraint and 500 template / onboarding agent creation. 3. Create / update handler validation no longer normalizes garbage to "ignore". The strict 400 path is now reachable on bad client input; the drift-safe `normalizeSkillsLocal` stays on the read side only. UI copy + docs clarified that the toggle is Claude-only; other runtimes ignore the setting. Verification: - `go test ./...` green (full suite locally). - `pnpm --filter @multica/views exec vitest run agents/components/tabs/skills-tab.test.tsx` green. - Handler DB-backed tests still skip locally without docker (same as Elon's run) — CI will validate the create / update paths against migration 108. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): mirror effective claude config dir with windows fallback (MUL-2603) Address Elon's second-round review on PR #3200: 1. The per-task scratch dir now mirrors the effective host Claude config dir, not unconditionally `~/.claude/`. Precedence: agent `custom_env` CLAUDE_CONFIG_DIR > parent process env > `~/.claude/`. Without this, an operator who pinned Claude at a managed install (custom env CLAUDE_CONFIG_DIR) would get the wrong credentials in the scratch dir, because `buildClaudeEnv` strips that env before handing it to the child. We resolve the source up front and feed it to the mirror, so the override env still points at the right bytes. 2. Mirror entries now go through platform-aware linkers. On Windows without Developer Mode / admin, `os.Symlink` is denied, which previously left the scratch dir empty and broke Claude Code auth on default `ignore`. The new helpers try symlink first, then fall back to a directory junction (`mklink /J`) for dirs or a hardlink (same-volume content share) / copy for files. Mirrors the execenv/codex_home_link_windows.go pattern. 3. Tests: - `TestResolveHostClaudeConfigDir` locks in the custom_env > parent_env > `~/.claude` precedence. - `TestNewIsolatedClaudeConfigDirMirrorsCustomHostDir` confirms the scratch dir picks up `.credentials.json` from a synthetic custom host dir, proving the source resolution actually propagates into the mirror. - `TestNewIsolatedClaudeConfigDirEmptyHostIsNoop` documents the env-var-auth-only case (no host source ⇒ empty scratch dir). - `TestMirrorHostClaudeExceptSkillsWith_FallbackWhenSymlinkFails` exercises the Windows-no-Developer-Mode path via the new `mirrorHostClaudeExceptSkillsWith` seam, asserting credentials and sub-dir children still reach the scratch dir after the symlink stand-in fails. - `TestMirrorHostClaudeExceptSkillsWith_PropagatesFirstLinkError` confirms callers see the per-entry error when even fallback fails (so the warn-log fires on broken Windows installs). - `TestCopyFileRoundTrip` covers the last-resort copy fallback and its EXCL no-overwrite contract. - `TestClaudeExecuteIsolatesUsesCustomEnvSource` is the end-to-end check: an agent with custom_env CLAUDE_CONFIG_DIR reads its credentials from the pinned dir, not `~/.claude/`. 4. Docs: `apps/docs/content/docs/skills.{mdx,zh.mdx}` updated to describe the effective-source resolution and the Windows fallback chain so the docs match the runtime behaviour. Verification: - `go test ./...` green (full server suite locally, including `pkg/agent` 23 cases covering the new + existing isolation paths). - `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and `go test -c -o /dev/null` both compile clean, confirming the Windows-tagged linker file builds. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): default skills_local to merge to preserve legacy behavior (MUL-2603) Per Bohan's product decision on PR #3200, the per-agent host-skill toggle defaults to "merge" — the pre-MUL-2603 inherit-from-machine behavior — so existing personal workflows that rely on locally installed Claude Skills keep working unchanged. Agent owners explicitly opt into "ignore" when they need to harden a shared agent against a broken local skill on one operator's machine (GitHub #3052). Also audited all 11 runtimes for user-global skill discovery paths and documented the scope of the toggle. Only Claude reads a user-global `~/.claude/skills/`; Codex isolates via `CODEX_HOME`, the ACP backends (Hermes / Kimi / Kiro) and the JSON-stream backends (Copilot / Cursor / Gemini / Pi / OpenCode / OpenClaw) anchor discovery to the task workdir and never read a user-global skill directory. UI copy and docs now say "for runtimes that support it (currently Claude Code)" everywhere so the scope is explicit. Changes: - Migration 108: column default flipped to 'merge'. - Handler CreateAgent: missing field → "merge"; explicit "ignore" / "merge" still validated, garbage still 400. - normalizeSkillsLocal: drift-safe coercion now lands on "merge" for anything that isn't the exact literal "ignore". - agent_template.go / onboarding_shim.go: internal CreateAgent callers send "merge" instead of "ignore" to match the new default. - Claude runtime (`claude.go`): isolate-mode gate flipped from `SkillsLocal != "merge"` to `SkillsLocal == "ignore"`, so "" (legacy daemons / older clients) and "merge" both walk `~/.claude/` directly. - Create Agent dialog + Skills tab: toggle defaults to on (merge); only duplicate of an explicit "ignore" agent carries through. The isolation opt-in is now `skills_local: "ignore"` when the user flips off; "merge" is omitted from the request body. - i18n (EN + zh-Hans): copy reframed — "On (default) — merged"; "Off — ignored. Recommended for shared agents". - Docs (`/skills`, `/guides/agents.zh`): describe new default and enumerate which runtimes act on the toggle. - Landing changelog 0.3.7: retitled "Per-Agent Local-Skill Toggle"; note the on-by-default behavior + off-to-isolate framing. - Tests: - `TestClaudeExecuteIsolatesHostSkillsWhenIgnoreOptedIn` replaces the old by-default isolation case (now requires explicit "ignore"). - New `TestClaudeExecuteDefaultModeKeepsHostConfigDir` locks in that default ExecOptions preserve the host CLAUDE_CONFIG_DIR. - `TestClaudeExecuteIsolatesUsesCustomEnvSource` now explicitly opts into "ignore" mode. - Handler tests: omitted → "merge"; explicit "ignore" round-trips; preserve-existing test seeds "ignore" and asserts "merge" flip-back. - `TestNormalizeSkillsLocal_DriftStaysSafe`: only literal "ignore" maps to ignore; everything else → "merge". - `skills-tab.test.tsx`: toggle ON by default; flip OFF when agent opted into "ignore". Intro-text matcher anchored to a more specific phrase so it no longer collides with the toggle hint copy. Verification: - `go test ./...` green (full server suite locally). - `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and `go test -c -o /dev/null` both compile clean (windows-tagged linker file still builds). - `pnpm typecheck` green across all packages and apps. - `pnpm --filter @multica/views test` 88 files / 771 tests green. - `pnpm --filter @multica/core test` 43 files / 390 tests green. - Handler DB-backed tests still skip locally without docker; CI will validate the create / update paths against migration 108. Co-authored-by: multica-agent <github@multica.ai> * chore(landing): drop 0.3.7 changelog entry from this PR (MUL-2603) The landing-page release notes belong in a separate release-prep PR, not in the feature PR. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): propagate skills_local=ignore to codex user-skill seed (MUL-2603) Make the per-agent skills_local toggle real for Codex too, not just Claude. Previously the toggle was only consumed by the Claude backend, while the daemon's execenv layer always seeded Codex's per-task CODEX_HOME with the host machine's user-installed skills from ~/.codex/skills/. A shared Codex agent with skills_local=ignore could still inherit a broken local skill from one operator's machine. Now: PrepareParams/ReuseParams carry SkillsLocal; hydrateCodexSkills skips seedUserCodexSkills when SkillsLocal == "ignore" so the per-task CODEX_HOME exposes only workspace skills to the codex CLI. Default ("merge", or empty from older servers/clients) preserves existing inherit-from-machine behavior. UI / docs are updated to reflect the contract honestly: Claude Code and Codex honor the toggle; other runtimes (Hermes / Kimi / Kiro / Copilot / Cursor / Gemini / Pi / OpenCode / OpenClaw) leave $HOME untouched and discover user-level skills natively, so the toggle is a no-op for them today. New tests: TestPrepareCodexSkillsLocalIgnoreSkipsUserSeed, TestPrepareCodexSkillsLocalMergeSeedsUserSkills, and TestReuseCodexSkillsLocalIgnoreSkipsUserSeed cover Prepare(ignore), Prepare(merge), and the toggle-flip-on-reuse path. Co-authored-by: multica-agent <github@multica.ai> * docs(skills): scope skills_local toggle copy to Claude Code + Codex (MUL-2603) Off-state hint and Skills tab intro now explicitly call out Claude Code + Codex as the only runtimes that honor the toggle, with "other runtimes ignore this setting" wired into both states (en + zh-Hans), so users on non-Claude/Codex agents don't read "Off" as runtime-wide isolation. Docs (skills.mdx, skills.zh.mdx, guides/agents.zh.mdx) stop describing Hermes / Kimi / Gemini / Copilot / Cursor / Pi / OpenCode / OpenClaw / Kiro as having native user-level skill discovery; the daemon simply does not manage user-level skill discovery for those runtimes today, and the toggle is a no-op regardless of where it is set. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-26 13:26:33 +08:00
Bohan Jiang	cd71b0fe05	fix(daemon): disable Codex native auto-memory in per-task config.toml (#3202 ) Codex CLI's auto-memory subsystem writes summaries to `$CODEX_HOME/memories/raw_memories.md` and `state_*.sqlite`, then reads them back on the next turn. The daemon never cleared these files across Reuse(), and Codex CLI may also pull from user-level `~/.codex/memories/` entirely outside the per-task isolation. Either path leaks unrelated context into new Multica tasks — multica#3130 saw `D:\Project\MoHaYu\ WowChat` Raw Memories injected into a brand-new issue's first turn. Write a daemon-managed block into the per-task `config.toml` that sets `features.memories = false`, `memories.generate_memories = false`, and `memories.use_memories = false`. Codex then neither writes nor reads its memory subsystem regardless of where the residual files live. The user's global `~/.codex/config.toml` is never touched. Pattern mirrors `ensureCodexMultiAgentConfig`: idempotent managed-block upsert, two TOML layout variants (root dotted-key vs. inside a `[features]` / `[memories]` table) to satisfy strict toml-rs parsing, and a `MULTICA_CODEX_MEMORY` env-var escape hatch. MUL-2598 Co-authored-by: multica-agent <github@multica.ai>	2026-05-25 15:17:38 +08:00
LinYushen	8e9df90d32	feat: include repo description in agent brief (#3203 ) Add Description field to RepoData structs so that workspace repo descriptions (set via the settings UI) are preserved through normalization and rendered in the agent brief as: - <url> — <description> When no description is set, the existing format is unchanged. Closes MUL-2610 Co-authored-by: multica-agent <github@multica.ai>	2026-05-25 15:16:22 +08:00
Bohan Jiang	a55c03a0b3	fix(agent): inject Workspace Context into agent brief (MUL-2542) (#3078 ) * fix(agent): inject Workspace Context into agent brief (MUL-2542) The per-workspace `workspace.context` field (Settings → General) was stored in the DB but never reached the agent prompt. Plumb it from the workspace row through the claim response, the daemon's Task struct and TaskContextForEnv, and render it as `## Workspace Context` in the meta brief above `## Available Commands`. Heading is skipped when the field is empty so workspaces that haven't set a context don't see a bare header. Applies to every task kind — issue, comment, chat, autopilot, quick-create — so the shared system prompt is consistent regardless of trigger source. Co-authored-by: multica-agent <github@multica.ai> * chore(server): gofmt files touched by workspace-context injection Run gofmt on the files that buildWorkspaceContext injection touched. Cleans up composite-literal alignment in execenv task context and struct-tag alignment in Task / AgentTaskResponse / RegisterRequest. No behavior change. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: J <agent-j@multica.ai>	2026-05-22 17:23:27 +08:00
Bohan Jiang	c967ae0e0e	feat(issues): platform-owned parent notify on child done (MUL-2538) (#3055 ) * feat(issues): platform-owned parent notify on child done (MUL-2538) When a child issue transitions from a non-done status into `done` and has an open parent, the server now posts a top-level platform-generated comment on the parent itself. Replaces the agent-prompt rule shipped in PR #2918, which produced self-mention loops, planner ping-pong, and accidental `MUL-` prefix hardcoding because the agent did not always know the workspace prefix. - Migration 107 widens `comment.author_type` to allow `system`; the zero UUID is used as the sentinel `author_id` (the column stays NOT NULL, callers branch on `author_type === 'system'`). - `Handler.notifyParentOfChildDone` fires from both `UpdateIssue` and `BatchUpdateIssues`. Guards: prev status != done, new status == done, parent set, parent not in `done`/`cancelled`. Bypasses the CreateComment HTTP path so the assignee on_comment trigger and the mention-trigger paths do not fire — the comment content carries only the safe issue mention for the child, no `mention://agent/...` / `mention://member/...` / `mention://squad/...` links. - `runtime_config.go` downgrades the Parent/Sub-issue Protocol rule 1 to an explicit "do NOT post one yourself" guardrail; rule 2 (sub-issue creation `--status todo` vs `backlog`) is unchanged. - New handler test exercises the happy path, idempotency, reopen+done, parent done/cancelled guards, and the no-parent case. Runtime-config tests reassert the new wording and the banned strings from the prior revision. Co-authored-by: multica-agent <github@multica.ai> * fix(issues): isolate system comments + wire GH merge path (MUL-2538) Addresses the two must-fix items from the PR #3055 second review: 1. The platform-generated `comment:created` event (author_type='system') was running through the generic comment listeners, which (a) tried to subscribe the zero-UUID author and (b) parsed @mentions from the body for inbox notifications. Both subscriber_listeners and notification_listeners now early-return on author_type='system' so the event becomes a pure WS broadcast for the timeline — no inbox rows, no transcluded-mention attack surface. 2. advanceIssueToDone (the GitHub merge auto-done path) only published issue:updated and skipped notifyParentOfChildDone, so a child closed via merged PR — the dominant completion path — left the parent silent. The helper is now invoked on the same prev/updated pair, with the existing guards (transition + parent state) protecting double-fire. Tests: - New cmd/server/notification_listeners_test: TestNotification_SystemCommentSkipsInboxAndMentions (parent subscribers and smuggled @mention targets stay quiet), TestSubscriberSystemCommentDoesNotSubscribe (zero-UUID never reaches AddIssueSubscriber). - New internal/handler/github_test: TestWebhook_MergedPR_ChildWithParent_NotifiesParent fires a real pull_request closed-merged webhook against a child and asserts the parent receives exactly one safe system comment with the workspace's real identifier (no `mention://agent\|member\|squad` links). Co-authored-by: multica-agent <github@multica.ai> * fix(runtime): drop parent-notification guidance from agent brief (MUL-2538) Per Bohan's product call on PR #3055: the platform now owns the child-done parent notification, so the runtime brief should not mention the parent-comment path at all — not as an instruction, not as a "do not do it" guardrail. The previous revision kept rule 1 of the Parent / Sub-issue Protocol as a "Do NOT post your own parent-notification comment." sentence; that still puts the concept in front of the agent every run, which is exactly what we are trying to avoid. What changes: - Delete the "Parent / Sub-issue Protocol" preamble and rule 1 from buildMetaSkillContent. The remaining content — the `--status todo` vs `--status backlog` rule for creating sub-issues — now lives in a dedicated `## Sub-issue Creation` section, since the parent/child framing it previously sat under is gone. - The system comment on the parent stays exactly as in `366f6e2`: the agent simply does not need to know about it. Tests: - runtime_config_test.go is rewritten around the new section name and the wider "no parent-notification guidance" canary; the banned list now covers both the original PR #2918 wording and the intermediate "do NOT post one" wording. System comment UI: the frontend already renders `author_type === "system"` with author name "Multica" (`useActorName`) and the MulticaIcon avatar (`ActorAvatar` via `isSystem`), matching Bohan's "looks like a normal comment, author is multica + multica logo" requirement — no frontend changes needed. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-22 14:51:43 +08:00
Bohan Jiang	ae530ef057	docs(runtime): tighten issue-metadata write bar (MUL-2507) (#3004 ) The previous wording invited agents to pin too much: any opened PR, external link, or "fact future agents will want one-glance access to" was framed as worth writing, with no explicit upper bound. In practice this caused metadata bags to accumulate single-run details and description-summary noise instead of the small set of repeatedly-read values the feature was designed for. Rework the agent runtime brief and the CLI docs to lead with the bar: write a key only when it is materially important AND likely to be re-read by future runs on the same issue. "Most runs write zero new keys" is now stated as the expected case, and the workflow exit step is rewritten to mirror the same gate. Recommended-key list, safety boundaries, and stale-key cleanup are preserved so the locked-in test anchors still pass. Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 17:20:43 +08:00
Bohan Jiang	0c767c0052	feat(issues): per-issue metadata KV (MUL-2017) (#2845 ) * feat(issues): per-issue metadata KV (MUL-2017) Adds a small JSONB KV map to every issue for agent pipeline state (attempts, PR number, pipeline status, ...). Keys match a narrow regex, values are primitives (string / number / bool), capped at 50 keys per issue and 8KB per blob. Defense-in-depth via two CHECK constraints (object shape + size). All mutations are single-key atomic (jsonb_set / `- key`). `UpdateIssue` intentionally does NOT touch metadata: a whole-blob overwrite would race with concurrent agent writes. GET /api/issues/:id/metadata PUT /api/issues/:id/metadata/:key body: { "value": <primitive> } DELETE /api/issues/:id/metadata/:key Containment filter on list: GET /api/issues?metadata=<json-object> uses PG `@>` against a `jsonb_path_ops` GIN index. Mirrored across ListIssues, CountIssues, ListOpenIssues, and the hand-rolled ListGroupedIssues SQL so CLI/API and UI grouped views stay consistent. CLI: multica issue metadata {list,get,set,delete} multica issue list --metadata key=value (repeatable, AND) set has --type to override the default value-sniffing Co-authored-by: multica-agent <github@multica.ai> * fix(issues): metadata test bugs + wire realtime + read-only display (MUL-2017) - Fix two failing handler tests blocking backend CI: - reset decode target after delete so map merge does not mask removal - url.PathEscape the key segment so spaces no longer panic NewRequest - Wire issue_metadata:changed end to end so the detail / list / my-issues caches stay in sync with set/delete events (other tabs, CLI writes). - Add a read-only Metadata strip to the issue detail sidebar; hidden when the issue has no keys so it stays quiet in the common case. Co-authored-by: multica-agent <github@multica.ai> * feat(runtime): teach agents to read/write issue metadata (MUL-2017) Add an `## Issue Metadata` section to the runtime brief plus a `metadata list` step on entry and a `metadata set`/`delete` step on exit. Section only emits when the task carries an issue id (comment- or assignment-triggered); chat / quick-create / run-only autopilot stay clean so they don't fire failing CLI calls. Co-authored-by: multica-agent <github@multica.ai> * fix(issues): bump metadata migration to 105 and drop attempts as example (MUL-2017) main is now at 104_drop_runtime_timezone; the migrator picks LatestVersion() by sorted filename, so a slot before the tail would let DBs that have already run 099–104 think they're up-to-date while the issue.metadata column is missing — runtime would then fail with column does not exist. Renumbering to 105 puts the migration at the tail and forces it to run. Also drop attempts as a positive example across docs/code comments and test fixtures — the runtime instruction prompt already lists it under "What NOT to pin" (runtime bookkeeping). Replace with pr_number, which is in the recommended-keys set, so docs/tests speak the same language as the prompt. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 16:35:45 +08:00
Bohan Jiang	7f9e4e829d	feat(comments): thread-internal --tail pagination + reply cursor (MUL-2421) (#2846 ) * feat(comments): thread-internal pagination via --tail + reply cursor (MUL-2421) Long threads inside a single issue still forced agents to read every reply once they used --thread, even after MUL-2387 fixed cross-thread noise. This adds reply-level paging so a 200-reply thread can be navigated tail-first without dragging the whole conversation into prompt context. - New SQL query ListThreadCommentsForIssuePaged: same recursive root walk as the legacy thread query, but caps reply count and supports an (created_at, id) composite cursor. Root is unconditional — even tail=0 emits it so the reader keeps the "what is this thread about" context. - Handler ListComments: parses `tail` (non-negative, ThreadTailSet flag preserves the tail=0 intent), threads it through to the paged query, and re-uses X-Multica-Next-Before / X-Multica-Next-Before-Id for the reply cursor. Cursor's meaning is now context-dependent: thread cursor under --recent, reply cursor under --thread + --tail. - CLI: new --tail flag (only valid with --thread; mutually exclusive with --recent), reply-cursor semantics for --before / --before-id when paired with --thread + --tail, stderr label flips to "Next reply cursor" so an operator copy-pasting the cursor knows which scope it scrolls. - Tests cover the new contract: tail=N keeps newest N + root, tail=0 is root-only, anchor on a nested reply still walks up, reply cursor scrolls older replies page-by-page, since combined with tail filters after the cut, and the negative-flag-combination matrix. Out of scope: prompt template update to hint at `--thread <id> --tail 30` on long threads — separate follow-up per the issue. Co-authored-by: multica-agent <github@multica.ai> * fix(comments): only emit reply cursor when older reply exists (MUL-2421) The thread-tail path emitted `X-Multica-Next-Before` whenever the page filled to exactly the requested reply count, even when there was nothing older to scroll to. So `--thread <root> --tail 3` on a thread with exactly 3 replies sent a cursor that, when followed, returned just the root — a wasted round-trip that surfaced as a phantom "older replies" affordance in the agent prompt. Switch to a `reply_limit + 1` probe: ask the SQL for one extra row, trim the oldest overflow before responding, and only emit the cursor when an older reply actually existed. The exact-boundary case (replyCount == tail with no overflow) now returns no cursor. Also documents `--thread/--tail/--recent/--before` and the cursor semantics in CLI_AND_DAEMON.md, which was the second must-fix in the MUL-2421 review. Co-authored-by: multica-agent <github@multica.ai> * fix(comments): suppress reply cursor when --since covers older replies (MUL-2421) In the thread + tail + since path the server still emitted a reply cursor whenever there was an older reply on disk, regardless of `since`. If the oldest retained reply on the page was already `<= since`, every older reply was guaranteed to be filtered out too, so the next page only ever returned the root — wasting round-trips until the agent walked the whole pre-`since` history. Mirror the recent + since suppression: when `replies[0].CreatedAt <= since`, drop the cursor. Test covers the exact case from Elon's review: tail=2 overflow, body keeps a fresher reply, but the cursor target (oldest retained reply) is already past `since` — header must be empty. Co-authored-by: multica-agent <github@multica.ai> * feat(prompt): default comment-trigger reads to --thread --tail 30 (MUL-2421) Comment-triggered agents previously defaulted the trigger-thread read to the unbounded `--thread <id> --output json`, which dumps the full thread into the prompt — exactly the kind of context bloat MUL-2387 fixed at the cross-thread layer but never bounded inside a single thread. Use the new `--tail` flag landed earlier in this PR (server + CLI) as the default for both the per-turn prompt and the runtime-config Workflow: - `--thread <trigger-id> --tail 30 --output json` is the new default. Root is always included so "what is this about" context survives. - If 30 replies aren't enough, the prompt now spells out the reply cursor: re-feed the stderr `Next reply cursor: --before <ts> --before-id <reply-id>` pair back to walk older replies. - `--recent 20` stays as the cross-thread background fallback, with an explicit callout that the same `--before` / `--before-id` flags walk threads (not replies) in that mode. - Available Commands core line now surfaces `--tail N` and both stderr cursor labels so non-workflow callers also discover the flag. - `--since` callouts reflect the post-MUL-2421 combinable mode names (`--thread --tail` / `--recent`). Tests (`prompt_test.go`, `execenv_test.go`) pin the new defaults and add a regression guard against the unbounded `--thread` recipe sneaking back in. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 13:43:15 +08:00
Bohan Jiang	aeb284cbeb	feat(runtime): teach agents the parent/sub-issue protocol (MUL-2338) (#2918 ) * feat(runtime): teach agents the parent/sub-issue protocol (MUL-2338) Adds a Parent / Sub-issue Protocol section to the runtime brief built by `buildMetaSkillContent`, emitted whenever the agent is running on a real Multica issue (assignment- or comment-triggered). Two behaviors are now documented for every issue-bound agent: - A. When wrapping up a child issue, post the final result and switch to `in_review` on this issue first, then post a single top-level comment on the parent. Mention the parent assignee only when it is another agent on a still-open parent — never self-mention, never @ member / squad, never re-trigger a `done` / `cancelled` parent. - B. When creating sub-issues, choose `--status backlog` for sub-issues that must wait and `--status todo` for the one to start immediately; promote with `multica issue status <id> todo` when its turn comes. The signal is explicitly framed as best-effort — no server-side state sync, no claim of a guaranteed handshake. The section is skipped for chat, quick-create, and run-only autopilot runs, which have no parent/child semantics. Tests in runtime_config_test.go assert that the section is present in both issue workflows, absent in the three non-issue modes, and that the wording does not introduce a non-existent `multica issue list --parent` command or promise a reliable handshake. Co-authored-by: multica-agent <github@multica.ai> * fix(runtime): split Step A of parent/sub-issue protocol by trigger type (MUL-2338) Comment-triggered runs were inheriting an unconditional `multica issue status <this-issue-id> in_review` from Step A, which conflicts with the comment-triggered workflow rule "Do NOT change the issue status unless the comment explicitly asks for it" (Elon's blocking review on PR #2918). Step A now branches on trigger type: - Assignment-triggered: keep "post final results + flip in_review". - Comment-triggered: complete the reply per the existing workflow rule, only flip status when the triggering comment asked for it, and gate the parent-notification steps on actually closing out child work. Tests lock the boundary: comment-triggered briefs must not contain the unconditional in_review command, must echo the existing status guardrail inside Step A, and must spell out the "closing out" gate. Assignment-triggered briefs still carry the unconditional flip. Co-authored-by: multica-agent <github@multica.ai> * fix(runtime): simplify parent/sub-issue mention rule to always @ parent assignee (MUL-2338) Per Bohan's directive on PR #2918: the per-case mention table (same agent / member / squad / closed parent) is overkill prompt complexity. Replace it with a single rule: always @mention the parent's assignee using the URL that matches assignee_type. The platform's existing run dedup handles re-triggers, and a single rule is easier for agents to follow predictably. Preserves the existing comment-triggered boundary (Step A still does NOT add an unconditional in_review flip on comment-triggered runs). Co-authored-by: multica-agent <github@multica.ai> * refactor(runtime): compress parent/sub-issue protocol to 3-rule convention (MUL-2338) Drop the spec-flavored A/B sub-headings and per-case mention table; keep three numbered rules (close out child, notify parent, pick backlog vs todo) plus a one-line best-effort preamble. The comment-triggered branch still re-asserts the "do not change status unless asked" guardrail and gates parent notification on actually closing out child work; the assignment-triggered branch still flips to `in_review`. Section is now 7 lines instead of 29. A new TestParentSubIssueProtocolIsCompact guards the ≤10-line ceiling so this stays a convention, not a spec. Co-authored-by: multica-agent <github@multica.ai> * fix(runtime): make sub-issue creation rule unconditional in parent/sub-issue protocol (MUL-2338) Elon's review on PR #2918: the preamble previously gated all three rules on the current issue having `parent_issue_id`, but rule 3 (creating sub-issues) needs to reach top-level parents that have no parent themselves — that is exactly where the `todo` vs `backlog` decision matters most. Move the gate from the preamble onto rules 1 and 2 per-rule; rule 3 now applies to any issue-bound run. Section stays at 7 newlines (≤10). Co-authored-by: multica-agent <github@multica.ai> * refactor(runtime): unify parent/sub-issue protocol as mechanism description (MUL-2338) Drop the if/else split between assignment- and comment-triggered runs in the Parent / Sub-issue Protocol section: both runs now read the same two-rule description of how the parent/child mechanism works. The comment-triggered workflow rule "Do NOT change the issue status unless the comment explicitly asks for it" naturally short-circuits the parent notification (no status flip → not closing out the child → skip), so the protocol no longer needs to branch on TriggerCommentID. Tests collapse the two trigger-specific cases into one parameterized test, and the assignment vs comment status-flip invariants are now anchored on the real workflow command (with substituted issue id) instead of the protocol's removed `<this-issue-id>` placeholder. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 16:20:33 +08:00
Jiayuan Zhang	2ad1cd8ff8	feat(profile): user profile description injected into agent brief (MUL-2406) ## Summary Adds per-user `profile_description` so coding agents have cheap, durable context about who is asking. v1 per the brief Xeon locked in on [MUL-2406](mention://issue/63a7247c-4f6a-42cf-90d1-7c746e77158a): - DB — `user.profile_description TEXT NOT NULL DEFAULT ''` (migration 096). 2000-rune cap enforced server-side. No nullable / privacy state to manage. - API — `PATCH /api/me` accepts the field; `UserResponse` always emits it. Client wraps `updateMe` in a lenient `UserSchema` + `EMPTY_USER` fallback per CLAUDE.md API Response Compatibility. - UI — Settings → Account gains an "About you" textarea with live `n/2000` counter, `maxLength` guard, and a localized too-long error (EN + zh-Hans). - CLI — `multica user profile get` / `multica user profile update` with `--description / --description-stdin / --description-file / --clear`, mirroring the existing `issue comment add` input-mode menu. - Daemon injection — claim handler resolves the runtime owner and stamps `requesting_user_name` + `requesting_user_profile_description` on the task. `buildMetaSkillContent` emits `## Requesting User` between `## Agent Identity` and `## Available Commands`, blockquoted and framed as background context. The block is omitted entirely when the description is empty (no token cost when unused). Brief is written once per task via `CLAUDE.md` / `AGENTS.md`, not the per-turn prompt — same path the agent already reads for identity, so no extra per-turn cost. ## Test plan - [x] `go build ./...`, `go vet ./...`, `go test ./internal/cli/ ./internal/daemon/ ./internal/daemon/execenv/ ./cmd/multica/` - [x] New brief tests: `TestBuildMetaSkillContentEmitsRequestingUser`, `TestBuildMetaSkillContentOmitsRequestingUserWhenEmpty` - [x] `pnpm typecheck`, `pnpm lint`, `pnpm test` (74 files, 644 tests pass) - [ ] Handler DB tests (`TestUpdateMe*`) require a migrated test DB — not runnable in this sandbox - [ ] Manual: open Settings → Account, set a description, confirm the next daemon-run agent's `CLAUDE.md` shows `## Requesting User`	2026-05-19 19:51:28 +02:00

1 2 3 4

159 Commits