multica

mirror of https://github.com/multica-ai/multica.git synced 2026-07-05 13:29:44 +02:00

Author	SHA1	Message	Date
Multica Eve	f59cb2f494	MUL-3834: harden daemon websocket reconnect (#4699 ) * MUL-3834 harden daemon websocket reconnect Co-authored-by: multica-agent <github@multica.ai> * MUL-3834 stabilize daemon websocket liveness tests Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-06-29 16:46:57 +08:00
Multica Eve	7d0c73d11f	MUL-3417: tolerate OpenClaw config file CLI mismatch Closes MUL-3417 Fixes #4299	2026-06-25 16:58:07 +08:00
Bohan Jiang	dfa384ffa2	fix(daemon): resolve skill bundles per-skill with size-scaled timeout (#4505 ) (#4530 ) * fix(daemon): resolve skill bundles per-skill with size-scaled timeout (MUL-3650, #4505) Cold-start skill resolution downloaded the agent's entire bundle in one atomic request bounded by the shared 30s control-plane http.Client timeout. On a slow/jittery link a large bundle (15+ skills) could not finish the body read in 30s, and because the cache was only written after the whole batch succeeded, nothing was persisted on failure — so every dispatch re-downloaded the full bundle and timed out again, never converging. Resolve each missing bundle in its own request and cache it the moment it arrives: - daemon: per-skill resolve with a deadline scaled to the bundle's declared size (floor 30s, cap 5m, ~50KB/s floor throughput) instead of the fixed control-plane timeout; each success is persisted independently, so a dispatch that fails on one skill still caches the rest and the next dispatch only re-fetches what is missing. - client: dedicated bundleClient with no fixed Timeout (deadline comes from ctx), a singular ResolveSkillBundle, and a short transient-retry schedule. Tests cover the size-scaled timeout and the cross-dispatch incremental caching / convergence (a failed skill does not discard its siblings, and cached skills are not re-fetched). Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): accept server-side skill updates in per-skill resolve (MUL-3650) Address review on #4530: resolveSkillBundle validated the returned bundle against the claim-time ref, which pinned it to the requested hash. The resolve endpoint intentionally serves the agent's current bundle and hash when the requested hash is stale (the skill can be edited between claim and prepare), so a legitimate updated bundle was rejected as invalid and the task failed. Confirm only that the server returned the requested skill (source/id), then validate self-consistency against a ref derived from the returned bundle and cache it under its own hash — matching the documented endpoint contract. Adds a regression test covering a stale-hash request answered with an updated bundle. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-24 19:00:13 +08:00
J	34bd115808	test(execenv): fix stale test name reference in comment (#3028 ) Co-authored-by: multica-agent <github@multica.ai>	2026-06-24 18:06:24 +08:00
jockibeard	3adfaf4285	fix(execenv): support OpenClaw 2026.6.x agents schema (#3028 ) (#4319 ) Adapts OpenClaw execenv prep to the 2026.6.x agents schema (agents.list config path removed; agents live in a sqlite registry). Case-insensitive key-missing guard + registry fallback on read, version-aware emission on write so per-task workspace pinning keeps working. Closes #3028 MUL-3643	2026-06-24 18:05:38 +08:00
Bohan Jiang	9db80a0940	fix(daemon): forbid mid-run progress comments in runtime brief (#4516 ) A run could post running progress/plan narration as issue comments, and a review run surfaced its in-progress narration as the result instead of a conclusion (MUL-3605). Add one rule to the Output section's issue-task branch, in both the legacy and slim briefs: post exactly one comment per run — the final result, before the turn exits — and keep plans/progress in the agent's own reasoning. The pre-existing "Final results MUST be delivered … a task that finishes without a result comment is invisible" line already makes the comment mandatory, and "state the outcome, not the process" already rules out progress dumps, so no second rule is added. Chat / quick-create / autopilot keep their own delivery channels. Adds a regression test across both brief paths. Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-24 17:20:19 +08:00
beast	20eecfb093	fix(projects): honor repo resource checkout refs (MUL-3593) (#4470 )	2026-06-24 16:25:17 +08:00
Multica Eve	1ac3a03e5d	MUL-3618: dispatch daemon feature flag snapshots (#4509 ) * MUL-3618: dispatch daemon feature flag snapshots Co-authored-by: multica-agent <github@multica.ai> * MUL-3618: narrow daemon flag snapshots to process scope Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-06-24 16:19:30 +08:00
Bohan Jiang	76c58a4ee8	MUL-3617: remove Gemini CLI runtime (#4503 ) * fix: remove gemini cli runtime Co-authored-by: multica-agent <github@multica.ai> * fix: skip unsupported custom runtime profiles Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-24 15:15:42 +08:00
Multica Eve	8ad673fdb7	MUL-3560: gate slim runtime brief behind `runtime_brief_slim` feature flag (#4449 ) The MUL-3560 slim runtime brief — kind-driven dispatcher, per-section gating, prose compression for ~7k chars saved on the typical comment-triggered task — now ships behind the `runtime_brief_slim` feature flag wired via the framework-level service from MUL-3615. Default: OFF in every environment (production stays on the legacy brief that has shipped for ~2 years). Staging opts in via the YAML rule set; ops can override per-process with `FF_RUNTIME_BRIEF_SLIM=true`. Production is held back until staging has burned in long enough that we are confident the slim brief does not regress agent behaviour. Architecture (one toggle point, two code paths, both fully tested): buildMetaSkillContent (runtime_config.go) │ └─ useSlimBrief() → false (default) │ → fall through to the legacy verbose body that ships on │ main today — byte-for-byte unchanged, no migration risk │ └─ useSlimBrief() → true → buildMetaSkillContentSlim (runtime_config_sections.go) → classifyTask → 5-way kind switch → per-section writers BuildCommentReplyInstructions takes the same gate, so the per-turn comment prompt and the runtime brief stay in sync on which template they emit. What's in this PR: - runtime_config_flag.go (new): package-scope `runtimeFlags` atomic pointer + `SetFeatureFlags` setter + `useSlimBrief` toggle point. Nil-safe: a daemon that forgets to wire the service falls back to legacy, no panic. - runtime_config_kind.go (new): `taskKind` enum + `classifyTask` + `hasIssueContext` predicate. Used only by the slim path. - runtime_config_sections.go (new): the slim brief itself — `buildMetaSkillContentSlim` + per-section `writeXxx` helpers + `writeAvailableCommandsQuickCreate` minimal variant + `writeBackgroundTaskSafetySlim` compressed safety section. The Section × Kind matrix is documented inline on `buildMetaSkillContentSlim` and the test below checks the dispatcher does not diverge from the spec. - reply_instructions.go: `BuildCommentReplyInstructions` gains a short slim-or-legacy prelude; new `buildCommentReplyInstructionsSlim` is the compressed cookbook (defers the shell-hazard rationale to `## Comment Formatting`). - runtime_config.go: `buildMetaSkillContent` gains a 2-line dispatcher at the top; the legacy body is otherwise untouched. - runtime_config_kind_test.go (new): canaries for both paths. - TestClassifyTask: 5 kinds + 3 tiebreak cases. - TestTaskKindHasIssueContext: predicate semantics. - TestSlimFlagOffUsesLegacy: nil flag service → legacy path (renders "Get full issue details.", a legacy-only substring). - TestSlimFlagOnUsesSlim: flag on → slim path (renders "full issue.", a slim-only one-liner) AND must NOT render legacy "Get full issue details.". - TestBuildMetaSkillContentSlimKindMatrix: locks the per-kind section set; heading match is line-anchored so inline references don't trip absence assertions. - TestSlimQuickCreateAvailableCommands: locks the minimal-variant content for quick-create (issue create present, every other Core command absent). - TestSlimBriefIsSubstantiallyShorter: ≥ 30% reduction guard so a future change can't accidentally re-bloat the slim path back to legacy levels. - cmd/server/main.go: now calls `execenv.SetFeatureFlags(flags)` immediately after constructing the feature flag service. Measured impact (slim vs legacy, claude provider, realistic fixture with 2 repos + 2 skills + member initiator): legacy = 19567 chars slim = 11868 chars Δ = -7699 (-39.3%) Verification: - go vet ./internal/daemon/... ./cmd/server/... ok - go test ./internal/daemon/... ok - go test ./pkg/featureflag/... ok - TestSlimBriefIsSubstantiallyShorter logs the 39.3% ratio - TestSlimFlagOffUsesLegacy + TestSlimFlagOnUsesSlim pass both directions, so the dispatcher is locked in code. The pre-existing `internal/handler` test failures (TestLeaveWorkspace_RevokesOwnRuntimes, TestDeleteMember_CancelsTasksFromAgentReassignment, TestDeleteMember_NoRuntimes_DeletesMember) reproduce on plain `origin/main` with the same `relation "channel_user_binding" does not exist` SQL error — they are a missing-migration bug from the recent channels foundation PR (`ce28d0aa0`), not anything this PR touched. Rollout plan: 1. Merge this PR. Production daemons keep emitting the legacy brief (flag default false). 2. Add a YAML rule to staging's `MULTICA_FEATURE_FLAGS_FILE`: runtime_brief_slim: default: true Staging daemons start emitting the slim brief on next restart. 3. Watch `agent prompt prepared` logs + agent behaviour for 7 days. 4. If staging is clean, flip the prod YAML to `default: true`. Legacy code path stays in the binary as a kill-switch (`FF_RUNTIME_BRIEF_SLIM=false` to revert without a deploy). 5. After ~30 days clean in prod, follow up with a PR that deletes the legacy body and the flag — same pattern as docs/feature-flags.md recommends ("plan the death of the flag at birth"). Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-06-24 14:23:17 +08:00
Naiyuan Qing	b79777caec	feat(comments): resolve-aware fold for agent comment reads (MUL-3555) (#4463 ) * feat(comments): resolve-aware fold for agent comment reads (MUL-3555) Agents reading a long issue paid tokens for settled discussion. The human timeline already folds resolved threads, but the agent read path (`comment list`) ignored resolved_at entirely — humans saw the conclusion, agents got the full raw discussion. Add an opt-in `fold=true` projection to ListComments that collapses each resolved thread to root + conclusion (reply-resolved) or root only (root-resolved), reusing the human timeline's deriveThreadResolution semantics. The resolved thread's root carries `thread_resolved` + `folded_count`; `--full` brings the dropped comments back. Fold is rejected on partial-thread reads (since/tail) and roots_only, where a resolution comment could be unfetched and silently dropped. CLI `comment list` folds by default on the complete-thread reads (default, --recent, untailed --thread) with a `--full` escape hatch; the agent prompts and runtime brief document the fold + escape. No new endpoint, no human UI change, no SQL/migration change — in-memory projection, same precedent as summary/roots_only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * refactor(daemon): dedupe fold prompt restatements per review (MUL-3555) Howard's PR review flagged DRY redundancy: the resolve-fold rule was restated in full in the task prompt (prompt.go:41/:182) and the brief workflow steps (runtime_config.go:673/:692, reply_instructions cold hint) even though the canonical command catalog (runtime_config.go:477) — always present in the brief — already documents it in full, and the task prompt explicitly defers to it ("follow the rule in your runtime workflow file"). Keep the catalog entry full (the canonical reference); shrink the five inline restatements to a short "resolved threads come back folded — `--full` to expand" pointer. No loss of signal (the agent always has the full catalog in context), ~80-120 tokens/run saved on the worst-case assignment / cold paths. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-06-24 09:52:18 +08:00
Bohan Jiang	5038c983c0	MUL-3281: Add daemon skill bundle refs (#4445 ) * feat: add daemon skill bundle refs Co-authored-by: multica-agent <github@multica.ai> * fix: tighten skill bundle resolve safeguards Co-authored-by: multica-agent <github@multica.ai> * feat: add task prepare lease Co-authored-by: multica-agent <github@multica.ai> * fix: isolate prepare lease concurrent index migration Co-authored-by: multica-agent <github@multica.ai> * fix: keep prepare lease active through start Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-23 16:19:16 +08:00
Multica Eve	12ea1f6a8c	MUL-3495: support custom runtime args and registration errors (#4408 ) * feat: support custom runtime args Co-authored-by: multica-agent <github@multica.ai> * fix: address custom runtime review nits Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-06-23 14:20:18 +08:00
Naiyuan Qing	4ab335b8a5	MUL-3416: Issue pre-trigger preview + Handoff Note (#4383 ) * feat(issues): unify run-enqueue decision behind WillEnqueueRun + preview endpoint Collapse the issue update/batch enqueue copies into one service predicate service.IssueService.WillEnqueueRun, shared verbatim with a new dry-run endpoint POST /api/issues/preview-trigger so the four entry points stop drifting (squad/self-loop/batch omissions, MUL-3375). The private-agent gate stays at the HTTP boundary: write paths inject allow-all, preview injects the real gate so it never leaks a private agent's readiness. Add suppress_run to issue update/batch: the change applies but no run starts. Remove the now-dead handler mirrors shouldEnqueueSquadLeaderOnAssign / isSquadLeaderReady. service.Create and the comment trigger chain are untouched. Tests: preview behavior, preview<->write-path match, batch aggregation, member no-trigger, suppress_run skip, malformed-body 400. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(issues): inject handoff note into assigned runs via first-class task field Add an optional handoff_note carried by issue assign/promote into the run's opening prompt and issue_context.md, via a dedicated agent_task_queue column (migration 122) and a daemon assignment-handoff render branch — never a fabricated comment, never trigger_comment_id (MUL-3375 §6.1). Thread the note through enqueueIssueTask/enqueueMentionTask + WithHandoff public variants and dispatchIssueRun; suppress_run or a parked write drops it (no run = nothing to inject). Soft version gate: MinHandoffCLIVersion + HandoffSupported, surfaced per-trigger as handoff_supported in the preview so the UI can gray the note box on old daemons; the assignment never hard-fails. Tests: daemon prompt + issue_context render via the assignment branch (not quick-create/comment), version helper matrix, note persists on the task, suppressed assign enqueues nothing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(issues): leave a display-only handoff record on the timeline When an assign/promote with a handoff note starts a run, write one type='handoff' timeline record via TaskService.RecordHandoff — a direct Queries.CreateComment + timeline event that bypasses Handler.CreateComment, so it never reaches triggerTasksForComment and cannot start a second run (MUL-3375 §6.2, the must-not-retrigger invariant). Author is the actor who handed off; body is the note. Migration 123 admits the 'handoff' comment type. Recorded only on a real run start: suppress_run or a parked write writes nothing. enqueueSquadLeaderTask now reports whether it enqueued so the trace is gated on an actual dispatch. Test: exactly one handoff record on assign-with-note, exactly one task (no re-trigger), and no record when suppressed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(issues): frontend plumbing for issue-trigger preview + handoff (core) Add api.previewIssueTrigger + IssueTriggerPreviewSchema (zod parseWithFallback), the use-issue-trigger-preview hook, issueKeys.issueTriggerPreview(+All) with WS queue-state invalidation, suppress_run/handoff_note on UpdateIssueRequest, the 'handoff' CommentType, and stripping of the control fields from optimistic update/batch cache patches (MUL-3375 §9). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * fix(issues): exclude handoff records from new-comment counting type='handoff' is a display-only timeline record, not conversation. Exclude it from CountNewCommentsSince so a handoff note never inflates the count of "new comments to catch up on" fed to a claiming agent (MUL-3375 §12). Analytics already excludes it (RecordHandoff is a direct write that emits no analytics event), and the comment-trigger path is already bypassed. Test: a handoff record does not bump the new-comment count; a real comment does. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(issues): pre-trigger preview UI, handoff note, timeline card (web/desktop) Wire the §9 frontend onto the preview endpoint + handoff fields: - Delete the backlog blocking dialog (backlog-agent-hint) and its modal type; the over-eager nag is gone. Backlog awareness is now a passive label. - RunConfirmModal: single assign + batch assign/status route here. Shows the backend predicate's verdict ("将启动 @X" / "将启动 N 个" / parked), an optional handoff note (assign only, soft-gated by handoff_supported), and 暂不启动 — then applies via update/batch. No frontend guessing. - create modal: passive CreateRunHint ("将启动 @X" / backlog parked). - single status change stays a direct apply (unchanged). - timeline: render type='handoff' as a distinct, non-interactive handoff card. - i18n run_confirm + handoff_card across en/ja/ko/zh-Hans; drop backlog action keys; locale parity green. Tests: use-issue-actions (assign → run-confirm modal, member → direct), create-issue + comment-card suites updated/green; views typecheck + lint clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> test(issues): use a valid anchor in the handoff count-exclusion test CountNewCommentsSince filters id <> @anchor_id; SQL id <> NULL is NULL and excludes every row, so an empty anchor made the control assertion read 0. The production caller always passes a real anchor — mirror that with a non-matching sentinel uuid. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * test(issues): RunConfirmModal apply logic (start/suppress/note-gate/batch) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * test(core): preview schema malformed/missing/null fallback coverage Cover IssueTriggerPreviewSchema via parseWithFallback (MUL-3375): well-formed parse, top-level + item default fills (empty/older backend), and fallback to { triggers: [], total_count: 0 } for malformed shapes, a dropped required issue_id, a wrong-typed total_count, and null/non-object bodies — so the four entry points degrade to "nothing will start" instead of throwing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * refactor(issues): remove display-only handoff timeline record (留痕) The handoff "留痕" timeline record (type='handoff' comment written on run start) was judged superfluous and dropped per product call. This removes only the display-only trace; the handoff NOTE injection into the run's opening prompt + issue_context.md is untouched. - backend: drop RecordHandoff + its call in dispatchIssueRun - db: drop the `type <> 'handoff'` exclusion in CountNewCommentsSince and migration 123 (comment_type_check reverts to the 4-type set from 001); no production data exists for this unreleased feature - frontend: drop the "handoff" CommentType, HandoffCard, and handoff_card i18n (all locales) - tests: drop handoff_count_test.go and the record-write assertions in issue_trigger_preview_test.go (note-injection tests retained) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * feat(issues): dismissable run-confirm modal + team-handoff copy Two fixes to the pre-trigger confirm modal (MUL-3375). 1. Dismissable: switch RunConfirmModal from AlertDialog to the standard shadcn Dialog so it has the close (X) button + Esc + click-outside. Previously the only choices were "start" / "don't start now" with no way to abort the action entirely; dismissing now cancels with no write. 2. Copy: rework the action-surface wording away from the backend term "run" toward team-handoff voice — 指派 / 开始 / 交接 (run stays only on record surfaces). Unifies the note's three names to "交接说明", and parallels the rewrite across en/ja/ko. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * chore(agent): bump handoff note min CLI version to 0.3.28 The daemon release that renders handoff notes ships in 0.3.28 (0.3.27 was the prior tag), so move the soft-gate threshold up. Below this the note is silently dropped and the frontend grays the note box — assignment is never blocked. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(issues): skip run-confirm when batch-moving issues to backlog A move into backlog never starts a run (service/issue_trigger.go), so the pre-trigger confirm modal degenerated to an empty "won't start" box with a single Apply button — pure friction. Apply directly instead, matching the single-issue status path. Other target statuses still route through the modal. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(issues): refine pre-trigger preview hint and copy - Move the create-issue run hint to a reveal band (grid 0fr→1fr) above the property toolbar. It was sharing the footer button row and, lacking a width constraint, reflowed the submit buttons whenever it appeared. Restyle to a borderless, comment-style avatar+caption that is purely a caption (non-interactive avatar). - Distinguish squad from agent in the pre-trigger copy: a squad's leader evaluates and delegates rather than "starting work" itself. Add will_start_named_squad / will_start_squad / create_will_start_squad across en/zh/ja/ko (reusing the squad_leader_* evaluate→arrange vocabulary) and branch run-confirm + the create hint on squad assignees. - Bold the assignee name in the run-confirm headline via a language-safe sentinel split (no per-language prefix/suffix keys). - Align zh "开始处理" → "开始工作" on the single-assign copy. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(issues): stub ActorAvatar in create-issue suite CreateRunHint now renders an ActorAvatar for agent/squad assignees, which pulls in getActorInitials/getActorAvatarUrl + the workspace/presence/navigation hook tree. This form-focused suite only stubbed getActorName, so the squad-forwarding test crashed with "getActorInitials is not a function". Stub the avatar inert — its own behavior is covered elsewhere. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Walt <walt@multica.ai> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-06-23 13:17:13 +08:00
Bohan Jiang	48b8dbf439	feat(daemon): surface sub-issue stages in the always-on runtime brief (#4426 ) Agents creating sub-issues only saw the runtime brief's Sub-issue Creation section, which taught the manual todo/backlog serial chain and never mentioned stages — the `--stage` flow was documented only in the multica-working-on-issues skill, which an agent reads only if it opens it. So agents defaulted to hand-managed backlog chains and rarely reached for stages. - Add an "Ordering with stages" paragraph to the brief's Sub-issue Creation section nudging agents to group ordered/waiting sub-issues with --stage instead of hand-promoting a backlog chain. - List --stage on the brief's issue create / update command lines and add multica issue children to the Core command list for discoverability. - Extend the brief test with the new stage assertions. The Sub-issue Creation section stays gated to issue-bound runs (skipped for chat/quick-create/autopilot), unconditional on parent_issue_id, and free of parent-notification guidance — all existing canaries still pass. MUL-3508 Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-23 01:08:10 +08:00
Bohan Jiang	da72e2fa22	feat(daemon): inject project description into the agent brief (MUL-3465) (#4395 ) * feat(daemon): inject project description into the agent brief Issues bound to a project only surfaced the project title in the runtime brief; the project description (durable, project-wide context the owner sets) was loaded but dropped. Carry it end-to-end: - claim handler reads proj.Description onto the response (issue-bound and quick-create paths) - new ProjectDescription field on AgentTaskResponse, daemon Task, and TaskContextForEnv - rendered in the brief's `## Project Context` section and written to .multica/project/resources.json as project_description Empty descriptions render nothing (no extra heading). Updated the projects-and-resources built-in skill docs in the same change. MUL-3465 Co-authored-by: multica-agent <github@multica.ai> * feat(projects): clarify project description is injected as agent context The project description is now durable context injected into every task's brief, but the UI still presented it as a plain "Description" field, so existing descriptions could silently become agent input. Add a hint under the description editor on the project detail page and in the create-project modal, in all four locales, stating it is shared with agents as context for every task in the project. No data-semantics change. Addresses review feedback on PR #4395. MUL-3465 Co-authored-by: multica-agent <github@multica.ai> * test(handler): assert project description flows through task claim The execenv tests cover brief rendering, but nothing pinned the claim handler boundary where proj.Description is read onto the response. Add two tests — issue-bound and quick-create paths — so a regression in that assignment fails loudly instead of silently dropping the description. Addresses review feedback on PR #4395. MUL-3465 Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-22 23:39:27 +08:00
DylanLi	78342a39ce	MUL-3305: feat(agent): add qoder CLI as a choice of agent provider. (#2461 ) * feat(agent): Qoder ACP runtime, chat reconnect recovery, and task linkage - Add Qoder CLI backend (ACP transport, model discovery, blocked-args policy) - Wire daemon/runtime config, docs, and UI provider assets - Retry terminal task reports; add backoff unit tests - Chat: SQL attach user message to task; handler + optimistic cache reconcile - Invalidate chat/task-messages caches on WS reconnect; extract helper + tests Co-authored-by: Orca <help@stably.ai> Co-authored-by: Cursor <cursoragent@cursor.com> * chore: drop non-Qoder changes (chat reconnect, task link, terminal report retries) Keep only Qoder runtime, docs, daemon config/execenv, and UI provider assets. Co-authored-by: Orca <help@stably.ai> Co-authored-by: Cursor <cursoragent@cursor.com> * fix(agent): harden Qoder ACP drain and wire project skills path - Stop streaming to msgCh after reader wait so grace timeout cannot race close - Resolve injected skills to .qoder/skills per Qoder CLI discovery - Update AGENTS.md skill copy and add execenv tests Co-authored-by: Orca <help@stably.ai> Co-authored-by: Cursor <cursoragent@cursor.com> * feat(qoder): add provider logo and wire MCP config into ACP sessions - Add inline SVG QoderLogo component to provider-logo.tsx, replacing the generic Monitor icon placeholder - Add convertMcpConfigForACP helper to convert Claude-style MCP server config (object map) into ACP array format for session/new and session/resume - Add unit tests for convertMcpConfigForACP covering stdio, SSE, empty/nil, and multi-server cases Co-authored-by: Orca <help@stably.ai> * fix(test): capture both return values from InjectRuntimeConfig in Qoder test Co-authored-by: Orca <help@stably.ai> * fix(qoder): preserve remote MCP headers and promote provider errors Addresses review feedback on #2461 (Bohan-J): two runtime-correctness issues in the Qoder ACP backend. 1. Remote MCP headers were dropped. The bespoke convertMcpConfigForACP only forwarded url/type, so an authenticated remote MCP server looked configured in Multica but failed inside the Qoder session. Replace it with the shared buildACPMcpServers helper (same path Hermes/Kimi/Kiro use), which preserves headers as [{name, value}], sorts for deterministic output, and handles remote transport aliases. Fail closed on malformed mcp_config instead of silently dropping servers. 2. Provider failures could report as completed tasks. stderr was wired via io.MultiWriter and the result was only promoted to failed when output was empty, so a terminal upstream error (HTTP 429 / expired token) racing a stopReason=end_turn with text still became "completed". Switch to StderrPipe + an explicit copier, drain it (bounded by the existing grace window, since qodercli can leave a child holding the inherited fds) before the decision, and run the shared promoteACPResultOnProviderError. Tests: replace the convertMcpConfigForACP unit tests with two end-to-end Qoder tests — one asserts the Authorization header reaches the session/new payload as {name, value}, the other asserts a terminal stderr error with non-empty output reports failed. Co-authored-by: Orca <help@stably.ai> * fix(qoder): align ACP session handling Co-authored-by: Orca <help@stably.ai> * fix(agent): guard qoder late output after drain Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Orca <help@stably.ai> Co-authored-by: Cursor <cursoragent@cursor.com> Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-22 18:55:45 +08:00
Naiyuan Qing	4fe8b54e9b	MUL-3446: keep chat output in chat (#4387 ) * MUL-3446: keep chat output in chat Co-authored-by: multica-agent <github@multica.ai> * MUL-3446: simplify chat output guidance Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-06-22 15:51:03 +08:00
BeliyDym	5fd3d01d13	MUL-3502: OST-1161: Bound assignment comment catch-up Squashed PR #4392. Updates assignment/comment catch-up guidance to use recent 10 and aligns related examples.	2026-06-22 15:46:47 +08:00
Hzzzzzx	b4c9e4423c	test: enable -race detector in Go test pipeline (WOR-61) (#4274 ) * test: enable -race detector in Go test pipeline (WOR-61) Add the -race flag to all three Go test invocation sites so the existing concurrency regression harness (workdir_race_test.go for #3999, runtime_gone_test.go, runtime_profile_drift_test.go) actually exercises the race detector. The daemon package alone has 28+ goroutine launch points with no automated race coverage before this change. Sites updated: - Makefile:299 (make test, local) - .github/workflows/ci.yml:101 (CI backend job) - .github/workflows/release.yml:55 (release verify job) go test already runs a vet subset by default, so no separate -vet flag is added. No production code touched. Co-authored-by: multica-agent <github@multica.ai> * test(execenv): serialize runtimeGOOS-mutating test (WOR-61) TestInjectRuntimeConfigIssueMetadataCodexFormattingUnchanged called t.Parallel() while mutating the package-level runtimeGOOS to drive the windows/linux branches, racing with the other parallel tests that read runtimeGOOS in buildMetaSkillContent. The -race flag enabled in the prior commit surfaced it as 3 WARNING: DATA RACE reports and 11 "race detected" failures in CI (only the execenv package failed). Drop t.Parallel() and add the "// Not parallel: mutates the package-level runtimeGOOS." comment already used by the six sibling writer tests across execenv_test.go and reply_instructions_test.go. This is test-isolation only; no production code, no mutex/atomic, no signature change. Verified locally: go test -race -count=1 ./internal/daemon/execenv/ -> ok 2.276s go test -race -count=1 ./internal/daemon/... -> all 3 pkgs ok Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: hzz <331380069@qq.com> Co-authored-by: multica-agent <github@multica.ai>	2026-06-18 15:50:24 +08:00
LinYushen	0d0edac32f	feat(daemon): discover local skills from ~/.agents/skills (MUL-3333) (#4244 ) * feat(daemon): discover local skills from ~/.agents/skills (MUL-3333) Upgrade local skill discovery and import from a single provider root to an ordered multi-root scan: the runtime's own skill directory (e.g. ~/.claude/skills) first, then the cross-tool universal root ~/.agents/skills. - Rename localSkillRootForProvider -> localSkillRootsForProvider, returning ordered roots [provider, universal] with a kind classifier. - listRuntimeLocalSkills iterates the roots, gives each root its OWN visited set (so a cross-root symlink alias is not collapsed), dedupes strictly by Key with the provider root winning, and sorts once after the merge. - loadRuntimeLocalSkillBundle walks the same priority order and only falls through to the next root on os.IsNotExist; any other stat error is returned so import never silently resolves a different same-key skill. - Add a Root ("provider" \| "universal") field to the local skill summary (daemon + handler structs and the TS RuntimeLocalSkillSummary type) so the UI can label a skill's origin without a future schema break. Backward compatible: every skill visible today keeps its Key, SourcePath and FileCount; the universal root only surfaces additional, non-conflicting skills. Out of scope (follow-up issues): execution-time injection of ~/.agents/skills into runtimes (e.g. Codex seedUserCodexSkills) and workspace-relative .agents/skills discovery. Tests cover universal-root discovery + import, provider-wins conflict priority, both-roots merge, missing/both-missing roots, nested layouts, IsNotExist fallback, the no-fallthrough-on-read-error guarantee, and the per-root visited cross-root symlink alias. Docs updated in en/zh/ja/ko. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): fall through to next root when a same-key dir has no SKILL.md loadRuntimeLocalSkillBundle previously only fell through to the next root on os.IsNotExist for the skill DIRECTORY. A provider-root directory that shares a skill's key but contains no SKILL.md (so listRuntimeLocalSkills descends past it and surfaces the universal-root skill instead) made load stop on the invalid provider dir and error — list and load disagreed, and the import the user picked from the list could not be fetched. Make the validity predicate match list: a root "has" the skill at a key only when it is a directory containing a SKILL.md. A missing entry, a non-directory, or a directory without a SKILL.md all mean "this root doesn't have it" and we continue to the next root. Only a genuine non-IsNotExist stat error or an unreadable existing SKILL.md (permission/IO) is returned, so we still never silently substitute a different-content same-key skill from a lower-priority root (Eve review #1, preserved by the existing read-error guard test). Adds regression tests for the provider-dir-without-SKILL.md and provider-non-dir fall-through cases. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-06-18 15:24:41 +08:00
Bohan Jiang	e3a829f05e	feat(daemon): disk-usage cross-root aggregation + migration FK/readiness fixes (MUL-3404) (#4290 ) Bundles the MUL-3404 disk-usage feature with the two Preflight BLOCK fixes. - feat(daemon): `disk-usage --all-profiles` aggregates across every workspace root (default + each ~/.multica/profiles/* root, incl. the Desktop app's), with a per-root breakdown and combined grand total; the cross-root hint now also fires when the current root is non-empty. - fix(db): drop DB-level foreign keys/cascades from the new autopilot_subscriber and comment.source_task_id migrations (resolved in the app layer — autopilot delete now removes subscribers in a transaction); the autopilot_subscriber down-migration relabels reason='autopilot' to 'manual' instead of deleting. - fix(server): readiness verifies every required migration is applied, not just the lexically-last one, so an out-of-order migration can't be masked. MUL-3404.	2026-06-18 13:33:14 +08:00
Bohan Jiang	e7daf876bd	fix(daemon): reclaim autopilot_run workdir on terminal status (MUL-3403) (#4287 ) * fix(daemon): reclaim autopilot_run workdir on terminal status (MUL-3403) Autopilot run workdirs are never reused — there is no PriorWorkDir path that hands a later run the same directory, so every run gets a fresh one. Yet GC waited the full GCTTL (default 24h) before reclaiming a terminal run's dir. Combined with one fresh dir per run, high-frequency autopilots piled up hundreds of stale dirs (508 dirs / 22GB in the field report). Drop the TTL gate so a terminal run (completed/failed/skipped/ issue_created) is reclaimed immediately, mirroring gcDecisionQuickCreate. Existing safety constraints are untouched: active-env-root short-circuit, 404 -> orphanByMTime, non-404 error -> skip, and the local_directory override all still apply. Co-authored-by: multica-agent <github@multica.ai> * docs(daemon): fix GetAutopilotRunGCCheck comment — completed_at is not a TTL anchor The endpoint comment still claimed the daemon uses completed_at as the TTL anchor for terminal runs. GC now decides purely on terminal status (the workdir is never reused, so a terminal run is reclaimed on sight); completed_at is returned for the API contract / diagnostics only. Addresses the review nit on #4287. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-18 11:17:59 +08:00
Bohan Jiang	1279f22d1c	MUL-3325: add background task safety brief (#4257 ) * fix(daemon): add background task safety brief Co-authored-by: multica-agent <github@multica.ai> * fix(agent): force Claude background tools foreground Co-authored-by: multica-agent <github@multica.ai> * fix(agent): narrow Claude async launch detection Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-18 10:45:51 +08:00
LinYushen	77ac17ef49	Make custom runtimes appear immediately (#4234 ) * Make custom runtimes appear immediately * Scope daemon profile refresh by authorized runtimes * Relay runtime profile refresh hints * Localize runtime profile close label	2026-06-17 16:00:22 +08:00
Multica Eve	6bb8cac9ea	MUL-3332: daemon picks up new custom runtime profiles without restart (#4225 ) * MUL-3332: daemon picks up new custom runtime profiles without restart The workspaceSyncLoop's already-tracked branch refreshed only settings and repos via refreshWorkspaceRepos and never re-fetched runtime profiles, so a custom runtime profile created via the web UI / CLI did not become a registered runtime row until the daemon restarted (or a runtimeGone recovery happened to fire). Detect server-side profile drift each sync tick by hashing the workspace's profile list with profileSetSignature(), caching the digest on workspaceState.profileSetSig, and triggering reregisterWorkspaceAfterRuntimeGone when the live signature differs from the cached one. Steady-state syncs cost exactly one extra GetRuntimeProfiles round trip; only real drift fans out to a Register call. The fetch is best-effort: a 404 / network blip preserves the cached signature so a transient failure cannot loop the daemon into spurious re-registrations. Tests in runtime_profile_drift_test.go cover digest stability under reorder, field-by-field drift detection (add / enable-flip / command_name / protocol_family / fixed_args / visibility), the no-drift hot path (no re-register), the new-profile drift path (single re-register + index update + sig converges), and best-effort fetch error handling. Co-authored-by: multica-agent <github@multica.ai> * MUL-3332: split orphan recovery from profile drift; converge to zero Addresses two blocking review concerns on #4225 (raised by GPT-Boy): 1. Profile drift must not kill running tasks on existing runtimes. The first cut reused reregisterWorkspaceAfterRuntimeGone, which after re-register calls /recover-orphans for every returned runtime ID. The server's RecoverOrphanedTasksForRuntime hard-fails every dispatched/running/waiting_local_directory row on that runtime — the correct response when a runtime row was actually deleted server-side, but a catastrophic false positive on profile drift: a built-in runtime still actively executing the user's tasks would have its work killed just because the user added an unrelated sibling custom profile. Fix: extract applyRegisterResponseInPlace as the shared in-place state converger between the two paths, and stop calling /recover-orphans from the drift path. reregisterWorkspaceAfterRuntimeGone keeps the /recover-orphans call because in that path the rows really were gone. 2. Disabling the only profile on a custom-only daemon must converge. The first cut hit registerRuntimesForWorkspace's len(runtimes)==0 guard and bailed out, so the disabled profile's runtime stayed alive in local tracking and on the server (still polling, still heartbeating, still online for the full 150 s stale-heartbeat window). Fix: introduce ErrNoRuntimesToRegister as a sentinel, have registerRuntimesForWorkspace return profileSig even on the empty case (so the drift path can cache the converged-empty signature), and have the drift refresh's error handler take a convergeWorkspaceRuntimesToZero branch that clears local runtimeIDs / runtimeIndex entries and Deregisters the orphaned IDs so the server marks them offline immediately. The same Deregister step also runs on partial drift (a built-in survives, the disabled profile's runtime drops) so the user sees the dropped runtime go offline within the next sync tick instead of after the 150 s sweep. Tests: - TestRefreshWorkspaceRuntimeProfiles_DriftWithRunningRuntimeSkipsOrphanRecovery (mixed built-in + custom, add another profile, asserts zero /recover-orphans calls). - TestRefreshWorkspaceRuntimeProfiles_DisableConvergesCustomOnlyDaemon (custom-only daemon, disable only profile, asserts local state cleared, signature converges to empty digest, Deregister called with the orphaned ID, no recover-orphans, follow-up tick is no-op). - TestRefreshWorkspaceRuntimeProfiles_DisableOneOfManyDeregistersDroppedID (partial drift: only the dropped ID is Deregistered, surviving built-in is left alone and not orphan-recovered). - TestRefreshWorkspaceRuntimeProfiles_NewProfileTriggersReregister extended to also assert no /recover-orphans calls. - TestRegisterRuntimes_SkipsProfileNotOnPath strengthened to assert the ErrNoRuntimesToRegister sentinel and that profileSig is still returned on the empty path. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-06-17 12:36:30 +08:00
LinYushen	1f5cb51d4e	MUL-3284: Web UI + CLI (custom runtime PR3) (#4177 ) * MUL-3284 PR3 (CLI): multica runtime profile subcommands + local path override - cmd_runtime_profile.go: `multica runtime profile` group — list / create / update / delete against /api/workspaces/{id}/runtime-profiles, plus set-path / unset-path for a per-machine command override. protocol-family validated client-side via agent.IsSupportedType / agent.SupportedTypes; visibility validated; update only sends changed flags (protocol_family immutable); delete surfaces the server 409 body when agents are still bound. - internal/cli/config.go: ProfileCommandOverrides map[string]string on CLIConfig (omitempty), through the existing marshal/unmarshal so set/unset round-trips without dropping other fields. - internal/daemon: Config.ProfileCommandOverrides, loaded from CLIConfig; appendProfileRuntimes now prefers an override path when set AND executable, else falls back to exec.LookPath(command_name), else skips+logs as before. - Tests: cmd_runtime_profile_test.go (registration, create/update/delete incl. bad-family + missing-flag + 409 surfacing, set/unset path round-trip, relative-path rejection, config preservation); cli/config round-trip; daemon prefers-override / falls-back-when-not-executable. Verified: go build ./..., go vet, go test ./cmd/multica/... ./internal/daemon/... ./internal/cli/... all pass. Co-authored-by: multica-agent <github@multica.ai> * MUL-3284 PR3 (Web): custom runtime profiles in the Runtime page Single-list integration — no new page, no tabs/grouping. Built-in protocol families and custom profiles render mixed in one catalog, each row badged built-in vs custom (progressive disclosure). - packages/core: RUNTIME_PROFILE_PROTOCOL_FAMILIES (single-source 13-family whitelist, matches server agent.SupportedTypes + migration 120 CHECK) and RuntimeProtocolFamily / RuntimeProfile types; api client list/get/create/update/deleteRuntimeProfile against /api/workspaces/{id}/runtime-profiles; runtimes/profiles.ts query + mutation hooks and a 409 "agents still bound" conflict parser. - packages/views/runtimes: runtime-profile-catalog (mixed built-in+custom rows), runtime-profiles-dialog (header "+ Add runtime" → step 1 pick protocol family → step 2 display_name/command_name/description; edit form for custom; admin-gated), delete-runtime-profile-dialog (confirm + graceful 409), runtimes-page / runtime-list integration. - i18n: new strings added to all four locales (en, zh-Hans, ja, ko). - a11y: dialogs are focus-trapped, Esc-closable, labelled; full create/edit/delete flow is keyboard + screen-reader operable. Iron rule honored: no generic per-agent args UI here (those stay on Agent config). fixed_args is not surfaced as a general args field. Verified: turbo typecheck + lint + test pass for @multica/core, @multica/views, @multica/web; the @multica/web production build succeeds. Co-authored-by: multica-agent <github@multica.ai> * MUL-3284 PR3: hide fixed_args from Web + CLI (not yet wired to launch) Review fix. fixed_args was surfaced as a working feature, but the daemon does not splice it into the agent launch command — exposing it promised admins a no-op. Per the call, remove it from every user-facing surface while keeping the underlying column/struct "carried but not exposed". - Web (runtime-profiles-dialog.tsx + runtime-profile-catalog.ts): drop the detail row, the create body field, the update patch field, and the form textarea; remove the parseFixedArgs/fixedArgsToText helpers and the fixedArgs form value. Left a NOTE pointing at the daemon TODO. - i18n: removed the fixed_args strings from all four locales (en/zh-Hans/ja/ko). - CLI (cmd_runtime_profile.go): removed the `--fixed-arg` flag from create and update and stopped sending `fixed_args`; updated the "no fields" message. Test now asserts the CLI never sends fixed_args. Untouched (the carried-but-not-exposed layer): the runtime_profile.fixed_args column, the server handler's accept/return, and the daemon's RuntimeProfile field — all keep the existing TODO(MUL-3284) to wire it into the launch path (with a test proving args reach the backend) before any UI/CLI re-exposes it. Verified: turbo typecheck+lint+test pass for @multica/core and @multica/views; go build/vet/test pass for ./cmd/multica/. Co-authored-by: multica-agent <github@multica.ai> * MUL-3284 PR3: stop exposing profile visibility=private (server forces workspace) Double-review (Eve) caught a fixed_args-shaped hole: visibility=private was a user-facing toggle (Web form + detail + CLI), but the three server read paths (ListRuntimeProfiles, daemon ListEnabledRuntimeProfilesForWorkspace, DaemonRegister) never enforce it — so a "private" profile's name/command would leak to other members and could be registered by other machines' daemons (lateral data leak). Same "don't paint a pie" fix as fixed_args: hide the control everywhere and force the stored value. - Server (runtime_profile.go): drop `visibility` from the create + update request structs; CreateRuntimeProfile always stores 'workspace' (runtimeProfileDefaultVisibility); UpdateRuntimeProfile no longer accepts it; removed validRuntimeProfileVisibility. The column + response field stay (always 'workspace') as the carried-but-not-exposed layer. - Web (runtime-profiles-dialog.tsx): removed the visibility form fieldset, the VisibilityOption component, the detail row, the visibility state, and the create/update submit fields. - i18n: removed the profile visibility strings from all four locales (profiles.detail.visibility, profiles.visibility., profiles.form.visibility_). Top-level runtime/agent visibility strings are untouched. - CLI (cmd_runtime_profile.go): removed `--visibility` from create/update and the VISIBILITY list column; removed validateVisibility; stopped sending the field. - Tests: new TestCreateRuntimeProfile_ForcesWorkspaceVisibility (POST visibility:"private" -> response and DB row are 'workspace'); CLI create test now asserts visibility is never sent. Follow-up MUL-3308 tracks implementing real creator-visibility (and wiring fixed_args to the launch path); TODOs left in server/Web/CLI point to it. Verified: turbo typecheck+lint+test pass (@multica/core, @multica/views); go build/vet pass; go test ./cmd/multica/... and the full ./internal/handler/ suite pass against a migrated Postgres 17. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-06-17 11:38:17 +08:00
LinYushen	52e76e7b23	MUL-3284: server API + daemon (custom runtime PR2) (#4149 ) * MUL-3284: add runtime_profile schema (custom runtime PR1) Schema-only foundation for custom runtimes. Additive migration 120: - New workspace-level `runtime_profile` table: the shared, team-visible definition of a custom runtime (e.g. an in-house Codex wrapper). protocol_family is CHECK-constrained to the exact backend list in agent.New() (server/pkg/agent/agent.go). The only args column is `fixed_args` (args every agent on the runtime must inherit); there is deliberately no generic per-agent args field — those stay on agent.custom_args. - `agent_runtime.profile_id` (nullable, FK -> runtime_profile ON DELETE CASCADE): NULL = built-in runtime, non-NULL = a registered instance of a custom profile. - Partial unique index agent_runtime_workspace_daemon_profile_key on (workspace_id, daemon_id, profile_id) WHERE profile_id IS NOT NULL. The legacy UNIQUE (workspace_id, daemon_id, provider) constraint is left INTACT so the existing registration upsert (ON CONFLICT (workspace_id, daemon_id, provider) in runtime.sql) keeps resolving its arbiter and the server stays green. Converting that key to a partial (WHERE profile_id IS NULL) index and making the upsert profile-aware is PR2's registration work, not this migration. Verified up + down against Postgres 17: full `migrate up` applies 120; schema shows the table, column, partial index and intact legacy constraint; functional checks pass (partial index blocks dup (ws,daemon,profile), allows same profile on another daemon; CHECK and display_name uniqueness reject bad input; legacy ON CONFLICT still resolves; profile delete cascades to instances); down/up round-trip is clean. Co-authored-by: multica-agent <github@multica.ai> * MUL-3284: drop DB FKs/cascade from runtime_profile migration (review fix) Per review (house rule: no new database foreign keys / cascades; relational integrity lives in the application layer): - runtime_profile.workspace_id: drop REFERENCES workspace ON DELETE CASCADE -> plain UUID NOT NULL. - runtime_profile.created_by: drop REFERENCES "user" ON DELETE SET NULL -> plain UUID. - agent_runtime.profile_id: drop REFERENCES runtime_profile ON DELETE CASCADE -> plain UUID. CHECK constraints, UNIQUE (workspace_id, display_name), the workspace index, and the partial unique index agent_runtime_workspace_daemon_profile_key are unchanged. The legacy UNIQUE (workspace_id, daemon_id, provider) constraint remains untouched. Behavioral consequence: the database no longer auto-removes a profile's agent_runtime instance rows on profile delete. That cleanup moves into PR2's profile-delete path. Up-migration comments document this; down-migration comment no longer references FKs/cascade. Re-verified on Postgres 17: migrate up applies 120; no FK constraints exist on the new columns; partial index still blocks dup (ws,daemon,profile_id); CHECK and display_name uniqueness still reject bad input; deleting a profile now leaves the runtime row orphaned (proving cascade is gone); down/up round-trip clean with the legacy constraint intact. Co-authored-by: multica-agent <github@multica.ai> * MUL-3284 PR2 (server): runtime_profile CRUD + profile-aware registration Server/DB half of the custom-runtime feature. - Migration 121: convert the legacy UNIQUE (workspace_id, daemon_id, provider) constraint on agent_runtime into a partial unique index scoped to built-in rows (WHERE profile_id IS NULL). With 120's partial index on profile_id this lets one daemon host the built-in provider AND custom profiles of the same protocol family without collision. - Queries: runtime_profile CRUD; ListEnabledRuntimeProfilesForWorkspace (daemon-facing); CountAgentsByProfile + DeleteAgentRuntimesByProfile for the app-layer cascade; profile-aware UpsertAgentRuntimeWithProfile; the built-in UpsertAgentRuntime ON CONFLICT now spells out WHERE profile_id IS NULL so it targets the right partial index. sqlc regenerated. - agent.SupportedTypes / IsSupportedType: single-source protocol_family whitelist, in lockstep with agent.New and the migration 120 CHECK. - Handlers + routes: runtime_profile CRUD (member-read, admin-write) with protocol_family whitelist validation, display_name uniqueness (409), and fixed_args validation (no generic per-agent args — iron rule); a daemon-token endpoint GET /api/daemon/workspaces/{id}/runtime-profiles; DeleteRuntimeProfile does the app-layer cascade (delete instance rows then profile, in one tx) and refuses (409) while active agents are bound. - DaemonRegister accepts an optional per-runtime profile_id: validates the profile belongs to the workspace and is enabled, registers via the profile-aware upsert, and skips legacy hostname merge for custom rows. AgentRuntimeResponse now carries profile_id. Verified on Postgres 17: migrate up through 121; built-in + custom codex coexist on one daemon; both upsert arbiters are idempotent; delete-by-profile cascade removes only the custom instance; migrate down reverses 121 then 120 and replays clean. go build ./... and go vet pass; handler test package compiles. Daemon-side wiring (fetch profiles, PATH-resolve command_name, register with profile_id, exec uses command_name) lands in a follow-up commit on this branch. Co-authored-by: multica-agent <github@multica.ai> * MUL-3284 PR2 (daemon): pull profiles, PATH-resolve, register, exec command Daemon-side half of custom runtime profiles, against the server contract on this branch. - client.go: GetRuntimeProfiles(workspaceID) -> GET /api/daemon/workspaces/{id}/runtime-profiles (mirrors GetWorkspaceRepos); RuntimeProfile / RuntimeProfilesResponse types. - types.go: Runtime gains profile_id (parsed from the register response so runtimeIndex carries it). - daemon.go: * appendProfileRuntimes — called inside registerRuntimesForWorkspace before the empty-runtimes guard. Best-effort fetch (older server 404s are logged and swallowed; never fails registration). Per enabled profile: resolve command_name via PATH (exec.LookPath, behind a `lookPath` test hook), skip+log when absent, best-effort version probe, record the resolved absolute path keyed by profile_id, and append a registration entry {name, type=protocol_family, version, status:online, profile_id}. A custom-only host (no built-in agents) still registers. * profileCommandPaths map (guarded by d.mu) + recordProfileCommandPath / customCommandPathForRuntime helpers. * runTask: looks up the claimed task's RuntimeID -> profile command path and overrides the executable path, synthesizing an AgentEntry so a custom runtime runs even when the host has no built-in agent of the same provider. provider (=protocol_family) is unchanged so agent.New still selects the right backend. - Tests: GetRuntimeProfiles request shape; profile runtime appended + path recorded (custom-only host); profile skipped when command not on PATH; profiles-fetch-404 is best-effort; customCommandPathForRuntime bookkeeping. - agent: lockstep test pinning SupportedTypes to agent.New and the migration 120 protocol_family CHECK. Iron rule honored: profile carries no generic per-agent args. fixed_args are parsed and carried but intentionally NOT wired into the launch command yet (optional/best-effort; explicit TODO(MUL-3284) in appendProfileRuntimes). Verified: go build ./... clean; go vet ./internal/daemon/... clean; go test ./internal/daemon/... pass (existing + 5 new); full go test ./internal/handler/ suite passes against a migrated Postgres 17; agent lockstep test passes. Co-authored-by: multica-agent <github@multica.ai> * MUL-3284 PR2: profile delete runs full archived-agent cascade (fix 500) Review fix. DeleteRuntimeProfile previously guarded only on ACTIVE agents, but agent.runtime_id is ON DELETE RESTRICT — a profile whose runtimes had only ARCHIVED agents passed the guard, then DeleteAgentRuntimesByProfile hit the FK and the handler 500'd. Now it mirrors the mature runtime-delete cascade (DeleteAgentRuntime): in one transaction it enumerates the profile's runtime rows, refuses (409) any with active agents or active squads led by archived agents, then for each runtime pauses autopilots pinned to its archived agents, drops archived squads led by them, and hard-deletes the archived agents before removing the runtime rows and the profile. No code path can now fall through to a raw FK error. - queries: ListAgentRuntimeIDsByProfile (sqlc regen). Reuses the existing per-runtime teardown queries (CountActiveSquadsWithArchivedLeadersByRuntime, ListArchivedAgentIDsByRuntime, PauseAutopilotsByAgentAssignees, DeleteSquadsByArchivedAgentsOnRuntime, DeleteArchivedAgentsByRuntime). - tests: TestDeleteRuntimeProfile_ArchivedAgentCascade (archived-only profile deletes cleanly: 204, runtime + archived agent + profile gone) and TestDeleteRuntimeProfile_ActiveAgentBlocks (active agent → 409, survives). Verified against Postgres 17: both new tests pass; full handler suite, daemon tests, and agent lockstep test pass; go vet clean. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-06-17 11:33:09 +08:00
Multica Eve	18a58e80c0	MUL-3316: fix(execenv): switch agent prompt to --content-file to prevent heredoc flag swallowing (#4182 ) (#4191 ) * fix(execenv): switch agent prompt to --content-file to prevent heredoc flag swallowing (#4182) The Linux/macOS reply template recommended --content-stdin with a quoted HEREDOC. That pattern is safe for the trivial single-flag comment-add case that BuildCommentReplyInstructions emits, but as soon as a model wraps extra flags around the heredoc on multica issue create / update — assignee, project — the bash heredoc/flag boundary is fragile in two ways the model cannot see: - A 'BODY \\' terminator with a trailing token is not recognised as the heredoc end, so flag lines after it are swallowed into the description (OXY-78: residual flag text leaked into the description, command exit 0). - A clean terminator turns the trailing '--assignee ...' line into a separate failing shell statement, while the create itself already exited 0 with no assignee (OXY-76: assignee silently dropped, no residual text). In both cases the CLI never receives the swallowed flags, the API request omits the fields, and the daemon has no visibility. The created issue lands with assignee_id: null / project_id: null. This commit: * Switches the Linux/macOS branch of BuildCommentReplyInstructions to --content-file with a 3-step recipe (write file, post, rm) so the body never reaches the shell and all flags live on one shell-token line. There is no heredoc boundary for flags to leak across. * Adds a parallel cleanup step (Remove-Item) to the Windows branch so the cross-platform template is one shape. * Rewrites the runtime_config.go ## Comment Formatting non-Windows section to mandate --content-file and explicitly ban --content-stdin HEREDOC for agent-authored comments, citing #4182. * Reorders the Available Commands menu lines for issue create / update / comment add to put --content-file / --description-file ahead of the stdin variant and add a per-line note pointing at #4182. * Updates and renames the affected tests (TestBuildCommentReplyInstructionsCodexLinux, TestBuildCommentReplyInstructionsNonCodexLinux, TestInjectRuntimeConfigLinuxCommentFormattingEmphasizesFile, TestInjectRuntimeConfigIssueMetadataCodexFormattingUnchanged) so the new file-first contract is pinned and the old HEREDOC mandate is in the banned-strings lists. This converges Linux/macOS with the long-standing Windows file-only path, so the cross-platform guidance is now one shape. It also strictly improves on the previous MUL-2904 guardrail by eliminating shell exposure of the body entirely (no body ever reaches the shell, so backtick / $() / $VAR substitution cannot corrupt it). Closes GitHub multica-ai/multica#4182. No CLI or backend changes — --content-file / --description-file already exist. Co-authored-by: multica-agent <github@multica.ai> * docs(prompt): correct stale BuildPrompt comment to file-first (#4182) --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: CC-Girl <cc-girl@multica.ai>	2026-06-16 17:14:25 +08:00
Bohan Jiang	f9c193e06b	fix: fail closed on agent task auth tokens (#4142 ) Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-15 16:34:35 +08:00
YOMXXX	34d4cd3a28	feat(openclaw): support connecting to existing OpenClaw gateway (#3260 ) [MUL-3158] (#3664 ) * feat(openclaw): support connecting to existing OpenClaw gateway (#3260) When the daemon host is a lightweight dev machine or CI coordinator, the heavy agent work (LLM inference, code execution, tool use) often belongs on a more powerful remote server already running an OpenClaw gateway. Multica historically hard-coded `openclaw agent --local`, forcing every turn to execute in-process on the daemon host. This change adds an opt-in gateway routing mode controlled per-agent via `runtime_config`: { "mode": "gateway", "gateway": { "host": "...", "port": 18789, "token": "...", "tls": false } } - Backend: ExecOptions gains OpenclawMode + OpenclawGateway; buildOpenclawArgs drops `--local` when mode == "gateway". Per-task openclaw-config.json wrapper pins gateway.{host,port,auth.{mode,token},tls} so users do not need to edit the daemon host's `~/.openclaw/openclaw.json` to point at a different endpoint. - Daemon: AgentData carries the raw runtime_config; decoding is fail-soft (malformed JSON falls back to local mode rather than blocking dispatch). - API: gateway.token is masked to "**" on every GET; PATCH replays the sentinel back, and the update handler restores the persisted token so the round-trip never destroys the secret. Defense-in-depth masking on WS broadcasts, plus String/MarshalJSON masking on the in-memory struct to block stray `%+v` / json.Marshal leaks. - UI: openclaw-only "Routing" tab on the agent detail page with mode selector + structured endpoint form. Token uses a "saved — submit a new value to rotate" UX and matching backend preserve hook. Empty `runtime_config` keeps the historical embedded behaviour, so existing agents are unaffected. fix(openclaw): address #3664 review — drop dead gateway field, gate pin on mode Per Bohan-J's review: - Remove the dead ExecOptions.OpenclawGateway field (+ its String/MarshalJSON and the daemon.go construction block). It carried the plaintext bearer token but was never read — buildOpenclawArgs only consumes OpenclawMode and the live gateway path runs through execenv.OpenclawGatewayPin — so this narrows the secret's footprint. - Gate the gateway pin on mode=="gateway" in decodeOpenclawRuntimeConfig: a {"mode":"local","gateway":{...,"token"}} payload no longer writes the token into the 0o600 per-task wrapper that --local makes openclaw ignore. - Warn on an unrecognized non-empty mode (e.g. "gatway") instead of silently falling back to local. - Run preserveMaskedGatewayToken in CreateAgent too, so a literal "***" at create time can't persist as a real bearer token. - Document the gateway host:port trust boundary (SSRF note for shared daemon hosts). Adds regression tests for the local-mode pin drop and the unknown-mode warning.	2026-06-13 15:33:28 +08:00
Bohan Jiang	f415099c4a	MUL-3263: support managed MCP config for Cursor (#4081 ) * feat: support managed MCP config for Cursor Co-authored-by: multica-agent <github@multica.ai> * fix: address Cursor MCP review feedback Co-authored-by: multica-agent <github@multica.ai> * docs: include Cursor in skills MCP support Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-13 02:07:00 +08:00
Bohan Jiang	c8ab73d38d	MUL-3244: Bind quick-create attachments to created issues (#4062 ) * fix: bind quick-create attachments to created issues Co-authored-by: multica-agent <github@multica.ai> * test: use real image markdown in quick-create attachment test Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-12 16:45:38 +08:00
Liu Guanzhong	4594c776e1	feat(agent): add CodeBuddy as first-class CLI backend (#3186 ) * feat(agent): add codebuddyBackend struct and buildCodebuddyArgs Introduces the codebuddy agent backend skeleton with args builder that mirrors claudeBackend's protocol flags (stream-json, bypass permissions, blocked args filtering) for the codebuddy CLI fork. * feat(agent): implement codebuddyBackend.Execute with stream-json parsing * feat(agent): wire codebuddy into New() factory and launchHeaders * feat(agent): add codebuddy dynamic model discovery from --help * feat(agent): add codebuddy thinking/effort discovery and providerThinkingEnums * feat(daemon): add codebuddy CLI probe, env vars, and args support * fix(agent): use len(models)==0 for default model instead of loop index * fix(agent): increase codebuddy --help timeout to 35s for slow CLI startup * fix(agent): address codebuddy PR review feedback - Wire codebuddy into execenv: reuse claude's CLAUDE.md, .claude/skills, and ~/.claude/skills paths since CodeBuddy is a Claude Code fork - Replace hardcoded 20-min timeout with runContext for zero-timeout = no-deadline semantics matching all other backends - Restore runContext regression tests lost in rebase merge - Mirror claude.go execution model: concurrent stdin write to prevent pipe deadlock, sync.Once for stdin closure, keep stdin open for control_request auto-approval mid-run - Add control_request handling with auto-approve behavior - Add RequestID/Request fields to codebuddySDKMessage - Add codebuddy to metrics knownRuntimeProviders - Add codebuddy to provider-logo.tsx (reuses ClaudeLogo) - Consolidate --help discovery: shared codebuddyHelpOutput cache eliminates duplicate cold-start invocations --------- Co-authored-by: krislliu <krislliu@tencent.com>	2026-06-12 15:22:16 +08:00
Multica Eve	9439a85aa6	MUL-3242: fix daemon workdir provisioning race Fixes GitHub issue #3999 by moving the daemon StartTask transition behind workdir provisioning and extending the active env-root guard through completion metadata writes.	2026-06-12 15:14:27 +08:00
Bohan Jiang	8151f60c6c	fix(daemon): drop stale resume session when workdir is not reused (#4027 ) CLI backends key their session stores to the cwd (Claude Code looks sessions up under ~/.claude/projects/<encoded-cwd>/), so a prior session id can only resolve when the task runs in the exact workdir the session was recorded against. When the prior workdir no longer exists (GC'd after the issue went done, daemon reinstall, manual cleanup), execenv.Reuse falls back to a fresh Prepare but the stale session id was still passed to the backend: claude exited within a second and the run failed before doing any work — permanently, because the failed run records no session_id and the next claim serves the same stale pointer again. Gate ResumeSessionID on the workdir actually being reused, and correct PriorSessionResumed so the runtime brief uses the cold-path wording when the session is dropped. Fixes multica-ai/multica#3854 (MUL-3221) Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-11 13:07:44 +08:00
Bohan Jiang	ac75c97797	fix(desktop): disable auto-start/stop toggles for a daemon the app can't control (WSL2) (#3940 ) * feat(daemon): report OS in /health response The desktop app reads daemon liveness over HTTP but starts/stops it via the native CLI, which acts on the host process namespace. On Windows with the daemon in WSL2, /health is reachable via localhost forwarding yet the daemon's process is unreachable — so the app needs a signal to tell a daemon it manages from one it merely sees. Expose runtime.GOOS as `os` so the desktop can compare it against its own host OS. MUL-3154, #3916 Co-authored-by: multica-agent <github@multica.ai> * fix(desktop): disable auto-start/stop for an unmanageable daemon When the daemon runs in an environment the app can't drive — e.g. Linux in WSL2 behind a Windows desktop, reachable only via localhost forwarding — the Auto-start/Auto-stop toggles silently did nothing: the lifecycle CLI acts on the host process namespace and never reaches the daemon's PID. Detect it by comparing the daemon's reported OS (new /health `os` field) against the host OS, and only when a daemon is actually running. When they differ: disable both toggles with an explanatory note, skip the version-match restart on auto-start, and skip the no-op stop on quit. Fails safe — a missing `os` (older daemon) or a matching OS keeps the toggles live, so native Mac/Windows/Linux daemons are unaffected. MUL-3154, #3916 Co-authored-by: multica-agent <github@multica.ai> * fix(desktop): centralize externally-managed guard at the lifecycle boundary Review follow-up. The first cut only disabled the Settings toggles, but the same unmanageable daemon (WSL2 etc.) could still be Stop/Restart-ed from the Runtime card and from automatic lifecycle entries (logout, user switch, reauth, first-workspace restart) — each of which would shell out to a native CLI that can't reach the daemon's process. Move the guard into the main-process lifecycle functions so every entry point is covered by construction: stopDaemon() and restartDaemon() no-op for an externally-managed daemon, and ensureRunningDaemonVersionMatches() treats it as up-to-date (no misleading restart). The per-branch checks in the auto-start handler and before-quit are removed — the boundary now covers them. The Runtime card hides Stop/Restart and shows a 'Managed outside the app' hint, mirroring the Settings tab. Adds a component test for the card's two states. MUL-3154, #3916 Co-authored-by: multica-agent <github@multica.ai> * fix(desktop): preflight the lifecycle guard against live /health Review follow-up. The guard read a cached lastExternallyManaged, which only fetchHealth() updates — but not every lifecycle entry polls before calling stop/restart. syncToken()'s user-switch branch calls restartDaemon() directly after its own fetchHealthAtPort(), without refreshing the cache; on a fresh launch / account switch (no poll yet) the cache is still the initial false, so restartDaemon() would shell out to the native CLI and hit the very WSL/native PID-namespace problem this PR avoids. Make stopDaemon()/restartDaemon() preflight against a live /health read each call instead of trusting the poll cache. The decision is extracted to a pure daemonLifecycleUnreachable(readDaemonOS, hostOS) so a unit test can prove the live value (not a cache) drives it. lastExternallyManaged is removed — the UI already reads the per-status externallyManaged field, so it had no other consumer. MUL-3154, #3916 Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-10 12:27:41 +08:00
Chenyu-24601	8b94764c47	feat(daemon): configurable OpenClaw binary path / state dir via CLIConfig.Backends (MUL-3157) Summary: - Add CLI config schema for OpenClaw backend binary path and state dir overrides. - Apply those overrides during daemon LoadConfig using the existing env-var based probe/spawn path. - Cover backward compatibility, precedence, partial overrides, and fail-soft config loading. Verification: - go test ./internal/cli ./internal/daemon - go vet ./internal/cli ./internal/daemon - GitHub CI passed	2026-06-09 14:05:37 +08:00
Bohan Jiang	24b162cdbc	feat(daemon): surface the real task initiator to the agent runtime (MUL-2645) (#3899 ) * feat(daemon): surface the real task initiator to the agent runtime (MUL-2645) In a multi-person workspace the agent runtime only ever saw the runtime OWNER identity: the brief's `## Requesting User` is sourced from runtime.OwnerID and the task-scoped token is owner-bound, so every requester (whoever commented, @mentioned, or chatted) appeared to the agent as the owner. Agents that route by initiator for permission, privacy, or audit all misjudged. Resolve the real task initiator at claim time and surface it distinctly from the owner: - comment / mention trigger -> triggering comment's author (member or agent) - chat task -> chat session creator (sessions are creator-only) - on-assign / autopilot / quick-create -> no attributable initiator (omitted) Adds initiator_{type,id,name,email} to the claim response, the daemon Task, and TaskContextForEnv, rendered into the brief as a new `## Task Initiator` section. The section documents the privacy boundary: the agent's credentials stay owner-scoped, so this is an attested identity for the agent's own routing/privacy logic, not act-as. No DB migration — both paths are derivable from existing rows. Tests: brief rendering (member/agent/omit/sanitize) + email guard unit tests, and claim-handler tests for the comment and chat paths. Co-authored-by: multica-agent <github@multica.ai> * fix(chat): store real sender as task initiator, not chat_session creator (MUL-2645) Review fix (Niko, PR #3899). v1 resolved the chat task initiator from chat_session.creator_id at claim time. That is correct for web chat and Lark p2p (creator == sender), but WRONG for Lark group chats: the group session creator is deliberately the installer (stable identity across member churn), not the message sender. So in a Lark group, every member who triggered the agent showed up in the brief as the installer/owner — the exact bug this issue is about, still live at that entry point. Capture the real sender at enqueue time instead of deriving it from the session creator at claim time: - migration 117: agent_task_queue.initiator_user_id (FK user, ON DELETE SET NULL); NULL for non-chat and pre-migration rows. - EnqueueChatTask now takes an explicit initiatorUserID. Web chat passes the authenticated request user; the Lark dispatcher threads the inbound sender (binding.MulticaUserID) through scheduleRun -> flushChatRun. The debouncer keeps the latest scheduled flush per session, so in a multi- sender silence window the LATEST sender wins (documented + tested). - claim handler resolves the initiator from task.initiator_user_id and drops the creator_id fallback entirely. The Lark group session creator stays the installer (unchanged) — only the task initiator is corrected, keeping the two concepts cleanly separate. Tests: dispatcher group regression (initiator = sender, not installer), latest-sender-wins, p2p initiator assertion; the chat claim handler test now sets creator != initiator and asserts the stored sender wins. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-08 19:29:57 +08:00
liujianqiang-niu	5be7d1bc17	MUL-3136 fix(openclaw): parse config path from last non-empty line of CLI output Fix OpenClaw config discovery when `openclaw config file` prints Doctor warning UI before the actual config path. The daemon now uses the last non-empty stdout line as the path while preserving the existing tilde expansion, absolute-path validation, stat checks, and fail-closed behavior. Tests: go test ./internal/daemon/execenv	2026-06-08 17:22:02 +08:00
Bohan Jiang	1ddf89a8f2	feat(daemon): enable Antigravity (agy) per-agent model selection (MUL-3125) (#3894 ) * feat(daemon): wire agy --model and model discovery for Antigravity agy 1.0.6 added a --model flag and an `agy models` catalog command, which were the #1 blocker in the earlier agy-backend review (MUL-3125). The antigravity backend already shipped but deliberately dropped opts.Model because agy 1.0.1 had no way to select a model. - buildAntigravityArgs now passes --model <display name> when opts.Model is set; the value is the exact `agy models` display string (spaces + parens), passed as a single exec arg so no shell quoting is needed. - Block --model in custom_args so it can't override the managed value. - ListModels("antigravity") enumerates via `agy models` (no static fallback: agy silently no-ops on unrecognised models, so a stale guess would turn a typo into a successful empty run). - ModelSelectionSupported now returns true for every built-in provider; the hook stays for any future model-less runtime. - Daemon probe reads MULTICA_ANTIGRAVITY_MODEL for the daemon-wide default. Co-authored-by: multica-agent <github@multica.ai> * docs(providers): mark Antigravity model selection as supported Antigravity gained --model in agy 1.0.6 (MUL-3125). Update the provider matrix + prose (en/zh/ja/ko) from "managed internally / no --model" to dynamic discovery via `agy models`, and refresh the now-stale picker comments. Flag the display-string (not slug) shape and agy's silent no-op on unrecognised values. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): reject unknown Antigravity model at spawn (MUL-3125) agy exits 0 with empty output on an unrecognised --model, so a stale/typo'd value would surface as a 'completed' but empty task. Validate opts.Model against the `agy models` catalog in Execute before spawning: a non-empty model the CLI does not advertise fails fast with an actionable error listing the real choices. opts.Model is the single funnel for agent.model and the MULTICA_ANTIGRAVITY_MODEL default, so this one check covers every source (UI free-text, API, persisted value, env) — addressing Elon's review that a UI-only guard is bypassable. Validation is fail-OPEN: if the catalog can't be discovered we pass the value through and let agy resolve it, so a discovery hiccup never blocks a run. Pure antigravityModelError() is unit-tested (valid / unknown / near-miss / empty-model / empty-catalog); verified live against real agy 1.0.6. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-08 15:32:53 +08:00
0xMomo	ef75f80d9d	fix(daemon): clean stale agent branches during repo gc (MUL-2550) (#3039 ) * fix(daemon): 清理陈旧 agent 分支 Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): 串行化 bare repo gc Co-authored-by: multica-agent <github@multica.ai> * test(daemon): adapt health repo cache mock Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): gate gc maintenance on stale-branch deletion Address review feedback on the bare-repo GC change: - Only run `reflog expire` + `git gc --prune=30.days` when we actually deleted a stale agent branch this cycle. Previously the heavy step ran every GC tick on every cached repo even when there was nothing to reclaim, turning a stale-ref cleanup into a periodic full-repo maintenance job under the per-repo lock. - Split git command timeouts: `gc --prune=30.days` now gets a 10-minute budget instead of sharing the 30s ceiling that was scoped for the original `worktree prune` call. Light commands stay at 30s. - Drop the redundant `gc --auto` — `gc --prune=30.days` already performs the maintenance `gc --auto` would have triggered. - Narrow the agent-namespace ref query from `refs/heads/agent` to `refs/heads/agent/` so the pattern can't surface a literal `agent` branch outside the daemon namespace. Tests: - New TestPruneWorktree_IgnoresLiteralAgentBranch pins the trailing- slash narrowing. - New TestPruneWorktree_SkipsMaintenanceWhenNothingDeleted uses an unreachable, backdated loose object as a sentinel to verify that `gc --prune` runs only when a stale agent branch was reaped. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: 0xNini Code Dev <agent@multica.local> Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: 0xNini <0xnini@iMac-Pro.local> Co-authored-by: J <j@multica.ai>	2026-06-08 15:25:14 +08:00
Bohan Jiang	3808049361	fix(codex): set semantic thread names (#3887 ) Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-08 14:53:31 +08:00
NanamiKite	2e34016f1f	fix(daemon): interrupt local agent on server-side terminal task states (#3878 ) shouldInterruptAgent now treats every terminal task status (completed/failed/cancelled, via isAgentTaskTerminal) plus a 404 task-not-found as an interruption signal, so the daemon stops a local agent once the backend has finalized the task — e.g. the runtime offline sweeper flipping running -> failed during a disconnect/reconnect. Previously only `cancelled`/404 interrupted, so the agent ran to completion and its CompleteTask call failed against a non-running row, wasting compute and adding log noise. Closes #3877	2026-06-08 14:00:30 +08:00
HMYDK	4190de3d64	fix(skills): quote description values in built-in SKILL.md YAML frontmatter (#3852 ) Built-in SKILL.md description values contained unquoted ': ' sequences, which strict YAML parsers (e.g. Codex) reject — silently dropping the skill at load. - Quote all eight built-in skill descriptions. - ensureSkillFrontmatter() re-synthesizes frontmatter that has a name but fails YAML validation, so malformed imports are repaired instead of dropped. - Unify frontmatter delimiter parsing into a single frontmatterParts helper. - Add strict-YAML regression tests over the built-in skills, plus unit tests for the recovery branch and delimiter variants. Closes #3851.	2026-06-08 13:10:24 +08:00
Wes	5e7587ad07	Optimize daemon runtime wakeups (#3859 )	2026-06-08 12:51:13 +08:00
Xinmin Zeng	270d177475	fix: broken "Add a computer" command on Multica Cloud + two CLI amplifiers (MUL-3087) (#3817 ) * fix(server): recognize official cloud by frontend host in daemon setup config The 'Add a computer' dialog builds its command from /api/config's daemon_server_url/daemon_app_url, falling back to 'multica setup' when both are empty. The official cloud is meant to omit them, but the omission only fired when MULTICA_PUBLIC_URL=https://api.multica.ai. When that env is unset the server URL defaults to the frontend origin and the old guard (which required serverURL host == api.multica.ai) didn't match, so the dialog emitted 'multica setup self-host --server-url https://multica.ai' — pointing the daemon backend at the frontend (no /health, no WebSocket proxy). Identify the official cloud by its frontend host alone (multica.ai / app.multica.ai) so a missing or misconfigured MULTICA_PUBLIC_URL can no longer leak the broken self-host command. Regression from #3474. * fix(cli): probe before persisting self-host config to preserve auth on failure setup self-host wrote a fresh CLIConfig{ServerURL, AppURL} (a full overwrite that drops the saved token) and only then probed the server, returning early on failure. A failed probe therefore logged the user out and left them unconnected, with no recovery in the same command. Probe first via persistSelfHostConfigIfReachable: an unreachable server leaves the existing config — and its token — untouched (failed setup = no-op). The prober is injected so both branches are unit-tested. * fix(daemon): serve health before preflight so daemon start readiness is accurate The CLI's 'daemon start' polls the health endpoint for 15s expecting status=running, but the daemon only began serving health after preflightAuth, whose initial workspace sync detects every configured agent's version by exec'ing it (~20s cold with 8 agents). Health served too late, so a perfectly healthy daemon printed 'may not have started successfully'. Start the health server right after resolveAuth (which still fails fast on a missing token) and before the slow preflight, so readiness reflects the daemon core being up rather than agent-version detection finishing. * fix(daemon): gate /health readiness so daemon start can't report a false start Serving health before preflightAuth fixed the false-negative (a healthy daemon printed "may not have started"), but health still returned status:"running" unconditionally — before preflight (PAT renew + workspace sync + runtime registration) had completed. `daemon start` and the desktop treat "running" as ready, so a slow or failing preflight could be misreported as a started daemon: setup prints "connected", then the process exits or hangs in agent-version detection with no runtime registered. That is harder to diagnose than the original false-negative. Split liveness from readiness: bind/serve the health port early (so callers see a live "starting" daemon instead of connection-refused), but report status:"starting" until d.ready is set after preflight, then "running". - daemon.go: add d.ready (atomic.Bool); set it true after the background loops launch, before pollLoop. - health.go: healthHandler reports "starting" until ready, else "running". - cmd_daemon.go: `daemon start` waits for "running" with a deadline raised to 45s (covers cold-start agent detection) and a clearer "still starting" message; new daemonAlive() helper treats both "running" and "starting" as a live daemon, so the already-running guard, restart, and stop act on a starting daemon and don't double-spawn or race its listener; `daemon status` shows "starting" distinctly. Older CLIs/desktop that only know "running" safely treat "starting" as not-ready (status != "running"), so no boundary break. Tests: health reports starting-then-running; daemonAlive truth table. Co-authored-by: multica-agent <github@multica.ai> * fix(desktop): handle daemon "starting" health status in lifecycle The daemon now reports /health status:"starting" until preflight completes (liveness/readiness split). That made "starting" a new external contract of /health, but the Desktop daemon-manager only knew "running", so the readiness fix would have moved the CLI's false-negative into a Desktop start regression: - `daemon start` now blocks up to 45s waiting for readiness, but the Desktop spawned it via execFile({ timeout: 20_000 }). On a cold start (the ~20s agent detection this PR targets) Electron killed the CLI supervisor at 20s and reported a start failure, even though the detached daemon child kept booting — the UI flashed "stopped" then "running". Raise the timeout to 60s (must exceed the CLI's 45s startupTimeout). - The Desktop treated only raw status === "running" as a live daemon, so a daemon that was still "starting" (booting on its own or started via the CLI) showed as "stopped", and startDaemon() would spawn a second one — which the new CLI rejects as "already running", surfacing as a start error. Add daemonStatusAlive() (shared, pure, unit-tested) mirroring the Go daemonAlive() and use it for liveness: fetchHealth() surfaces a daemon-reported "starting" as state "starting" regardless of our own currentState; startDaemon()'s already-running guard and the restart-on-user-switch guard treat "starting" as an existing daemon. version-decision stays gated on "running" (readiness, not liveness) — unchanged. Verified: desktop typecheck, eslint, full vitest suite (193 tests) all pass. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-05 17:01:23 +08:00
Bohan Jiang	3708fb0f07	fix(daemon): inactivity-based agent run timeout, no wall-clock guillotine (MUL-3064) Active long-running sessions are no longer killed by a fixed wall-clock deadline. Liveness is delegated to the idle watchdog (MULTICA_AGENT_IDLE_WATCHDOG, default 30m) with a larger in-flight-tool budget (MULTICA_AGENT_TOOL_WATCHDOG, default 2h). MULTICA_AGENT_TIMEOUT is an opt-in absolute cap (default 0 = no cap). The server-side 2.5h sweeper is unchanged as a coarse backstop. Fixes #3745.	2026-06-05 15:06:07 +08:00
Multica Eve	63b847ee48	Honor agent identity in assignment workflow (#3802 ) Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-06-05 13:45:43 +08:00
Naiyuan Qing	b9334dd59f	fix: anchor comment triggers to thread roots (#3746 ) Co-authored-by: multica-agent <github@multica.ai>	2026-06-04 13:47:05 +08:00

1 2 3 4 5 ...

345 Commits