Commit Graph

159 Commits

Author SHA1 Message Date
J
18b338ba1a fix: remove gemini cli runtime
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 14:44:05 +08:00
Multica Eve
8ad673fdb7 MUL-3560: gate slim runtime brief behind runtime_brief_slim feature flag (#4449)
The MUL-3560 slim runtime brief — kind-driven dispatcher, per-section
gating, prose compression for ~7k chars saved on the typical
comment-triggered task — now ships behind the `runtime_brief_slim`
feature flag wired via the framework-level service from MUL-3615.

Default: OFF in every environment (production stays on the legacy
brief that has shipped for ~2 years). Staging opts in via the YAML
rule set; ops can override per-process with `FF_RUNTIME_BRIEF_SLIM=true`.
Production is held back until staging has burned in long enough that
we are confident the slim brief does not regress agent behaviour.

Architecture (one toggle point, two code paths, both fully tested):

  buildMetaSkillContent (runtime_config.go)
      │
      └─ useSlimBrief() → false (default)
      │     → fall through to the legacy verbose body that ships on
      │       main today — byte-for-byte unchanged, no migration risk
      │
      └─ useSlimBrief() → true
            → buildMetaSkillContentSlim (runtime_config_sections.go)
              → classifyTask → 5-way kind switch → per-section writers

BuildCommentReplyInstructions takes the same gate, so the per-turn
comment prompt and the runtime brief stay in sync on which template
they emit.

What's in this PR:

- runtime_config_flag.go (new): package-scope `runtimeFlags` atomic
  pointer + `SetFeatureFlags` setter + `useSlimBrief` toggle point.
  Nil-safe: a daemon that forgets to wire the service falls back to
  legacy, no panic.
- runtime_config_kind.go (new): `taskKind` enum + `classifyTask` +
  `hasIssueContext` predicate. Used only by the slim path.
- runtime_config_sections.go (new): the slim brief itself —
  `buildMetaSkillContentSlim` + per-section `writeXxx` helpers
  + `writeAvailableCommandsQuickCreate` minimal variant +
  `writeBackgroundTaskSafetySlim` compressed safety section. The
  Section × Kind matrix is documented inline on
  `buildMetaSkillContentSlim` and the test below checks the
  dispatcher does not diverge from the spec.
- reply_instructions.go: `BuildCommentReplyInstructions` gains a
  short slim-or-legacy prelude; new `buildCommentReplyInstructionsSlim`
  is the compressed cookbook (defers the shell-hazard rationale to
  `## Comment Formatting`).
- runtime_config.go: `buildMetaSkillContent` gains a 2-line
  dispatcher at the top; the legacy body is otherwise untouched.
- runtime_config_kind_test.go (new): canaries for both paths.
  - TestClassifyTask: 5 kinds + 3 tiebreak cases.
  - TestTaskKindHasIssueContext: predicate semantics.
  - TestSlimFlagOffUsesLegacy: nil flag service → legacy path
    (renders "Get full issue details.", a legacy-only substring).
  - TestSlimFlagOnUsesSlim: flag on → slim path (renders "full
    issue.", a slim-only one-liner) AND must NOT render legacy
    "Get full issue details.".
  - TestBuildMetaSkillContentSlimKindMatrix: locks the per-kind
    section set; heading match is line-anchored so inline references
    don't trip absence assertions.
  - TestSlimQuickCreateAvailableCommands: locks the minimal-variant
    content for quick-create (issue create present, every other
    Core command absent).
  - TestSlimBriefIsSubstantiallyShorter: ≥ 30% reduction guard so
    a future change can't accidentally re-bloat the slim path back
    to legacy levels.
- cmd/server/main.go: now calls `execenv.SetFeatureFlags(flags)`
  immediately after constructing the feature flag service.

Measured impact (slim vs legacy, claude provider, realistic fixture
with 2 repos + 2 skills + member initiator):

    legacy = 19567 chars
    slim   = 11868 chars    Δ = -7699  (-39.3%)

Verification:

- go vet ./internal/daemon/... ./cmd/server/...                  ok
- go test ./internal/daemon/...                                  ok
- go test ./pkg/featureflag/...                                  ok
- TestSlimBriefIsSubstantiallyShorter logs the 39.3% ratio
- TestSlimFlagOffUsesLegacy + TestSlimFlagOnUsesSlim pass both
  directions, so the dispatcher is locked in code.

The pre-existing `internal/handler` test failures
(TestLeaveWorkspace_RevokesOwnRuntimes,
TestDeleteMember_CancelsTasksFromAgentReassignment,
TestDeleteMember_NoRuntimes_DeletesMember) reproduce on plain
`origin/main` with the same `relation "channel_user_binding" does
not exist` SQL error — they are a missing-migration bug from the
recent channels foundation PR (ce28d0aa0), not anything this PR
touched.

Rollout plan:

1. Merge this PR. Production daemons keep emitting the legacy brief
   (flag default false).
2. Add a YAML rule to staging's
   `MULTICA_FEATURE_FLAGS_FILE`:

       runtime_brief_slim:
         default: true

   Staging daemons start emitting the slim brief on next restart.
3. Watch `agent prompt prepared` logs + agent behaviour for 7 days.
4. If staging is clean, flip the prod YAML to `default: true`.
   Legacy code path stays in the binary as a kill-switch
   (`FF_RUNTIME_BRIEF_SLIM=false` to revert without a deploy).
5. After ~30 days clean in prod, follow up with a PR that deletes
   the legacy body and the flag — same pattern as docs/feature-flags.md
   recommends ("plan the death of the flag at birth").

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 14:23:17 +08:00
Naiyuan Qing
b79777caec feat(comments): resolve-aware fold for agent comment reads (MUL-3555) (#4463)
* feat(comments): resolve-aware fold for agent comment reads (MUL-3555)

Agents reading a long issue paid tokens for settled discussion. The human
timeline already folds resolved threads, but the agent read path
(`comment list`) ignored resolved_at entirely — humans saw the conclusion,
agents got the full raw discussion.

Add an opt-in `fold=true` projection to ListComments that collapses each
resolved thread to root + conclusion (reply-resolved) or root only
(root-resolved), reusing the human timeline's deriveThreadResolution
semantics. The resolved thread's root carries `thread_resolved` +
`folded_count`; `--full` brings the dropped comments back. Fold is rejected
on partial-thread reads (since/tail) and roots_only, where a resolution
comment could be unfetched and silently dropped.

CLI `comment list` folds by default on the complete-thread reads (default,
--recent, untailed --thread) with a `--full` escape hatch; the agent
prompts and runtime brief document the fold + escape. No new endpoint, no
human UI change, no SQL/migration change — in-memory projection, same
precedent as summary/roots_only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* refactor(daemon): dedupe fold prompt restatements per review (MUL-3555)

Howard's PR review flagged DRY redundancy: the resolve-fold rule was
restated in full in the task prompt (prompt.go:41/:182) and the brief
workflow steps (runtime_config.go:673/:692, reply_instructions cold
hint) even though the canonical command catalog (runtime_config.go:477)
— always present in the brief — already documents it in full, and the
task prompt explicitly defers to it ("follow the rule in your runtime
workflow file").

Keep the catalog entry full (the canonical reference); shrink the five
inline restatements to a short "resolved threads come back folded —
`--full` to expand" pointer. No loss of signal (the agent always has the
full catalog in context), ~80-120 tokens/run saved on the worst-case
assignment / cold paths.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 09:52:18 +08:00
Naiyuan Qing
4ab335b8a5 MUL-3416: Issue pre-trigger preview + Handoff Note (#4383)
* feat(issues): unify run-enqueue decision behind WillEnqueueRun + preview endpoint

Collapse the issue update/batch enqueue copies into one service predicate
service.IssueService.WillEnqueueRun, shared verbatim with a new dry-run
endpoint POST /api/issues/preview-trigger so the four entry points stop
drifting (squad/self-loop/batch omissions, MUL-3375). The private-agent gate
stays at the HTTP boundary: write paths inject allow-all, preview injects the
real gate so it never leaks a private agent's readiness.

Add suppress_run to issue update/batch: the change applies but no run starts.
Remove the now-dead handler mirrors shouldEnqueueSquadLeaderOnAssign /
isSquadLeaderReady. service.Create and the comment trigger chain are untouched.

Tests: preview behavior, preview<->write-path match, batch aggregation,
member no-trigger, suppress_run skip, malformed-body 400.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): inject handoff note into assigned runs via first-class task field

Add an optional handoff_note carried by issue assign/promote into the run's
opening prompt and issue_context.md, via a dedicated agent_task_queue column
(migration 122) and a daemon assignment-handoff render branch — never a
fabricated comment, never trigger_comment_id (MUL-3375 §6.1).

Thread the note through enqueueIssueTask/enqueueMentionTask + WithHandoff
public variants and dispatchIssueRun; suppress_run or a parked write drops it
(no run = nothing to inject). Soft version gate: MinHandoffCLIVersion +
HandoffSupported, surfaced per-trigger as handoff_supported in the preview so
the UI can gray the note box on old daemons; the assignment never hard-fails.

Tests: daemon prompt + issue_context render via the assignment branch (not
quick-create/comment), version helper matrix, note persists on the task,
suppressed assign enqueues nothing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): leave a display-only handoff record on the timeline

When an assign/promote with a handoff note starts a run, write one
type='handoff' timeline record via TaskService.RecordHandoff — a direct
Queries.CreateComment + timeline event that bypasses Handler.CreateComment, so
it never reaches triggerTasksForComment and cannot start a second run
(MUL-3375 §6.2, the must-not-retrigger invariant). Author is the actor who
handed off; body is the note. Migration 123 admits the 'handoff' comment type.
Recorded only on a real run start: suppress_run or a parked write writes
nothing. enqueueSquadLeaderTask now reports whether it enqueued so the trace
is gated on an actual dispatch.

Test: exactly one handoff record on assign-with-note, exactly one task (no
re-trigger), and no record when suppressed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): frontend plumbing for issue-trigger preview + handoff (core)

Add api.previewIssueTrigger + IssueTriggerPreviewSchema (zod parseWithFallback),
the use-issue-trigger-preview hook, issueKeys.issueTriggerPreview(+All) with WS
queue-state invalidation, suppress_run/handoff_note on UpdateIssueRequest, the
'handoff' CommentType, and stripping of the control fields from optimistic
update/batch cache patches (MUL-3375 §9).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): exclude handoff records from new-comment counting

type='handoff' is a display-only timeline record, not conversation. Exclude it
from CountNewCommentsSince so a handoff note never inflates the count of
"new comments to catch up on" fed to a claiming agent (MUL-3375 §12). Analytics
already excludes it (RecordHandoff is a direct write that emits no analytics
event), and the comment-trigger path is already bypassed.

Test: a handoff record does not bump the new-comment count; a real comment does.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): pre-trigger preview UI, handoff note, timeline card (web/desktop)

Wire the §9 frontend onto the preview endpoint + handoff fields:
- Delete the backlog blocking dialog (backlog-agent-hint*) and its modal type;
  the over-eager nag is gone. Backlog awareness is now a passive label.
- RunConfirmModal: single assign + batch assign/status route here. Shows the
  backend predicate's verdict ("将启动 @X" / "将启动 N 个" / parked), an optional
  handoff note (assign only, soft-gated by handoff_supported), and 暂不启动 —
  then applies via update/batch. No frontend guessing.
- create modal: passive CreateRunHint ("将启动 @X" / backlog parked).
- single status change stays a direct apply (unchanged).
- timeline: render type='handoff' as a distinct, non-interactive handoff card.
- i18n run_confirm + handoff_card across en/ja/ko/zh-Hans; drop backlog action
  keys; locale parity green.

Tests: use-issue-actions (assign → run-confirm modal, member → direct),
create-issue + comment-card suites updated/green; views typecheck + lint clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(issues): use a valid anchor in the handoff count-exclusion test

CountNewCommentsSince filters id <> @anchor_id; SQL id <> NULL is NULL and
excludes every row, so an empty anchor made the control assertion read 0. The
production caller always passes a real anchor — mirror that with a non-matching
sentinel uuid.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(issues): RunConfirmModal apply logic (start/suppress/note-gate/batch)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(core): preview schema malformed/missing/null fallback coverage

Cover IssueTriggerPreviewSchema via parseWithFallback (MUL-3375): well-formed
parse, top-level + item default fills (empty/older backend), and fallback to
{ triggers: [], total_count: 0 } for malformed shapes, a dropped required
issue_id, a wrong-typed total_count, and null/non-object bodies — so the four
entry points degrade to "nothing will start" instead of throwing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* refactor(issues): remove display-only handoff timeline record (留痕)

The handoff "留痕" timeline record (type='handoff' comment written on run
start) was judged superfluous and dropped per product call. This removes
only the display-only trace; the handoff NOTE injection into the run's
opening prompt + issue_context.md is untouched.

- backend: drop RecordHandoff + its call in dispatchIssueRun
- db: drop the `type <> 'handoff'` exclusion in CountNewCommentsSince and
  migration 123 (comment_type_check reverts to the 4-type set from 001);
  no production data exists for this unreleased feature
- frontend: drop the "handoff" CommentType, HandoffCard, and handoff_card
  i18n (all locales)
- tests: drop handoff_count_test.go and the record-write assertions in
  issue_trigger_preview_test.go (note-injection tests retained)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): dismissable run-confirm modal + team-handoff copy

Two fixes to the pre-trigger confirm modal (MUL-3375).

1. Dismissable: switch RunConfirmModal from AlertDialog to the standard
   shadcn Dialog so it has the close (X) button + Esc + click-outside.
   Previously the only choices were "start" / "don't start now" with no
   way to abort the action entirely; dismissing now cancels with no write.

2. Copy: rework the action-surface wording away from the backend term
   "run" toward team-handoff voice — 指派 / 开始 / 交接 (run stays only on
   record surfaces). Unifies the note's three names to "交接说明", and
   parallels the rewrite across en/ja/ko.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* chore(agent): bump handoff note min CLI version to 0.3.28

The daemon release that renders handoff notes ships in 0.3.28 (0.3.27
was the prior tag), so move the soft-gate threshold up. Below this the
note is silently dropped and the frontend grays the note box — assignment
is never blocked.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(issues): skip run-confirm when batch-moving issues to backlog

A move into backlog never starts a run (service/issue_trigger.go), so the
pre-trigger confirm modal degenerated to an empty "won't start" box with a
single Apply button — pure friction. Apply directly instead, matching the
single-issue status path. Other target statuses still route through the
modal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(issues): refine pre-trigger preview hint and copy

- Move the create-issue run hint to a reveal band (grid 0fr→1fr) above the
  property toolbar. It was sharing the footer button row and, lacking a
  width constraint, reflowed the submit buttons whenever it appeared.
  Restyle to a borderless, comment-style avatar+caption that is purely a
  caption (non-interactive avatar).
- Distinguish squad from agent in the pre-trigger copy: a squad's leader
  evaluates and delegates rather than "starting work" itself. Add
  will_start_named_squad / will_start_squad / create_will_start_squad across
  en/zh/ja/ko (reusing the squad_leader_* evaluate→arrange vocabulary) and
  branch run-confirm + the create hint on squad assignees.
- Bold the assignee name in the run-confirm headline via a language-safe
  sentinel split (no per-language prefix/suffix keys).
- Align zh "开始处理" → "开始工作" on the single-assign copy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(issues): stub ActorAvatar in create-issue suite

CreateRunHint now renders an ActorAvatar for agent/squad assignees, which
pulls in getActorInitials/getActorAvatarUrl + the workspace/presence/navigation
hook tree. This form-focused suite only stubbed getActorName, so the
squad-forwarding test crashed with "getActorInitials is not a function". Stub
the avatar inert — its own behavior is covered elsewhere.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Walt <walt@multica.ai>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 13:17:13 +08:00
Bohan Jiang
48b8dbf439 feat(daemon): surface sub-issue stages in the always-on runtime brief (#4426)
Agents creating sub-issues only saw the runtime brief's Sub-issue
Creation section, which taught the manual todo/backlog serial chain and
never mentioned stages — the `--stage` flow was documented only in the
multica-working-on-issues skill, which an agent reads only if it opens
it. So agents defaulted to hand-managed backlog chains and rarely reached
for stages.

- Add an "Ordering with stages" paragraph to the brief's Sub-issue
  Creation section nudging agents to group ordered/waiting sub-issues
  with --stage instead of hand-promoting a backlog chain.
- List --stage on the brief's issue create / update command lines and
  add multica issue children to the Core command list for discoverability.
- Extend the brief test with the new stage assertions.

The Sub-issue Creation section stays gated to issue-bound runs (skipped
for chat/quick-create/autopilot), unconditional on parent_issue_id, and
free of parent-notification guidance — all existing canaries still pass.

MUL-3508

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 01:08:10 +08:00
Bohan Jiang
da72e2fa22 feat(daemon): inject project description into the agent brief (MUL-3465) (#4395)
* feat(daemon): inject project description into the agent brief

Issues bound to a project only surfaced the project title in the runtime
brief; the project description (durable, project-wide context the owner
sets) was loaded but dropped. Carry it end-to-end:

- claim handler reads proj.Description onto the response (issue-bound and
  quick-create paths)
- new ProjectDescription field on AgentTaskResponse, daemon Task, and
  TaskContextForEnv
- rendered in the brief's `## Project Context` section and written to
  .multica/project/resources.json as project_description

Empty descriptions render nothing (no extra heading). Updated the
projects-and-resources built-in skill docs in the same change.

MUL-3465

Co-authored-by: multica-agent <github@multica.ai>

* feat(projects): clarify project description is injected as agent context

The project description is now durable context injected into every task's
brief, but the UI still presented it as a plain "Description" field, so
existing descriptions could silently become agent input. Add a hint under
the description editor on the project detail page and in the create-project
modal, in all four locales, stating it is shared with agents as context for
every task in the project. No data-semantics change.

Addresses review feedback on PR #4395. MUL-3465

Co-authored-by: multica-agent <github@multica.ai>

* test(handler): assert project description flows through task claim

The execenv tests cover brief rendering, but nothing pinned the claim
handler boundary where proj.Description is read onto the response. Add
two tests — issue-bound and quick-create paths — so a regression in that
assignment fails loudly instead of silently dropping the description.

Addresses review feedback on PR #4395. MUL-3465

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 23:39:27 +08:00
DylanLi
78342a39ce MUL-3305: feat(agent): add qoder CLI as a choice of agent provider. (#2461)
* feat(agent): Qoder ACP runtime, chat reconnect recovery, and task linkage

- Add Qoder CLI backend (ACP transport, model discovery, blocked-args policy)
- Wire daemon/runtime config, docs, and UI provider assets
- Retry terminal task reports; add backoff unit tests
- Chat: SQL attach user message to task; handler + optimistic cache reconcile
- Invalidate chat/task-messages caches on WS reconnect; extract helper + tests

Co-authored-by: Orca <help@stably.ai>
Co-authored-by: Cursor <cursoragent@cursor.com>

* chore: drop non-Qoder changes (chat reconnect, task link, terminal report retries)

Keep only Qoder runtime, docs, daemon config/execenv, and UI provider assets.

Co-authored-by: Orca <help@stably.ai>
Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(agent): harden Qoder ACP drain and wire project skills path

- Stop streaming to msgCh after reader wait so grace timeout cannot race close
- Resolve injected skills to .qoder/skills per Qoder CLI discovery
- Update AGENTS.md skill copy and add execenv tests

Co-authored-by: Orca <help@stably.ai>
Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(qoder): add provider logo and wire MCP config into ACP sessions

- Add inline SVG QoderLogo component to provider-logo.tsx, replacing
  the generic Monitor icon placeholder
- Add convertMcpConfigForACP helper to convert Claude-style MCP server
  config (object map) into ACP array format for session/new and
  session/resume
- Add unit tests for convertMcpConfigForACP covering stdio, SSE,
  empty/nil, and multi-server cases

Co-authored-by: Orca <help@stably.ai>

* fix(test): capture both return values from InjectRuntimeConfig in Qoder test

Co-authored-by: Orca <help@stably.ai>

* fix(qoder): preserve remote MCP headers and promote provider errors

Addresses review feedback on #2461 (Bohan-J): two runtime-correctness
issues in the Qoder ACP backend.

1. Remote MCP headers were dropped. The bespoke convertMcpConfigForACP
   only forwarded url/type, so an authenticated remote MCP server looked
   configured in Multica but failed inside the Qoder session. Replace it
   with the shared buildACPMcpServers helper (same path Hermes/Kimi/Kiro
   use), which preserves headers as [{name, value}], sorts for
   deterministic output, and handles remote transport aliases. Fail
   closed on malformed mcp_config instead of silently dropping servers.

2. Provider failures could report as completed tasks. stderr was wired
   via io.MultiWriter and the result was only promoted to failed when
   output was empty, so a terminal upstream error (HTTP 429 / expired
   token) racing a stopReason=end_turn with text still became
   "completed". Switch to StderrPipe + an explicit copier, drain it
   (bounded by the existing grace window, since qodercli can leave a
   child holding the inherited fds) before the decision, and run the
   shared promoteACPResultOnProviderError.

Tests: replace the convertMcpConfigForACP unit tests with two
end-to-end Qoder tests — one asserts the Authorization header reaches
the session/new payload as {name, value}, the other asserts a terminal
stderr error with non-empty output reports failed.

Co-authored-by: Orca <help@stably.ai>

* fix(qoder): align ACP session handling

Co-authored-by: Orca <help@stably.ai>

* fix(agent): guard qoder late output after drain

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Orca <help@stably.ai>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 18:55:45 +08:00
Naiyuan Qing
4fe8b54e9b MUL-3446: keep chat output in chat (#4387)
* MUL-3446: keep chat output in chat

Co-authored-by: multica-agent <github@multica.ai>

* MUL-3446: simplify chat output guidance

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 15:51:03 +08:00
BeliyDym
5fd3d01d13 MUL-3502: OST-1161: Bound assignment comment catch-up
Squashed PR #4392. Updates assignment/comment catch-up guidance to use recent 10 and aligns related examples.
2026-06-22 15:46:47 +08:00
Hzzzzzx
b4c9e4423c test: enable -race detector in Go test pipeline (WOR-61) (#4274)
* test: enable -race detector in Go test pipeline (WOR-61)

Add the -race flag to all three Go test invocation sites so the existing
concurrency regression harness (workdir_race_test.go for #3999,
runtime_gone_test.go, runtime_profile_drift_test.go) actually exercises
the race detector. The daemon package alone has 28+ goroutine launch
points with no automated race coverage before this change.

Sites updated:
- Makefile:299        (make test, local)
- .github/workflows/ci.yml:101      (CI backend job)
- .github/workflows/release.yml:55  (release verify job)

go test already runs a vet subset by default, so no separate -vet flag
is added. No production code touched.

Co-authored-by: multica-agent <github@multica.ai>

* test(execenv): serialize runtimeGOOS-mutating test (WOR-61)

TestInjectRuntimeConfigIssueMetadataCodexFormattingUnchanged called
t.Parallel() while mutating the package-level runtimeGOOS to drive the
windows/linux branches, racing with the other parallel tests that read
runtimeGOOS in buildMetaSkillContent. The -race flag enabled in the
prior commit surfaced it as 3 WARNING: DATA RACE reports and 11
"race detected" failures in CI (only the execenv package failed).

Drop t.Parallel() and add the "// Not parallel: mutates the package-level
runtimeGOOS." comment already used by the six sibling writer tests across
execenv_test.go and reply_instructions_test.go. This is test-isolation
only; no production code, no mutex/atomic, no signature change.

Verified locally:
  go test -race -count=1 ./internal/daemon/execenv/   -> ok 2.276s
  go test -race -count=1 ./internal/daemon/...        -> all 3 pkgs ok

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: hzz <331380069@qq.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-18 15:50:24 +08:00
Bohan Jiang
1279f22d1c MUL-3325: add background task safety brief (#4257)
* fix(daemon): add background task safety brief

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): force Claude background tools foreground

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): narrow Claude async launch detection

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-18 10:45:51 +08:00
Multica Eve
18a58e80c0 MUL-3316: fix(execenv): switch agent prompt to --content-file to prevent heredoc flag swallowing (#4182) (#4191)
* fix(execenv): switch agent prompt to --content-file to prevent heredoc flag swallowing (#4182)

The Linux/macOS reply template recommended --content-stdin with a quoted
HEREDOC. That pattern is safe for the trivial single-flag comment-add case
that BuildCommentReplyInstructions emits, but as soon as a model wraps
extra flags around the heredoc on multica issue create / update — assignee,
project — the bash heredoc/flag boundary is fragile in two ways the model
cannot see:

  - A 'BODY \\' terminator with a trailing token is not recognised as the
    heredoc end, so flag lines after it are swallowed into the description
    (OXY-78: residual flag text leaked into the description, command exit 0).
  - A clean terminator turns the trailing '--assignee ...' line into a
    separate failing shell statement, while the create itself already exited
    0 with no assignee (OXY-76: assignee silently dropped, no residual text).

In both cases the CLI never receives the swallowed flags, the API request
omits the fields, and the daemon has no visibility. The created issue lands
with assignee_id: null / project_id: null.

This commit:

  * Switches the Linux/macOS branch of BuildCommentReplyInstructions to
    --content-file with a 3-step recipe (write file, post, rm) so the body
    never reaches the shell and all flags live on one shell-token line.
    There is no heredoc boundary for flags to leak across.
  * Adds a parallel cleanup step (Remove-Item) to the Windows branch so the
    cross-platform template is one shape.
  * Rewrites the runtime_config.go ## Comment Formatting non-Windows section
    to mandate --content-file and explicitly ban --content-stdin HEREDOC for
    agent-authored comments, citing #4182.
  * Reorders the Available Commands menu lines for issue create / update /
    comment add to put --content-file / --description-file ahead of the
    stdin variant and add a per-line note pointing at #4182.
  * Updates and renames the affected tests
    (TestBuildCommentReplyInstructionsCodexLinux,
    TestBuildCommentReplyInstructionsNonCodexLinux,
    TestInjectRuntimeConfigLinuxCommentFormattingEmphasizesFile,
    TestInjectRuntimeConfigIssueMetadataCodexFormattingUnchanged) so the
    new file-first contract is pinned and the old HEREDOC mandate is in the
    banned-strings lists.

This converges Linux/macOS with the long-standing Windows file-only path,
so the cross-platform guidance is now one shape. It also strictly improves
on the previous MUL-2904 guardrail by eliminating shell exposure of the
body entirely (no body ever reaches the shell, so backtick / $() / $VAR
substitution cannot corrupt it).

Closes GitHub multica-ai/multica#4182.

No CLI or backend changes — --content-file / --description-file already
exist.

Co-authored-by: multica-agent <github@multica.ai>

* docs(prompt): correct stale BuildPrompt comment to file-first (#4182)

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: CC-Girl <cc-girl@multica.ai>
2026-06-16 17:14:25 +08:00
YOMXXX
34d4cd3a28 feat(openclaw): support connecting to existing OpenClaw gateway (#3260) [MUL-3158] (#3664)
* feat(openclaw): support connecting to existing OpenClaw gateway (#3260)

When the daemon host is a lightweight dev machine or CI coordinator, the
heavy agent work (LLM inference, code execution, tool use) often belongs
on a more powerful remote server already running an OpenClaw gateway.
Multica historically hard-coded `openclaw agent --local`, forcing every
turn to execute in-process on the daemon host.

This change adds an opt-in gateway routing mode controlled per-agent via
`runtime_config`:

  {
    "mode": "gateway",
    "gateway": { "host": "...", "port": 18789, "token": "...", "tls": false }
  }

- Backend: ExecOptions gains OpenclawMode + OpenclawGateway; buildOpenclawArgs
  drops `--local` when mode == "gateway". Per-task openclaw-config.json
  wrapper pins gateway.{host,port,auth.{mode,token},tls} so users do not
  need to edit the daemon host's `~/.openclaw/openclaw.json` to point at
  a different endpoint.
- Daemon: AgentData carries the raw runtime_config; decoding is fail-soft
  (malformed JSON falls back to local mode rather than blocking dispatch).
- API: gateway.token is masked to "***" on every GET; PATCH replays the
  sentinel back, and the update handler restores the persisted token so
  the round-trip never destroys the secret. Defense-in-depth masking on
  WS broadcasts, plus String/MarshalJSON masking on the in-memory struct
  to block stray `%+v` / json.Marshal leaks.
- UI: openclaw-only "Routing" tab on the agent detail page with mode
  selector + structured endpoint form. Token uses a "saved — submit a
  new value to rotate" UX and matching backend preserve hook.

Empty `runtime_config` keeps the historical embedded behaviour, so
existing agents are unaffected.

* fix(openclaw): address #3664 review — drop dead gateway field, gate pin on mode

Per Bohan-J's review:

- Remove the dead ExecOptions.OpenclawGateway field (+ its String/MarshalJSON and
  the daemon.go construction block). It carried the plaintext bearer token but was
  never read — buildOpenclawArgs only consumes OpenclawMode and the live gateway
  path runs through execenv.OpenclawGatewayPin — so this narrows the secret's
  footprint.
- Gate the gateway pin on mode=="gateway" in decodeOpenclawRuntimeConfig: a
  {"mode":"local","gateway":{...,"token"}} payload no longer writes the token into
  the 0o600 per-task wrapper that --local makes openclaw ignore.
- Warn on an unrecognized non-empty mode (e.g. "gatway") instead of silently
  falling back to local.
- Run preserveMaskedGatewayToken in CreateAgent too, so a literal "***" at create
  time can't persist as a real bearer token.
- Document the gateway host:port trust boundary (SSRF note for shared daemon hosts).

Adds regression tests for the local-mode pin drop and the unknown-mode warning.
2026-06-13 15:33:28 +08:00
Bohan Jiang
f415099c4a MUL-3263: support managed MCP config for Cursor (#4081)
* feat: support managed MCP config for Cursor

Co-authored-by: multica-agent <github@multica.ai>

* fix: address Cursor MCP review feedback

Co-authored-by: multica-agent <github@multica.ai>

* docs: include Cursor in skills MCP support

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-13 02:07:00 +08:00
Liu Guanzhong
4594c776e1 feat(agent): add CodeBuddy as first-class CLI backend (#3186)
* feat(agent): add codebuddyBackend struct and buildCodebuddyArgs

Introduces the codebuddy agent backend skeleton with args builder
that mirrors claudeBackend's protocol flags (stream-json, bypass
permissions, blocked args filtering) for the codebuddy CLI fork.

* feat(agent): implement codebuddyBackend.Execute with stream-json parsing

* feat(agent): wire codebuddy into New() factory and launchHeaders

* feat(agent): add codebuddy dynamic model discovery from --help

* feat(agent): add codebuddy thinking/effort discovery and providerThinkingEnums

* feat(daemon): add codebuddy CLI probe, env vars, and args support

* fix(agent): use len(models)==0 for default model instead of loop index

* fix(agent): increase codebuddy --help timeout to 35s for slow CLI startup

* fix(agent): address codebuddy PR review feedback

- Wire codebuddy into execenv: reuse claude's CLAUDE.md, .claude/skills,
  and ~/.claude/skills paths since CodeBuddy is a Claude Code fork
- Replace hardcoded 20-min timeout with runContext for zero-timeout =
  no-deadline semantics matching all other backends
- Restore runContext regression tests lost in rebase merge
- Mirror claude.go execution model: concurrent stdin write to prevent
  pipe deadlock, sync.Once for stdin closure, keep stdin open for
  control_request auto-approval mid-run
- Add control_request handling with auto-approve behavior
- Add RequestID/Request fields to codebuddySDKMessage
- Add codebuddy to metrics knownRuntimeProviders
- Add codebuddy to provider-logo.tsx (reuses ClaudeLogo)
- Consolidate --help discovery: shared codebuddyHelpOutput cache
  eliminates duplicate cold-start invocations

---------

Co-authored-by: krislliu <krislliu@tencent.com>
2026-06-12 15:22:16 +08:00
Bohan Jiang
24b162cdbc feat(daemon): surface the real task initiator to the agent runtime (MUL-2645) (#3899)
* feat(daemon): surface the real task initiator to the agent runtime (MUL-2645)

In a multi-person workspace the agent runtime only ever saw the runtime
OWNER identity: the brief's `## Requesting User` is sourced from
runtime.OwnerID and the task-scoped token is owner-bound, so every
requester (whoever commented, @mentioned, or chatted) appeared to the
agent as the owner. Agents that route by initiator for permission,
privacy, or audit all misjudged.

Resolve the real task initiator at claim time and surface it distinctly
from the owner:
- comment / mention trigger -> triggering comment's author (member or agent)
- chat task -> chat session creator (sessions are creator-only)
- on-assign / autopilot / quick-create -> no attributable initiator (omitted)

Adds initiator_{type,id,name,email} to the claim response, the daemon
Task, and TaskContextForEnv, rendered into the brief as a new
`## Task Initiator` section. The section documents the privacy boundary:
the agent's credentials stay owner-scoped, so this is an attested
identity for the agent's own routing/privacy logic, not act-as. No DB
migration — both paths are derivable from existing rows.

Tests: brief rendering (member/agent/omit/sanitize) + email guard unit
tests, and claim-handler tests for the comment and chat paths.

Co-authored-by: multica-agent <github@multica.ai>

* fix(chat): store real sender as task initiator, not chat_session creator (MUL-2645)

Review fix (Niko, PR #3899). v1 resolved the chat task initiator from
chat_session.creator_id at claim time. That is correct for web chat and
Lark p2p (creator == sender), but WRONG for Lark group chats: the group
session creator is deliberately the installer (stable identity across
member churn), not the message sender. So in a Lark group, every member
who triggered the agent showed up in the brief as the installer/owner —
the exact bug this issue is about, still live at that entry point.

Capture the real sender at enqueue time instead of deriving it from the
session creator at claim time:

- migration 117: agent_task_queue.initiator_user_id (FK user, ON DELETE
  SET NULL); NULL for non-chat and pre-migration rows.
- EnqueueChatTask now takes an explicit initiatorUserID. Web chat passes
  the authenticated request user; the Lark dispatcher threads the inbound
  sender (binding.MulticaUserID) through scheduleRun -> flushChatRun. The
  debouncer keeps the latest scheduled flush per session, so in a multi-
  sender silence window the LATEST sender wins (documented + tested).
- claim handler resolves the initiator from task.initiator_user_id and
  drops the creator_id fallback entirely.

The Lark group session creator stays the installer (unchanged) — only the
task initiator is corrected, keeping the two concepts cleanly separate.

Tests: dispatcher group regression (initiator = sender, not installer),
latest-sender-wins, p2p initiator assertion; the chat claim handler test
now sets creator != initiator and asserts the stored sender wins.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-08 19:29:57 +08:00
liujianqiang-niu
5be7d1bc17 MUL-3136 fix(openclaw): parse config path from last non-empty line of CLI output
Fix OpenClaw config discovery when `openclaw config file` prints Doctor warning UI before the actual config path. The daemon now uses the last non-empty stdout line as the path while preserving the existing tilde expansion, absolute-path validation, stat checks, and fail-closed behavior.

Tests: go test ./internal/daemon/execenv
2026-06-08 17:22:02 +08:00
HMYDK
4190de3d64 fix(skills): quote description values in built-in SKILL.md YAML frontmatter (#3852)
Built-in SKILL.md description values contained unquoted ': ' sequences, which strict YAML parsers (e.g. Codex) reject — silently dropping the skill at load.

- Quote all eight built-in skill descriptions.
- ensureSkillFrontmatter() re-synthesizes frontmatter that has a name but fails YAML validation, so malformed imports are repaired instead of dropped.
- Unify frontmatter delimiter parsing into a single frontmatterParts helper.
- Add strict-YAML regression tests over the built-in skills, plus unit tests for the recovery branch and delimiter variants.

Closes #3851.
2026-06-08 13:10:24 +08:00
Multica Eve
63b847ee48 Honor agent identity in assignment workflow (#3802)
Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-05 13:45:43 +08:00
Naiyuan Qing
b9334dd59f fix: anchor comment triggers to thread roots (#3746)
Co-authored-by: multica-agent <github@multica.ai>
2026-06-04 13:47:05 +08:00
Bohan Jiang
d6a556bdbf fix(execenv): refresh skills in place on reuse instead of accumulating duplicate dirs (#3716)
Re-dispatching the same agent on the same issue reuses the persistent workdir
via execenv.Reuse(), where the standard-provider skill refresh re-wrote skills
without clearing the prior dispatch's output, so allocateCollisionFreeSkillDir
dodged Multica's own directories into issue-review-multica-N.

On reuse, reclaim the platform-owned managed skill directories the prior
manifest recorded (removeReusedManagedSkillDirs) and roll back the remaining
sidecar files (CleanupSidecars) before refreshing, so each skill lands at its
canonical slug every dispatch. Mirrors the Codex hydrateCodexSkills wipe;
scoped to reuse, which never runs for local_directory tasks.

Fixes #3684 (MUL-2963).
2026-06-03 19:30:42 +08:00
Naiyuan Qing
1544e3b68a feat(skills): built-in agent skills (WIP) MUL-2759 (#3456)
* feat(skills): introduce built-in agent skills (WIP)

Inject platform-authored, version-bundled skills into every agent on top of
its workspace-bound skills, so agents learn how to operate Multica correctly
without users needing to know the internals or agents needing to read source.

Mechanism: skills are embedded into the server binary and appended to the
agent payload at task-claim time (handler/daemon.go), reusing the existing
SkillData wire + daemon-side writeSkillFiles. The daemon needs no changes,
and because it travels over an existing wire field, older daemons pick the
skills up the moment the server ships.

First skill: multica-mentioning — how to build a working @mention (look up
the UUID, match type to id source, know what each mention type triggers).

WIP: injection mechanism + first skill only; more skills to follow in
dependency order (skill -> agent -> squad).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(skills): make multica-mentioning the standard template + add eval

Add the contract-skill frontmatter the other built-in skills will copy:
user-invocable:false (it triggers from context, not as a slash command)
and allowed-tools fencing it to the multica CLI it teaches. These keys
survive to agent machines untouched (ensureSkillFrontmatter only ever
adds a missing name).

Add a Go eval in builtin_skills_test.go (a _test.go so it never ships to
agent machines via the skill-files walk):
- Enforces the template invariants on every built-in skill, present and
  future: multica- prefix, name+description present, description within
  1024 chars, body within the 500-line L2 budget, no eval file leaking
  into the shipped payload.
- Couples the mentioning skill's documented contract to the real
  util.ParseMentions: its Incorrect examples must parse to nothing (a
  name where a UUID belongs fails silently) and its Correct example must
  fire. A drift in the mention regex now breaks CI instead of silently
  turning the skill into a lie agents act on.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(skills): add working-on-issues built-in skill

Co-authored-by: multica-agent <github@multica.ai>

* feat(skills): verify linked PRs in issue workflow skill

Co-authored-by: multica-agent <github@multica.ai>

* feat(skills): add skill import and discovery built-ins

Co-authored-by: multica-agent <github@multica.ai>

* feat(skills): add skill authoring built-in

Co-authored-by: multica-agent <github@multica.ai>

* docs(skills): align builtin skill workflows

Co-authored-by: multica-agent <github@multica.ai>

* docs(skills): use structured skill search

Co-authored-by: multica-agent <github@multica.ai>

* fix(skills): make built-in skill bundle launch-ready

Co-authored-by: multica-agent <github@multica.ai>

* fix(skills): align built-ins with additive skill binding

Co-authored-by: multica-agent <github@multica.ai>

* feat(skills): add creating agents built-in skill

Co-authored-by: multica-agent <github@multica.ai>

* Add built-in squads skill

Co-authored-by: multica-agent <github@multica.ai>

* refactor(skills): rewrite built-in skills as source-traced contracts

Rewrite the built-in agent skills to the inbuilt-skill-authoring standard:
state source-traced product facts with the source-code link logic as the
core, not prescriptive how-to coaching.

- creating-agents: drop the Decision-flow / Do-don't-consequences
  methodology; replace with field/behavior contracts (validation, persisted
  shape, daemon claim-time consumption, env gating, skill binding).
- skill-discovery: stop teaching repo/github_stars as selection signals —
  searchClawHubSkills never populates them (always null); rank by
  install_count + source/url + description. Add file:line citations.
- mentioning: drop the unbacked "member mention sends a notification" claim
  (no such path in the comment handler); state that only agent/squad
  mentions enqueue work. Tighten the parser-failure wording.
- working-on-issues: refresh citations drifted by the main merge; describe
  the PR response `state` enum accurately; trim status coaching.
- skill-importing: correct response type to SkillWithFilesResponse; document
  the reserved SKILL.md supporting-file rule; add line-accurate citations.
- squads: correct the "leader cannot be archived" overstatement (not
  rejected at create/update; fails closed later at routing/dispatch);
  refresh source-map attributions and test list.

Each skill now ships references/<skill>-source-map.md as its evidence layer
(line-accurate citations live there, not pinned in the test, so a future
main merge cannot rot them into stale lies).

builtin_skills_test.go: replace coaching/line-number pins with drift-resistant
contract anchors, forbid the coaching phrasing, and require every skill to
ship its source-map. The ParseMentions behavior coupling is preserved.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* docs(skills): close field-role and citation gaps found in review

Independent review of the rewritten built-in skills surfaced two real gaps and
some citation drift; this fixes them.

- creating-agents: add the three missing field rows (visibility,
  max_concurrent_tasks, mcp_config) to the field-contract table — mcp_config is
  runtime-consumed (TaskAgentData, daemon.go), visibility is access-control
  (default private), max_concurrent_tasks is a scheduler cap (default 6).
  Mark custom_args/runtime_config JSON validation as CLI-side (the server
  marshals as-is). Correct the CLI body-builder note (description/instructions
  use a non-empty check, the rest use Changed). Source-map: fix the env query
  name (UpdateAgentCustomEnv), the conformance test name, and add the new field
  defaults + the McpConfig runtime-payload line.
- mentioning: the @squad mention private gate is canAccessPrivateAgent, not
  canEnqueueSquadLeader (that wrapper is the assignment/child-done path).
- working-on-issues: cite notifyParentOfChildDone at its func def (:51), not the
  doc comment (:15).
- skill-importing: config.origin is set only when the source supplied an origin
  — note it may be absent; cite createSkillWithFiles at its definition
  (skill_create.go:72), not the call site.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* Add built-in skills for autopilots runtimes and resources

Co-authored-by: multica-agent <github@multica.ai>

* feat(runtime): list skill descriptions in the brief Skills index

The brief's `## Skills` section emitted bare skill names only, discarding
the one-line description that SkillContextForEnv already carries. For
Claude-family providers the frontmatter description is loaded natively;
for providers without native skill discovery (hermes/default) the brief's
list is the only signal they ever see, so a bare name gave them nothing
to decide when to load a skill. Emit `name — description` when a
description is present, falling back to the bare name when it is empty.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* refactor(skills): drop CLI-only rule from working-on-issues

The "Platform data goes through the CLI" section duplicated the runtime
brief's `## Important: Always Use the multica CLI` section verbatim (and
the attachment-via-CLI note duplicated the brief's `## Attachments`). The
CLI-only rule is universal and must be known before any skill loads, so
the brief is its single source of truth; the skill copy was pure
redundancy and a drift risk. Remove it and the matching intro clause.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* refactor(skills): remove discovery guidance from built-ins

* docs(skills): remove stale skill-necessity records

The per-skill necessity records had drifted to 3 of 8 shipped skills plus a
record for `multica-skill-authoring`, which is not a shipped built-in skill.
Per-skill "why it exists / when to use it" already lives co-located with each
skill (frontmatter `description` + `references/<skill>-source-map.md`) where it
cannot drift from the skill, and the doc's methodology duplicated the
workspace's inbuilt-skill-authoring protocol. Remove the file rather than keep a
parallel listing that every new skill has to remember to update.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(runtime): add source-authority escape hatch to the brief

The brief already tells agents to run `--help` for command discovery, but
nothing stated the trust precedence when a skill, the brief, or a doc seems to
contradict actual behavior. Add one line to the Available Commands escape-hatch
note: trust the live CLI (`--help`/`--output json`) and the checked-out source
over source-traced prose that can lag the code, and verify on any conflict or
confusion. Kept in the always-on brief (universal, needed before any skill
loads) rather than duplicated into each skill; per-skill source-map pointers
remain the specific layer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): scope the source-authority escape hatch to the CLI

The previous version told agents the "checked-out source is the deeper
authority" for verifying behavior. That over-claims: the repos in a task's
brief come from GetWorkspaceRepos + project github_repo resources (per-workspace
config, see daemon.registerTaskRepos), not the Multica platform source. A
generic agent's checked-out source is its own app, not Multica's code, so it
cannot verify a Multica skill/brief claim against it. The only universally
available authority for Multica behavior is the live CLI (`--help` /
`--output json` / observed command behavior). Re-scope the line accordingly and
state plainly that the platform's source is not in the workdir.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* revert(runtime): drop the source-authority escape-hatch line

Reverts the brief addition from fdd5e82df and its follow-up cc67b2088. The
`--help` discovery fallback already in the Available Commands note is enough;
the extra trust-precedence sentence was unnecessary. runtime_config.go is now
identical to 6ca27ad74.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* docs(claude): remind to update built-in skills on CLI/field/behavior changes

Add a Coding Rule: when a change touches a CLI command/flag, API field, or
product behavior that a built-in skill documents, update that skill's SKILL.md
and source-map in the same PR. Lives in the repo dev-guide (read when working in
this repo), not the runtime brief — the runtime brief is injected into every
workspace, where most agents have no Multica skill to update. AGENTS.md is a
pointer to CLAUDE.md, so no mirror needed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-03 18:51:03 +08:00
Bohan Jiang
44feb3d06d fix(skill): canonicalize reserved SKILL.md path check across daemon + API (#3660)
A skill_file row whose path is the skill's own SKILL.md (persisted by
older builds or direct create/update API calls) collides with the
primary content the daemon writes itself, failing task prep with
errPathPreExists on every non-codex local runtime (#3489).

#3526 guarded this with strings.EqualFold(path, "SKILL.md") at the
daemon write site and the three API ingress points, but the stored path
is not canonicalized: "./SKILL.md" or "sub/../SKILL.md" slip past the
exact-match guard while filepath.Join still resolves them onto the same
SKILL.md, so prep can still break.

Extract one canonical helper, skill.IsReservedContentPath, that cleans
the path before the case-insensitive compare, and use it at all four
sites (execenv writeSkillFiles, skill create, update, single-file
upsert). Add a daemon-side regression test for writeSkillFiles ignoring
a bundled SKILL.md (exact + "./" spellings) — the load-bearing fix
previously had only API-layer coverage — plus a unit test for the helper.

Existing poisoned rows are intentionally left in place (skipped at prep)
per the decision on MUL-2928.

MUL-2928
Follow-up to #3526; supersedes #3560.

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-02 18:23:57 +08:00
MSandro
996eb07dc5 fix(daemon): skip duplicate SKILL.md in supporting files to prevent task prep failures (#3526)
Fixes #3489

MUL-2928
2026-06-02 17:53:20 +08:00
Bohan Jiang
888186b183 fix(daemon): make comment-posting guardrail provider-agnostic (MUL-2904) (#3654)
* fix(daemon): make comment-posting guardrail provider-agnostic (MUL-2904)

Agents inlining a backtick-wrapped token into `multica issue comment add
--content "..."` had the shell run it as a command substitution, silently
deleting the token; the stored comment never matched the model's intent, so
it retried forever — spamming OKK-497 with duplicate comments.

The corruption is shell-driven, not provider-driven, so extend the
"never inline --content; use --content-file / quoted-HEREDOC --content-stdin"
rule from Codex-only to ALL providers:

- BuildCommentReplyInstructions: collapse the Linux/macOS non-Codex inline
  branch into the unified quoted-HEREDOC stdin template.
- buildMetaSkillContent: rename "Codex-Specific Comment Formatting" ->
  "Comment Formatting" and emit it for every provider; strengthen the
  Available Commands entry and the assignment step-6 examples to steer away
  from inline --content.
- Windows behavior unchanged (file-only; avoids PowerShell ASCII drop).

Tests: flip the non-Codex Linux reply test into a MUL-2904 regression,
broaden the stdin-emphasis test across providers, and pin the
provider-agnostic guardrail.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): keep Windows assignment brief file-only (address review)

Review catch on #3654: the previous commit added platform-agnostic prose
recommending "--content-file or --content-stdin" in the Available Commands
entry and the assignment-triggered step-6 example. The assignment path has
no BuildCommentReplyInstructions OS override, so on Windows an agent following
step 6 literally would pipe its final comment through PowerShell and drop
non-ASCII bytes (#2198 / #2236 / #2376) — contradicting this PR's own
Windows file-only rule in the ## Comment Formatting section.

Make the platform-agnostic surfaces defer to the OS-aware ## Comment
Formatting section (the single source of truth) instead of naming stdin.
The flag synopsis still lists all three modes.

Add TestInjectRuntimeConfigWindowsAssignmentBriefStaysFileOnly: a Windows
assignment-triggered brief must not contain any prescriptive "... or
--content-stdin" recommendation.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-02 17:47:09 +08:00
Naiyuan Qing
3c8645e546 feat(cli): add squad member set-role (#3583)
Co-authored-by: multica-agent <github@multica.ai>
2026-06-01 12:51:15 +08:00
Naiyuan Qing
4ae4722ef0 fix(comments): preserve direct parent on replies (#3579)
Co-authored-by: multica-agent <github@multica.ai>
2026-06-01 08:28:15 +08:00
Naiyuan Qing
973a43923f fix(comments): revert since-delta to issue-wide, steer to parent thread first (#3535)
#3509/#3523 scoped the comment-trigger since-delta count to the triggering
thread, so an agent resuming a busy issue only saw "+N in this thread" and
lost visibility of new comments in other threads. Revert the count to
issue-wide (every thread), keeping the trigger-comment + agent-own
exclusions, and reshape the warm-path hint to:

  - report the issue-wide new-comment volume,
  - steer the agent to read the triggering (parent) thread FIRST
    (`--thread <trigger> --since`, or `--tail 30` for full context),
  - demote the issue-wide `--since` catch-up to an only-if-needed fallback
    ("don't read them all blindly").

Also fixes the now-stale "scoped to the triggering thread" wording in the
resumed-session no-delta hint (it's issue-wide zero now).

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-29 20:13:23 +08:00
Multica Eve
d1c7d478e1 MUL-2785: clarify thread-scoped comment delta (#3523)
Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-29 17:16:58 +08:00
Multica Eve
9616d78e47 MUL-2785: optimize resumed comment reads (#3509)
* feat(comments): skip default thread read on resumed comment sessions

Co-authored-by: multica-agent <github@multica.ai>

* fix(comments): scope since delta to trigger thread

Co-authored-by: multica-agent <github@multica.ai>

* chore(comments): address thread delta review nits

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-29 14:57:14 +08:00
Naiyuan Qing
3187bbf90c feat(comments): re-add since-delta + cold-start thread read + parent-root write normalization (#3494)
* feat(comments): since-delta new-comment hint + default-on comment session resume (#3432)

* feat(db): add unresolved comment count + list filter queries

Add CountUnresolvedComments (excludes the agent's own comments) and
ListUnresolvedCommentsForIssue. Both are additive — existing callers stay
on the unfiltered queries — so old clients are unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handler): support unresolved-only comment listing

Wire an additive `unresolved` query param into ListComments. Defaults off
so an old CLI that never sends it gets unchanged behavior; only true/1
enable it. Rejects combining unresolved with thread/recent (whole-issue
filter vs navigation models). Includes filter + count query tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handler): plumb unresolved count + thread root into claim, gate comment resume

Populate trigger_parent_id (thread root of the trigger comment) and
unresolved_count (excludes the agent's own comments) on comment-triggered
claim responses. Both fields are omitempty so old daemons ignore them.

Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION
(default off): resumed comment turns can inherit the prior turn's "Done."
final message, so this stays an explicit rollout switch. The runtime-match
and poisoned-session guards still apply regardless of the flag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(daemon): inject unresolved-comments hint + resolve step into agent brief

Add a shared BuildUnresolvedCommentsHint helper rendered on both the
per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It
ships only the count and the relevant CLI call — never comment bodies — so
the server stays cheap. Thread case points at --thread <root>; issue case
points at --unresolved. Suppressed when the count is 0.

Also add a workflow step telling the agent to `multica comment resolve
<thread-root>` once a thread is fully handled, so the unresolved set
converges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): add comment list --unresolved and comment resolve command

Add an --unresolved filter to `issue comment list` (wired to the server's
unresolved param, rejected when combined with --thread/--recent) and a
top-level `comment resolve <id>` command that POSTs to the existing
/api/comments/{id}/resolve endpoint, letting an agent close threads it has
fully handled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(comments): since-delta new-comment hint + default-on comment resume

Simplifies the comment-triggered agent flow down to what's actually needed:

- New-comment awareness is now a pure time delta: the claim response carries
  new_comment_count + new_comments_since (anchored on the prior run's
  started_at, never completed_at so a long run can't miss comments). The
  per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s)
  since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the
  two surfaces can't drift. Cold start (no prior run) falls back to a plain read.
- Comment-triggered tasks resume the prior session by default (same runtime),
  dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS
  comment" prompt guard defends against inheriting the prior turn's "Done."
  marker; GetLastTaskSession still excludes poisoned sessions.
- Drops the resolved-based machinery from the first draft: CountUnresolvedComments
  / ListUnresolvedCommentsForIssue queries, the `comment list --unresolved`
  flag, the `multica comment resolve` command, and the resolve workflow step.
- Removes the verbose cursor-pagination paragraph from the comment prompt; the
  --thread/--recent/--since flags stay in the CLI/API, just no longer explained
  inline every turn.

Compatibility: new claim fields are omitempty (old daemons ignore them).
Comment resume is default-on and affects even old daemons, which already
consume prior_session_id.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(comments): collapse reply parent_id to thread root on write

Comment threads are a 2-level model (root + flat replies, like Linear/Slack),
enforced today only by the UI and the agent path — the CreateComment handler
stored whatever parent_id it was handed, and the agent-side flatten walked just
one level, so a reply-to-a-reply could land at depth 3+. Add GetThreadRoot (a
recursive walk to the parent_id=NULL root) and run both write paths
(handler.CreateComment, service.createAgentComment) through it, so every stored
reply's parent_id IS its thread root. Readers can now treat parent_id as the
thread root without re-walking. The agent-drift guard still compares the raw
parent_id to the trigger comment before normalization.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(comments): cold-start reads triggering thread, warm keeps --thread pointer

The since-delta rework dropped the thread-first read on the COLD path: a
first-time agent fell back to the flat `comment list` dump (oldest-first, cap
2000), burying the trigger's context in ancient chatter. Point cold start at the
triggering conversation instead via a shared BuildColdCommentsHint
(`--thread <trigger> --tail 30` + a --recent pointer for cross-thread
background). On the WARM path, --since is a pure time delta and can miss the
triggering thread's pre-anchor history, so BuildNewCommentsHint now also emits a
--thread pointer. Both surfaces (per-turn prompt + CLAUDE.md workflow) render via
the shared helpers so they cannot drift (PR #2816 rule).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 10:38:37 +08:00
Bohan Jiang
fa076d38f2 MUL-2778 feat(agent): wire mcp_config through OpenClaw runtime (#3450)
* MUL-2778 feat(agent): wire mcp_config through OpenClaw runtime

The MCP config tab (#3419) lets admins save mcp_config on an agent, and
recent work (#3439) plumbed it through the three ACP runtimes. OpenClaw
still ignored the field, leaving the Tab silently inert for any
OpenClaw-backed agent.

Translate the agent's Claude-style `{"mcpServers": {...}}` into the
per-task OpenClaw wrapper's `mcp.servers` block — OpenClaw resolves MCP
via its own config schema rather than ExecOptions, so the existing
OPENCLAW_CONFIG_PATH preparer is the right seam. Fail closed on
malformed JSON / entries missing `command` or `url`, matching the
fail-closed posture the preparer already uses for the agents.list step.
Null / absent mcp_config leaves the wrapper free of an `mcp` key so the
user's global mcp.servers flows through untouched; an explicit empty
managed set (`{}` / `{"mcpServers":{}}`) is honoured as "admin saved no
servers" mirroring `hasManagedCodexMcpConfig`.

Strict-mode replacement (drop user-only servers entirely) would require
OpenClaw to do a per-key replace rather than a deep merge at
`mcp.servers`; the comment documents that caveat rather than relying on
undocumented behaviour.

Also adds `openclaw` to `MCP_SUPPORTED_PROVIDERS` so the MCP Tab
actually surfaces in the agent overview pane, and pins the new
visibility case with a renderPane test.

Co-authored-by: multica-agent <github@multica.ai>

* MUL-2778 fix(agent): make openclaw mcp_config strict-replace via sanitized snapshot

Elon flagged on #3450 that the previous wiring let user-only mcp.servers
leak through the wrapper's `$include` of the live user config: deep-merge
at `mcp.servers` keeps user-only names, and the strict-empty case
(`{ "mcpServers": {} }`) silently inherited user globals.

Switch the strict-replace path to write a sanitized snapshot of the
user's fully resolved config (via `openclaw config get --json`) with the
`mcp` block stripped, then have the wrapper `$include` the snapshot
instead of the live user file. With the user's `mcp` gone from the
$include resolution, the wrapper's `mcp.servers` is the only definition
the embedded OpenClaw sees — managed only, including the explicit empty
set.

The snapshot lives in envRoot at 0o600 alongside the wrapper so the GC
reaper sweeps it with the rest of the task scratch, and no extra
OPENCLAW_INCLUDE_ROOTS entry is needed (same-dir $include).

Fail-closed on `config get --json` errors so the daemon never silently
falls back to the leaky $include path. The inherit branch (null
mcp_config) still uses the live user file directly — no extra CLI
roundtrip and no snapshot is written.

New tests pin the contract Elon's review required:

- TestPrepareOpenclawConfigStrictReplacesUserMcpServers: user has
  global_one + shared, managed has shared + managed_only → wrapper has
  exactly {shared (managed value), managed_only}; global_one does NOT
  leak; snapshot file has the user's `mcp` stripped while preserving
  gateway / providers / API keys.
- TestPrepareOpenclawConfigStrictEmptyManagedSetDropsUserMcp: empty
  managed set drops user's global_one (both `{}` and
  `{"mcpServers":{}}` cases).
- TestPrepareOpenclawConfigNullMcpConfigKeepsUserInclude: null path
  inherits the live user config, writes no snapshot, makes no extra CLI
  call.
- TestPrepareOpenclawConfigFailsClosedOnResolvedConfigError: errors
  during `config get --json` surface; no stale wrapper or snapshot.
- TestPrepareOpenclawConfigManagedSetFreshInstall: fresh install with
  managed mcp_config skips the snapshot dance entirely.

Also tightens en + zh-Hans MCP Tab copy to mention OpenClaw goes via the
per-task wrapper, and to use OpenClaw's own `transport` field rather
than Claude's `type` for HTTP/SSE entries.

Co-authored-by: multica-agent <github@multica.ai>

* MUL-2778 fix(agent): narrow openclaw snapshot strip to mcp.servers only

Elon's third-round must-fix: the previous strict-replace snapshot deleted
the entire `mcp` block, which wiped out non-server settings under `mcp`
like `sessionIdleTtlMs`. Those are documented OpenClaw config keys
(https://docs.openclaw.ai/gateway/configuration-reference#mcp) outside
the MCP Tab's scope — the agent's saved mcp_config only manages server
definitions, so other `mcp.*` tuning the user set must survive.

Replace the blanket `delete(resolved, "mcp")` with a stripUserMcpServers
helper that:

- deletes only `mcp.servers` when `mcp` is an object
- drops the parent `mcp` key only when the object is empty after the
  strip (so we don't emit `mcp: {}` placeholders)
- leaves non-object `mcp` values untouched (we only know how to strip
  servers from the documented shape)

Pinned with TestPrepareOpenclawConfigStrictPreservesNonServerMcpKeys:
user resolved has both `mcp.sessionIdleTtlMs: 300000` and
`mcp.servers.global_one`; after the strict path runs the snapshot
keeps the TTL and drops the servers map, and the wrapper's
`mcp.servers` is exactly the managed set with no leak.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-28 18:43:02 +08:00
Naiyuan Qing
d90732750f Revert "feat(comments): since-delta new-comment hint + default-on comment ses…" (#3455)
This reverts commit 5e78e5100a.
2026-05-28 17:52:59 +08:00
Bohan Jiang
ee4ec3b76d MUL-2784 fix(daemon): cleanup sidecar tree (.agent_context / .multica / provider skills) after local_directory tasks (#3444)
* fix(daemon): cleanup .agent_context / .multica / provider skill sidecars after local_directory tasks (MUL-2784)

PR #3438 (MUL-2753) only restored CLAUDE.md / AGENTS.md / GEMINI.md to
their pre-task bytes; the sidecar tree writeContextFiles seeds
(.agent_context/, .multica/, .claude/skills/, .github/skills/,
.opencode/skills/, skills/, .pi/skills/, .cursor/skills/,
.kimi/skills/, .kiro/skills/, .agents/skills/, fallback
.agent_context/skills/) was explicitly deferred to this follow-up. In
local_directory mode the agent's workdir is the user's repo, so each
task accumulates one more layer of those directories in the user's
tree.

Plan A: track every file/dir Prepare creates inside workDir in a
sidecarManifest written to envRoot/.multica_sidecar_manifest.json
(daemon scratch — never in the user's workdir). On local_directory
teardown CleanupSidecars walks the manifest, removes the recorded
files, then rmdir-iterates the recorded directories in reverse.
Pre-existing files and directories are deliberately NOT recorded, so
a user-installed .claude/skills/my-own-skill/ sibling — or any
unrelated file the user keeps under .claude/, .github/, etc. — is
preserved bit-for-bit. Non-empty rmdir fails ENOTEMPTY and is
silently skipped, which is the signal that the user owns the
directory.

Daemon wiring lives next to the existing CleanupRuntimeConfig defer
in runTask: runtime brief first, sidecars second. Cloud-mode runs
still write a manifest for symmetry but never trigger the cleanup
(the GC loop wipes envRoot wholesale).

Tests (sidecar_manifest_test.go) cover the round-trip invariant per
the issue's acceptance criteria:

- empty workdir → Prepare → Cleanup → empty workdir, byte-exact, for
  every file-based provider (claude, codex, copilot, opencode,
  openclaw, hermes, pi, cursor, kimi, kiro, antigravity, gemini),
- user's .claude/skills/my-own-skill/ (and equivalents per
  provider) survives Cleanup intact,
- unrelated user files under .claude/, .github/, etc. survive,
- three repeated cycles do not accumulate any orphan state,
- project_resources branch (.multica/project/resources.json) is
  also reversible,
- recordWriteFile refuses to record pre-existing files,
- recordMkdirAll refuses to record pre-existing dirs,
- Cleanup is a no-op when the manifest file is missing.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): refuse to overwrite pre-existing sidecar paths; pick collision-free skill slugs (MUL-2784 review)

Addresses PR #3444 review (Elon):

**Must-fix #1**: recordWriteFile used to overwrite pre-existing target
files unconditionally and only skip the manifest record. That destroys
user bytes at write time AND leaves the corrupted contents in place at
cleanup time — the byte-exact contract the issue requires is violated
on both halves. Fixed by making recordWriteFile detect any pre-existing
entry (regular file, symlink, directory) via Lstat and return a
sentinel errPathPreExists without touching the path. The user's bytes
are preserved verbatim.

For per-skill collisions (user's .claude/skills/issue-review/ vs
Multica's "Issue Review"), writeSkillFiles now allocates a
collision-free sibling slug via allocateCollisionFreeSkillDir: first
attempt is the natural slug, then `<base>-multica`,
`<base>-multica-2`, …, bounded at 64 attempts. Provider-native
discovery still picks the skill up (every subdir under skillsParent is
a distinct skill) and the user's path stays bit-for-bit intact.

For Multica-only namespace files (.agent_context/issue_context.md,
.multica/project/resources.json), the writer swallows errPathPreExists
and continues — the runtime brief already carries every fact those
files would, so a collision degrades to brief-only mode rather than
destroying user content.

**Must-fix #2**: Added byte-exact collision matrix tests covering
every file-based provider (claude / codex / copilot / opencode /
openclaw / hermes / pi / cursor / kimi / kiro / antigravity / gemini):

- TestPrepareThenCleanupSidecarsSameSlugCollisionPerProvider: seeds
  user's `<provider>/skills/issue-review/SKILL.md` plus a private
  notes.md sibling, runs Prepare → Inject → Cleanup, asserts
  workdir snapshot is byte-identical to seed.
- TestPrepareThenCleanupSidecarsIssueContextCollisionPerProvider:
  seeds user's `.agent_context/issue_context.md`, asserts round-trip
  preserves it.
- TestPrepareThenCleanupSidecarsProjectResourcesCollisionPerProvider:
  same for `.multica/project/resources.json`.
- TestPrepareThenCleanupSidecarsMultiSkillCollisionFreeAllocation:
  end-to-end check that the Multica skill lands at the
  collision-free sibling and Cleanup removes only the Multica side.
- TestAllocateCollisionFreeSkillDir: directed unit test pinning the
  slug-bumping sequence.
- TestRecordWriteFileRefusesToOverwritePreExistingFile (was
  TestRecordWriteFileSkipsPreExistingFile): flipped to assert the
  user's bytes survive and errPathPreExists is returned.
- TestRecordWriteFileRefusesToOverwriteSymlinkOrDir: covers the
  Lstat path for non-file entries.

**Should-fix**: CleanupSidecars used to swallow ANY non-ENOENT rmdir
error as "user content present," silently dropping real I/O failures
(EACCES, EPERM, EBUSY). Now it re-reads the directory after a failed
rmdir via the new dirHasEntries helper — non-empty → silently skip
(ENOTEMPTY, the intended branch); empty → genuine error, captured
into firstErr and surfaced. Plus directed tests:

- TestCleanupSidecarsSurfacesRealRmdirErrors
- TestDirHasEntries

Local verification:
- go test ./internal/daemon/execenv/... — all green
- go test ./internal/daemon/... — all green
- go vet ./... — clean

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): surface original rmdir error when post-rmdir ReadDir also fails (MUL-2784 review)

Addresses remaining PR #3444 review blocker (Elon): dirHasEntries used
to return true when ReadDir failed with anything other than ENOENT,
which made CleanupSidecars treat every locked / faulted directory as
ENOTEMPTY and silently drop the original rmdir error. The v1 fix from
the previous round closed the EACCES-on-empty-dir branch but missed
the case where the chmod also blocks ReadDir — exactly the failure
mode the review called out.

Helper change: dirHasEntries now returns (hasEntries, ok bool):

  - (false, true)  — dir exists and is empty (or missing, race-safe)
  - (true,  true)  — dir has user content (the ENOTEMPTY branch)
  - (_,     false) — ReadDir failed (EACCES, ENOTDIR, EIO, …); the
                     caller cannot tell ENOTEMPTY from a real error
                     and MUST surface the original rmdir error

CleanupSidecars switches on (ok, hasEntries):

  - !ok               → surface the ORIGINAL rmdir error (not the
                        ReadDir failure — that's diagnostic plumbing
                        and would distract from the root cause)
  - ok && hasEntries  → swallow silently (intended ENOTEMPTY branch;
                        preserve user content)
  - ok && !hasEntries → surface the rmdir error (empty dir + EACCES /
                        EPERM / EBUSY → genuine cleanup failure)

Tests:

  - TestDirHasEntries: extended with a regular-file sub-case (ReadDir
    returns ENOTDIR) asserting (false, false). The v1 helper returned
    (true) here, hiding the bug.
  - TestCleanupSidecarsSwallowsMissingAndNonEmptyDirs: renamed from
    TestCleanupSidecarsSurfacesRealRmdirErrors. The old name claimed
    to test the surfacing path but never actually exercised it.
  - TestCleanupSidecarsSurfacesEACCESOnEmptyRecordedDir: chmod parent
    to 0o555 so rmdir(recorded) fails EACCES while ReadDir(recorded)
    still succeeds (empty). Asserts firstErr is non-nil and references
    both the recorded path and the rmdir branch. Skipped when running
    as root (chmod is bypassed for uid 0).
  - TestCleanupSidecarsSurfacesEACCESWhenReadDirFailsToo: the must-fix
    case — chmod parent 0o555 AND chmod recorded 0o000 so BOTH rmdir
    and ReadDir fail. The surfaced error must be the ORIGINAL rmdir
    failure, not the ReadDir one. Skipped on uid 0.

Local verification:
- go test ./internal/daemon/execenv/... — all green
- go test ./internal/daemon/... — all green
- go vet ./... — clean

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-28 17:22:47 +08:00
Bohan Jiang
03f70209c4 fix(daemon): preserve user CLAUDE.md / AGENTS.md / GEMINI.md in local_directory runs (#3438)
* fix(daemon): preserve user CLAUDE.md / AGENTS.md / GEMINI.md in local_directory runs (MUL-2753)

InjectRuntimeConfig previously called os.WriteFile unconditionally, which
truncated whatever file lived at the same path. For the local_directory
project_resource flow the workdir is the user's own repo, so the agent
silently destroyed any repo-level CLAUDE.md / AGENTS.md / GEMINI.md the
first time it ran in that directory, and the daemon's local-directory
cleanup explicitly skips the user's path so the file was never restored.

Write the brief inside a marker block instead:

  <!-- BEGIN MULTICA-RUNTIME (auto-managed; do not edit) -->
  ...brief...
  <!-- END MULTICA-RUNTIME -->

writeRuntimeConfigFile handles three states:

- file missing  -> create with just the marker block,
- file present, no marker block -> append the marker block at the end
  (preserves user-authored content above), and
- file present, marker block already there -> replace the block body in
  place so repeated runs don't grow the file unboundedly.

This is the short-term fix called out on MUL-2753. The sidecar question
(.agent_context/, .claude/skills/, .multica/project/resources.json) is
left for a follow-up — those files don't overwrite user content, just
litter the workdir.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): cleanup runtime config marker block after local_directory tasks (MUL-2753)

Address Elon's review on PR #3438:

1. Add `CleanupRuntimeConfig` and wire it into the daemon's task path so
   `local_directory` runs excise the marker block on the way out. Without
   it, a user's subsequent manual `claude` / `codex` / `gemini` run in
   the same directory picks up the previous task's stale brief (issue
   id, trigger comment id, reply rules) and acts on the wrong context.
   Cloud workspace runs skip the cleanup — their scratch workdir is
   wiped by the GC loop anyway.

2. If excising the block would leave the file empty / whitespace-only,
   the file is removed so we don't leave behind a stub the user has to
   delete by hand. Surviving user content is preserved byte-for-byte.

3. Harden the marker parser: search for the end marker strictly after
   the begin marker. The previous `strings.Index` pair mishandled two
   malformed cases —
     - a stray end marker before any begin (e.g. user pasted a
       documentation snippet showing the wire format) would cause
       every run to stack another block, growing the file unboundedly;
     - a half-block left by a previous crashed run would cause every
       subsequent run to append a fresh block beneath the half-block.
   The `locateMarkerBlock` helper now anchors the end search past the
   begin offset, and treats "begin found, no end after" as "block runs
   to EOF" so the next write replaces it cleanly.

Centralised the provider→filename mapping in `runtimeConfigPath` so
Inject and Cleanup can't drift past each other when a new provider is
added.

Tests cover: parser hardening (stray-end-before-begin idempotency,
half-block recovery), Cleanup happy path / file removal / no-op cases /
malformed half-block / per-provider mapping, and an end-to-end
inject→cleanup round trip that locks in byte-identical restoration of
the user's pre-injection file.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): byte-exact inject/cleanup round trip for runtime config (MUL-2753)

Address Elon's second-round review on PR #3438. The previous cleanup
relied on `TrimRight + "\n"` for trailing newlines and `TrimSpace == ""`
for file removal — both compensated for the inject path's "normalise
trailing newlines so there's always exactly `\n\n` before the block"
step, but they did so by mutating the user's bytes. The result was a
real diff on three boundary cases:

  - file ended without a newline (`rules`) → cleanup added one;
  - file ended with two or more newlines (`rules\n\n`) → cleanup
    collapsed to a single newline;
  - file pre-existed but was empty / whitespace-only → cleanup
    deleted it.

Reshape the contract so the bytes inject adds are the exact bytes
cleanup removes, with no user-byte mutation in between:

  - Define `runtimeManagedSeparator = "\n\n"` as a fixed managed
    separator that inject always inserts (unconditionally — including
    for files that already end in two or more newlines) between
    pre-existing user content and the marker block.
  - Inject's missing-file branch still writes the block alone (no
    separator); that absence is the marker Cleanup uses to identify
    "we created this file from scratch" and is the only condition
    under which Cleanup is allowed to `os.Remove` the file.
  - Cleanup detects `HasSuffix(pre, runtimeManagedSeparator)` and
    strips exactly those bytes; whatever remains is written back
    verbatim with no `TrimRight` / `TrimSpace`, so the pre-injection
    bytes survive exactly.

The replace-in-place branch is untouched — the managed separator
established by the first inject lives in pre and survives across
subsequent runs, so byte-exactness is preserved through arbitrary
inject→inject→cleanup chains.

Tests:

  - `TestInjectThenCleanupRoundTripByteExactBoundaries` parameterises
    9 seed shapes (missing file, empty, whitespace-only, no trailing
    newline, one trailing newline, two trailing newlines, many
    trailing newlines, CRLF line endings, no final newline with
    embedded blank lines) and asserts byte-identical round trip
    across two full cycles.
  - `TestInjectReplaceThenCleanupRestoresByteExact` covers the
    replace-in-place branch for the same boundary seeds.
  - `TestWriteRuntimeConfigFileAlwaysInsertsFixedManagedSeparator`
    pins the new invariant at the source: regardless of seed shape,
    inject emits `<seed><\n\n><marker block>` with no normalisation.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-28 16:15:07 +08:00
Naiyuan Qing
5e78e5100a feat(comments): since-delta new-comment hint + default-on comment session resume (#3432)
* feat(db): add unresolved comment count + list filter queries

Add CountUnresolvedComments (excludes the agent's own comments) and
ListUnresolvedCommentsForIssue. Both are additive — existing callers stay
on the unfiltered queries — so old clients are unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handler): support unresolved-only comment listing

Wire an additive `unresolved` query param into ListComments. Defaults off
so an old CLI that never sends it gets unchanged behavior; only true/1
enable it. Rejects combining unresolved with thread/recent (whole-issue
filter vs navigation models). Includes filter + count query tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handler): plumb unresolved count + thread root into claim, gate comment resume

Populate trigger_parent_id (thread root of the trigger comment) and
unresolved_count (excludes the agent's own comments) on comment-triggered
claim responses. Both fields are omitempty so old daemons ignore them.

Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION
(default off): resumed comment turns can inherit the prior turn's "Done."
final message, so this stays an explicit rollout switch. The runtime-match
and poisoned-session guards still apply regardless of the flag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(daemon): inject unresolved-comments hint + resolve step into agent brief

Add a shared BuildUnresolvedCommentsHint helper rendered on both the
per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It
ships only the count and the relevant CLI call — never comment bodies — so
the server stays cheap. Thread case points at --thread <root>; issue case
points at --unresolved. Suppressed when the count is 0.

Also add a workflow step telling the agent to `multica comment resolve
<thread-root>` once a thread is fully handled, so the unresolved set
converges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): add comment list --unresolved and comment resolve command

Add an --unresolved filter to `issue comment list` (wired to the server's
unresolved param, rejected when combined with --thread/--recent) and a
top-level `comment resolve <id>` command that POSTs to the existing
/api/comments/{id}/resolve endpoint, letting an agent close threads it has
fully handled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(comments): since-delta new-comment hint + default-on comment resume

Simplifies the comment-triggered agent flow down to what's actually needed:

- New-comment awareness is now a pure time delta: the claim response carries
  new_comment_count + new_comments_since (anchored on the prior run's
  started_at, never completed_at so a long run can't miss comments). The
  per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s)
  since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the
  two surfaces can't drift. Cold start (no prior run) falls back to a plain read.
- Comment-triggered tasks resume the prior session by default (same runtime),
  dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS
  comment" prompt guard defends against inheriting the prior turn's "Done."
  marker; GetLastTaskSession still excludes poisoned sessions.
- Drops the resolved-based machinery from the first draft: CountUnresolvedComments
  / ListUnresolvedCommentsForIssue queries, the `comment list --unresolved`
  flag, the `multica comment resolve` command, and the resolve workflow step.
- Removes the verbose cursor-pagination paragraph from the comment prompt; the
  --thread/--recent/--since flags stay in the CLI/API, just no longer explained
  inline every turn.

Compatibility: new claim fields are omitempty (old daemons ignore them).
Comment resume is default-on and affects even old daemons, which already
consume prior_session_id.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 15:58:42 +08:00
Bohan Jiang
bae8a84abd MUL-2767 feat(agent): add Antigravity runtime backend (#3427)
* feat(agent): add Antigravity runtime backend

Adds Google's Antigravity CLI (`agy`) as the 12th supported coding-tool
runtime, alongside Claude / Codex / Cursor / Copilot / Gemini / Hermes /
Kimi / Kiro / OpenCode / OpenClaw / Pi.

The CLI emits plain assistant text on stdout (no structured event
stream), so the backend streams stdout line-by-line as `MessageText`
events and accumulates the same text as the final `Result.Output`.
Session resumption uses `--conversation <id>`; because the conversation
UUID is not echoed on stdout, the daemon routes `--log-file` to a temp
file and recovers the id from the glog-formatted log lines.

MUL-2767

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): correct Antigravity capability contract from Elon review

- ModelSelectionSupported now returns false for antigravity. `agy` has no
  --model flag and antigravityBackend deliberately drops opts.Model, so
  the UI must render a disabled "Managed by runtime" picker instead of
  an empty dropdown plus a silently-ignored manual-entry field. Also
  stop seeding AgentEntry.Model from MULTICA_ANTIGRAVITY_MODEL — the
  backend would silently ignore it.

- Antigravity skills now write to {workDir}/.agents/skills/, the CLI's
  native workspace path (inherits Gemini CLI's layout per
  https://antigravity.google/docs/gcli-migration). Previously they went
  to the .agent_context/skills/ fallback that the CLI doesn't scan.
  Runtime brief moves antigravity into the native-discovery branch and
  local_skills.go points the user-level skill root at
  ~/.gemini/antigravity-cli/skills for Runtime → local skill import.

- Doc + UI comment sync: providers matrix / install-agent-runtime /
  cloud-quickstart / agents-create / tasks (session-resume support) /
  skills / README all now list Antigravity in the right buckets, and
  the model-picker / model-dropdown comments cite antigravity (not the
  stale hermes reference) as the supported=false example.

New tests: TestAntigravityModelSelectionUnsupported,
TestInjectRuntimeConfigAntigravity (native discovery wording),
TestWriteContextFilesAntigravityNativeSkills (.agents/skills/ landing,
.agent_context/skills/ NOT written).

Co-authored-by: multica-agent <github@multica.ai>

* feat(provider-logo): swap inline placeholder for real Antigravity PNG

Replaces the hand-drawn planet+arc placeholder with the official asset
shipped from Downloads. Stored next to the component; bundlers
(Next.js / electron-vite) resolve the PNG import to a URL string at
build time. Added a small assets.d.ts so packages/views' tsc accepts
PNG / SVG module imports — there was no prior asset usage in this
package to register the declaration.

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-28 15:40:05 +08:00
Bohan Jiang
341ce7bfa5 feat: support local working directory for projects (MUL-2618 v1) (#3283)
* feat(project): add local_directory project_resource type (MUL-2662)

Adds a second project_resource type alongside github_repo so a project
can be pinned to an existing directory on a specific daemon (the v1 of
the local-working-directory flow tracked in MUL-2618). The ref schema is
{ local_path, daemon_id, label? }; local_path must be absolute and
daemon_id is required. The same (daemon_id, local_path) pair is allowed
on multiple projects by design — no UNIQUE constraint is added.

Implementation reuses the existing project_resource API surface: the new
type is wired through the validator switch with no migration, no new
events, and no daemon-handler changes (daemon already passes through
arbitrary resource types via ProjectResources). The CLI gains
--local-path / --daemon-id / --ref-label shortcuts so
`multica project resource add --type local_directory` mirrors the
existing `--type github_repo --url ...` ergonomics; the generic --ref
flag still works for both types.

Tests cover the full CRUD lifecycle, the same-path-across-projects
allowance, the same-path-same-project conflict, the validator rejections
(missing/blank/relative path, missing daemon_id, wrong payload type),
and the cross-platform isAbsoluteLocalPath helper.

Co-authored-by: multica-agent <github@multica.ai>

* feat(project): add update endpoint + label-shadow guard for project_resource (MUL-2662)

Addresses the Elon review on PR #3263:

- Add PUT /api/projects/{id}/resources/{resourceId} with sqlc query,
  matching handler, CLI `project resource update`, and a new
  EventProjectResourceUpdated WS event. resource_type stays immutable;
  ref/label/position are all individually optional.
- Catch same-project (daemon_id, local_path) collisions where only the
  embedded label differs — the row-level UNIQUE only matches the full
  ref JSON, so a label typo would otherwise let the same working
  directory bind twice.
- Tests cover the update lifecycle (label-only / ref / clear / 404 /
  invalid path) and the label-shadow conflict on both create and
  update; the in-place rename still succeeds because the conflict
  scan ignores the row being edited.

Incidental: regenerating sqlc picked up a missing skills_local scan in
UpdateAgentCustomEnv that drifted in from #3200.

Co-authored-by: multica-agent <github@multica.ai>

* fix(project): close bundled-create label-shadow gap + merge resource_ref on CLI update (MUL-2662)

Two follow-ups from MUL-2662 review round 2:

- CreateProject inline resources path now dedupes local_directory entries on
  (daemon_id, local_path) before opening the transaction. The DB-level
  UNIQUE(project_id, resource_type, resource_ref) constraint only fires on a
  full JSON match, so two rows with the same target but different `label`
  would otherwise slip past. Standalone POST/PUT already cover this via
  findLocalDirectoryConflict; bundled create was the missing surface.
- `multica project resource update` now seeds resource_ref from the existing
  row before applying per-type shortcut flags, so `--default-branch-hint x`
  on its own no longer constructs a payload missing `url` (which the server
  400s on). Local_directory partial edits get the same merge behavior.

Co-authored-by: multica-agent <github@multica.ai>

* feat(desktop): local_directory project_resource UI (MUL-2665) (#3273)

* feat(desktop): local_directory project_resource UI (MUL-2665)

First UI surface for the local-working-directory flow tracked in MUL-2618.
Lets users on the desktop pin a project to an existing folder on this
machine; web stays read-only since the per-daemon check can't be done in
the browser.

What's new for the renderer:

- ProjectResourcesSection grows a desktop-only "Add local directory"
  button next to the existing GitHub-repo popover. Clicking it opens
  Electron's native folder picker, validates the path through a new
  IPC pair (existence + r/w), and submits a project_resource of
  resource_type=local_directory with daemon_id pulled live from
  daemonAPI.getStatus.
- LocalDirectoryRow renders the rename pencil + path tooltip, and
  greys out when ref.daemon_id != this machine's daemon_id (with a
  "only available on the machine that registered this directory"
  tooltip). Delete stays enabled so users can drop stale registrations
  from any device.
- LocalDirectoryHint sits above the issue-detail comment composer and
  shows "Agent will work in-place at {label} ({path})" when the issue's
  project has a local_directory matching this daemon. Hidden on web.
- TaskStatusPill picks up a new "waiting_for_directory_release" stage
  that the daemon will publish when it dequeues a task but can't
  acquire the path lock. The render is in place now so the daemon
  sibling subtask can wire the status string without an additional UI
  PR.

Plumbing:

- @multica/core/types gains LocalDirectoryResourceRef +
  UpdateProjectResourceRequest, and the api client gets the matching
  PUT method backed by the server endpoint that landed in
  2ac3faebb (MUL-2662). A useUpdateProjectResource hook drives the
  in-place label edit.
- New Electron handlers under apps/desktop/src/main/local-directory.ts:
    local-directory:pick     -> dialog.showOpenDialog (openDirectory)
    local-directory:validate -> stat + access(R_OK + W_OK)
  exposed through the preload as desktopAPI.pickDirectory /
  validateLocalDirectory. View code talks to them via a thin
  packages/views/platform helper that returns reason=unsupported on
  web instead of crashing.
- useLocalDaemonStatus exposes the local daemon's id, device name, and
  running flag from daemonAPI.onStatusChange so the renderer can do the
  cross-device match without coupling to the desktop preload typings.

Tests:

- pickStageKeys gets a unit test covering the new stage and proving
  the directory-release status outranks availability hints.
- LocalDirectoryHint tests cover the four render branches (no project,
  no daemon, foreign daemon, matching daemon).
- i18n parity stays green; new keys added under projects.resources.*
  and chat.status_pill.stages.waiting_for_directory_release in both
  locales.

Out of scope (will land separately):
- The daemon-side waiting/lock signal that flips the pill into the
  new state.
- Adding local_directory to the create-project modal's bulk
  attach flow.
- Docs page refresh for project-resources.mdx — left for the
  MUL-2618 umbrella sweep.

Co-authored-by: multica-agent <github@multica.ai>

* fix(desktop): hide rename for foreign daemon local_directory rows (MUL-2618)

Address review nit on #3273: the rename pencil was gated only by
`canEdit`, so a foreign / unknown-daemon row still showed it even
though the spec says cross-device rows are disabled. Gate rename on
`!mismatch` so it disappears on those rows; delete stays available
so a stale registration can still be dropped from any device.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663) (#3274)

* feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663)

Wires up the daemon side of the local_directory project_resource introduced
in MUL-2662. When a task is dispatched against a project whose resources
include a local_directory pinned to this daemon's UUID, the daemon now:

  - Validates the path (absolute, exists, daemon process can read+write,
    not in the system-root / $HOME blacklist) and fails the task fast on
    any precondition violation, with a user-readable reason.
  - Serialises concurrent tasks on the same on-disk path via a
    daemon-local LocalPathLocker keyed by symlink-resolved realpath. The
    lock is held for the entire task lifetime (claim → context write →
    agent → result report).
  - When the lock is contended, the daemon flips the row to a new
    waiting_local_directory status on the server (carrying a wait_reason
    like "<path> (held by task <short id>)") so the UI can render
    "等待本地目录释放" instead of leaving the row silently in dispatched
    past the sweeper timeout. The status accepts being woken into running
    once the lock is acquired.
  - Sets execenv.WorkDir to the user's path (no copy, no mount). envRoot
    still lives under workspacesRoot/<wsID>/ and hosts output/, logs/, and
    .gc_meta.json — the daemon's logbook for the run.
  - Stamps GCMeta.LocalDirectory=true so the GC loop never RemoveAlls
    envRoot for these tasks (gcActionClean → gcActionCleanArtifacts,
    gcActionOrphan → gcActionSkip). The user's directory was never under
    envRoot to begin with, so this is defense in depth.
  - Skips execenv.Reuse for local_directory tasks because the prior
    WorkDir is the user's path and reusing it through that code path
    loses the envRoot association the GC loop needs. Prepare is cheap
    here (no clone, no copy), so always running it is fine.

Server-side protocol changes:

  - New CHECK value 'waiting_local_directory' on agent_task_queue.status
    plus a wait_reason TEXT column (migration 109).
  - All cancel / active / counted-as-running / orphan-recovery queries
    expanded to include the new status; FailStaleTasks intentionally
    excludes it (the daemon owns the wait).
  - New SQL MarkAgentTaskWaitingLocalDirectory(id, reason) and a relaxed
    StartAgentTask that accepts both dispatched and
    waiting_local_directory as preconditions (and clears wait_reason on
    the way through).
  - New POST /api/daemon/tasks/{taskId}/wait-local-directory endpoint,
    TaskService.MarkTaskWaitingLocalDirectory broadcaster, and matching
    daemon Client.MarkTaskWaitingLocalDirectory.

Tests cover: path blacklist + R/W enforcement, mutex serialisation +
ctx-cancelled wait, lock handover between two tasks, GC never returns
gcActionClean / gcActionOrphan for local_directory rows (with negative
control for the standard path), and Prepare/Cleanup correctly substitute
+ protect the user's WorkDir.

The desktop UI side (UI for adding a local_directory resource, surfacing
the "等待本地目录" badge) is MUL-2665; the agent-task lifecycle changes
(no branch switch, dirty-tree tolerant, auto-commit) are MUL-2664.

This PR targets the shared MUL-2618 v1 feature branch agent/j/912b8cb1,
not main; the whole v1 will be merged to main together when complete.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): tighten local_directory status, symlink, cancel handling (MUL-2618)

Address the 3 must-fix items from Elon's review of PR #3274.

1. Status string unified. The server / daemon publish
   `waiting_local_directory`; align views, locales, and the
   pickStageKeys test (PR #3273 had used `waiting_for_directory_release`
   on a placeholder string). Without this, the daemon's wait state
   never reached the pill once the two siblings merged.

2. validateLocalPath now also runs the blacklist against the
   symlink-resolved realpath, with macOS's `/etc` -> `/private/etc`
   redirect handled via `isBlacklistedRealPath` which compares
   canonical forms. Without this, a symlink such as
   `/Users/me/proj/home -> /Users/me` slipped the literal $HOME check
   while every daemon write still landed in the user's home. Tests
   cover symlink-to-home, symlink-to-system-root, and the negative
   case (symlink to a regular subdirectory).

3. acquireLocalDirectoryLockIfNeeded now spins up a cancellation
   watcher inside `onWait` (lazy — the fast path stays free) so the
   gap between dispatch and StartTask responds to server-side cancel
   or row deletion. If the watcher fires while the daemon is parked
   on the path mutex, the lock-wait context is cancelled, Acquire
   returns promptly, and the helper exits silently the same way the
   run-phase poller does. New TestAcquireLocalDirectoryLock_CancelDuringWait
   exercises the path end-to-end with a fake server.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): unconditional canonical blacklist + Windows drive-root generalisation (MUL-2618)

- validateLocalPath now always runs isBlacklistedRealPath on the
  symlink-resolved path, not only when it differs from absPath. The old
  guard let users type the canonical form of an OS-symlinked banned root
  (e.g. /private/tmp, /private/etc, /private/var on macOS) straight
  through, since EvalSymlinks is a no-op on already-canonical input.
- Windows drive-root rejection moved off the static C/D/E/F enumeration
  onto filepath.VolumeName via a new isDriveRoot helper, so removable /
  network drives mounted at G:..Z: and UNC \\server\share roots are also
  blocked. systemRootBlacklist keeps the well-known C:\ trees only.
- Tests: macOS-only case exercises direct /private/{tmp,etc,var}; a
  new TestIsDriveRoot covers the Windows generalisation (skipped on
  POSIX runners by runtime guard).

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* feat(views): wire waiting_local_directory end-to-end in issue UI + presence (MUL-2618)

Connect the daemon-emitted `task:waiting_local_directory` and `task:running`
events through to issue execution log, sticky agent banner, activity indicator,
and agent presence so a parked task is no longer invisible on the issue page.

- Add `waiting_local_directory` to `AgentTask.status` and the typed
  `task:running` / `task:waiting_local_directory` WS event payloads.
- Chat realtime sync writes both new statuses into the pending-task cache so
  the chat StatusPill flips out of a stale `dispatched` frame.
- ExecutionLogSection: count `waiting_local_directory` as active, add tone +
  status label, treat parked tasks the same as dispatched for time anchor /
  transcript visibility / terminate-confirm note.
- AgentLiveCard: subscribe to both new events, rank the parked state between
  dispatched and queued, and surface a "is waiting for the local directory"
  banner with the muted "Clock" treatment used for queued.
- IssueAgentActivityIndicator: route parked tasks into the queued bucket so
  the hover stack and chip stay visible.
- derive-presence: parked tasks count toward `queuedCount` so the agent
  workload chip stays out of `idle` while the daemon waits on the path lock.
- Locales: add `agent_live.is_waiting_local_directory` and
  `execution_log.status_waiting_local_directory` (en + zh-Hans).

Co-authored-by: multica-agent <github@multica.ai>

* feat(project): enforce one local_directory per (project, daemon) (MUL-2618)

The daemon-side resolver picks the first matching local_directory by
daemon_id, so allowing two rows on the same daemon — even at different
paths — let the agent silently write into whichever sorted first. Tighten
the invariant top to bottom:

- server: `findLocalDirectoryConflict` rejects any second row sharing a
  daemon_id, regardless of `local_path` or label. Bundled-create surface in
  `CreateProject` runs the same daemon-scoped dedupe up front.
- daemon: `findLocalDirectoryAssignment` fails fast when it finds more than
  one row pinned to the current daemon (older API client / direct DB
  writes can still produce that state — refuse to guess).
- desktop UI: hide the "Add local directory" action once the current
  daemon owns a row on this project, with a hint and a defensive toast on
  the call path; foreign-daemon rows stay visible read-only as before.
- Tests:
  * daemon: new `two local_directory rows on this daemon fail fast` /
    `local_directory rows on different daemons coexist` cases.
  * handler: rewrite the legacy `LabelShadow` cases as
    `DaemonScopedConflict` / `BundledLocalDirectoryDaemonConflict` —
    asserts 409 on same-daemon different-path, 201 on per-daemon bundles.
- Locales: en + zh-Hans copy for the new hint + toast.

Co-authored-by: multica-agent <github@multica.ai>

* chore(sqlc): drop stale skills_local in UpdateAgentCustomEnv (MUL-2618)

Follow-up to the main-merge in 0f8e8ca7: the auto-merge preserved most
of main's skills_local revert but kept the column reference inside the
UpdateAgentCustomEnv scanner because that block hadn't been touched by
either side. Re-running `sqlc generate` regenerates the file without
skills_local in this query, matching the rest of the file and the
post-revert schema.

Co-authored-by: multica-agent <github@multica.ai>

* feat(create-project): binary source picker — repos OR local directory

Turn the create-project dialog's "Repos" pill into a binary Source
picker. A project's source is mutually exclusive: either a set of
GitHub repos (worktree mode, default) or a single local working
directory (local mode, desktop-only). Mirrors the constraint the
backend will enforce next.

Behavior:
- Pill shows the active mode's selection (GitHub icon + repo count, or
  folder icon + local label/path).
- Popover has a 2-tab segmented control at the top; the Local tab is
  hidden entirely on web (local_directory needs a daemon_id).
- Local tab requires the daemon online — amber notice + disabled picker
  when offline, re-renders automatically via useLocalDaemonStatus.
- Switching tabs preserves the other side's stash, but handleSubmit
  only emits the resource matching the active sourceMode, so abandoned
  picks never leak into the created project.

Backend mutual-exclusion validation + the resources-section
conditional-add-button still to come — this PR just unblocks the
dialog so it can be demoed.

* fix(mobile): cover waiting_local_directory in run row status maps (MUL-2618)

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Multica J <j@multica.ai>
2026-05-27 13:44:31 +08:00
Multica Eve
744b474199 revert(agent): remove per-agent local skill toggle (MUL-2603) (#3286)
* Revert "feat(agents): hide skills_local toggle for runtimes that don't honour it (MUL-2603) (#3276)"

This reverts commit 0b50c5a209.

Co-authored-by: multica-agent <github@multica.ai>

* Revert "fix(agent): surface host OAuth token via env var on macOS isolation (MUL-2603) (#3267)"

This reverts commit a67bf81225.

Co-authored-by: multica-agent <github@multica.ai>

* Revert "fix(agents): tighten skills-tab intro and drop redundant import hint (#3265)"

This reverts commit d8075a5775.

Co-authored-by: multica-agent <github@multica.ai>

* Revert "fix(agent): mirror $HOME/.claude.json into isolated config dir (MUL-2661) (#3261)"

This reverts commit 40da88fc16.

Co-authored-by: multica-agent <github@multica.ai>

* Revert "feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200)"

This reverts commit 960befa56f.

Co-authored-by: multica-agent <github@multica.ai>

* Add migration cleanup for reverted agent skills toggle

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-26 17:00:01 +08:00
Bohan Jiang
6bc3d14eb3 fix(daemon/execenv): refresh stale Codex config copies across env reuse (MUL-2646) (#3268)
* fix(daemon/execenv): refresh stale Codex config copies across env reuse (MUL-2646)

`copyFileIfExists` previously short-circuited whenever the per-task
`codex-home/{config.toml,config.json,instructions.md}` already existed,
so once the files were seeded at first Prepare they were never refreshed
again — even though `Reuse()` calls `prepareCodexHomeWithOpts` on every
resume. A user who rotated their Codex `~/.codex/config.toml` between
runs (e.g. switching the active `[model_providers.X]` `base_url`, or
pointing `env_key` at a freshly rotated API key) kept reading the stale
per-task copy on session resume. Codex then issued requests to the new
URL using the old key and the API rejected the token.

Treat any existing `dst` as something to drop and re-copy from the
current shared source, mirroring the symlink path that already refreshes
`auth.json` (#2126). The daemon-managed sandbox / multi-agent / memory
blocks are applied via marker-bracketed idempotent passes after the
copy, so a re-copy + re-ensure cycle preserves them.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon/execenv): drop per-task Codex copy when shared source removed (MUL-2646)

Extend the MUL-2646 fix to the deletion arm of "sync the shared source":
`syncCopiedFile` (renamed from `copyFileIfExists`) now also removes the
per-task `dst` when the shared `src` is absent. The prior version
short-circuited on missing src and left `config.toml` / `config.json` /
`instructions.md` from the previous Prepare lingering in the per-task
home — so a user who removed a provider by deleting `~/.codex/config.toml`,
or pulled `config.json` / `instructions.md` out of the shared home, would
keep replaying the stale copy on session resume.

For `config.toml` the subsequent `ensureCodex{Sandbox,MultiAgent,Memory}Config`
passes recreate the file with only the daemon-managed default blocks, so
removing the shared file cleanly drops every user-managed
`[model_providers.X]` / `model_provider` line. For `config.json` and
`instructions.md` there is no daemon default, so they disappear in
lockstep with the shared source.

Adds `TestPrepareCodexHome_DropsCopiedConfigWhenSharedSourceRemoved`
covering the new path, and extends the refresh-arm test to assert the
multi-agent / memory marker blocks are still present after the copy is
refreshed.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-26 14:52:53 +08:00
Bohan Jiang
960befa56f feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200)
* feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603)

Adds an agent-scoped `skills_local` switch ("ignore" default / "merge") so
shared agents stop inheriting the operator's user-global Claude skill
directory. A single broken local skill on one operator's machine was
crashing the Claude CLI before it ever read stdin — the daemon saw a
"broken pipe" with no recoverable signal (GitHub #3052).

- DB: migration 108 adds `agent.skills_local` (NOT NULL DEFAULT 'ignore'),
  with sqlc CreateAgent/UpdateAgent updates and handler validation.
- Claude runtime: when the agent is in "ignore" mode the backend points
  CLAUDE_CONFIG_DIR at an empty per-task scratch dir under the task cwd
  (fallback: OS temp), strips any inherited override, and cleans up after
  the run. Workspace skills under `{cwd}/.claude/skills/` still load.
  "merge" preserves the legacy inherit-from-machine behavior; Codex and
  other isolated backends are no-ops.
- UI: new Skills toggle in the Create Agent dialog and the Agent → Skills
  tab, with EN/zh-Hans copy and SkillsLocalToggle shared between the two.
- Tests: unit coverage for the new env helper, isolation dir lifecycle,
  full Claude execute paths (ignore + merge), and the handler tristate
  contract. Existing skills-tab test updated for the new copy.
- Docs: updated `/skills` docs (EN + ZH) and added a 0.3.7 changelog entry
  in the landing-page i18n.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): preserve claude login + validate skills_local input (MUL-2603)

Address Elon's review on PR #3200:

1. Skill isolation no longer drops the operator's Claude login. The
   per-task scratch dir now mirrors every entry under `~/.claude/`
   as symlinks except `skills/`, so `.credentials.json`, settings,
   plugins, etc. reach the CLI exactly as on the host while the
   user-global skills directory stays hidden. Without this, default
   `ignore` would have broken every Claude agent on a non-API-key
   host the moment migration 108 landed.

2. Internal CreateAgent callers (agent_template, onboarding_shim)
   now set `SkillsLocal: "ignore"`. The Go zero value was about to
   trip the migration-108 CHECK constraint and 500 template /
   onboarding agent creation.

3. Create / update handler validation no longer normalizes garbage
   to "ignore". The strict 400 path is now reachable on bad client
   input; the drift-safe `normalizeSkillsLocal` stays on the read
   side only.

UI copy + docs clarified that the toggle is Claude-only; other
runtimes ignore the setting.

Verification:
- `go test ./...` green (full suite locally).
- `pnpm --filter @multica/views exec vitest run agents/components/tabs/skills-tab.test.tsx` green.
- Handler DB-backed tests still skip locally without docker (same
  as Elon's run) — CI will validate the create / update paths
  against migration 108.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): mirror effective claude config dir with windows fallback (MUL-2603)

Address Elon's second-round review on PR #3200:

1. The per-task scratch dir now mirrors the *effective* host Claude
   config dir, not unconditionally `~/.claude/`. Precedence: agent
   `custom_env` CLAUDE_CONFIG_DIR > parent process env > `~/.claude/`.
   Without this, an operator who pinned Claude at a managed install
   (custom env CLAUDE_CONFIG_DIR) would get the wrong credentials in
   the scratch dir, because `buildClaudeEnv` strips that env before
   handing it to the child. We resolve the source up front and feed
   it to the mirror, so the override env still points at the right
   bytes.

2. Mirror entries now go through platform-aware linkers. On Windows
   without Developer Mode / admin, `os.Symlink` is denied, which
   previously left the scratch dir empty and broke Claude Code auth
   on default `ignore`. The new helpers try symlink first, then fall
   back to a directory junction (`mklink /J`) for dirs or a hardlink
   (same-volume content share) / copy for files. Mirrors the
   execenv/codex_home_link_windows.go pattern.

3. Tests:
   - `TestResolveHostClaudeConfigDir` locks in the custom_env >
     parent_env > `~/.claude` precedence.
   - `TestNewIsolatedClaudeConfigDirMirrorsCustomHostDir` confirms
     the scratch dir picks up `.credentials.json` from a synthetic
     custom host dir, proving the source resolution actually
     propagates into the mirror.
   - `TestNewIsolatedClaudeConfigDirEmptyHostIsNoop` documents the
     env-var-auth-only case (no host source ⇒ empty scratch dir).
   - `TestMirrorHostClaudeExceptSkillsWith_FallbackWhenSymlinkFails`
     exercises the Windows-no-Developer-Mode path via the new
     `mirrorHostClaudeExceptSkillsWith` seam, asserting credentials
     and sub-dir children still reach the scratch dir after the
     symlink stand-in fails.
   - `TestMirrorHostClaudeExceptSkillsWith_PropagatesFirstLinkError`
     confirms callers see the per-entry error when even fallback
     fails (so the warn-log fires on broken Windows installs).
   - `TestCopyFileRoundTrip` covers the last-resort copy fallback
     and its EXCL no-overwrite contract.
   - `TestClaudeExecuteIsolatesUsesCustomEnvSource` is the
     end-to-end check: an agent with custom_env CLAUDE_CONFIG_DIR
     reads its credentials from the pinned dir, not `~/.claude/`.

4. Docs: `apps/docs/content/docs/skills.{mdx,zh.mdx}` updated to
   describe the effective-source resolution and the Windows
   fallback chain so the docs match the runtime behaviour.

Verification:
- `go test ./...` green (full server suite locally, including
  `pkg/agent` 23 cases covering the new + existing isolation
  paths).
- `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and
  `go test -c -o /dev/null` both compile clean, confirming the
  Windows-tagged linker file builds.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): default skills_local to merge to preserve legacy behavior (MUL-2603)

Per Bohan's product decision on PR #3200, the per-agent host-skill toggle
defaults to "merge" — the pre-MUL-2603 inherit-from-machine behavior —
so existing personal workflows that rely on locally installed Claude
Skills keep working unchanged. Agent owners explicitly opt into "ignore"
when they need to harden a shared agent against a broken local skill on
one operator's machine (GitHub #3052).

Also audited all 11 runtimes for user-global skill discovery paths and
documented the scope of the toggle. Only Claude reads a user-global
`~/.claude/skills/`; Codex isolates via `CODEX_HOME`, the ACP backends
(Hermes / Kimi / Kiro) and the JSON-stream backends (Copilot / Cursor /
Gemini / Pi / OpenCode / OpenClaw) anchor discovery to the task workdir
and never read a user-global skill directory. UI copy and docs now say
"for runtimes that support it (currently Claude Code)" everywhere so
the scope is explicit.

Changes:

- Migration 108: column default flipped to 'merge'.
- Handler CreateAgent: missing field → "merge"; explicit "ignore" /
  "merge" still validated, garbage still 400.
- normalizeSkillsLocal: drift-safe coercion now lands on "merge" for
  anything that isn't the exact literal "ignore".
- agent_template.go / onboarding_shim.go: internal CreateAgent callers
  send "merge" instead of "ignore" to match the new default.
- Claude runtime (`claude.go`): isolate-mode gate flipped from
  `SkillsLocal != "merge"` to `SkillsLocal == "ignore"`, so "" (legacy
  daemons / older clients) and "merge" both walk `~/.claude/` directly.
- Create Agent dialog + Skills tab: toggle defaults to on (merge); only
  duplicate of an explicit "ignore" agent carries through. The
  isolation opt-in is now `skills_local: "ignore"` when the user flips
  off; "merge" is omitted from the request body.
- i18n (EN + zh-Hans): copy reframed — "On (default) — merged"; "Off —
  ignored. Recommended for shared agents".
- Docs (`/skills`, `/guides/agents.zh`): describe new default and
  enumerate which runtimes act on the toggle.
- Landing changelog 0.3.7: retitled "Per-Agent Local-Skill Toggle"; note
  the on-by-default behavior + off-to-isolate framing.
- Tests:
  - `TestClaudeExecuteIsolatesHostSkillsWhenIgnoreOptedIn` replaces the
    old by-default isolation case (now requires explicit "ignore").
  - New `TestClaudeExecuteDefaultModeKeepsHostConfigDir` locks in that
    default ExecOptions preserve the host CLAUDE_CONFIG_DIR.
  - `TestClaudeExecuteIsolatesUsesCustomEnvSource` now explicitly opts
    into "ignore" mode.
  - Handler tests: omitted → "merge"; explicit "ignore" round-trips;
    preserve-existing test seeds "ignore" and asserts "merge" flip-back.
  - `TestNormalizeSkillsLocal_DriftStaysSafe`: only literal "ignore"
    maps to ignore; everything else → "merge".
  - `skills-tab.test.tsx`: toggle ON by default; flip OFF when agent
    opted into "ignore". Intro-text matcher anchored to a more specific
    phrase so it no longer collides with the toggle hint copy.

Verification:
- `go test ./...` green (full server suite locally).
- `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and
  `go test -c -o /dev/null` both compile clean (windows-tagged linker
  file still builds).
- `pnpm typecheck` green across all packages and apps.
- `pnpm --filter @multica/views test` 88 files / 771 tests green.
- `pnpm --filter @multica/core test` 43 files / 390 tests green.
- Handler DB-backed tests still skip locally without docker; CI will
  validate the create / update paths against migration 108.

Co-authored-by: multica-agent <github@multica.ai>

* chore(landing): drop 0.3.7 changelog entry from this PR (MUL-2603)

The landing-page release notes belong in a separate release-prep PR, not in the feature PR.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): propagate skills_local=ignore to codex user-skill seed (MUL-2603)

Make the per-agent skills_local toggle real for Codex too, not just Claude.
Previously the toggle was only consumed by the Claude backend, while the
daemon's execenv layer always seeded Codex's per-task CODEX_HOME with the
host machine's user-installed skills from ~/.codex/skills/. A shared Codex
agent with skills_local=ignore could still inherit a broken local skill
from one operator's machine.

Now: PrepareParams/ReuseParams carry SkillsLocal; hydrateCodexSkills
skips seedUserCodexSkills when SkillsLocal == "ignore" so the per-task
CODEX_HOME exposes only workspace skills to the codex CLI. Default
("merge", or empty from older servers/clients) preserves existing
inherit-from-machine behavior. UI / docs are updated to reflect the
contract honestly: Claude Code and Codex honor the toggle; other
runtimes (Hermes / Kimi / Kiro / Copilot / Cursor / Gemini / Pi /
OpenCode / OpenClaw) leave $HOME untouched and discover user-level
skills natively, so the toggle is a no-op for them today.

New tests: TestPrepareCodexSkillsLocalIgnoreSkipsUserSeed,
TestPrepareCodexSkillsLocalMergeSeedsUserSkills, and
TestReuseCodexSkillsLocalIgnoreSkipsUserSeed cover Prepare(ignore),
Prepare(merge), and the toggle-flip-on-reuse path.

Co-authored-by: multica-agent <github@multica.ai>

* docs(skills): scope skills_local toggle copy to Claude Code + Codex (MUL-2603)

Off-state hint and Skills tab intro now explicitly call out Claude Code +
Codex as the only runtimes that honor the toggle, with "other runtimes
ignore this setting" wired into both states (en + zh-Hans), so users on
non-Claude/Codex agents don't read "Off" as runtime-wide isolation.

Docs (skills.mdx, skills.zh.mdx, guides/agents.zh.mdx) stop describing
Hermes / Kimi / Gemini / Copilot / Cursor / Pi / OpenCode / OpenClaw / Kiro
as having native user-level skill discovery; the daemon simply does not
manage user-level skill discovery for those runtimes today, and the toggle
is a no-op regardless of where it is set.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-26 13:26:33 +08:00
Bohan Jiang
cd71b0fe05 fix(daemon): disable Codex native auto-memory in per-task config.toml (#3202)
Codex CLI's auto-memory subsystem writes summaries to
`$CODEX_HOME/memories/raw_memories.md` and `state_*.sqlite`, then reads
them back on the next turn. The daemon never cleared these files across
Reuse(), and Codex CLI may also pull from user-level `~/.codex/memories/`
entirely outside the per-task isolation. Either path leaks unrelated
context into new Multica tasks — multica#3130 saw `D:\Project\MoHaYu\
WowChat` Raw Memories injected into a brand-new issue's first turn.

Write a daemon-managed block into the per-task `config.toml` that sets
`features.memories = false`, `memories.generate_memories = false`, and
`memories.use_memories = false`. Codex then neither writes nor reads
its memory subsystem regardless of where the residual files live. The
user's global `~/.codex/config.toml` is never touched.

Pattern mirrors `ensureCodexMultiAgentConfig`: idempotent managed-block
upsert, two TOML layout variants (root dotted-key vs. inside a `[features]`
/ `[memories]` table) to satisfy strict toml-rs parsing, and a
`MULTICA_CODEX_MEMORY` env-var escape hatch.

MUL-2598

Co-authored-by: multica-agent <github@multica.ai>
2026-05-25 15:17:38 +08:00
LinYushen
8e9df90d32 feat: include repo description in agent brief (#3203)
Add Description field to RepoData structs so that workspace repo
descriptions (set via the settings UI) are preserved through
normalization and rendered in the agent brief as:
  - <url> — <description>

When no description is set, the existing format is unchanged.

Closes MUL-2610

Co-authored-by: multica-agent <github@multica.ai>
2026-05-25 15:16:22 +08:00
Bohan Jiang
a55c03a0b3 fix(agent): inject Workspace Context into agent brief (MUL-2542) (#3078)
* fix(agent): inject Workspace Context into agent brief (MUL-2542)

The per-workspace `workspace.context` field (Settings → General) was
stored in the DB but never reached the agent prompt. Plumb it from the
workspace row through the claim response, the daemon's Task struct and
TaskContextForEnv, and render it as `## Workspace Context` in the meta
brief above `## Available Commands`. Heading is skipped when the field
is empty so workspaces that haven't set a context don't see a bare
header. Applies to every task kind — issue, comment, chat, autopilot,
quick-create — so the shared system prompt is consistent regardless of
trigger source.

Co-authored-by: multica-agent <github@multica.ai>

* chore(server): gofmt files touched by workspace-context injection

Run gofmt on the files that buildWorkspaceContext injection touched.
Cleans up composite-literal alignment in execenv task context and
struct-tag alignment in Task / AgentTaskResponse / RegisterRequest.
No behavior change.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: J <agent-j@multica.ai>
2026-05-22 17:23:27 +08:00
Bohan Jiang
c967ae0e0e feat(issues): platform-owned parent notify on child done (MUL-2538) (#3055)
* feat(issues): platform-owned parent notify on child done (MUL-2538)

When a child issue transitions from a non-done status into `done` and has
an open parent, the server now posts a top-level platform-generated
comment on the parent itself. Replaces the agent-prompt rule shipped in
PR #2918, which produced self-mention loops, planner ping-pong, and
accidental `MUL-` prefix hardcoding because the agent did not always know
the workspace prefix.

- Migration 107 widens `comment.author_type` to allow `system`; the
  zero UUID is used as the sentinel `author_id` (the column stays NOT
  NULL, callers branch on `author_type === 'system'`).
- `Handler.notifyParentOfChildDone` fires from both `UpdateIssue` and
  `BatchUpdateIssues`. Guards: prev status != done, new status == done,
  parent set, parent not in `done`/`cancelled`. Bypasses the
  CreateComment HTTP path so the assignee on_comment trigger and the
  mention-trigger paths do not fire — the comment content carries only
  the safe issue mention for the child, no `mention://agent/...` /
  `mention://member/...` / `mention://squad/...` links.
- `runtime_config.go` downgrades the Parent/Sub-issue Protocol rule 1
  to an explicit "do NOT post one yourself" guardrail; rule 2 (sub-issue
  creation `--status todo` vs `backlog`) is unchanged.
- New handler test exercises the happy path, idempotency, reopen+done,
  parent done/cancelled guards, and the no-parent case. Runtime-config
  tests reassert the new wording and the banned strings from the prior
  revision.

Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): isolate system comments + wire GH merge path (MUL-2538)

Addresses the two must-fix items from the PR #3055 second review:

1. The platform-generated `comment:created` event (author_type='system')
   was running through the generic comment listeners, which (a) tried to
   subscribe the zero-UUID author and (b) parsed @mentions from the body
   for inbox notifications. Both subscriber_listeners and
   notification_listeners now early-return on author_type='system' so the
   event becomes a pure WS broadcast for the timeline — no inbox rows,
   no transcluded-mention attack surface.

2. advanceIssueToDone (the GitHub merge auto-done path) only published
   issue:updated and skipped notifyParentOfChildDone, so a child closed
   via merged PR — the dominant completion path — left the parent
   silent. The helper is now invoked on the same prev/updated pair, with
   the existing guards (transition + parent state) protecting double-fire.

Tests:
- New cmd/server/notification_listeners_test:
  TestNotification_SystemCommentSkipsInboxAndMentions (parent subscribers
  and smuggled @mention targets stay quiet),
  TestSubscriberSystemCommentDoesNotSubscribe (zero-UUID never reaches
  AddIssueSubscriber).
- New internal/handler/github_test:
  TestWebhook_MergedPR_ChildWithParent_NotifiesParent fires a real
  pull_request closed-merged webhook against a child and asserts the
  parent receives exactly one safe system comment with the workspace's
  real identifier (no `mention://agent|member|squad` links).

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): drop parent-notification guidance from agent brief (MUL-2538)

Per Bohan's product call on PR #3055: the platform now owns the
child-done parent notification, so the runtime brief should not mention
the parent-comment path at all — not as an instruction, not as a "do
not do it" guardrail. The previous revision kept rule 1 of the Parent /
Sub-issue Protocol as a "Do NOT post your own parent-notification
comment." sentence; that still puts the concept in front of the agent
every run, which is exactly what we are trying to avoid.

What changes:
- Delete the "Parent / Sub-issue Protocol" preamble and rule 1 from
  buildMetaSkillContent. The remaining content — the `--status todo`
  vs `--status backlog` rule for creating sub-issues — now lives in a
  dedicated `## Sub-issue Creation` section, since the parent/child
  framing it previously sat under is gone.
- The system comment on the parent stays exactly as in 366f6e2: the
  agent simply does not need to know about it.

Tests:
- runtime_config_test.go is rewritten around the new section name and
  the wider "no parent-notification guidance" canary; the banned list
  now covers both the original PR #2918 wording and the intermediate
  "do NOT post one" wording.

System comment UI: the frontend already renders `author_type === "system"`
with author name "Multica" (`useActorName`) and the MulticaIcon avatar
(`ActorAvatar` via `isSystem`), matching Bohan's "looks like a normal
comment, author is multica + multica logo" requirement — no frontend
changes needed.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-22 14:51:43 +08:00
Bohan Jiang
ae530ef057 docs(runtime): tighten issue-metadata write bar (MUL-2507) (#3004)
The previous wording invited agents to pin too much: any opened PR,
external link, or "fact future agents will want one-glance access to"
was framed as worth writing, with no explicit upper bound. In practice
this caused metadata bags to accumulate single-run details and
description-summary noise instead of the small set of repeatedly-read
values the feature was designed for.

Rework the agent runtime brief and the CLI docs to lead with the bar:
write a key only when it is materially important AND likely to be
re-read by future runs on the same issue. "Most runs write zero new
keys" is now stated as the expected case, and the workflow exit step
is rewritten to mirror the same gate. Recommended-key list, safety
boundaries, and stale-key cleanup are preserved so the locked-in test
anchors still pass.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 17:20:43 +08:00
Bohan Jiang
0c767c0052 feat(issues): per-issue metadata KV (MUL-2017) (#2845)
* feat(issues): per-issue metadata KV (MUL-2017)

Adds a small JSONB KV map to every issue for agent pipeline state (attempts,
PR number, pipeline status, ...). Keys match a narrow regex, values are
primitives (string / number / bool), capped at 50 keys per issue and 8KB
per blob. Defense-in-depth via two CHECK constraints (object shape + size).

All mutations are single-key atomic (jsonb_set / `- key`). `UpdateIssue`
intentionally does NOT touch metadata: a whole-blob overwrite would race
with concurrent agent writes.

  GET    /api/issues/:id/metadata
  PUT    /api/issues/:id/metadata/:key   body: { "value": <primitive> }
  DELETE /api/issues/:id/metadata/:key

Containment filter on list: GET /api/issues?metadata=<json-object> uses
PG `@>` against a `jsonb_path_ops` GIN index. Mirrored across ListIssues,
CountIssues, ListOpenIssues, and the hand-rolled ListGroupedIssues SQL so
CLI/API and UI grouped views stay consistent.

CLI: multica issue metadata {list,get,set,delete}
  multica issue list --metadata key=value (repeatable, AND)
  set has --type to override the default value-sniffing
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): metadata test bugs + wire realtime + read-only display (MUL-2017)

- Fix two failing handler tests blocking backend CI:
  - reset decode target after delete so map merge does not mask removal
  - url.PathEscape the key segment so spaces no longer panic NewRequest
- Wire issue_metadata:changed end to end so the detail / list / my-issues
  caches stay in sync with set/delete events (other tabs, CLI writes).
- Add a read-only Metadata strip to the issue detail sidebar; hidden when
  the issue has no keys so it stays quiet in the common case.

Co-authored-by: multica-agent <github@multica.ai>

* feat(runtime): teach agents to read/write issue metadata (MUL-2017)

Add an `## Issue Metadata` section to the runtime brief plus a
`metadata list` step on entry and a `metadata set`/`delete` step on
exit. Section only emits when the task carries an issue id (comment- or
assignment-triggered); chat / quick-create / run-only autopilot stay
clean so they don't fire failing CLI calls.

Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): bump metadata migration to 105 and drop attempts as example (MUL-2017)

main is now at 104_drop_runtime_timezone; the migrator picks
LatestVersion() by sorted filename, so a slot before the tail would
let DBs that have already run 099–104 think they're up-to-date while
the issue.metadata column is missing — runtime would then fail with
column does not exist. Renumbering to 105 puts the migration at the
tail and forces it to run.

Also drop attempts as a positive example across docs/code comments and
test fixtures — the runtime instruction prompt already lists it under
"What NOT to pin" (runtime bookkeeping). Replace with pr_number, which
is in the recommended-keys set, so docs/tests speak the same language
as the prompt.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 16:35:45 +08:00
Bohan Jiang
7f9e4e829d feat(comments): thread-internal --tail pagination + reply cursor (MUL-2421) (#2846)
* feat(comments): thread-internal pagination via --tail + reply cursor (MUL-2421)

Long threads inside a single issue still forced agents to read every reply
once they used --thread, even after MUL-2387 fixed cross-thread noise. This
adds reply-level paging so a 200-reply thread can be navigated tail-first
without dragging the whole conversation into prompt context.

- New SQL query ListThreadCommentsForIssuePaged: same recursive root walk
  as the legacy thread query, but caps reply count and supports an
  (created_at, id) composite cursor. Root is unconditional — even tail=0
  emits it so the reader keeps the "what is this thread about" context.
- Handler ListComments: parses `tail` (non-negative, ThreadTailSet flag
  preserves the tail=0 intent), threads it through to the paged query,
  and re-uses X-Multica-Next-Before / X-Multica-Next-Before-Id for the
  reply cursor. Cursor's meaning is now context-dependent: thread cursor
  under --recent, reply cursor under --thread + --tail.
- CLI: new --tail flag (only valid with --thread; mutually exclusive
  with --recent), reply-cursor semantics for --before / --before-id when
  paired with --thread + --tail, stderr label flips to "Next reply cursor"
  so an operator copy-pasting the cursor knows which scope it scrolls.
- Tests cover the new contract: tail=N keeps newest N + root, tail=0 is
  root-only, anchor on a nested reply still walks up, reply cursor
  scrolls older replies page-by-page, since combined with tail filters
  after the cut, and the negative-flag-combination matrix.

Out of scope: prompt template update to hint at `--thread <id> --tail 30`
on long threads — separate follow-up per the issue.

Co-authored-by: multica-agent <github@multica.ai>

* fix(comments): only emit reply cursor when older reply exists (MUL-2421)

The thread-tail path emitted `X-Multica-Next-Before` whenever the page
filled to exactly the requested reply count, even when there was nothing
older to scroll to. So `--thread <root> --tail 3` on a thread with
exactly 3 replies sent a cursor that, when followed, returned just the
root — a wasted round-trip that surfaced as a phantom "older replies"
affordance in the agent prompt.

Switch to a `reply_limit + 1` probe: ask the SQL for one extra row, trim
the oldest overflow before responding, and only emit the cursor when an
older reply actually existed. The exact-boundary case (replyCount ==
tail with no overflow) now returns no cursor.

Also documents `--thread/--tail/--recent/--before` and the cursor
semantics in CLI_AND_DAEMON.md, which was the second must-fix in the
MUL-2421 review.

Co-authored-by: multica-agent <github@multica.ai>

* fix(comments): suppress reply cursor when --since covers older replies (MUL-2421)

In the thread + tail + since path the server still emitted a reply cursor
whenever there was an older reply on disk, regardless of `since`. If the
oldest retained reply on the page was already `<= since`, every older
reply was guaranteed to be filtered out too, so the next page only ever
returned the root — wasting round-trips until the agent walked the whole
pre-`since` history. Mirror the recent + since suppression: when
`replies[0].CreatedAt <= since`, drop the cursor.

Test covers the exact case from Elon's review: tail=2 overflow, body
keeps a fresher reply, but the cursor target (oldest retained reply) is
already past `since` — header must be empty.

Co-authored-by: multica-agent <github@multica.ai>

* feat(prompt): default comment-trigger reads to --thread --tail 30 (MUL-2421)

Comment-triggered agents previously defaulted the trigger-thread read to
the unbounded `--thread <id> --output json`, which dumps the full thread
into the prompt — exactly the kind of context bloat MUL-2387 fixed at the
cross-thread layer but never bounded inside a single thread.

Use the new `--tail` flag landed earlier in this PR (server + CLI) as the
default for both the per-turn prompt and the runtime-config Workflow:

- `--thread <trigger-id> --tail 30 --output json` is the new default.
  Root is always included so "what is this about" context survives.
- If 30 replies aren't enough, the prompt now spells out the reply
  cursor: re-feed the stderr `Next reply cursor: --before <ts>
  --before-id <reply-id>` pair back to walk older replies.
- `--recent 20` stays as the cross-thread background fallback, with an
  explicit callout that the same `--before` / `--before-id` flags walk
  *threads* (not replies) in that mode.
- Available Commands core line now surfaces `--tail N` and both stderr
  cursor labels so non-workflow callers also discover the flag.
- `--since` callouts reflect the post-MUL-2421 combinable mode names
  (`--thread --tail` / `--recent`).

Tests (`prompt_test.go`, `execenv_test.go`) pin the new defaults and add
a regression guard against the unbounded `--thread` recipe sneaking back
in.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 13:43:15 +08:00
Bohan Jiang
aeb284cbeb feat(runtime): teach agents the parent/sub-issue protocol (MUL-2338) (#2918)
* feat(runtime): teach agents the parent/sub-issue protocol (MUL-2338)

Adds a Parent / Sub-issue Protocol section to the runtime brief built by
`buildMetaSkillContent`, emitted whenever the agent is running on a real
Multica issue (assignment- or comment-triggered). Two behaviors are now
documented for every issue-bound agent:

- A. When wrapping up a child issue, post the final result and switch to
  `in_review` on this issue first, then post a single top-level comment
  on the parent. Mention the parent assignee only when it is another
  agent on a still-open parent — never self-mention, never @ member /
  squad, never re-trigger a `done` / `cancelled` parent.
- B. When creating sub-issues, choose `--status backlog` for sub-issues
  that must wait and `--status todo` for the one to start immediately;
  promote with `multica issue status <id> todo` when its turn comes.

The signal is explicitly framed as best-effort — no server-side state
sync, no claim of a guaranteed handshake. The section is skipped for
chat, quick-create, and run-only autopilot runs, which have no
parent/child semantics.

Tests in runtime_config_test.go assert that the section is present in
both issue workflows, absent in the three non-issue modes, and that the
wording does not introduce a non-existent `multica issue list --parent`
command or promise a reliable handshake.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): split Step A of parent/sub-issue protocol by trigger type (MUL-2338)

Comment-triggered runs were inheriting an unconditional
`multica issue status <this-issue-id> in_review` from Step A, which
conflicts with the comment-triggered workflow rule "Do NOT change the
issue status unless the comment explicitly asks for it" (Elon's blocking
review on PR #2918). Step A now branches on trigger type:

- Assignment-triggered: keep "post final results + flip in_review".
- Comment-triggered: complete the reply per the existing workflow rule,
  only flip status when the triggering comment asked for it, and gate
  the parent-notification steps on actually closing out child work.

Tests lock the boundary: comment-triggered briefs must not contain the
unconditional in_review command, must echo the existing status
guardrail inside Step A, and must spell out the "closing out" gate.
Assignment-triggered briefs still carry the unconditional flip.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): simplify parent/sub-issue mention rule to always @ parent assignee (MUL-2338)

Per Bohan's directive on PR #2918: the per-case mention table (same agent /
member / squad / closed parent) is overkill prompt complexity. Replace it
with a single rule: always @mention the parent's assignee using the URL
that matches assignee_type. The platform's existing run dedup handles
re-triggers, and a single rule is easier for agents to follow predictably.

Preserves the existing comment-triggered boundary (Step A still does NOT
add an unconditional in_review flip on comment-triggered runs).

Co-authored-by: multica-agent <github@multica.ai>

* refactor(runtime): compress parent/sub-issue protocol to 3-rule convention (MUL-2338)

Drop the spec-flavored A/B sub-headings and per-case mention table; keep
three numbered rules (close out child, notify parent, pick backlog vs
todo) plus a one-line best-effort preamble. The comment-triggered
branch still re-asserts the "do not change status unless asked"
guardrail and gates parent notification on actually closing out child
work; the assignment-triggered branch still flips to `in_review`.

Section is now 7 lines instead of 29. A new TestParentSubIssueProtocolIsCompact
guards the ≤10-line ceiling so this stays a convention, not a spec.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): make sub-issue creation rule unconditional in parent/sub-issue protocol (MUL-2338)

Elon's review on PR #2918: the preamble previously gated all three
rules on the current issue having `parent_issue_id`, but rule 3
(creating sub-issues) needs to reach top-level parents that have no
parent themselves — that is exactly where the `todo` vs `backlog`
decision matters most. Move the gate from the preamble onto rules 1
and 2 per-rule; rule 3 now applies to any issue-bound run. Section
stays at 7 newlines (≤10).

Co-authored-by: multica-agent <github@multica.ai>

* refactor(runtime): unify parent/sub-issue protocol as mechanism description (MUL-2338)

Drop the if/else split between assignment- and comment-triggered runs in
the Parent / Sub-issue Protocol section: both runs now read the same
two-rule description of how the parent/child mechanism works. The
comment-triggered workflow rule "Do NOT change the issue status unless
the comment explicitly asks for it" naturally short-circuits the parent
notification (no status flip → not closing out the child → skip), so the
protocol no longer needs to branch on TriggerCommentID.

Tests collapse the two trigger-specific cases into one parameterized
test, and the assignment vs comment status-flip invariants are now
anchored on the real workflow command (with substituted issue id)
instead of the protocol's removed `<this-issue-id>` placeholder.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 16:20:33 +08:00
Jiayuan Zhang
2ad1cd8ff8 feat(profile): user profile description injected into agent brief (MUL-2406)
## Summary

Adds per-user `profile_description` so coding agents have cheap, durable context about who is asking. v1 per the brief Xeon locked in on [MUL-2406](mention://issue/63a7247c-4f6a-42cf-90d1-7c746e77158a):

- **DB** — `user.profile_description TEXT NOT NULL DEFAULT ''` (migration 096). 2000-rune cap enforced server-side. No nullable / privacy state to manage.
- **API** — `PATCH /api/me` accepts the field; `UserResponse` always emits it. Client wraps `updateMe` in a lenient `UserSchema` + `EMPTY_USER` fallback per CLAUDE.md API Response Compatibility.
- **UI** — Settings → Account gains an "About you" textarea with live `n/2000` counter, `maxLength` guard, and a localized too-long error (EN + zh-Hans).
- **CLI** — `multica user profile get` / `multica user profile update` with `--description / --description-stdin / --description-file / --clear`, mirroring the existing `issue comment add` input-mode menu.
- **Daemon injection** — claim handler resolves the runtime owner and stamps `requesting_user_name` + `requesting_user_profile_description` on the task. `buildMetaSkillContent` emits `## Requesting User` between `## Agent Identity` and `## Available Commands`, blockquoted and framed as background context. The block is omitted entirely when the description is empty (no token cost when unused).

Brief is written **once per task** via `CLAUDE.md` / `AGENTS.md`, not the per-turn prompt — same path the agent already reads for identity, so no extra per-turn cost.

## Test plan

- [x] `go build ./...`, `go vet ./...`, `go test ./internal/cli/ ./internal/daemon/ ./internal/daemon/execenv/ ./cmd/multica/`
- [x] New brief tests: `TestBuildMetaSkillContentEmitsRequestingUser`, `TestBuildMetaSkillContentOmitsRequestingUserWhenEmpty`
- [x] `pnpm typecheck`, `pnpm lint`, `pnpm test` (74 files, 644 tests pass)
- [ ] Handler DB tests (`TestUpdateMe*`) require a migrated test DB — not runnable in this sandbox
- [ ] Manual: open Settings → Account, set a description, confirm the next daemon-run agent's `CLAUDE.md` shows `## Requesting User`
2026-05-19 19:51:28 +02:00