Commit Graph

309 Commits

Author SHA1 Message Date
Bohan Jiang
ac75c97797 fix(desktop): disable auto-start/stop toggles for a daemon the app can't control (WSL2) (#3940)
* feat(daemon): report OS in /health response

The desktop app reads daemon liveness over HTTP but starts/stops it via the
native CLI, which acts on the host process namespace. On Windows with the
daemon in WSL2, /health is reachable via localhost forwarding yet the daemon's
process is unreachable — so the app needs a signal to tell a daemon it manages
from one it merely sees. Expose runtime.GOOS as `os` so the desktop can
compare it against its own host OS. MUL-3154, #3916

Co-authored-by: multica-agent <github@multica.ai>

* fix(desktop): disable auto-start/stop for an unmanageable daemon

When the daemon runs in an environment the app can't drive — e.g. Linux in
WSL2 behind a Windows desktop, reachable only via localhost forwarding — the
Auto-start/Auto-stop toggles silently did nothing: the lifecycle CLI acts on
the host process namespace and never reaches the daemon's PID.

Detect it by comparing the daemon's reported OS (new /health `os` field)
against the host OS, and only when a daemon is actually running. When they
differ: disable both toggles with an explanatory note, skip the version-match
restart on auto-start, and skip the no-op stop on quit. Fails safe — a missing
`os` (older daemon) or a matching OS keeps the toggles live, so native
Mac/Windows/Linux daemons are unaffected.

MUL-3154, #3916

Co-authored-by: multica-agent <github@multica.ai>

* fix(desktop): centralize externally-managed guard at the lifecycle boundary

Review follow-up. The first cut only disabled the Settings toggles, but the
same unmanageable daemon (WSL2 etc.) could still be Stop/Restart-ed from the
Runtime card and from automatic lifecycle entries (logout, user switch,
reauth, first-workspace restart) — each of which would shell out to a native
CLI that can't reach the daemon's process.

Move the guard into the main-process lifecycle functions so every entry point
is covered by construction: stopDaemon() and restartDaemon() no-op for an
externally-managed daemon, and ensureRunningDaemonVersionMatches() treats it
as up-to-date (no misleading restart). The per-branch checks in the auto-start
handler and before-quit are removed — the boundary now covers them. The
Runtime card hides Stop/Restart and shows a 'Managed outside the app' hint,
mirroring the Settings tab. Adds a component test for the card's two states.

MUL-3154, #3916

Co-authored-by: multica-agent <github@multica.ai>

* fix(desktop): preflight the lifecycle guard against live /health

Review follow-up. The guard read a cached lastExternallyManaged, which only
fetchHealth() updates — but not every lifecycle entry polls before calling
stop/restart. syncToken()'s user-switch branch calls restartDaemon() directly
after its own fetchHealthAtPort(), without refreshing the cache; on a fresh
launch / account switch (no poll yet) the cache is still the initial false, so
restartDaemon() would shell out to the native CLI and hit the very WSL/native
PID-namespace problem this PR avoids.

Make stopDaemon()/restartDaemon() preflight against a live /health read each
call instead of trusting the poll cache. The decision is extracted to a pure
daemonLifecycleUnreachable(readDaemonOS, hostOS) so a unit test can prove the
*live* value (not a cache) drives it. lastExternallyManaged is removed — the UI
already reads the per-status externallyManaged field, so it had no other
consumer.

MUL-3154, #3916

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-10 12:27:41 +08:00
Chenyu-24601
8b94764c47 feat(daemon): configurable OpenClaw binary path / state dir via CLIConfig.Backends (MUL-3157)
Summary:
- Add CLI config schema for OpenClaw backend binary path and state dir overrides.
- Apply those overrides during daemon LoadConfig using the existing env-var based probe/spawn path.
- Cover backward compatibility, precedence, partial overrides, and fail-soft config loading.

Verification:
- go test ./internal/cli ./internal/daemon
- go vet ./internal/cli ./internal/daemon
- GitHub CI passed
2026-06-09 14:05:37 +08:00
Bohan Jiang
24b162cdbc feat(daemon): surface the real task initiator to the agent runtime (MUL-2645) (#3899)
* feat(daemon): surface the real task initiator to the agent runtime (MUL-2645)

In a multi-person workspace the agent runtime only ever saw the runtime
OWNER identity: the brief's `## Requesting User` is sourced from
runtime.OwnerID and the task-scoped token is owner-bound, so every
requester (whoever commented, @mentioned, or chatted) appeared to the
agent as the owner. Agents that route by initiator for permission,
privacy, or audit all misjudged.

Resolve the real task initiator at claim time and surface it distinctly
from the owner:
- comment / mention trigger -> triggering comment's author (member or agent)
- chat task -> chat session creator (sessions are creator-only)
- on-assign / autopilot / quick-create -> no attributable initiator (omitted)

Adds initiator_{type,id,name,email} to the claim response, the daemon
Task, and TaskContextForEnv, rendered into the brief as a new
`## Task Initiator` section. The section documents the privacy boundary:
the agent's credentials stay owner-scoped, so this is an attested
identity for the agent's own routing/privacy logic, not act-as. No DB
migration — both paths are derivable from existing rows.

Tests: brief rendering (member/agent/omit/sanitize) + email guard unit
tests, and claim-handler tests for the comment and chat paths.

Co-authored-by: multica-agent <github@multica.ai>

* fix(chat): store real sender as task initiator, not chat_session creator (MUL-2645)

Review fix (Niko, PR #3899). v1 resolved the chat task initiator from
chat_session.creator_id at claim time. That is correct for web chat and
Lark p2p (creator == sender), but WRONG for Lark group chats: the group
session creator is deliberately the installer (stable identity across
member churn), not the message sender. So in a Lark group, every member
who triggered the agent showed up in the brief as the installer/owner —
the exact bug this issue is about, still live at that entry point.

Capture the real sender at enqueue time instead of deriving it from the
session creator at claim time:

- migration 117: agent_task_queue.initiator_user_id (FK user, ON DELETE
  SET NULL); NULL for non-chat and pre-migration rows.
- EnqueueChatTask now takes an explicit initiatorUserID. Web chat passes
  the authenticated request user; the Lark dispatcher threads the inbound
  sender (binding.MulticaUserID) through scheduleRun -> flushChatRun. The
  debouncer keeps the latest scheduled flush per session, so in a multi-
  sender silence window the LATEST sender wins (documented + tested).
- claim handler resolves the initiator from task.initiator_user_id and
  drops the creator_id fallback entirely.

The Lark group session creator stays the installer (unchanged) — only the
task initiator is corrected, keeping the two concepts cleanly separate.

Tests: dispatcher group regression (initiator = sender, not installer),
latest-sender-wins, p2p initiator assertion; the chat claim handler test
now sets creator != initiator and asserts the stored sender wins.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-08 19:29:57 +08:00
liujianqiang-niu
5be7d1bc17 MUL-3136 fix(openclaw): parse config path from last non-empty line of CLI output
Fix OpenClaw config discovery when `openclaw config file` prints Doctor warning UI before the actual config path. The daemon now uses the last non-empty stdout line as the path while preserving the existing tilde expansion, absolute-path validation, stat checks, and fail-closed behavior.

Tests: go test ./internal/daemon/execenv
2026-06-08 17:22:02 +08:00
Bohan Jiang
1ddf89a8f2 feat(daemon): enable Antigravity (agy) per-agent model selection (MUL-3125) (#3894)
* feat(daemon): wire agy --model and model discovery for Antigravity

agy 1.0.6 added a --model flag and an `agy models` catalog command, which
were the #1 blocker in the earlier agy-backend review (MUL-3125). The
antigravity backend already shipped but deliberately dropped opts.Model
because agy 1.0.1 had no way to select a model.

- buildAntigravityArgs now passes --model <display name> when opts.Model is
  set; the value is the exact `agy models` display string (spaces + parens),
  passed as a single exec arg so no shell quoting is needed.
- Block --model in custom_args so it can't override the managed value.
- ListModels("antigravity") enumerates via `agy models` (no static fallback:
  agy silently no-ops on unrecognised models, so a stale guess would turn a
  typo into a successful empty run).
- ModelSelectionSupported now returns true for every built-in provider; the
  hook stays for any future model-less runtime.
- Daemon probe reads MULTICA_ANTIGRAVITY_MODEL for the daemon-wide default.

Co-authored-by: multica-agent <github@multica.ai>

* docs(providers): mark Antigravity model selection as supported

Antigravity gained --model in agy 1.0.6 (MUL-3125). Update the provider
matrix + prose (en/zh/ja/ko) from "managed internally / no --model" to
dynamic discovery via `agy models`, and refresh the now-stale picker
comments. Flag the display-string (not slug) shape and agy's silent no-op
on unrecognised values.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): reject unknown Antigravity model at spawn (MUL-3125)

agy exits 0 with empty output on an unrecognised --model, so a stale/typo'd
value would surface as a 'completed' but empty task. Validate opts.Model
against the `agy models` catalog in Execute before spawning: a non-empty
model the CLI does not advertise fails fast with an actionable error listing
the real choices. opts.Model is the single funnel for agent.model and the
MULTICA_ANTIGRAVITY_MODEL default, so this one check covers every source
(UI free-text, API, persisted value, env) — addressing Elon's review that a
UI-only guard is bypassable.

Validation is fail-OPEN: if the catalog can't be discovered we pass the
value through and let agy resolve it, so a discovery hiccup never blocks a
run. Pure antigravityModelError() is unit-tested (valid / unknown / near-miss
/ empty-model / empty-catalog); verified live against real agy 1.0.6.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-08 15:32:53 +08:00
0xMomo
ef75f80d9d fix(daemon): clean stale agent branches during repo gc (MUL-2550) (#3039)
* fix(daemon): 清理陈旧 agent 分支

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): 串行化 bare repo gc

Co-authored-by: multica-agent <github@multica.ai>

* test(daemon): adapt health repo cache mock

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): gate gc maintenance on stale-branch deletion

Address review feedback on the bare-repo GC change:

- Only run `reflog expire` + `git gc --prune=30.days` when we actually
  deleted a stale agent branch this cycle. Previously the heavy step
  ran every GC tick on every cached repo even when there was nothing
  to reclaim, turning a stale-ref cleanup into a periodic full-repo
  maintenance job under the per-repo lock.
- Split git command timeouts: `gc --prune=30.days` now gets a
  10-minute budget instead of sharing the 30s ceiling that was scoped
  for the original `worktree prune` call. Light commands stay at 30s.
- Drop the redundant `gc --auto` — `gc --prune=30.days` already
  performs the maintenance `gc --auto` would have triggered.
- Narrow the agent-namespace ref query from `refs/heads/agent` to
  `refs/heads/agent/` so the pattern can't surface a literal
  `agent` branch outside the daemon namespace.

Tests:
- New TestPruneWorktree_IgnoresLiteralAgentBranch pins the trailing-
  slash narrowing.
- New TestPruneWorktree_SkipsMaintenanceWhenNothingDeleted uses an
  unreachable, backdated loose object as a sentinel to verify that
  `gc --prune` runs only when a stale agent branch was reaped.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: 0xNini Code Dev <agent@multica.local>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: 0xNini <0xnini@iMac-Pro.local>
Co-authored-by: J <j@multica.ai>
2026-06-08 15:25:14 +08:00
Bohan Jiang
3808049361 fix(codex): set semantic thread names (#3887)
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-08 14:53:31 +08:00
NanamiKite
2e34016f1f fix(daemon): interrupt local agent on server-side terminal task states (#3878)
shouldInterruptAgent now treats every terminal task status (completed/failed/cancelled, via isAgentTaskTerminal) plus a 404 task-not-found as an interruption signal, so the daemon stops a local agent once the backend has finalized the task — e.g. the runtime offline sweeper flipping running -> failed during a disconnect/reconnect. Previously only `cancelled`/404 interrupted, so the agent ran to completion and its CompleteTask call failed against a non-running row, wasting compute and adding log noise.

Closes #3877
2026-06-08 14:00:30 +08:00
HMYDK
4190de3d64 fix(skills): quote description values in built-in SKILL.md YAML frontmatter (#3852)
Built-in SKILL.md description values contained unquoted ': ' sequences, which strict YAML parsers (e.g. Codex) reject — silently dropping the skill at load.

- Quote all eight built-in skill descriptions.
- ensureSkillFrontmatter() re-synthesizes frontmatter that has a name but fails YAML validation, so malformed imports are repaired instead of dropped.
- Unify frontmatter delimiter parsing into a single frontmatterParts helper.
- Add strict-YAML regression tests over the built-in skills, plus unit tests for the recovery branch and delimiter variants.

Closes #3851.
2026-06-08 13:10:24 +08:00
Wes
5e7587ad07 Optimize daemon runtime wakeups (#3859) 2026-06-08 12:51:13 +08:00
Xinmin Zeng
270d177475 fix: broken "Add a computer" command on Multica Cloud + two CLI amplifiers (MUL-3087) (#3817)
* fix(server): recognize official cloud by frontend host in daemon setup config

The 'Add a computer' dialog builds its command from /api/config's
daemon_server_url/daemon_app_url, falling back to 'multica setup' when
both are empty. The official cloud is meant to omit them, but the
omission only fired when MULTICA_PUBLIC_URL=https://api.multica.ai. When
that env is unset the server URL defaults to the frontend origin and the
old guard (which required serverURL host == api.multica.ai) didn't match,
so the dialog emitted 'multica setup self-host --server-url
https://multica.ai' — pointing the daemon backend at the frontend (no
/health, no WebSocket proxy).

Identify the official cloud by its frontend host alone (multica.ai /
app.multica.ai) so a missing or misconfigured MULTICA_PUBLIC_URL can no
longer leak the broken self-host command. Regression from #3474.

* fix(cli): probe before persisting self-host config to preserve auth on failure

setup self-host wrote a fresh CLIConfig{ServerURL, AppURL} (a full
overwrite that drops the saved token) and only then probed the server,
returning early on failure. A failed probe therefore logged the user out
and left them unconnected, with no recovery in the same command.

Probe first via persistSelfHostConfigIfReachable: an unreachable server
leaves the existing config — and its token — untouched (failed setup =
no-op). The prober is injected so both branches are unit-tested.

* fix(daemon): serve health before preflight so daemon start readiness is accurate

The CLI's 'daemon start' polls the health endpoint for 15s expecting
status=running, but the daemon only began serving health after
preflightAuth, whose initial workspace sync detects every configured
agent's version by exec'ing it (~20s cold with 8 agents). Health served
too late, so a perfectly healthy daemon printed 'may not have started
successfully'.

Start the health server right after resolveAuth (which still fails fast
on a missing token) and before the slow preflight, so readiness reflects
the daemon core being up rather than agent-version detection finishing.

* fix(daemon): gate /health readiness so daemon start can't report a false start

Serving health before preflightAuth fixed the false-negative (a healthy
daemon printed "may not have started"), but health still returned
status:"running" unconditionally — before preflight (PAT renew + workspace
sync + runtime registration) had completed. `daemon start` and the desktop
treat "running" as ready, so a slow or *failing* preflight could be
misreported as a started daemon: setup prints "connected", then the process
exits or hangs in agent-version detection with no runtime registered. That
is harder to diagnose than the original false-negative.

Split liveness from readiness: bind/serve the health port early (so callers
see a live "starting" daemon instead of connection-refused), but report
status:"starting" until d.ready is set after preflight, then "running".

- daemon.go: add d.ready (atomic.Bool); set it true after the background
  loops launch, before pollLoop.
- health.go: healthHandler reports "starting" until ready, else "running".
- cmd_daemon.go: `daemon start` waits for "running" with a deadline raised
  to 45s (covers cold-start agent detection) and a clearer "still starting"
  message; new daemonAlive() helper treats both "running" and "starting" as
  a live daemon, so the already-running guard, restart, and stop act on a
  starting daemon and don't double-spawn or race its listener; `daemon
  status` shows "starting" distinctly.

Older CLIs/desktop that only know "running" safely treat "starting" as
not-ready (status != "running"), so no boundary break.

Tests: health reports starting-then-running; daemonAlive truth table.
Co-authored-by: multica-agent <github@multica.ai>

* fix(desktop): handle daemon "starting" health status in lifecycle

The daemon now reports /health status:"starting" until preflight completes
(liveness/readiness split). That made "starting" a new external contract of
/health, but the Desktop daemon-manager only knew "running", so the readiness
fix would have moved the CLI's false-negative into a Desktop start regression:

- `daemon start` now blocks up to 45s waiting for readiness, but the Desktop
  spawned it via execFile({ timeout: 20_000 }). On a cold start (the ~20s agent
  detection this PR targets) Electron killed the CLI supervisor at 20s and
  reported a start failure, even though the detached daemon child kept booting —
  the UI flashed "stopped" then "running". Raise the timeout to 60s (must exceed
  the CLI's 45s startupTimeout).
- The Desktop treated only raw status === "running" as a live daemon, so a
  daemon that was still "starting" (booting on its own or started via the CLI)
  showed as "stopped", and startDaemon() would spawn a second one — which the new
  CLI rejects as "already running", surfacing as a start error.

Add daemonStatusAlive() (shared, pure, unit-tested) mirroring the Go daemonAlive()
and use it for liveness: fetchHealth() surfaces a daemon-reported "starting" as
state "starting" regardless of our own currentState; startDaemon()'s
already-running guard and the restart-on-user-switch guard treat "starting" as an
existing daemon. version-decision stays gated on "running" (readiness, not
liveness) — unchanged.

Verified: desktop typecheck, eslint, full vitest suite (193 tests) all pass.
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-05 17:01:23 +08:00
Bohan Jiang
3708fb0f07 fix(daemon): inactivity-based agent run timeout, no wall-clock guillotine (MUL-3064)
Active long-running sessions are no longer killed by a fixed wall-clock deadline. Liveness is delegated to the idle watchdog (MULTICA_AGENT_IDLE_WATCHDOG, default 30m) with a larger in-flight-tool budget (MULTICA_AGENT_TOOL_WATCHDOG, default 2h). MULTICA_AGENT_TIMEOUT is an opt-in absolute cap (default 0 = no cap). The server-side 2.5h sweeper is unchanged as a coarse backstop.

Fixes #3745.
2026-06-05 15:06:07 +08:00
Multica Eve
63b847ee48 Honor agent identity in assignment workflow (#3802)
Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-05 13:45:43 +08:00
Naiyuan Qing
b9334dd59f fix: anchor comment triggers to thread roots (#3746)
Co-authored-by: multica-agent <github@multica.ai>
2026-06-04 13:47:05 +08:00
Bohan Jiang
d6a556bdbf fix(execenv): refresh skills in place on reuse instead of accumulating duplicate dirs (#3716)
Re-dispatching the same agent on the same issue reuses the persistent workdir
via execenv.Reuse(), where the standard-provider skill refresh re-wrote skills
without clearing the prior dispatch's output, so allocateCollisionFreeSkillDir
dodged Multica's own directories into issue-review-multica-N.

On reuse, reclaim the platform-owned managed skill directories the prior
manifest recorded (removeReusedManagedSkillDirs) and roll back the remaining
sidecar files (CleanupSidecars) before refreshing, so each skill lands at its
canonical slug every dispatch. Mirrors the Codex hydrateCodexSkills wipe;
scoped to reuse, which never runs for local_directory tasks.

Fixes #3684 (MUL-2963).
2026-06-03 19:30:42 +08:00
Naiyuan Qing
1544e3b68a feat(skills): built-in agent skills (WIP) MUL-2759 (#3456)
* feat(skills): introduce built-in agent skills (WIP)

Inject platform-authored, version-bundled skills into every agent on top of
its workspace-bound skills, so agents learn how to operate Multica correctly
without users needing to know the internals or agents needing to read source.

Mechanism: skills are embedded into the server binary and appended to the
agent payload at task-claim time (handler/daemon.go), reusing the existing
SkillData wire + daemon-side writeSkillFiles. The daemon needs no changes,
and because it travels over an existing wire field, older daemons pick the
skills up the moment the server ships.

First skill: multica-mentioning — how to build a working @mention (look up
the UUID, match type to id source, know what each mention type triggers).

WIP: injection mechanism + first skill only; more skills to follow in
dependency order (skill -> agent -> squad).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(skills): make multica-mentioning the standard template + add eval

Add the contract-skill frontmatter the other built-in skills will copy:
user-invocable:false (it triggers from context, not as a slash command)
and allowed-tools fencing it to the multica CLI it teaches. These keys
survive to agent machines untouched (ensureSkillFrontmatter only ever
adds a missing name).

Add a Go eval in builtin_skills_test.go (a _test.go so it never ships to
agent machines via the skill-files walk):
- Enforces the template invariants on every built-in skill, present and
  future: multica- prefix, name+description present, description within
  1024 chars, body within the 500-line L2 budget, no eval file leaking
  into the shipped payload.
- Couples the mentioning skill's documented contract to the real
  util.ParseMentions: its Incorrect examples must parse to nothing (a
  name where a UUID belongs fails silently) and its Correct example must
  fire. A drift in the mention regex now breaks CI instead of silently
  turning the skill into a lie agents act on.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(skills): add working-on-issues built-in skill

Co-authored-by: multica-agent <github@multica.ai>

* feat(skills): verify linked PRs in issue workflow skill

Co-authored-by: multica-agent <github@multica.ai>

* feat(skills): add skill import and discovery built-ins

Co-authored-by: multica-agent <github@multica.ai>

* feat(skills): add skill authoring built-in

Co-authored-by: multica-agent <github@multica.ai>

* docs(skills): align builtin skill workflows

Co-authored-by: multica-agent <github@multica.ai>

* docs(skills): use structured skill search

Co-authored-by: multica-agent <github@multica.ai>

* fix(skills): make built-in skill bundle launch-ready

Co-authored-by: multica-agent <github@multica.ai>

* fix(skills): align built-ins with additive skill binding

Co-authored-by: multica-agent <github@multica.ai>

* feat(skills): add creating agents built-in skill

Co-authored-by: multica-agent <github@multica.ai>

* Add built-in squads skill

Co-authored-by: multica-agent <github@multica.ai>

* refactor(skills): rewrite built-in skills as source-traced contracts

Rewrite the built-in agent skills to the inbuilt-skill-authoring standard:
state source-traced product facts with the source-code link logic as the
core, not prescriptive how-to coaching.

- creating-agents: drop the Decision-flow / Do-don't-consequences
  methodology; replace with field/behavior contracts (validation, persisted
  shape, daemon claim-time consumption, env gating, skill binding).
- skill-discovery: stop teaching repo/github_stars as selection signals —
  searchClawHubSkills never populates them (always null); rank by
  install_count + source/url + description. Add file:line citations.
- mentioning: drop the unbacked "member mention sends a notification" claim
  (no such path in the comment handler); state that only agent/squad
  mentions enqueue work. Tighten the parser-failure wording.
- working-on-issues: refresh citations drifted by the main merge; describe
  the PR response `state` enum accurately; trim status coaching.
- skill-importing: correct response type to SkillWithFilesResponse; document
  the reserved SKILL.md supporting-file rule; add line-accurate citations.
- squads: correct the "leader cannot be archived" overstatement (not
  rejected at create/update; fails closed later at routing/dispatch);
  refresh source-map attributions and test list.

Each skill now ships references/<skill>-source-map.md as its evidence layer
(line-accurate citations live there, not pinned in the test, so a future
main merge cannot rot them into stale lies).

builtin_skills_test.go: replace coaching/line-number pins with drift-resistant
contract anchors, forbid the coaching phrasing, and require every skill to
ship its source-map. The ParseMentions behavior coupling is preserved.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* docs(skills): close field-role and citation gaps found in review

Independent review of the rewritten built-in skills surfaced two real gaps and
some citation drift; this fixes them.

- creating-agents: add the three missing field rows (visibility,
  max_concurrent_tasks, mcp_config) to the field-contract table — mcp_config is
  runtime-consumed (TaskAgentData, daemon.go), visibility is access-control
  (default private), max_concurrent_tasks is a scheduler cap (default 6).
  Mark custom_args/runtime_config JSON validation as CLI-side (the server
  marshals as-is). Correct the CLI body-builder note (description/instructions
  use a non-empty check, the rest use Changed). Source-map: fix the env query
  name (UpdateAgentCustomEnv), the conformance test name, and add the new field
  defaults + the McpConfig runtime-payload line.
- mentioning: the @squad mention private gate is canAccessPrivateAgent, not
  canEnqueueSquadLeader (that wrapper is the assignment/child-done path).
- working-on-issues: cite notifyParentOfChildDone at its func def (:51), not the
  doc comment (:15).
- skill-importing: config.origin is set only when the source supplied an origin
  — note it may be absent; cite createSkillWithFiles at its definition
  (skill_create.go:72), not the call site.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* Add built-in skills for autopilots runtimes and resources

Co-authored-by: multica-agent <github@multica.ai>

* feat(runtime): list skill descriptions in the brief Skills index

The brief's `## Skills` section emitted bare skill names only, discarding
the one-line description that SkillContextForEnv already carries. For
Claude-family providers the frontmatter description is loaded natively;
for providers without native skill discovery (hermes/default) the brief's
list is the only signal they ever see, so a bare name gave them nothing
to decide when to load a skill. Emit `name — description` when a
description is present, falling back to the bare name when it is empty.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* refactor(skills): drop CLI-only rule from working-on-issues

The "Platform data goes through the CLI" section duplicated the runtime
brief's `## Important: Always Use the multica CLI` section verbatim (and
the attachment-via-CLI note duplicated the brief's `## Attachments`). The
CLI-only rule is universal and must be known before any skill loads, so
the brief is its single source of truth; the skill copy was pure
redundancy and a drift risk. Remove it and the matching intro clause.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* refactor(skills): remove discovery guidance from built-ins

* docs(skills): remove stale skill-necessity records

The per-skill necessity records had drifted to 3 of 8 shipped skills plus a
record for `multica-skill-authoring`, which is not a shipped built-in skill.
Per-skill "why it exists / when to use it" already lives co-located with each
skill (frontmatter `description` + `references/<skill>-source-map.md`) where it
cannot drift from the skill, and the doc's methodology duplicated the
workspace's inbuilt-skill-authoring protocol. Remove the file rather than keep a
parallel listing that every new skill has to remember to update.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(runtime): add source-authority escape hatch to the brief

The brief already tells agents to run `--help` for command discovery, but
nothing stated the trust precedence when a skill, the brief, or a doc seems to
contradict actual behavior. Add one line to the Available Commands escape-hatch
note: trust the live CLI (`--help`/`--output json`) and the checked-out source
over source-traced prose that can lag the code, and verify on any conflict or
confusion. Kept in the always-on brief (universal, needed before any skill
loads) rather than duplicated into each skill; per-skill source-map pointers
remain the specific layer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): scope the source-authority escape hatch to the CLI

The previous version told agents the "checked-out source is the deeper
authority" for verifying behavior. That over-claims: the repos in a task's
brief come from GetWorkspaceRepos + project github_repo resources (per-workspace
config, see daemon.registerTaskRepos), not the Multica platform source. A
generic agent's checked-out source is its own app, not Multica's code, so it
cannot verify a Multica skill/brief claim against it. The only universally
available authority for Multica behavior is the live CLI (`--help` /
`--output json` / observed command behavior). Re-scope the line accordingly and
state plainly that the platform's source is not in the workdir.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* revert(runtime): drop the source-authority escape-hatch line

Reverts the brief addition from fdd5e82df and its follow-up cc67b2088. The
`--help` discovery fallback already in the Available Commands note is enough;
the extra trust-precedence sentence was unnecessary. runtime_config.go is now
identical to 6ca27ad74.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* docs(claude): remind to update built-in skills on CLI/field/behavior changes

Add a Coding Rule: when a change touches a CLI command/flag, API field, or
product behavior that a built-in skill documents, update that skill's SKILL.md
and source-map in the same PR. Lives in the repo dev-guide (read when working in
this repo), not the runtime brief — the runtime brief is injected into every
workspace, where most agents have no Multica skill to update. AGENTS.md is a
pointer to CLAUDE.md, so no mirror needed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-03 18:51:03 +08:00
Multica Eve
10afd1af1b feat(server): introduce pkg/taskfailure classifier and switch in-flight failure_reason writes (MUL-2946) (#3693)
Lift MUL-1949's offline backfill failure_reason taxonomy into a shared
in-flight classifier so the agent_task_queue.failure_reason column is
written with refined values (provider_auth_or_access, context_overflow,
provider_capacity_or_rate_limit, …) at write time rather than waiting on
SQL backfill to re-classify after the fact. PR1 of the Grafana board
plan in MUL-2328 — the upcoming PR2 reuses pkg/taskfailure.AllReasons()
to pre-warm the Prometheus failure_reason label set.

* server/pkg/taskfailure: new package with the canonical 21 Reason
  constants (7 platform-side + 14 agent_error.* sub-reasons),
  AllReasons() returning a defensive copy, IsAgentError() prefix check,
  and Classify(rawError) Reason mirroring the SQL CASE rules from
  MUL-1949 (db-boy's analysis). 100% statement coverage.
* server/internal/daemon/daemon.go: route the 'agent_error' coarse
  fallback paths (StartTask error, runTask early-return error, CompleteTask
  permanent rejection, reportTaskResult default branch) and the
  executeAndDrain default error case (chained after classifyPoisonedError)
  through taskfailure.Classify so blocked / timeout / unknown-status
  results all carry a refined reason on the wire.
* server/internal/service/task.go: FailTask classifies errMsg when the
  daemon-supplied failureReason is empty, eliminating the legacy
  COALESCE(.., 'agent_error') landing.
* server/internal/daemon/poisoned.go: alias FailureReasonIterationLimit
  and FailureReasonAPIInvalidRequest to the canonical taskfailure
  constants. agent_fallback_message and codex_semantic_inactivity are
  pre-existing operational reasons not in the canonical 21 — kept as
  literals for now and revisited in a follow-up PR.

Backfill SQL from MUL-1949 stays as the authoritative offline source of
truth; this PR keeps the in-flight classifier in lock-step with the SQL
CASE expression so historical and future rows share the same taxonomy.
No behavior change for the platform-side reasons (queued_expired,
runtime_offline, runtime_recovery, timeout, etc.) which already align
with the canonical set.

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-03 13:52:56 +08:00
Bohan Jiang
44feb3d06d fix(skill): canonicalize reserved SKILL.md path check across daemon + API (#3660)
A skill_file row whose path is the skill's own SKILL.md (persisted by
older builds or direct create/update API calls) collides with the
primary content the daemon writes itself, failing task prep with
errPathPreExists on every non-codex local runtime (#3489).

#3526 guarded this with strings.EqualFold(path, "SKILL.md") at the
daemon write site and the three API ingress points, but the stored path
is not canonicalized: "./SKILL.md" or "sub/../SKILL.md" slip past the
exact-match guard while filepath.Join still resolves them onto the same
SKILL.md, so prep can still break.

Extract one canonical helper, skill.IsReservedContentPath, that cleans
the path before the case-insensitive compare, and use it at all four
sites (execenv writeSkillFiles, skill create, update, single-file
upsert). Add a daemon-side regression test for writeSkillFiles ignoring
a bundled SKILL.md (exact + "./" spellings) — the load-bearing fix
previously had only API-layer coverage — plus a unit test for the helper.

Existing poisoned rows are intentionally left in place (skipped at prep)
per the decision on MUL-2928.

MUL-2928
Follow-up to #3526; supersedes #3560.

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-02 18:23:57 +08:00
MSandro
996eb07dc5 fix(daemon): skip duplicate SKILL.md in supporting files to prevent task prep failures (#3526)
Fixes #3489

MUL-2928
2026-06-02 17:53:20 +08:00
Bohan Jiang
888186b183 fix(daemon): make comment-posting guardrail provider-agnostic (MUL-2904) (#3654)
* fix(daemon): make comment-posting guardrail provider-agnostic (MUL-2904)

Agents inlining a backtick-wrapped token into `multica issue comment add
--content "..."` had the shell run it as a command substitution, silently
deleting the token; the stored comment never matched the model's intent, so
it retried forever — spamming OKK-497 with duplicate comments.

The corruption is shell-driven, not provider-driven, so extend the
"never inline --content; use --content-file / quoted-HEREDOC --content-stdin"
rule from Codex-only to ALL providers:

- BuildCommentReplyInstructions: collapse the Linux/macOS non-Codex inline
  branch into the unified quoted-HEREDOC stdin template.
- buildMetaSkillContent: rename "Codex-Specific Comment Formatting" ->
  "Comment Formatting" and emit it for every provider; strengthen the
  Available Commands entry and the assignment step-6 examples to steer away
  from inline --content.
- Windows behavior unchanged (file-only; avoids PowerShell ASCII drop).

Tests: flip the non-Codex Linux reply test into a MUL-2904 regression,
broaden the stdin-emphasis test across providers, and pin the
provider-agnostic guardrail.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): keep Windows assignment brief file-only (address review)

Review catch on #3654: the previous commit added platform-agnostic prose
recommending "--content-file or --content-stdin" in the Available Commands
entry and the assignment-triggered step-6 example. The assignment path has
no BuildCommentReplyInstructions OS override, so on Windows an agent following
step 6 literally would pipe its final comment through PowerShell and drop
non-ASCII bytes (#2198 / #2236 / #2376) — contradicting this PR's own
Windows file-only rule in the ## Comment Formatting section.

Make the platform-agnostic surfaces defer to the OS-aware ## Comment
Formatting section (the single source of truth) instead of naming stdin.
The flag synopsis still lists all three modes.

Add TestInjectRuntimeConfigWindowsAssignmentBriefStaysFileOnly: a Windows
assignment-triggered brief must not contain any prescriptive "... or
--content-stdin" recommendation.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-02 17:47:09 +08:00
Qi Yijiazhen
91c1e51411 feat(editor): add / slash-command palette for invoking agent skills (#3159)
* feat(editor): add / slash-command palette for invoking agent skills

Adds a `/` trigger in the chat box that opens a popover listing the active
agent's skills. Selecting an item inserts a `[/label](slash://skill/<id>)`
token; the daemon extracts those IDs in `buildChatPrompt` and emits an
"Explicitly selected skills:" block using the canonical names from the
agent's skill registry — labels are display-only and never trusted.

Built on Tiptap's `Mention` extension so the suggestion lifecycle,
keyboard routing, and IME handling mirror the existing `@` mention UX.
Item list is sourced from the React Query workspace cache (no per-keystroke
fetch). Gated behind a new `enableSlashCommands` prop so only `chat-input`
opts in; other `ContentEditor` consumers (issue editor, comments) are
unaffected. Read-only markdown surfaces render the token as a `.slash-command`
pill via a custom link renderer + sanitize-schema/url-transform allowlists.

Closes #3108

* fix(i18n): add slash_command editor copy for ko/ja

The PR added slash_command popover empty-state keys to en + zh-Hans only;
locales/parity.test.ts requires every locale to cover every EN key, so ko
and ja failed CI. Add the two keys (no_skills_configured, no_results)
matching existing skill terminology (스킬 / スキル).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Naiyuan Qing <145280634+NevilleQingNY@users.noreply.github.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 15:57:42 +08:00
Bohan Jiang
0f9d9d1494 fix(skills): align Go/TS frontmatter coercion for non-scalar values (#3614)
The Go SKILL.md frontmatter parser unmarshalled into a {Name,Description}
string struct, so a non-scalar value (a list/map written where a scalar
belongs) made the whole decode fail and dropped even a valid sibling
`name`. The TS parser instead kept the name and JSON-encoded the value,
so the file-viewer (TS) and the import path (Go) could disagree about
the same SKILL.md.

Decode into a generic map and coerce per key on the Go side, mirroring
the TS coercion (scalars -> literal form, sequences/mappings -> JSON), so
both sides produce identical results and a structured value never
discards a sibling key. Rename ParseFrontmatter -> ParseSkillFrontmatter
to remove the cross-language name clash with the TS parseFrontmatter
(which returns {frontmatter, body}), and drop the unused TS
parseSkillFrontmatter export.

Add parity tests for sequence/mapping values plus name-only,
description-only, leading-blank-line and triple-dash-in-body edge cases
on both sides.

Follow-up to #3543 / MUL-2842.

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-01 19:42:20 +08:00
YOMXXX
801c201d4c fix(skills): parse multi-line YAML frontmatter in SKILL.md (#3495) (#3543)
Three independent line-based frontmatter parsers only handled
single-line `description: value`, so a YAML block scalar
(`description: |`) collapsed to the literal "|" and the rest of the
description was dropped before it ever reached the database.

Replace all three with real YAML decoders that understand block
scalars, folded scalars and quoted values:

- server/internal/skill: shared ParseFrontmatter via gopkg.in/yaml.v3,
  used by both the handler import path and daemon local-skill discovery
- packages/core/skills: shared parseFrontmatter via the yaml package
- file-viewer renders multi-line frontmatter values (whitespace-pre-wrap)

Both parsers fall back to empty values on malformed YAML, preserving the
previous non-fatal behaviour.
2026-06-01 19:35:01 +08:00
Naiyuan Qing
3c8645e546 feat(cli): add squad member set-role (#3583)
Co-authored-by: multica-agent <github@multica.ai>
2026-06-01 12:51:15 +08:00
Naiyuan Qing
4ae4722ef0 fix(comments): preserve direct parent on replies (#3579)
Co-authored-by: multica-agent <github@multica.ai>
2026-06-01 08:28:15 +08:00
Naiyuan Qing
973a43923f fix(comments): revert since-delta to issue-wide, steer to parent thread first (#3535)
#3509/#3523 scoped the comment-trigger since-delta count to the triggering
thread, so an agent resuming a busy issue only saw "+N in this thread" and
lost visibility of new comments in other threads. Revert the count to
issue-wide (every thread), keeping the trigger-comment + agent-own
exclusions, and reshape the warm-path hint to:

  - report the issue-wide new-comment volume,
  - steer the agent to read the triggering (parent) thread FIRST
    (`--thread <trigger> --since`, or `--tail 30` for full context),
  - demote the issue-wide `--since` catch-up to an only-if-needed fallback
    ("don't read them all blindly").

Also fixes the now-stale "scoped to the triggering thread" wording in the
resumed-session no-delta hint (it's issue-wide zero now).

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-29 20:13:23 +08:00
Multica Eve
d1c7d478e1 MUL-2785: clarify thread-scoped comment delta (#3523)
Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-29 17:16:58 +08:00
Multica Eve
9616d78e47 MUL-2785: optimize resumed comment reads (#3509)
* feat(comments): skip default thread read on resumed comment sessions

Co-authored-by: multica-agent <github@multica.ai>

* fix(comments): scope since delta to trigger thread

Co-authored-by: multica-agent <github@multica.ai>

* chore(comments): address thread delta review nits

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-29 14:57:14 +08:00
Naiyuan Qing
3187bbf90c feat(comments): re-add since-delta + cold-start thread read + parent-root write normalization (#3494)
* feat(comments): since-delta new-comment hint + default-on comment session resume (#3432)

* feat(db): add unresolved comment count + list filter queries

Add CountUnresolvedComments (excludes the agent's own comments) and
ListUnresolvedCommentsForIssue. Both are additive — existing callers stay
on the unfiltered queries — so old clients are unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handler): support unresolved-only comment listing

Wire an additive `unresolved` query param into ListComments. Defaults off
so an old CLI that never sends it gets unchanged behavior; only true/1
enable it. Rejects combining unresolved with thread/recent (whole-issue
filter vs navigation models). Includes filter + count query tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handler): plumb unresolved count + thread root into claim, gate comment resume

Populate trigger_parent_id (thread root of the trigger comment) and
unresolved_count (excludes the agent's own comments) on comment-triggered
claim responses. Both fields are omitempty so old daemons ignore them.

Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION
(default off): resumed comment turns can inherit the prior turn's "Done."
final message, so this stays an explicit rollout switch. The runtime-match
and poisoned-session guards still apply regardless of the flag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(daemon): inject unresolved-comments hint + resolve step into agent brief

Add a shared BuildUnresolvedCommentsHint helper rendered on both the
per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It
ships only the count and the relevant CLI call — never comment bodies — so
the server stays cheap. Thread case points at --thread <root>; issue case
points at --unresolved. Suppressed when the count is 0.

Also add a workflow step telling the agent to `multica comment resolve
<thread-root>` once a thread is fully handled, so the unresolved set
converges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): add comment list --unresolved and comment resolve command

Add an --unresolved filter to `issue comment list` (wired to the server's
unresolved param, rejected when combined with --thread/--recent) and a
top-level `comment resolve <id>` command that POSTs to the existing
/api/comments/{id}/resolve endpoint, letting an agent close threads it has
fully handled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(comments): since-delta new-comment hint + default-on comment resume

Simplifies the comment-triggered agent flow down to what's actually needed:

- New-comment awareness is now a pure time delta: the claim response carries
  new_comment_count + new_comments_since (anchored on the prior run's
  started_at, never completed_at so a long run can't miss comments). The
  per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s)
  since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the
  two surfaces can't drift. Cold start (no prior run) falls back to a plain read.
- Comment-triggered tasks resume the prior session by default (same runtime),
  dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS
  comment" prompt guard defends against inheriting the prior turn's "Done."
  marker; GetLastTaskSession still excludes poisoned sessions.
- Drops the resolved-based machinery from the first draft: CountUnresolvedComments
  / ListUnresolvedCommentsForIssue queries, the `comment list --unresolved`
  flag, the `multica comment resolve` command, and the resolve workflow step.
- Removes the verbose cursor-pagination paragraph from the comment prompt; the
  --thread/--recent/--since flags stay in the CLI/API, just no longer explained
  inline every turn.

Compatibility: new claim fields are omitempty (old daemons ignore them).
Comment resume is default-on and affects even old daemons, which already
consume prior_session_id.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(comments): collapse reply parent_id to thread root on write

Comment threads are a 2-level model (root + flat replies, like Linear/Slack),
enforced today only by the UI and the agent path — the CreateComment handler
stored whatever parent_id it was handed, and the agent-side flatten walked just
one level, so a reply-to-a-reply could land at depth 3+. Add GetThreadRoot (a
recursive walk to the parent_id=NULL root) and run both write paths
(handler.CreateComment, service.createAgentComment) through it, so every stored
reply's parent_id IS its thread root. Readers can now treat parent_id as the
thread root without re-walking. The agent-drift guard still compares the raw
parent_id to the trigger comment before normalization.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(comments): cold-start reads triggering thread, warm keeps --thread pointer

The since-delta rework dropped the thread-first read on the COLD path: a
first-time agent fell back to the flat `comment list` dump (oldest-first, cap
2000), burying the trigger's context in ancient chatter. Point cold start at the
triggering conversation instead via a shared BuildColdCommentsHint
(`--thread <trigger> --tail 30` + a --recent pointer for cross-thread
background). On the WARM path, --since is a pure time delta and can miss the
triggering thread's pre-anchor history, so BuildNewCommentsHint now also emits a
--thread pointer. Both surfaces (per-turn prompt + CLAUDE.md workflow) render via
the shared helpers so they cannot drift (PR #2816 rule).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 10:38:37 +08:00
Bohan Jiang
fa076d38f2 MUL-2778 feat(agent): wire mcp_config through OpenClaw runtime (#3450)
* MUL-2778 feat(agent): wire mcp_config through OpenClaw runtime

The MCP config tab (#3419) lets admins save mcp_config on an agent, and
recent work (#3439) plumbed it through the three ACP runtimes. OpenClaw
still ignored the field, leaving the Tab silently inert for any
OpenClaw-backed agent.

Translate the agent's Claude-style `{"mcpServers": {...}}` into the
per-task OpenClaw wrapper's `mcp.servers` block — OpenClaw resolves MCP
via its own config schema rather than ExecOptions, so the existing
OPENCLAW_CONFIG_PATH preparer is the right seam. Fail closed on
malformed JSON / entries missing `command` or `url`, matching the
fail-closed posture the preparer already uses for the agents.list step.
Null / absent mcp_config leaves the wrapper free of an `mcp` key so the
user's global mcp.servers flows through untouched; an explicit empty
managed set (`{}` / `{"mcpServers":{}}`) is honoured as "admin saved no
servers" mirroring `hasManagedCodexMcpConfig`.

Strict-mode replacement (drop user-only servers entirely) would require
OpenClaw to do a per-key replace rather than a deep merge at
`mcp.servers`; the comment documents that caveat rather than relying on
undocumented behaviour.

Also adds `openclaw` to `MCP_SUPPORTED_PROVIDERS` so the MCP Tab
actually surfaces in the agent overview pane, and pins the new
visibility case with a renderPane test.

Co-authored-by: multica-agent <github@multica.ai>

* MUL-2778 fix(agent): make openclaw mcp_config strict-replace via sanitized snapshot

Elon flagged on #3450 that the previous wiring let user-only mcp.servers
leak through the wrapper's `$include` of the live user config: deep-merge
at `mcp.servers` keeps user-only names, and the strict-empty case
(`{ "mcpServers": {} }`) silently inherited user globals.

Switch the strict-replace path to write a sanitized snapshot of the
user's fully resolved config (via `openclaw config get --json`) with the
`mcp` block stripped, then have the wrapper `$include` the snapshot
instead of the live user file. With the user's `mcp` gone from the
$include resolution, the wrapper's `mcp.servers` is the only definition
the embedded OpenClaw sees — managed only, including the explicit empty
set.

The snapshot lives in envRoot at 0o600 alongside the wrapper so the GC
reaper sweeps it with the rest of the task scratch, and no extra
OPENCLAW_INCLUDE_ROOTS entry is needed (same-dir $include).

Fail-closed on `config get --json` errors so the daemon never silently
falls back to the leaky $include path. The inherit branch (null
mcp_config) still uses the live user file directly — no extra CLI
roundtrip and no snapshot is written.

New tests pin the contract Elon's review required:

- TestPrepareOpenclawConfigStrictReplacesUserMcpServers: user has
  global_one + shared, managed has shared + managed_only → wrapper has
  exactly {shared (managed value), managed_only}; global_one does NOT
  leak; snapshot file has the user's `mcp` stripped while preserving
  gateway / providers / API keys.
- TestPrepareOpenclawConfigStrictEmptyManagedSetDropsUserMcp: empty
  managed set drops user's global_one (both `{}` and
  `{"mcpServers":{}}` cases).
- TestPrepareOpenclawConfigNullMcpConfigKeepsUserInclude: null path
  inherits the live user config, writes no snapshot, makes no extra CLI
  call.
- TestPrepareOpenclawConfigFailsClosedOnResolvedConfigError: errors
  during `config get --json` surface; no stale wrapper or snapshot.
- TestPrepareOpenclawConfigManagedSetFreshInstall: fresh install with
  managed mcp_config skips the snapshot dance entirely.

Also tightens en + zh-Hans MCP Tab copy to mention OpenClaw goes via the
per-task wrapper, and to use OpenClaw's own `transport` field rather
than Claude's `type` for HTTP/SSE entries.

Co-authored-by: multica-agent <github@multica.ai>

* MUL-2778 fix(agent): narrow openclaw snapshot strip to mcp.servers only

Elon's third-round must-fix: the previous strict-replace snapshot deleted
the entire `mcp` block, which wiped out non-server settings under `mcp`
like `sessionIdleTtlMs`. Those are documented OpenClaw config keys
(https://docs.openclaw.ai/gateway/configuration-reference#mcp) outside
the MCP Tab's scope — the agent's saved mcp_config only manages server
definitions, so other `mcp.*` tuning the user set must survive.

Replace the blanket `delete(resolved, "mcp")` with a stripUserMcpServers
helper that:

- deletes only `mcp.servers` when `mcp` is an object
- drops the parent `mcp` key only when the object is empty after the
  strip (so we don't emit `mcp: {}` placeholders)
- leaves non-object `mcp` values untouched (we only know how to strip
  servers from the documented shape)

Pinned with TestPrepareOpenclawConfigStrictPreservesNonServerMcpKeys:
user resolved has both `mcp.sessionIdleTtlMs: 300000` and
`mcp.servers.global_one`; after the strict path runs the snapshot
keeps the TTL and drops the servers map, and the wrapper's
`mcp.servers` is exactly the managed set with no leak.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-28 18:43:02 +08:00
Naiyuan Qing
d90732750f Revert "feat(comments): since-delta new-comment hint + default-on comment ses…" (#3455)
This reverts commit 5e78e5100a.
2026-05-28 17:52:59 +08:00
Bohan Jiang
ee4ec3b76d MUL-2784 fix(daemon): cleanup sidecar tree (.agent_context / .multica / provider skills) after local_directory tasks (#3444)
* fix(daemon): cleanup .agent_context / .multica / provider skill sidecars after local_directory tasks (MUL-2784)

PR #3438 (MUL-2753) only restored CLAUDE.md / AGENTS.md / GEMINI.md to
their pre-task bytes; the sidecar tree writeContextFiles seeds
(.agent_context/, .multica/, .claude/skills/, .github/skills/,
.opencode/skills/, skills/, .pi/skills/, .cursor/skills/,
.kimi/skills/, .kiro/skills/, .agents/skills/, fallback
.agent_context/skills/) was explicitly deferred to this follow-up. In
local_directory mode the agent's workdir is the user's repo, so each
task accumulates one more layer of those directories in the user's
tree.

Plan A: track every file/dir Prepare creates inside workDir in a
sidecarManifest written to envRoot/.multica_sidecar_manifest.json
(daemon scratch — never in the user's workdir). On local_directory
teardown CleanupSidecars walks the manifest, removes the recorded
files, then rmdir-iterates the recorded directories in reverse.
Pre-existing files and directories are deliberately NOT recorded, so
a user-installed .claude/skills/my-own-skill/ sibling — or any
unrelated file the user keeps under .claude/, .github/, etc. — is
preserved bit-for-bit. Non-empty rmdir fails ENOTEMPTY and is
silently skipped, which is the signal that the user owns the
directory.

Daemon wiring lives next to the existing CleanupRuntimeConfig defer
in runTask: runtime brief first, sidecars second. Cloud-mode runs
still write a manifest for symmetry but never trigger the cleanup
(the GC loop wipes envRoot wholesale).

Tests (sidecar_manifest_test.go) cover the round-trip invariant per
the issue's acceptance criteria:

- empty workdir → Prepare → Cleanup → empty workdir, byte-exact, for
  every file-based provider (claude, codex, copilot, opencode,
  openclaw, hermes, pi, cursor, kimi, kiro, antigravity, gemini),
- user's .claude/skills/my-own-skill/ (and equivalents per
  provider) survives Cleanup intact,
- unrelated user files under .claude/, .github/, etc. survive,
- three repeated cycles do not accumulate any orphan state,
- project_resources branch (.multica/project/resources.json) is
  also reversible,
- recordWriteFile refuses to record pre-existing files,
- recordMkdirAll refuses to record pre-existing dirs,
- Cleanup is a no-op when the manifest file is missing.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): refuse to overwrite pre-existing sidecar paths; pick collision-free skill slugs (MUL-2784 review)

Addresses PR #3444 review (Elon):

**Must-fix #1**: recordWriteFile used to overwrite pre-existing target
files unconditionally and only skip the manifest record. That destroys
user bytes at write time AND leaves the corrupted contents in place at
cleanup time — the byte-exact contract the issue requires is violated
on both halves. Fixed by making recordWriteFile detect any pre-existing
entry (regular file, symlink, directory) via Lstat and return a
sentinel errPathPreExists without touching the path. The user's bytes
are preserved verbatim.

For per-skill collisions (user's .claude/skills/issue-review/ vs
Multica's "Issue Review"), writeSkillFiles now allocates a
collision-free sibling slug via allocateCollisionFreeSkillDir: first
attempt is the natural slug, then `<base>-multica`,
`<base>-multica-2`, …, bounded at 64 attempts. Provider-native
discovery still picks the skill up (every subdir under skillsParent is
a distinct skill) and the user's path stays bit-for-bit intact.

For Multica-only namespace files (.agent_context/issue_context.md,
.multica/project/resources.json), the writer swallows errPathPreExists
and continues — the runtime brief already carries every fact those
files would, so a collision degrades to brief-only mode rather than
destroying user content.

**Must-fix #2**: Added byte-exact collision matrix tests covering
every file-based provider (claude / codex / copilot / opencode /
openclaw / hermes / pi / cursor / kimi / kiro / antigravity / gemini):

- TestPrepareThenCleanupSidecarsSameSlugCollisionPerProvider: seeds
  user's `<provider>/skills/issue-review/SKILL.md` plus a private
  notes.md sibling, runs Prepare → Inject → Cleanup, asserts
  workdir snapshot is byte-identical to seed.
- TestPrepareThenCleanupSidecarsIssueContextCollisionPerProvider:
  seeds user's `.agent_context/issue_context.md`, asserts round-trip
  preserves it.
- TestPrepareThenCleanupSidecarsProjectResourcesCollisionPerProvider:
  same for `.multica/project/resources.json`.
- TestPrepareThenCleanupSidecarsMultiSkillCollisionFreeAllocation:
  end-to-end check that the Multica skill lands at the
  collision-free sibling and Cleanup removes only the Multica side.
- TestAllocateCollisionFreeSkillDir: directed unit test pinning the
  slug-bumping sequence.
- TestRecordWriteFileRefusesToOverwritePreExistingFile (was
  TestRecordWriteFileSkipsPreExistingFile): flipped to assert the
  user's bytes survive and errPathPreExists is returned.
- TestRecordWriteFileRefusesToOverwriteSymlinkOrDir: covers the
  Lstat path for non-file entries.

**Should-fix**: CleanupSidecars used to swallow ANY non-ENOENT rmdir
error as "user content present," silently dropping real I/O failures
(EACCES, EPERM, EBUSY). Now it re-reads the directory after a failed
rmdir via the new dirHasEntries helper — non-empty → silently skip
(ENOTEMPTY, the intended branch); empty → genuine error, captured
into firstErr and surfaced. Plus directed tests:

- TestCleanupSidecarsSurfacesRealRmdirErrors
- TestDirHasEntries

Local verification:
- go test ./internal/daemon/execenv/... — all green
- go test ./internal/daemon/... — all green
- go vet ./... — clean

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): surface original rmdir error when post-rmdir ReadDir also fails (MUL-2784 review)

Addresses remaining PR #3444 review blocker (Elon): dirHasEntries used
to return true when ReadDir failed with anything other than ENOENT,
which made CleanupSidecars treat every locked / faulted directory as
ENOTEMPTY and silently drop the original rmdir error. The v1 fix from
the previous round closed the EACCES-on-empty-dir branch but missed
the case where the chmod also blocks ReadDir — exactly the failure
mode the review called out.

Helper change: dirHasEntries now returns (hasEntries, ok bool):

  - (false, true)  — dir exists and is empty (or missing, race-safe)
  - (true,  true)  — dir has user content (the ENOTEMPTY branch)
  - (_,     false) — ReadDir failed (EACCES, ENOTDIR, EIO, …); the
                     caller cannot tell ENOTEMPTY from a real error
                     and MUST surface the original rmdir error

CleanupSidecars switches on (ok, hasEntries):

  - !ok               → surface the ORIGINAL rmdir error (not the
                        ReadDir failure — that's diagnostic plumbing
                        and would distract from the root cause)
  - ok && hasEntries  → swallow silently (intended ENOTEMPTY branch;
                        preserve user content)
  - ok && !hasEntries → surface the rmdir error (empty dir + EACCES /
                        EPERM / EBUSY → genuine cleanup failure)

Tests:

  - TestDirHasEntries: extended with a regular-file sub-case (ReadDir
    returns ENOTDIR) asserting (false, false). The v1 helper returned
    (true) here, hiding the bug.
  - TestCleanupSidecarsSwallowsMissingAndNonEmptyDirs: renamed from
    TestCleanupSidecarsSurfacesRealRmdirErrors. The old name claimed
    to test the surfacing path but never actually exercised it.
  - TestCleanupSidecarsSurfacesEACCESOnEmptyRecordedDir: chmod parent
    to 0o555 so rmdir(recorded) fails EACCES while ReadDir(recorded)
    still succeeds (empty). Asserts firstErr is non-nil and references
    both the recorded path and the rmdir branch. Skipped when running
    as root (chmod is bypassed for uid 0).
  - TestCleanupSidecarsSurfacesEACCESWhenReadDirFailsToo: the must-fix
    case — chmod parent 0o555 AND chmod recorded 0o000 so BOTH rmdir
    and ReadDir fail. The surfaced error must be the ORIGINAL rmdir
    failure, not the ReadDir one. Skipped on uid 0.

Local verification:
- go test ./internal/daemon/execenv/... — all green
- go test ./internal/daemon/... — all green
- go vet ./... — clean

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-28 17:22:47 +08:00
Bohan Jiang
90a737fc7e fix(daemon): retry terminal task callbacks on transient errors (MUL-2780) (#3443)
CompleteTask / FailTask used to be fire-once. A 1-second upstream 502
burst would drop the call, then the immediate fail-fallback also 502'd,
leaving the task stuck in `running` forever and showing the agent as
"still working" in the UI.

Add a bounded retry around the two terminal callbacks: 4s, 8s, 16s,
32s, 64s backoff schedule (5 retries, ~124s ceiling), retrying only
on transient errors (5xx, 408, 429, transport-level) and bailing
immediately on permanent 4xx. Also fix a latent bug where a transient
complete failure would silently downgrade a successful run to a fail:
the fallback now triggers only on permanent errors. Server-side
CompleteTask / FailTask are already idempotent on "already terminal",
so replays from a retry are safe even if the prior 502'd response was
actually persisted.

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-28 17:06:41 +08:00
Bohan Jiang
03f70209c4 fix(daemon): preserve user CLAUDE.md / AGENTS.md / GEMINI.md in local_directory runs (#3438)
* fix(daemon): preserve user CLAUDE.md / AGENTS.md / GEMINI.md in local_directory runs (MUL-2753)

InjectRuntimeConfig previously called os.WriteFile unconditionally, which
truncated whatever file lived at the same path. For the local_directory
project_resource flow the workdir is the user's own repo, so the agent
silently destroyed any repo-level CLAUDE.md / AGENTS.md / GEMINI.md the
first time it ran in that directory, and the daemon's local-directory
cleanup explicitly skips the user's path so the file was never restored.

Write the brief inside a marker block instead:

  <!-- BEGIN MULTICA-RUNTIME (auto-managed; do not edit) -->
  ...brief...
  <!-- END MULTICA-RUNTIME -->

writeRuntimeConfigFile handles three states:

- file missing  -> create with just the marker block,
- file present, no marker block -> append the marker block at the end
  (preserves user-authored content above), and
- file present, marker block already there -> replace the block body in
  place so repeated runs don't grow the file unboundedly.

This is the short-term fix called out on MUL-2753. The sidecar question
(.agent_context/, .claude/skills/, .multica/project/resources.json) is
left for a follow-up — those files don't overwrite user content, just
litter the workdir.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): cleanup runtime config marker block after local_directory tasks (MUL-2753)

Address Elon's review on PR #3438:

1. Add `CleanupRuntimeConfig` and wire it into the daemon's task path so
   `local_directory` runs excise the marker block on the way out. Without
   it, a user's subsequent manual `claude` / `codex` / `gemini` run in
   the same directory picks up the previous task's stale brief (issue
   id, trigger comment id, reply rules) and acts on the wrong context.
   Cloud workspace runs skip the cleanup — their scratch workdir is
   wiped by the GC loop anyway.

2. If excising the block would leave the file empty / whitespace-only,
   the file is removed so we don't leave behind a stub the user has to
   delete by hand. Surviving user content is preserved byte-for-byte.

3. Harden the marker parser: search for the end marker strictly after
   the begin marker. The previous `strings.Index` pair mishandled two
   malformed cases —
     - a stray end marker before any begin (e.g. user pasted a
       documentation snippet showing the wire format) would cause
       every run to stack another block, growing the file unboundedly;
     - a half-block left by a previous crashed run would cause every
       subsequent run to append a fresh block beneath the half-block.
   The `locateMarkerBlock` helper now anchors the end search past the
   begin offset, and treats "begin found, no end after" as "block runs
   to EOF" so the next write replaces it cleanly.

Centralised the provider→filename mapping in `runtimeConfigPath` so
Inject and Cleanup can't drift past each other when a new provider is
added.

Tests cover: parser hardening (stray-end-before-begin idempotency,
half-block recovery), Cleanup happy path / file removal / no-op cases /
malformed half-block / per-provider mapping, and an end-to-end
inject→cleanup round trip that locks in byte-identical restoration of
the user's pre-injection file.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): byte-exact inject/cleanup round trip for runtime config (MUL-2753)

Address Elon's second-round review on PR #3438. The previous cleanup
relied on `TrimRight + "\n"` for trailing newlines and `TrimSpace == ""`
for file removal — both compensated for the inject path's "normalise
trailing newlines so there's always exactly `\n\n` before the block"
step, but they did so by mutating the user's bytes. The result was a
real diff on three boundary cases:

  - file ended without a newline (`rules`) → cleanup added one;
  - file ended with two or more newlines (`rules\n\n`) → cleanup
    collapsed to a single newline;
  - file pre-existed but was empty / whitespace-only → cleanup
    deleted it.

Reshape the contract so the bytes inject adds are the exact bytes
cleanup removes, with no user-byte mutation in between:

  - Define `runtimeManagedSeparator = "\n\n"` as a fixed managed
    separator that inject always inserts (unconditionally — including
    for files that already end in two or more newlines) between
    pre-existing user content and the marker block.
  - Inject's missing-file branch still writes the block alone (no
    separator); that absence is the marker Cleanup uses to identify
    "we created this file from scratch" and is the only condition
    under which Cleanup is allowed to `os.Remove` the file.
  - Cleanup detects `HasSuffix(pre, runtimeManagedSeparator)` and
    strips exactly those bytes; whatever remains is written back
    verbatim with no `TrimRight` / `TrimSpace`, so the pre-injection
    bytes survive exactly.

The replace-in-place branch is untouched — the managed separator
established by the first inject lives in pre and survives across
subsequent runs, so byte-exactness is preserved through arbitrary
inject→inject→cleanup chains.

Tests:

  - `TestInjectThenCleanupRoundTripByteExactBoundaries` parameterises
    9 seed shapes (missing file, empty, whitespace-only, no trailing
    newline, one trailing newline, two trailing newlines, many
    trailing newlines, CRLF line endings, no final newline with
    embedded blank lines) and asserts byte-identical round trip
    across two full cycles.
  - `TestInjectReplaceThenCleanupRestoresByteExact` covers the
    replace-in-place branch for the same boundary seeds.
  - `TestWriteRuntimeConfigFileAlwaysInsertsFixedManagedSeparator`
    pins the new invariant at the source: regardless of seed shape,
    inject emits `<seed><\n\n><marker block>` with no normalisation.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-28 16:15:07 +08:00
Naiyuan Qing
5e78e5100a feat(comments): since-delta new-comment hint + default-on comment session resume (#3432)
* feat(db): add unresolved comment count + list filter queries

Add CountUnresolvedComments (excludes the agent's own comments) and
ListUnresolvedCommentsForIssue. Both are additive — existing callers stay
on the unfiltered queries — so old clients are unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handler): support unresolved-only comment listing

Wire an additive `unresolved` query param into ListComments. Defaults off
so an old CLI that never sends it gets unchanged behavior; only true/1
enable it. Rejects combining unresolved with thread/recent (whole-issue
filter vs navigation models). Includes filter + count query tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handler): plumb unresolved count + thread root into claim, gate comment resume

Populate trigger_parent_id (thread root of the trigger comment) and
unresolved_count (excludes the agent's own comments) on comment-triggered
claim responses. Both fields are omitempty so old daemons ignore them.

Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION
(default off): resumed comment turns can inherit the prior turn's "Done."
final message, so this stays an explicit rollout switch. The runtime-match
and poisoned-session guards still apply regardless of the flag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(daemon): inject unresolved-comments hint + resolve step into agent brief

Add a shared BuildUnresolvedCommentsHint helper rendered on both the
per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It
ships only the count and the relevant CLI call — never comment bodies — so
the server stays cheap. Thread case points at --thread <root>; issue case
points at --unresolved. Suppressed when the count is 0.

Also add a workflow step telling the agent to `multica comment resolve
<thread-root>` once a thread is fully handled, so the unresolved set
converges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): add comment list --unresolved and comment resolve command

Add an --unresolved filter to `issue comment list` (wired to the server's
unresolved param, rejected when combined with --thread/--recent) and a
top-level `comment resolve <id>` command that POSTs to the existing
/api/comments/{id}/resolve endpoint, letting an agent close threads it has
fully handled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(comments): since-delta new-comment hint + default-on comment resume

Simplifies the comment-triggered agent flow down to what's actually needed:

- New-comment awareness is now a pure time delta: the claim response carries
  new_comment_count + new_comments_since (anchored on the prior run's
  started_at, never completed_at so a long run can't miss comments). The
  per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s)
  since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the
  two surfaces can't drift. Cold start (no prior run) falls back to a plain read.
- Comment-triggered tasks resume the prior session by default (same runtime),
  dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS
  comment" prompt guard defends against inheriting the prior turn's "Done."
  marker; GetLastTaskSession still excludes poisoned sessions.
- Drops the resolved-based machinery from the first draft: CountUnresolvedComments
  / ListUnresolvedCommentsForIssue queries, the `comment list --unresolved`
  flag, the `multica comment resolve` command, and the resolve workflow step.
- Removes the verbose cursor-pagination paragraph from the comment prompt; the
  --thread/--recent/--since flags stay in the CLI/API, just no longer explained
  inline every turn.

Compatibility: new claim fields are omitempty (old daemons ignore them).
Comment resume is default-on and affects even old daemons, which already
consume prior_session_id.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 15:58:42 +08:00
Bohan Jiang
bae8a84abd MUL-2767 feat(agent): add Antigravity runtime backend (#3427)
* feat(agent): add Antigravity runtime backend

Adds Google's Antigravity CLI (`agy`) as the 12th supported coding-tool
runtime, alongside Claude / Codex / Cursor / Copilot / Gemini / Hermes /
Kimi / Kiro / OpenCode / OpenClaw / Pi.

The CLI emits plain assistant text on stdout (no structured event
stream), so the backend streams stdout line-by-line as `MessageText`
events and accumulates the same text as the final `Result.Output`.
Session resumption uses `--conversation <id>`; because the conversation
UUID is not echoed on stdout, the daemon routes `--log-file` to a temp
file and recovers the id from the glog-formatted log lines.

MUL-2767

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): correct Antigravity capability contract from Elon review

- ModelSelectionSupported now returns false for antigravity. `agy` has no
  --model flag and antigravityBackend deliberately drops opts.Model, so
  the UI must render a disabled "Managed by runtime" picker instead of
  an empty dropdown plus a silently-ignored manual-entry field. Also
  stop seeding AgentEntry.Model from MULTICA_ANTIGRAVITY_MODEL — the
  backend would silently ignore it.

- Antigravity skills now write to {workDir}/.agents/skills/, the CLI's
  native workspace path (inherits Gemini CLI's layout per
  https://antigravity.google/docs/gcli-migration). Previously they went
  to the .agent_context/skills/ fallback that the CLI doesn't scan.
  Runtime brief moves antigravity into the native-discovery branch and
  local_skills.go points the user-level skill root at
  ~/.gemini/antigravity-cli/skills for Runtime → local skill import.

- Doc + UI comment sync: providers matrix / install-agent-runtime /
  cloud-quickstart / agents-create / tasks (session-resume support) /
  skills / README all now list Antigravity in the right buckets, and
  the model-picker / model-dropdown comments cite antigravity (not the
  stale hermes reference) as the supported=false example.

New tests: TestAntigravityModelSelectionUnsupported,
TestInjectRuntimeConfigAntigravity (native discovery wording),
TestWriteContextFilesAntigravityNativeSkills (.agents/skills/ landing,
.agent_context/skills/ NOT written).

Co-authored-by: multica-agent <github@multica.ai>

* feat(provider-logo): swap inline placeholder for real Antigravity PNG

Replaces the hand-drawn planet+arc placeholder with the official asset
shipped from Downloads. Stored next to the component; bundlers
(Next.js / electron-vite) resolve the PNG import to a URL string at
build time. Added a small assets.d.ts so packages/views' tsc accepts
PNG / SVG module imports — there was no prior asset usage in this
package to register the declaration.

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-28 15:40:05 +08:00
DimaS
ccbd62c7ad fix(daemon): ignore gc meta with empty parent ids (#3407)
Co-authored-by: “646826” <“646826@gmail.com”>
2026-05-28 12:33:34 +08:00
Bohan Jiang
4864831721 MUL-2744: feat(auth): auto-renew daemon PAT in-place within 7-day window (#3360)
* MUL-2744: feat(auth): auto-renew daemon PAT in-place within 7-day window

Daemons currently hold a 90-day PAT and have no renewal path: once the
token's expires_at passes, every request 401s and the user has to find
the silent failure in the daemon log and re-run `multica login`.

This adds an in-place renewal:

- New `POST /api/tokens/current/renew` (Auth-protected, mul_ only). The
  server checks remaining lifetime: ≥ 7 days is a no-op; < 7 days bumps
  expires_at to now + 90 days via a guarded UPDATE that makes concurrent
  renews idempotent (the WHERE expires_at < $2 clause means only one
  writer wins; the loser sees pgx.ErrNoRows and reports the already-
  extended value). No raw token rotation — the same secret stays in
  every CLI/daemon process sharing the config.

- Daemon-side `tokenRenewalLoop`: fires once on startup (covers
  machine-was-off cases) and then every 3 days. With a 7-day server
  threshold this gives at least two renewal attempts before the window
  closes, so a single network blip can't push the token out.

- 401 fallback: when the renew call comes back 401 (token already
  revoked/expired), the daemon logs a user-actionable WARN telling the
  operator to run `multica login` — instead of the current silent
  failure mode. Loop keeps running so the warning repeats until fixed.

PAT cache (auth.AuthCacheTTL = 10m) doesn't need invalidation: the next
miss after the UPDATE re-reads the row and re-caches with the bumped
TTL automatically.

Co-authored-by: multica-agent <github@multica.ai>

* MUL-2744: fix(auth): renew PAT before first sync; CAS against renewal threshold

Addresses the two issues Elon raised on #3360.

Must-fix: if the PAT is already revoked/expired when the daemon starts,
syncWorkspacesFromAPI 401s and Run returns before the background
tokenRenewalLoop ever fires its initial renewal. The operator only sees
a generic auth failure in the workspace-sync log with no hint that
'multica login' is the fix. Now the startup path runs an inline
tryRenewToken first, surfacing the existing 401 WARN before anything
else gets a chance to fail. Pulled the renew + first-sync pair into
preflightAuth so the ordering invariant is enforced at one site and
tests can exercise the failure modes without spinning up the full Run
setup. Removed the redundant initial tryRenewToken from
tokenRenewalLoop — startup now owns the first call.

Nit: the previous WHERE clause on ExtendPersonalAccessTokenExpiry
(expires_at < $2) did not actually make concurrent renews idempotent
the way the comment claimed. Two callers race-computing
$2 = now + 90d produce strictly-different values, and the second
writer's $2 always exceeds the row the first writer just wrote, so the
UPDATE re-matches and bumps again. Switched to a CAS against the
renewal threshold (expires_at <= $renew_threshold_at, i.e. now + 7d):
once writer A pushes expires_at past the threshold, writer B's UPDATE
matches zero rows and the loser falls back to reporting the
already-extended value as a no-op.

Tests:
- TestPreflightAuth_RenewsBeforeWorkspaceSyncOnExpiredToken locks in
  the call ordering — renew endpoint is hit before workspaces, and the
  re-login WARN appears even though both endpoints 401.
- TestPreflightAuth_SyncProceedsWhenRenewIsNoOp covers steady-state
  startup: a renew=false no-op must still progress to workspace sync.
- TestPreflightAuth_TransientRenewFailureDoesNotBlockStartup covers a
  500 from the renew endpoint — startup must continue, no WARN.
- TestRenewPAT_ParallelRenewExtendsExactlyOnce fires N=8 concurrent
  renews at one row and asserts exactly one returns renewed=true with
  the others reporting the same already-extended expires_at, plus the
  DB carries only that single bumped value.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-27 22:22:26 +08:00
Bohan Jiang
17714c3ad1 fix(create-issue): preserve parent_issue_id through Create with agent flow (MUL-2534) (#3083)
* fix(create-issue): preserve parent_issue_id through Create with agent flow (MUL-2534)

When the create-issue modal was opened from the "Add sub issue" entry on
an existing issue and the user switched to "Create with agent", the
parent_issue_id was silently dropped: switchToAgent only forwarded
prompt + actor + project_id, the AgentCreatePanel had no notion of
parent context, and the daemon prompt never instructed the agent to
pass --parent <uuid>. The sub-issue intent was lost and the new issue
landed as a standalone.

This fix threads parent_issue_id through the whole pipeline silently —
no new editable form field, the existing carry channel handles it:

- Frontend: ManualCreatePanel.switchToAgent + AgentCreatePanel.switchToManual
  now carry parent_issue_id (and identifier, for display) so the sub-issue
  intent survives mode flips in either direction. AgentCreatePanel reads
  parent from `data`, forwards to api.quickCreateIssue, and renders a
  read-only "Sub-issue of MUL-XX" chip so the user can see the relationship.
- API: quickCreateIssue accepts optional parent_issue_id.
- Backend: QuickCreateIssueRequest validates parent_issue_id belongs to the
  same workspace (same path as CreateIssue), persists it in
  QuickCreateContext, and the daemon claim handler resolves the parent's
  identifier for prompt context.
- Daemon prompt: when ParentIssueID is set, buildQuickCreatePrompt instructs
  the agent to pass `--parent <uuid>` and treat the modal entry point as
  authoritative.

Tests cover all three hops: switchToAgent carry payload, AgentCreatePanel →
api.quickCreateIssue, and the daemon prompt's --parent injection (with both
identifier-present and UUID-only fallback branches).

Co-authored-by: multica-agent <github@multica.ai>

* test(create-issue): cover quick-create parent trust boundary + identifier fallback (MUL-2534)

Address review on PR #3083:

- Add server-side test for POST /api/issues/quick-create parent_issue_id:
  same-workspace parent threads through QuickCreateContext.ParentIssueID,
  foreign-workspace and bogus UUIDs return 400 and never enqueue a task.
- Fall back to `data.parent_issue_identifier` in ManualCreatePanel's
  switchToAgent when the parent detail query hasn't hydrated yet, so the
  agent chip never renders "Sub-issue of " with an empty tail.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-27 14:18:48 +08:00
Bohan Jiang
341ce7bfa5 feat: support local working directory for projects (MUL-2618 v1) (#3283)
* feat(project): add local_directory project_resource type (MUL-2662)

Adds a second project_resource type alongside github_repo so a project
can be pinned to an existing directory on a specific daemon (the v1 of
the local-working-directory flow tracked in MUL-2618). The ref schema is
{ local_path, daemon_id, label? }; local_path must be absolute and
daemon_id is required. The same (daemon_id, local_path) pair is allowed
on multiple projects by design — no UNIQUE constraint is added.

Implementation reuses the existing project_resource API surface: the new
type is wired through the validator switch with no migration, no new
events, and no daemon-handler changes (daemon already passes through
arbitrary resource types via ProjectResources). The CLI gains
--local-path / --daemon-id / --ref-label shortcuts so
`multica project resource add --type local_directory` mirrors the
existing `--type github_repo --url ...` ergonomics; the generic --ref
flag still works for both types.

Tests cover the full CRUD lifecycle, the same-path-across-projects
allowance, the same-path-same-project conflict, the validator rejections
(missing/blank/relative path, missing daemon_id, wrong payload type),
and the cross-platform isAbsoluteLocalPath helper.

Co-authored-by: multica-agent <github@multica.ai>

* feat(project): add update endpoint + label-shadow guard for project_resource (MUL-2662)

Addresses the Elon review on PR #3263:

- Add PUT /api/projects/{id}/resources/{resourceId} with sqlc query,
  matching handler, CLI `project resource update`, and a new
  EventProjectResourceUpdated WS event. resource_type stays immutable;
  ref/label/position are all individually optional.
- Catch same-project (daemon_id, local_path) collisions where only the
  embedded label differs — the row-level UNIQUE only matches the full
  ref JSON, so a label typo would otherwise let the same working
  directory bind twice.
- Tests cover the update lifecycle (label-only / ref / clear / 404 /
  invalid path) and the label-shadow conflict on both create and
  update; the in-place rename still succeeds because the conflict
  scan ignores the row being edited.

Incidental: regenerating sqlc picked up a missing skills_local scan in
UpdateAgentCustomEnv that drifted in from #3200.

Co-authored-by: multica-agent <github@multica.ai>

* fix(project): close bundled-create label-shadow gap + merge resource_ref on CLI update (MUL-2662)

Two follow-ups from MUL-2662 review round 2:

- CreateProject inline resources path now dedupes local_directory entries on
  (daemon_id, local_path) before opening the transaction. The DB-level
  UNIQUE(project_id, resource_type, resource_ref) constraint only fires on a
  full JSON match, so two rows with the same target but different `label`
  would otherwise slip past. Standalone POST/PUT already cover this via
  findLocalDirectoryConflict; bundled create was the missing surface.
- `multica project resource update` now seeds resource_ref from the existing
  row before applying per-type shortcut flags, so `--default-branch-hint x`
  on its own no longer constructs a payload missing `url` (which the server
  400s on). Local_directory partial edits get the same merge behavior.

Co-authored-by: multica-agent <github@multica.ai>

* feat(desktop): local_directory project_resource UI (MUL-2665) (#3273)

* feat(desktop): local_directory project_resource UI (MUL-2665)

First UI surface for the local-working-directory flow tracked in MUL-2618.
Lets users on the desktop pin a project to an existing folder on this
machine; web stays read-only since the per-daemon check can't be done in
the browser.

What's new for the renderer:

- ProjectResourcesSection grows a desktop-only "Add local directory"
  button next to the existing GitHub-repo popover. Clicking it opens
  Electron's native folder picker, validates the path through a new
  IPC pair (existence + r/w), and submits a project_resource of
  resource_type=local_directory with daemon_id pulled live from
  daemonAPI.getStatus.
- LocalDirectoryRow renders the rename pencil + path tooltip, and
  greys out when ref.daemon_id != this machine's daemon_id (with a
  "only available on the machine that registered this directory"
  tooltip). Delete stays enabled so users can drop stale registrations
  from any device.
- LocalDirectoryHint sits above the issue-detail comment composer and
  shows "Agent will work in-place at {label} ({path})" when the issue's
  project has a local_directory matching this daemon. Hidden on web.
- TaskStatusPill picks up a new "waiting_for_directory_release" stage
  that the daemon will publish when it dequeues a task but can't
  acquire the path lock. The render is in place now so the daemon
  sibling subtask can wire the status string without an additional UI
  PR.

Plumbing:

- @multica/core/types gains LocalDirectoryResourceRef +
  UpdateProjectResourceRequest, and the api client gets the matching
  PUT method backed by the server endpoint that landed in
  2ac3faebb (MUL-2662). A useUpdateProjectResource hook drives the
  in-place label edit.
- New Electron handlers under apps/desktop/src/main/local-directory.ts:
    local-directory:pick     -> dialog.showOpenDialog (openDirectory)
    local-directory:validate -> stat + access(R_OK + W_OK)
  exposed through the preload as desktopAPI.pickDirectory /
  validateLocalDirectory. View code talks to them via a thin
  packages/views/platform helper that returns reason=unsupported on
  web instead of crashing.
- useLocalDaemonStatus exposes the local daemon's id, device name, and
  running flag from daemonAPI.onStatusChange so the renderer can do the
  cross-device match without coupling to the desktop preload typings.

Tests:

- pickStageKeys gets a unit test covering the new stage and proving
  the directory-release status outranks availability hints.
- LocalDirectoryHint tests cover the four render branches (no project,
  no daemon, foreign daemon, matching daemon).
- i18n parity stays green; new keys added under projects.resources.*
  and chat.status_pill.stages.waiting_for_directory_release in both
  locales.

Out of scope (will land separately):
- The daemon-side waiting/lock signal that flips the pill into the
  new state.
- Adding local_directory to the create-project modal's bulk
  attach flow.
- Docs page refresh for project-resources.mdx — left for the
  MUL-2618 umbrella sweep.

Co-authored-by: multica-agent <github@multica.ai>

* fix(desktop): hide rename for foreign daemon local_directory rows (MUL-2618)

Address review nit on #3273: the rename pencil was gated only by
`canEdit`, so a foreign / unknown-daemon row still showed it even
though the spec says cross-device rows are disabled. Gate rename on
`!mismatch` so it disappears on those rows; delete stays available
so a stale registration can still be dropped from any device.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663) (#3274)

* feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663)

Wires up the daemon side of the local_directory project_resource introduced
in MUL-2662. When a task is dispatched against a project whose resources
include a local_directory pinned to this daemon's UUID, the daemon now:

  - Validates the path (absolute, exists, daemon process can read+write,
    not in the system-root / $HOME blacklist) and fails the task fast on
    any precondition violation, with a user-readable reason.
  - Serialises concurrent tasks on the same on-disk path via a
    daemon-local LocalPathLocker keyed by symlink-resolved realpath. The
    lock is held for the entire task lifetime (claim → context write →
    agent → result report).
  - When the lock is contended, the daemon flips the row to a new
    waiting_local_directory status on the server (carrying a wait_reason
    like "<path> (held by task <short id>)") so the UI can render
    "等待本地目录释放" instead of leaving the row silently in dispatched
    past the sweeper timeout. The status accepts being woken into running
    once the lock is acquired.
  - Sets execenv.WorkDir to the user's path (no copy, no mount). envRoot
    still lives under workspacesRoot/<wsID>/ and hosts output/, logs/, and
    .gc_meta.json — the daemon's logbook for the run.
  - Stamps GCMeta.LocalDirectory=true so the GC loop never RemoveAlls
    envRoot for these tasks (gcActionClean → gcActionCleanArtifacts,
    gcActionOrphan → gcActionSkip). The user's directory was never under
    envRoot to begin with, so this is defense in depth.
  - Skips execenv.Reuse for local_directory tasks because the prior
    WorkDir is the user's path and reusing it through that code path
    loses the envRoot association the GC loop needs. Prepare is cheap
    here (no clone, no copy), so always running it is fine.

Server-side protocol changes:

  - New CHECK value 'waiting_local_directory' on agent_task_queue.status
    plus a wait_reason TEXT column (migration 109).
  - All cancel / active / counted-as-running / orphan-recovery queries
    expanded to include the new status; FailStaleTasks intentionally
    excludes it (the daemon owns the wait).
  - New SQL MarkAgentTaskWaitingLocalDirectory(id, reason) and a relaxed
    StartAgentTask that accepts both dispatched and
    waiting_local_directory as preconditions (and clears wait_reason on
    the way through).
  - New POST /api/daemon/tasks/{taskId}/wait-local-directory endpoint,
    TaskService.MarkTaskWaitingLocalDirectory broadcaster, and matching
    daemon Client.MarkTaskWaitingLocalDirectory.

Tests cover: path blacklist + R/W enforcement, mutex serialisation +
ctx-cancelled wait, lock handover between two tasks, GC never returns
gcActionClean / gcActionOrphan for local_directory rows (with negative
control for the standard path), and Prepare/Cleanup correctly substitute
+ protect the user's WorkDir.

The desktop UI side (UI for adding a local_directory resource, surfacing
the "等待本地目录" badge) is MUL-2665; the agent-task lifecycle changes
(no branch switch, dirty-tree tolerant, auto-commit) are MUL-2664.

This PR targets the shared MUL-2618 v1 feature branch agent/j/912b8cb1,
not main; the whole v1 will be merged to main together when complete.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): tighten local_directory status, symlink, cancel handling (MUL-2618)

Address the 3 must-fix items from Elon's review of PR #3274.

1. Status string unified. The server / daemon publish
   `waiting_local_directory`; align views, locales, and the
   pickStageKeys test (PR #3273 had used `waiting_for_directory_release`
   on a placeholder string). Without this, the daemon's wait state
   never reached the pill once the two siblings merged.

2. validateLocalPath now also runs the blacklist against the
   symlink-resolved realpath, with macOS's `/etc` -> `/private/etc`
   redirect handled via `isBlacklistedRealPath` which compares
   canonical forms. Without this, a symlink such as
   `/Users/me/proj/home -> /Users/me` slipped the literal $HOME check
   while every daemon write still landed in the user's home. Tests
   cover symlink-to-home, symlink-to-system-root, and the negative
   case (symlink to a regular subdirectory).

3. acquireLocalDirectoryLockIfNeeded now spins up a cancellation
   watcher inside `onWait` (lazy — the fast path stays free) so the
   gap between dispatch and StartTask responds to server-side cancel
   or row deletion. If the watcher fires while the daemon is parked
   on the path mutex, the lock-wait context is cancelled, Acquire
   returns promptly, and the helper exits silently the same way the
   run-phase poller does. New TestAcquireLocalDirectoryLock_CancelDuringWait
   exercises the path end-to-end with a fake server.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): unconditional canonical blacklist + Windows drive-root generalisation (MUL-2618)

- validateLocalPath now always runs isBlacklistedRealPath on the
  symlink-resolved path, not only when it differs from absPath. The old
  guard let users type the canonical form of an OS-symlinked banned root
  (e.g. /private/tmp, /private/etc, /private/var on macOS) straight
  through, since EvalSymlinks is a no-op on already-canonical input.
- Windows drive-root rejection moved off the static C/D/E/F enumeration
  onto filepath.VolumeName via a new isDriveRoot helper, so removable /
  network drives mounted at G:..Z: and UNC \\server\share roots are also
  blocked. systemRootBlacklist keeps the well-known C:\ trees only.
- Tests: macOS-only case exercises direct /private/{tmp,etc,var}; a
  new TestIsDriveRoot covers the Windows generalisation (skipped on
  POSIX runners by runtime guard).

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* feat(views): wire waiting_local_directory end-to-end in issue UI + presence (MUL-2618)

Connect the daemon-emitted `task:waiting_local_directory` and `task:running`
events through to issue execution log, sticky agent banner, activity indicator,
and agent presence so a parked task is no longer invisible on the issue page.

- Add `waiting_local_directory` to `AgentTask.status` and the typed
  `task:running` / `task:waiting_local_directory` WS event payloads.
- Chat realtime sync writes both new statuses into the pending-task cache so
  the chat StatusPill flips out of a stale `dispatched` frame.
- ExecutionLogSection: count `waiting_local_directory` as active, add tone +
  status label, treat parked tasks the same as dispatched for time anchor /
  transcript visibility / terminate-confirm note.
- AgentLiveCard: subscribe to both new events, rank the parked state between
  dispatched and queued, and surface a "is waiting for the local directory"
  banner with the muted "Clock" treatment used for queued.
- IssueAgentActivityIndicator: route parked tasks into the queued bucket so
  the hover stack and chip stay visible.
- derive-presence: parked tasks count toward `queuedCount` so the agent
  workload chip stays out of `idle` while the daemon waits on the path lock.
- Locales: add `agent_live.is_waiting_local_directory` and
  `execution_log.status_waiting_local_directory` (en + zh-Hans).

Co-authored-by: multica-agent <github@multica.ai>

* feat(project): enforce one local_directory per (project, daemon) (MUL-2618)

The daemon-side resolver picks the first matching local_directory by
daemon_id, so allowing two rows on the same daemon — even at different
paths — let the agent silently write into whichever sorted first. Tighten
the invariant top to bottom:

- server: `findLocalDirectoryConflict` rejects any second row sharing a
  daemon_id, regardless of `local_path` or label. Bundled-create surface in
  `CreateProject` runs the same daemon-scoped dedupe up front.
- daemon: `findLocalDirectoryAssignment` fails fast when it finds more than
  one row pinned to the current daemon (older API client / direct DB
  writes can still produce that state — refuse to guess).
- desktop UI: hide the "Add local directory" action once the current
  daemon owns a row on this project, with a hint and a defensive toast on
  the call path; foreign-daemon rows stay visible read-only as before.
- Tests:
  * daemon: new `two local_directory rows on this daemon fail fast` /
    `local_directory rows on different daemons coexist` cases.
  * handler: rewrite the legacy `LabelShadow` cases as
    `DaemonScopedConflict` / `BundledLocalDirectoryDaemonConflict` —
    asserts 409 on same-daemon different-path, 201 on per-daemon bundles.
- Locales: en + zh-Hans copy for the new hint + toast.

Co-authored-by: multica-agent <github@multica.ai>

* chore(sqlc): drop stale skills_local in UpdateAgentCustomEnv (MUL-2618)

Follow-up to the main-merge in 0f8e8ca7: the auto-merge preserved most
of main's skills_local revert but kept the column reference inside the
UpdateAgentCustomEnv scanner because that block hadn't been touched by
either side. Re-running `sqlc generate` regenerates the file without
skills_local in this query, matching the rest of the file and the
post-revert schema.

Co-authored-by: multica-agent <github@multica.ai>

* feat(create-project): binary source picker — repos OR local directory

Turn the create-project dialog's "Repos" pill into a binary Source
picker. A project's source is mutually exclusive: either a set of
GitHub repos (worktree mode, default) or a single local working
directory (local mode, desktop-only). Mirrors the constraint the
backend will enforce next.

Behavior:
- Pill shows the active mode's selection (GitHub icon + repo count, or
  folder icon + local label/path).
- Popover has a 2-tab segmented control at the top; the Local tab is
  hidden entirely on web (local_directory needs a daemon_id).
- Local tab requires the daemon online — amber notice + disabled picker
  when offline, re-renders automatically via useLocalDaemonStatus.
- Switching tabs preserves the other side's stash, but handleSubmit
  only emits the resource matching the active sourceMode, so abandoned
  picks never leak into the created project.

Backend mutual-exclusion validation + the resources-section
conditional-add-button still to come — this PR just unblocks the
dialog so it can be demoed.

* fix(mobile): cover waiting_local_directory in run row status maps (MUL-2618)

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Multica J <j@multica.ai>
2026-05-27 13:44:31 +08:00
DENG
7bc1aa7563 fix(daemon): detect Codex Desktop bundle CLI (#3332)
Co-authored-by: codex <codex@multica.local>
2026-05-27 13:39:54 +08:00
Multica Eve
311cf4d998 fix(agent): surface Codex app-server no-progress diagnostics (MUL-2688)
Refs #3262.
2026-05-26 18:42:47 +08:00
Multica Eve
744b474199 revert(agent): remove per-agent local skill toggle (MUL-2603) (#3286)
* Revert "feat(agents): hide skills_local toggle for runtimes that don't honour it (MUL-2603) (#3276)"

This reverts commit 0b50c5a209.

Co-authored-by: multica-agent <github@multica.ai>

* Revert "fix(agent): surface host OAuth token via env var on macOS isolation (MUL-2603) (#3267)"

This reverts commit a67bf81225.

Co-authored-by: multica-agent <github@multica.ai>

* Revert "fix(agents): tighten skills-tab intro and drop redundant import hint (#3265)"

This reverts commit d8075a5775.

Co-authored-by: multica-agent <github@multica.ai>

* Revert "fix(agent): mirror $HOME/.claude.json into isolated config dir (MUL-2661) (#3261)"

This reverts commit 40da88fc16.

Co-authored-by: multica-agent <github@multica.ai>

* Revert "feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200)"

This reverts commit 960befa56f.

Co-authored-by: multica-agent <github@multica.ai>

* Add migration cleanup for reverted agent skills toggle

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-26 17:00:01 +08:00
Bohan Jiang
6bc3d14eb3 fix(daemon/execenv): refresh stale Codex config copies across env reuse (MUL-2646) (#3268)
* fix(daemon/execenv): refresh stale Codex config copies across env reuse (MUL-2646)

`copyFileIfExists` previously short-circuited whenever the per-task
`codex-home/{config.toml,config.json,instructions.md}` already existed,
so once the files were seeded at first Prepare they were never refreshed
again — even though `Reuse()` calls `prepareCodexHomeWithOpts` on every
resume. A user who rotated their Codex `~/.codex/config.toml` between
runs (e.g. switching the active `[model_providers.X]` `base_url`, or
pointing `env_key` at a freshly rotated API key) kept reading the stale
per-task copy on session resume. Codex then issued requests to the new
URL using the old key and the API rejected the token.

Treat any existing `dst` as something to drop and re-copy from the
current shared source, mirroring the symlink path that already refreshes
`auth.json` (#2126). The daemon-managed sandbox / multi-agent / memory
blocks are applied via marker-bracketed idempotent passes after the
copy, so a re-copy + re-ensure cycle preserves them.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon/execenv): drop per-task Codex copy when shared source removed (MUL-2646)

Extend the MUL-2646 fix to the deletion arm of "sync the shared source":
`syncCopiedFile` (renamed from `copyFileIfExists`) now also removes the
per-task `dst` when the shared `src` is absent. The prior version
short-circuited on missing src and left `config.toml` / `config.json` /
`instructions.md` from the previous Prepare lingering in the per-task
home — so a user who removed a provider by deleting `~/.codex/config.toml`,
or pulled `config.json` / `instructions.md` out of the shared home, would
keep replaying the stale copy on session resume.

For `config.toml` the subsequent `ensureCodex{Sandbox,MultiAgent,Memory}Config`
passes recreate the file with only the daemon-managed default blocks, so
removing the shared file cleanly drops every user-managed
`[model_providers.X]` / `model_provider` line. For `config.json` and
`instructions.md` there is no daemon default, so they disappear in
lockstep with the shared source.

Adds `TestPrepareCodexHome_DropsCopiedConfigWhenSharedSourceRemoved`
covering the new path, and extends the refresh-arm test to assert the
multi-agent / memory marker blocks are still present after the copy is
refreshed.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-26 14:52:53 +08:00
Bohan Jiang
960befa56f feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200)
* feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603)

Adds an agent-scoped `skills_local` switch ("ignore" default / "merge") so
shared agents stop inheriting the operator's user-global Claude skill
directory. A single broken local skill on one operator's machine was
crashing the Claude CLI before it ever read stdin — the daemon saw a
"broken pipe" with no recoverable signal (GitHub #3052).

- DB: migration 108 adds `agent.skills_local` (NOT NULL DEFAULT 'ignore'),
  with sqlc CreateAgent/UpdateAgent updates and handler validation.
- Claude runtime: when the agent is in "ignore" mode the backend points
  CLAUDE_CONFIG_DIR at an empty per-task scratch dir under the task cwd
  (fallback: OS temp), strips any inherited override, and cleans up after
  the run. Workspace skills under `{cwd}/.claude/skills/` still load.
  "merge" preserves the legacy inherit-from-machine behavior; Codex and
  other isolated backends are no-ops.
- UI: new Skills toggle in the Create Agent dialog and the Agent → Skills
  tab, with EN/zh-Hans copy and SkillsLocalToggle shared between the two.
- Tests: unit coverage for the new env helper, isolation dir lifecycle,
  full Claude execute paths (ignore + merge), and the handler tristate
  contract. Existing skills-tab test updated for the new copy.
- Docs: updated `/skills` docs (EN + ZH) and added a 0.3.7 changelog entry
  in the landing-page i18n.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): preserve claude login + validate skills_local input (MUL-2603)

Address Elon's review on PR #3200:

1. Skill isolation no longer drops the operator's Claude login. The
   per-task scratch dir now mirrors every entry under `~/.claude/`
   as symlinks except `skills/`, so `.credentials.json`, settings,
   plugins, etc. reach the CLI exactly as on the host while the
   user-global skills directory stays hidden. Without this, default
   `ignore` would have broken every Claude agent on a non-API-key
   host the moment migration 108 landed.

2. Internal CreateAgent callers (agent_template, onboarding_shim)
   now set `SkillsLocal: "ignore"`. The Go zero value was about to
   trip the migration-108 CHECK constraint and 500 template /
   onboarding agent creation.

3. Create / update handler validation no longer normalizes garbage
   to "ignore". The strict 400 path is now reachable on bad client
   input; the drift-safe `normalizeSkillsLocal` stays on the read
   side only.

UI copy + docs clarified that the toggle is Claude-only; other
runtimes ignore the setting.

Verification:
- `go test ./...` green (full suite locally).
- `pnpm --filter @multica/views exec vitest run agents/components/tabs/skills-tab.test.tsx` green.
- Handler DB-backed tests still skip locally without docker (same
  as Elon's run) — CI will validate the create / update paths
  against migration 108.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): mirror effective claude config dir with windows fallback (MUL-2603)

Address Elon's second-round review on PR #3200:

1. The per-task scratch dir now mirrors the *effective* host Claude
   config dir, not unconditionally `~/.claude/`. Precedence: agent
   `custom_env` CLAUDE_CONFIG_DIR > parent process env > `~/.claude/`.
   Without this, an operator who pinned Claude at a managed install
   (custom env CLAUDE_CONFIG_DIR) would get the wrong credentials in
   the scratch dir, because `buildClaudeEnv` strips that env before
   handing it to the child. We resolve the source up front and feed
   it to the mirror, so the override env still points at the right
   bytes.

2. Mirror entries now go through platform-aware linkers. On Windows
   without Developer Mode / admin, `os.Symlink` is denied, which
   previously left the scratch dir empty and broke Claude Code auth
   on default `ignore`. The new helpers try symlink first, then fall
   back to a directory junction (`mklink /J`) for dirs or a hardlink
   (same-volume content share) / copy for files. Mirrors the
   execenv/codex_home_link_windows.go pattern.

3. Tests:
   - `TestResolveHostClaudeConfigDir` locks in the custom_env >
     parent_env > `~/.claude` precedence.
   - `TestNewIsolatedClaudeConfigDirMirrorsCustomHostDir` confirms
     the scratch dir picks up `.credentials.json` from a synthetic
     custom host dir, proving the source resolution actually
     propagates into the mirror.
   - `TestNewIsolatedClaudeConfigDirEmptyHostIsNoop` documents the
     env-var-auth-only case (no host source ⇒ empty scratch dir).
   - `TestMirrorHostClaudeExceptSkillsWith_FallbackWhenSymlinkFails`
     exercises the Windows-no-Developer-Mode path via the new
     `mirrorHostClaudeExceptSkillsWith` seam, asserting credentials
     and sub-dir children still reach the scratch dir after the
     symlink stand-in fails.
   - `TestMirrorHostClaudeExceptSkillsWith_PropagatesFirstLinkError`
     confirms callers see the per-entry error when even fallback
     fails (so the warn-log fires on broken Windows installs).
   - `TestCopyFileRoundTrip` covers the last-resort copy fallback
     and its EXCL no-overwrite contract.
   - `TestClaudeExecuteIsolatesUsesCustomEnvSource` is the
     end-to-end check: an agent with custom_env CLAUDE_CONFIG_DIR
     reads its credentials from the pinned dir, not `~/.claude/`.

4. Docs: `apps/docs/content/docs/skills.{mdx,zh.mdx}` updated to
   describe the effective-source resolution and the Windows
   fallback chain so the docs match the runtime behaviour.

Verification:
- `go test ./...` green (full server suite locally, including
  `pkg/agent` 23 cases covering the new + existing isolation
  paths).
- `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and
  `go test -c -o /dev/null` both compile clean, confirming the
  Windows-tagged linker file builds.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): default skills_local to merge to preserve legacy behavior (MUL-2603)

Per Bohan's product decision on PR #3200, the per-agent host-skill toggle
defaults to "merge" — the pre-MUL-2603 inherit-from-machine behavior —
so existing personal workflows that rely on locally installed Claude
Skills keep working unchanged. Agent owners explicitly opt into "ignore"
when they need to harden a shared agent against a broken local skill on
one operator's machine (GitHub #3052).

Also audited all 11 runtimes for user-global skill discovery paths and
documented the scope of the toggle. Only Claude reads a user-global
`~/.claude/skills/`; Codex isolates via `CODEX_HOME`, the ACP backends
(Hermes / Kimi / Kiro) and the JSON-stream backends (Copilot / Cursor /
Gemini / Pi / OpenCode / OpenClaw) anchor discovery to the task workdir
and never read a user-global skill directory. UI copy and docs now say
"for runtimes that support it (currently Claude Code)" everywhere so
the scope is explicit.

Changes:

- Migration 108: column default flipped to 'merge'.
- Handler CreateAgent: missing field → "merge"; explicit "ignore" /
  "merge" still validated, garbage still 400.
- normalizeSkillsLocal: drift-safe coercion now lands on "merge" for
  anything that isn't the exact literal "ignore".
- agent_template.go / onboarding_shim.go: internal CreateAgent callers
  send "merge" instead of "ignore" to match the new default.
- Claude runtime (`claude.go`): isolate-mode gate flipped from
  `SkillsLocal != "merge"` to `SkillsLocal == "ignore"`, so "" (legacy
  daemons / older clients) and "merge" both walk `~/.claude/` directly.
- Create Agent dialog + Skills tab: toggle defaults to on (merge); only
  duplicate of an explicit "ignore" agent carries through. The
  isolation opt-in is now `skills_local: "ignore"` when the user flips
  off; "merge" is omitted from the request body.
- i18n (EN + zh-Hans): copy reframed — "On (default) — merged"; "Off —
  ignored. Recommended for shared agents".
- Docs (`/skills`, `/guides/agents.zh`): describe new default and
  enumerate which runtimes act on the toggle.
- Landing changelog 0.3.7: retitled "Per-Agent Local-Skill Toggle"; note
  the on-by-default behavior + off-to-isolate framing.
- Tests:
  - `TestClaudeExecuteIsolatesHostSkillsWhenIgnoreOptedIn` replaces the
    old by-default isolation case (now requires explicit "ignore").
  - New `TestClaudeExecuteDefaultModeKeepsHostConfigDir` locks in that
    default ExecOptions preserve the host CLAUDE_CONFIG_DIR.
  - `TestClaudeExecuteIsolatesUsesCustomEnvSource` now explicitly opts
    into "ignore" mode.
  - Handler tests: omitted → "merge"; explicit "ignore" round-trips;
    preserve-existing test seeds "ignore" and asserts "merge" flip-back.
  - `TestNormalizeSkillsLocal_DriftStaysSafe`: only literal "ignore"
    maps to ignore; everything else → "merge".
  - `skills-tab.test.tsx`: toggle ON by default; flip OFF when agent
    opted into "ignore". Intro-text matcher anchored to a more specific
    phrase so it no longer collides with the toggle hint copy.

Verification:
- `go test ./...` green (full server suite locally).
- `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and
  `go test -c -o /dev/null` both compile clean (windows-tagged linker
  file still builds).
- `pnpm typecheck` green across all packages and apps.
- `pnpm --filter @multica/views test` 88 files / 771 tests green.
- `pnpm --filter @multica/core test` 43 files / 390 tests green.
- Handler DB-backed tests still skip locally without docker; CI will
  validate the create / update paths against migration 108.

Co-authored-by: multica-agent <github@multica.ai>

* chore(landing): drop 0.3.7 changelog entry from this PR (MUL-2603)

The landing-page release notes belong in a separate release-prep PR, not in the feature PR.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): propagate skills_local=ignore to codex user-skill seed (MUL-2603)

Make the per-agent skills_local toggle real for Codex too, not just Claude.
Previously the toggle was only consumed by the Claude backend, while the
daemon's execenv layer always seeded Codex's per-task CODEX_HOME with the
host machine's user-installed skills from ~/.codex/skills/. A shared Codex
agent with skills_local=ignore could still inherit a broken local skill
from one operator's machine.

Now: PrepareParams/ReuseParams carry SkillsLocal; hydrateCodexSkills
skips seedUserCodexSkills when SkillsLocal == "ignore" so the per-task
CODEX_HOME exposes only workspace skills to the codex CLI. Default
("merge", or empty from older servers/clients) preserves existing
inherit-from-machine behavior. UI / docs are updated to reflect the
contract honestly: Claude Code and Codex honor the toggle; other
runtimes (Hermes / Kimi / Kiro / Copilot / Cursor / Gemini / Pi /
OpenCode / OpenClaw) leave $HOME untouched and discover user-level
skills natively, so the toggle is a no-op for them today.

New tests: TestPrepareCodexSkillsLocalIgnoreSkipsUserSeed,
TestPrepareCodexSkillsLocalMergeSeedsUserSkills, and
TestReuseCodexSkillsLocalIgnoreSkipsUserSeed cover Prepare(ignore),
Prepare(merge), and the toggle-flip-on-reuse path.

Co-authored-by: multica-agent <github@multica.ai>

* docs(skills): scope skills_local toggle copy to Claude Code + Codex (MUL-2603)

Off-state hint and Skills tab intro now explicitly call out Claude Code +
Codex as the only runtimes that honor the toggle, with "other runtimes
ignore this setting" wired into both states (en + zh-Hans), so users on
non-Claude/Codex agents don't read "Off" as runtime-wide isolation.

Docs (skills.mdx, skills.zh.mdx, guides/agents.zh.mdx) stop describing
Hermes / Kimi / Gemini / Copilot / Cursor / Pi / OpenCode / OpenClaw / Kiro
as having native user-level skill discovery; the daemon simply does not
manage user-level skill discovery for those runtimes today, and the toggle
is a no-op regardless of where it is set.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-26 13:26:33 +08:00
Bohan Jiang
13f74e651a feat(agents): remove custom_env from agent resources, add audited env endpoint (MUL-2600) (#3209)
* feat(agents): remove custom_env from agent resources, add audited env endpoint (MUL-2600)

The agent resource shape (list / get / create / update / archive /
restore responses + WebSocket events) no longer carries `custom_env`
values. Reads/writes of env now flow exclusively through a dedicated
`/api/agents/{id}/env` endpoint that is owner/admin-only, rejects
agent-actor sessions, applies a "****" sentinel preserve guard on
PUT, and writes a persistent audit row per reveal/update.

Why
- `multica agent list --output json` historically returned plaintext
  `custom_env` for owner/admin callers (the redaction gate gave only
  members the masked map). Any agent token running on the workspace
  inherits its owner's role and could read every other agent's
  secrets just by listing.
- Patching list/get redaction alone (PR #3175 direction) left
  symmetric leaks via mutation responses, WS events, the "reveal"
  path itself (no actor-aware auth), and a `****` overwrite footgun
  on UpdateAgent.

What changed
- Backend: drop `custom_env` from AgentResponse; add coarse
  `has_custom_env` + `custom_env_key_count`. Strip env handling from
  UpdateAgent (silently ignored if sent). Keep CreateAgent's
  custom_env acceptance.
- Backend: new GET/PUT `/api/agents/{id}/env` handlers in
  `internal/handler/agent_env.go`:
  - resolveActor → 403 for agent actors (closes the lateral-movement
    path).
  - Owner/admin role gate via existing helper.
  - PUT honours value == "****" as "preserve existing value".
  - Both write to `activity_log` with `agent_env_revealed` /
    `agent_env_updated` actions. Audit details record key names only,
    never values.
- Daemon claim path (`ClaimAgentTask`) unchanged — `TaskAgentData`
  still carries plaintext env for runtime injection.
- SQL: new `UpdateAgentCustomEnv` query; sqlc regenerated (v1.31.1).
- CLI: new `multica agent env get|set` subcommands. `--custom-env*`
  flags removed from `multica agent update`; the no-fields error
  now points to the new path.
- Frontend: drop env fields from `Agent` + `UpdateAgentRequest`; add
  `getAgentEnv` / `updateAgentEnv` client methods; rewrite env-tab
  to show "N variables configured" + explicit "Reveal & edit"
  button, fetching values only on intentional reveal.
- Locales: parity-safe additions to en + zh-Hans.
- Docs: agents-create.{mdx,zh.mdx} reflect the new threat model and
  endpoint.
- Mobile: schema drops `custom_env` / `custom_env_redacted`, adds
  metadata fields.

Tests
- Handler tests pinned the new invariants: no env in list/get
  responses, owner reveal happy-path + audit row, agent-actor 403,
  `****` sentinel preserves real values, UpdateAgent silently
  ignores `custom_env`, pure `mergeAgentEnv` cases.
- CLI tests pivot to the new flag surface: `agent update` MUST NOT
  expose the env flags; `agent env set` MUST expose
  --custom-env-stdin/--custom-env-file.
- Frontend test fixtures updated; pnpm typecheck / test / lint
  pass cleanly.

This is a breaking API change. Scripts that read `custom_env` from
`/api/agents` must migrate to `GET /api/agents/{id}/env`.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): close actor-spoofing + audit fail-closed in env endpoints (MUL-2600)

Addresses Elon's review of #3209:

* Mint a task-scoped `mat_` token per claim, bound to (agent, task,
  workspace, owner). Daemon injects it into the agent process in place
  of its own credential. Auth middleware authoritatively rebuilds
  X-User-ID / X-Agent-ID / X-Task-ID from the token row and sets
  X-Actor-Source=task_token; that header is server-set only — incoming
  values are stripped before any auth branch runs. resolveActor honors
  the header so an agent that strips X-Agent-ID / X-Task-ID still
  resolves as actor=agent.
* GetAgentEnv / UpdateAgentEnv are now fail-closed on audit-log
  failures: GET refuses to return plaintext, PUT persists inside the
  same tx as the audit row so they commit/roll back together.
* PUT /api/agents/{id} returns 400 when the body carries custom_env
  instead of silently dropping it — directs callers to the audited env
  endpoint.
* Agent actors never see mcp_config, even when the underlying member
  is owner/admin; mutation broadcasts go through a redaction shim so
  WS subscribers don't pick it up either.
* Fix backend test that asserted dense JSON (jsonb::text renders
  whitespace) and frontend test that assumed a unique "Test User"
  match.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): close residual MUL-2600 gaps from review (MUL-2600)

Migration 108 FK now correctly references agent_task_queue(id) instead
of the non-existent agent_task table; the previous name blocked CI
backend migrations.

Task-token-authenticated requests can no longer be re-routed at a
different workspace by passing workspace_slug / workspace_id /
?workspace_id / a URL workspace param. ResolveWorkspaceIDFromRequest
and resolveWorkspaceUUID both short-circuit on X-Actor-Source=task_token
and return only the token-bound X-Workspace-ID; buildMiddleware adds a
defence-in-depth 403 if any URL-resolved workspace disagrees with the
token binding.

mcp_config no longer leaks back to agent actors through UpdateAgent /
CreateAgent / ArchiveAgent / RestoreAgent HTTP responses — the same
redactAgentResponseForActor helper that GetAgent/ListAgents use is now
applied to mutation responses too. WS broadcasts were already redacted
via broadcastAgentResponse.

FailTask and every TaskService cancel path (CancelTask /
CancelTasksForIssue / CancelTasksForAgent / CancelTasksByTriggerComment
/ BroadcastCancelledTasks) now eagerly DeleteTaskTokensByTask so the
mat_ token's 24h window doesn't outlive a terminated task. Failure is
non-fatal — the FK cascade and expiry remain durable guards.

Doc-only: clarify that PUT /api/agents/{id} now hard-rejects bodies
that carry custom_env (was previously "silently ignores").

Tests:
- middleware: TestResolveWorkspaceIDFromRequest gains a task_token
  case asserting client-supplied slug/id/query cannot override the
  bound workspace.
- handler: TestUpdateAgent_RedactsMcpConfigForAgentActor and
  TestUpdateAgent_KeepsMcpConfigForMemberActor pin the mutation-
  response redaction contract per actor type.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): match redacted mcp_config as JSON null, not Go nil (MUL-2600)

`AgentResponse.McpConfig` is `json.RawMessage` without `omitempty`, so
the redacted response serialises as `"mcp_config": null`. On decode,
`json.RawMessage` keeps the literal bytes `null` rather than collapsing
to Go nil, which made the assertion fire on a non-leak.

The product contract (field always present, distinguished from "no
config" via `mcp_config_redacted`) is intentional, so adjust the test
to check for "no secret-bearing content" instead of weakening the
contract via `omitempty`.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-25 18:42:48 +08:00
Bohan Jiang
cd71b0fe05 fix(daemon): disable Codex native auto-memory in per-task config.toml (#3202)
Codex CLI's auto-memory subsystem writes summaries to
`$CODEX_HOME/memories/raw_memories.md` and `state_*.sqlite`, then reads
them back on the next turn. The daemon never cleared these files across
Reuse(), and Codex CLI may also pull from user-level `~/.codex/memories/`
entirely outside the per-task isolation. Either path leaks unrelated
context into new Multica tasks — multica#3130 saw `D:\Project\MoHaYu\
WowChat` Raw Memories injected into a brand-new issue's first turn.

Write a daemon-managed block into the per-task `config.toml` that sets
`features.memories = false`, `memories.generate_memories = false`, and
`memories.use_memories = false`. Codex then neither writes nor reads
its memory subsystem regardless of where the residual files live. The
user's global `~/.codex/config.toml` is never touched.

Pattern mirrors `ensureCodexMultiAgentConfig`: idempotent managed-block
upsert, two TOML layout variants (root dotted-key vs. inside a `[features]`
/ `[memories]` table) to satisfy strict toml-rs parsing, and a
`MULTICA_CODEX_MEMORY` env-var escape hatch.

MUL-2598

Co-authored-by: multica-agent <github@multica.ai>
2026-05-25 15:17:38 +08:00
LinYushen
8e9df90d32 feat: include repo description in agent brief (#3203)
Add Description field to RepoData structs so that workspace repo
descriptions (set via the settings UI) are preserved through
normalization and rendered in the agent brief as:
  - <url> — <description>

When no description is set, the existing format is unchanged.

Closes MUL-2610

Co-authored-by: multica-agent <github@multica.ai>
2026-05-25 15:16:22 +08:00
Bohan Jiang
a55c03a0b3 fix(agent): inject Workspace Context into agent brief (MUL-2542) (#3078)
* fix(agent): inject Workspace Context into agent brief (MUL-2542)

The per-workspace `workspace.context` field (Settings → General) was
stored in the DB but never reached the agent prompt. Plumb it from the
workspace row through the claim response, the daemon's Task struct and
TaskContextForEnv, and render it as `## Workspace Context` in the meta
brief above `## Available Commands`. Heading is skipped when the field
is empty so workspaces that haven't set a context don't see a bare
header. Applies to every task kind — issue, comment, chat, autopilot,
quick-create — so the shared system prompt is consistent regardless of
trigger source.

Co-authored-by: multica-agent <github@multica.ai>

* chore(server): gofmt files touched by workspace-context injection

Run gofmt on the files that buildWorkspaceContext injection touched.
Cleans up composite-literal alignment in execenv task context and
struct-tag alignment in Task / AgentTaskResponse / RegisterRequest.
No behavior change.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: J <agent-j@multica.ai>
2026-05-22 17:23:27 +08:00
Bohan Jiang
c967ae0e0e feat(issues): platform-owned parent notify on child done (MUL-2538) (#3055)
* feat(issues): platform-owned parent notify on child done (MUL-2538)

When a child issue transitions from a non-done status into `done` and has
an open parent, the server now posts a top-level platform-generated
comment on the parent itself. Replaces the agent-prompt rule shipped in
PR #2918, which produced self-mention loops, planner ping-pong, and
accidental `MUL-` prefix hardcoding because the agent did not always know
the workspace prefix.

- Migration 107 widens `comment.author_type` to allow `system`; the
  zero UUID is used as the sentinel `author_id` (the column stays NOT
  NULL, callers branch on `author_type === 'system'`).
- `Handler.notifyParentOfChildDone` fires from both `UpdateIssue` and
  `BatchUpdateIssues`. Guards: prev status != done, new status == done,
  parent set, parent not in `done`/`cancelled`. Bypasses the
  CreateComment HTTP path so the assignee on_comment trigger and the
  mention-trigger paths do not fire — the comment content carries only
  the safe issue mention for the child, no `mention://agent/...` /
  `mention://member/...` / `mention://squad/...` links.
- `runtime_config.go` downgrades the Parent/Sub-issue Protocol rule 1
  to an explicit "do NOT post one yourself" guardrail; rule 2 (sub-issue
  creation `--status todo` vs `backlog`) is unchanged.
- New handler test exercises the happy path, idempotency, reopen+done,
  parent done/cancelled guards, and the no-parent case. Runtime-config
  tests reassert the new wording and the banned strings from the prior
  revision.

Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): isolate system comments + wire GH merge path (MUL-2538)

Addresses the two must-fix items from the PR #3055 second review:

1. The platform-generated `comment:created` event (author_type='system')
   was running through the generic comment listeners, which (a) tried to
   subscribe the zero-UUID author and (b) parsed @mentions from the body
   for inbox notifications. Both subscriber_listeners and
   notification_listeners now early-return on author_type='system' so the
   event becomes a pure WS broadcast for the timeline — no inbox rows,
   no transcluded-mention attack surface.

2. advanceIssueToDone (the GitHub merge auto-done path) only published
   issue:updated and skipped notifyParentOfChildDone, so a child closed
   via merged PR — the dominant completion path — left the parent
   silent. The helper is now invoked on the same prev/updated pair, with
   the existing guards (transition + parent state) protecting double-fire.

Tests:
- New cmd/server/notification_listeners_test:
  TestNotification_SystemCommentSkipsInboxAndMentions (parent subscribers
  and smuggled @mention targets stay quiet),
  TestSubscriberSystemCommentDoesNotSubscribe (zero-UUID never reaches
  AddIssueSubscriber).
- New internal/handler/github_test:
  TestWebhook_MergedPR_ChildWithParent_NotifiesParent fires a real
  pull_request closed-merged webhook against a child and asserts the
  parent receives exactly one safe system comment with the workspace's
  real identifier (no `mention://agent|member|squad` links).

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): drop parent-notification guidance from agent brief (MUL-2538)

Per Bohan's product call on PR #3055: the platform now owns the
child-done parent notification, so the runtime brief should not mention
the parent-comment path at all — not as an instruction, not as a "do
not do it" guardrail. The previous revision kept rule 1 of the Parent /
Sub-issue Protocol as a "Do NOT post your own parent-notification
comment." sentence; that still puts the concept in front of the agent
every run, which is exactly what we are trying to avoid.

What changes:
- Delete the "Parent / Sub-issue Protocol" preamble and rule 1 from
  buildMetaSkillContent. The remaining content — the `--status todo`
  vs `--status backlog` rule for creating sub-issues — now lives in a
  dedicated `## Sub-issue Creation` section, since the parent/child
  framing it previously sat under is gone.
- The system comment on the parent stays exactly as in 366f6e2: the
  agent simply does not need to know about it.

Tests:
- runtime_config_test.go is rewritten around the new section name and
  the wider "no parent-notification guidance" canary; the banned list
  now covers both the original PR #2918 wording and the intermediate
  "do NOT post one" wording.

System comment UI: the frontend already renders `author_type === "system"`
with author name "Multica" (`useActorName`) and the MulticaIcon avatar
(`ActorAvatar` via `isSystem`), matching Bohan's "looks like a normal
comment, author is multica + multica logo" requirement — no frontend
changes needed.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-22 14:51:43 +08:00