multica

mirror of https://github.com/multica-ai/multica.git synced 2026-07-05 13:29:44 +02:00

Author	SHA1	Message	Date
Bohan Jiang	674be86add	fix(tasks): cancel autopilot run_only & quick_create tasks (MUL-2827) (#3615 ) CancelTaskByUser (POST /api/tasks/{taskId}/cancel) keyed cancellation off issue_id / chat_session_id alone, so any task whose only source link was autopilot_run_id (run_only autopilots) or quick_create context fell into the dead else branch and 404'd with "task not found" — even though the task was visible (and showed a cancel X) on the agent Activity tab. Enforce tenancy uniformly through the task's owning agent instead: agent_id is NOT NULL on every task row (ON DELETE CASCADE), and agents are workspace-scoped, so GetAgentTaskInWorkspace (task JOIN agent ON workspace) is a single tenant guard that works regardless of which optional source FK is set — including orphan tasks whose autopilot_run_id was SET NULL after the autopilot was deleted. Privacy layers on top: chat tasks stay creator-only, and every other task mirrors the agent Activity / snapshot private-agent visibility gate via canAccessPrivateAgent so the id-only endpoint is never more permissive than the surface that exposes the task. Tests cover run_only (same-ws success, cross-ws 404 no-mutation), quick_create, retry clones, issue-task regression, chat non-creator 403, and private-agent plain-member 403. Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-01 22:11:27 +08:00
Naiyuan Qing	3187bbf90c	feat(comments): re-add since-delta + cold-start thread read + parent-root write normalization (#3494 ) * feat(comments): since-delta new-comment hint + default-on comment session resume (#3432) * feat(db): add unresolved comment count + list filter queries Add CountUnresolvedComments (excludes the agent's own comments) and ListUnresolvedCommentsForIssue. Both are additive — existing callers stay on the unfiltered queries — so old clients are unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handler): support unresolved-only comment listing Wire an additive `unresolved` query param into ListComments. Defaults off so an old CLI that never sends it gets unchanged behavior; only true/1 enable it. Rejects combining unresolved with thread/recent (whole-issue filter vs navigation models). Includes filter + count query tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handler): plumb unresolved count + thread root into claim, gate comment resume Populate trigger_parent_id (thread root of the trigger comment) and unresolved_count (excludes the agent's own comments) on comment-triggered claim responses. Both fields are omitempty so old daemons ignore them. Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION (default off): resumed comment turns can inherit the prior turn's "Done." final message, so this stays an explicit rollout switch. The runtime-match and poisoned-session guards still apply regardless of the flag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(daemon): inject unresolved-comments hint + resolve step into agent brief Add a shared BuildUnresolvedCommentsHint helper rendered on both the per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It ships only the count and the relevant CLI call — never comment bodies — so the server stays cheap. Thread case points at --thread <root>; issue case points at --unresolved. Suppressed when the count is 0. Also add a workflow step telling the agent to `multica comment resolve <thread-root>` once a thread is fully handled, so the unresolved set converges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): add comment list --unresolved and comment resolve command Add an --unresolved filter to `issue comment list` (wired to the server's unresolved param, rejected when combined with --thread/--recent) and a top-level `comment resolve <id>` command that POSTs to the existing /api/comments/{id}/resolve endpoint, letting an agent close threads it has fully handled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(comments): since-delta new-comment hint + default-on comment resume Simplifies the comment-triggered agent flow down to what's actually needed: - New-comment awareness is now a pure time delta: the claim response carries new_comment_count + new_comments_since (anchored on the prior run's started_at, never completed_at so a long run can't miss comments). The per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s) since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the two surfaces can't drift. Cold start (no prior run) falls back to a plain read. - Comment-triggered tasks resume the prior session by default (same runtime), dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS comment" prompt guard defends against inheriting the prior turn's "Done." marker; GetLastTaskSession still excludes poisoned sessions. - Drops the resolved-based machinery from the first draft: CountUnresolvedComments / ListUnresolvedCommentsForIssue queries, the `comment list --unresolved` flag, the `multica comment resolve` command, and the resolve workflow step. - Removes the verbose cursor-pagination paragraph from the comment prompt; the --thread/--recent/--since flags stay in the CLI/API, just no longer explained inline every turn. Compatibility: new claim fields are omitempty (old daemons ignore them). Comment resume is default-on and affects even old daemons, which already consume prior_session_id. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(comments): collapse reply parent_id to thread root on write Comment threads are a 2-level model (root + flat replies, like Linear/Slack), enforced today only by the UI and the agent path — the CreateComment handler stored whatever parent_id it was handed, and the agent-side flatten walked just one level, so a reply-to-a-reply could land at depth 3+. Add GetThreadRoot (a recursive walk to the parent_id=NULL root) and run both write paths (handler.CreateComment, service.createAgentComment) through it, so every stored reply's parent_id IS its thread root. Readers can now treat parent_id as the thread root without re-walking. The agent-drift guard still compares the raw parent_id to the trigger comment before normalization. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(comments): cold-start reads triggering thread, warm keeps --thread pointer The since-delta rework dropped the thread-first read on the COLD path: a first-time agent fell back to the flat `comment list` dump (oldest-first, cap 2000), burying the trigger's context in ancient chatter. Point cold start at the triggering conversation instead via a shared BuildColdCommentsHint (`--thread <trigger> --tail 30` + a --recent pointer for cross-thread background). On the WARM path, --since is a pure time delta and can miss the triggering thread's pre-anchor history, so BuildNewCommentsHint now also emits a --thread pointer. Both surfaces (per-turn prompt + CLAUDE.md workflow) render via the shared helpers so they cannot drift (PR #2816 rule). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 10:38:37 +08:00
Naiyuan Qing	d90732750f	Revert "feat(comments): since-delta new-comment hint + default-on comment ses…" (#3455 ) This reverts commit `5e78e5100a`.	2026-05-28 17:52:59 +08:00
Naiyuan Qing	5e78e5100a	feat(comments): since-delta new-comment hint + default-on comment session resume (#3432 ) * feat(db): add unresolved comment count + list filter queries Add CountUnresolvedComments (excludes the agent's own comments) and ListUnresolvedCommentsForIssue. Both are additive — existing callers stay on the unfiltered queries — so old clients are unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handler): support unresolved-only comment listing Wire an additive `unresolved` query param into ListComments. Defaults off so an old CLI that never sends it gets unchanged behavior; only true/1 enable it. Rejects combining unresolved with thread/recent (whole-issue filter vs navigation models). Includes filter + count query tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handler): plumb unresolved count + thread root into claim, gate comment resume Populate trigger_parent_id (thread root of the trigger comment) and unresolved_count (excludes the agent's own comments) on comment-triggered claim responses. Both fields are omitempty so old daemons ignore them. Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION (default off): resumed comment turns can inherit the prior turn's "Done." final message, so this stays an explicit rollout switch. The runtime-match and poisoned-session guards still apply regardless of the flag. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(daemon): inject unresolved-comments hint + resolve step into agent brief Add a shared BuildUnresolvedCommentsHint helper rendered on both the per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It ships only the count and the relevant CLI call — never comment bodies — so the server stays cheap. Thread case points at --thread <root>; issue case points at --unresolved. Suppressed when the count is 0. Also add a workflow step telling the agent to `multica comment resolve <thread-root>` once a thread is fully handled, so the unresolved set converges. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cli): add comment list --unresolved and comment resolve command Add an --unresolved filter to `issue comment list` (wired to the server's unresolved param, rejected when combined with --thread/--recent) and a top-level `comment resolve <id>` command that POSTs to the existing /api/comments/{id}/resolve endpoint, letting an agent close threads it has fully handled. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(comments): since-delta new-comment hint + default-on comment resume Simplifies the comment-triggered agent flow down to what's actually needed: - New-comment awareness is now a pure time delta: the claim response carries new_comment_count + new_comments_since (anchored on the prior run's started_at, never completed_at so a long run can't miss comments). The per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s) since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the two surfaces can't drift. Cold start (no prior run) falls back to a plain read. - Comment-triggered tasks resume the prior session by default (same runtime), dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS comment" prompt guard defends against inheriting the prior turn's "Done." marker; GetLastTaskSession still excludes poisoned sessions. - Drops the resolved-based machinery from the first draft: CountUnresolvedComments / ListUnresolvedCommentsForIssue queries, the `comment list --unresolved` flag, the `multica comment resolve` command, and the resolve workflow step. - Removes the verbose cursor-pagination paragraph from the comment prompt; the --thread/--recent/--since flags stay in the CLI/API, just no longer explained inline every turn. Compatibility: new claim fields are omitempty (old daemons ignore them). Comment resume is default-on and affects even old daemons, which already consume prior_session_id. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 15:58:42 +08:00
Bohan Jiang	341ce7bfa5	feat: support local working directory for projects (MUL-2618 v1) (#3283 ) * feat(project): add local_directory project_resource type (MUL-2662) Adds a second project_resource type alongside github_repo so a project can be pinned to an existing directory on a specific daemon (the v1 of the local-working-directory flow tracked in MUL-2618). The ref schema is { local_path, daemon_id, label? }; local_path must be absolute and daemon_id is required. The same (daemon_id, local_path) pair is allowed on multiple projects by design — no UNIQUE constraint is added. Implementation reuses the existing project_resource API surface: the new type is wired through the validator switch with no migration, no new events, and no daemon-handler changes (daemon already passes through arbitrary resource types via ProjectResources). The CLI gains --local-path / --daemon-id / --ref-label shortcuts so `multica project resource add --type local_directory` mirrors the existing `--type github_repo --url ...` ergonomics; the generic --ref flag still works for both types. Tests cover the full CRUD lifecycle, the same-path-across-projects allowance, the same-path-same-project conflict, the validator rejections (missing/blank/relative path, missing daemon_id, wrong payload type), and the cross-platform isAbsoluteLocalPath helper. Co-authored-by: multica-agent <github@multica.ai> * feat(project): add update endpoint + label-shadow guard for project_resource (MUL-2662) Addresses the Elon review on PR #3263: - Add PUT /api/projects/{id}/resources/{resourceId} with sqlc query, matching handler, CLI `project resource update`, and a new EventProjectResourceUpdated WS event. resource_type stays immutable; ref/label/position are all individually optional. - Catch same-project (daemon_id, local_path) collisions where only the embedded label differs — the row-level UNIQUE only matches the full ref JSON, so a label typo would otherwise let the same working directory bind twice. - Tests cover the update lifecycle (label-only / ref / clear / 404 / invalid path) and the label-shadow conflict on both create and update; the in-place rename still succeeds because the conflict scan ignores the row being edited. Incidental: regenerating sqlc picked up a missing skills_local scan in UpdateAgentCustomEnv that drifted in from #3200. Co-authored-by: multica-agent <github@multica.ai> * fix(project): close bundled-create label-shadow gap + merge resource_ref on CLI update (MUL-2662) Two follow-ups from MUL-2662 review round 2: - CreateProject inline resources path now dedupes local_directory entries on (daemon_id, local_path) before opening the transaction. The DB-level UNIQUE(project_id, resource_type, resource_ref) constraint only fires on a full JSON match, so two rows with the same target but different `label` would otherwise slip past. Standalone POST/PUT already cover this via findLocalDirectoryConflict; bundled create was the missing surface. - `multica project resource update` now seeds resource_ref from the existing row before applying per-type shortcut flags, so `--default-branch-hint x` on its own no longer constructs a payload missing `url` (which the server 400s on). Local_directory partial edits get the same merge behavior. Co-authored-by: multica-agent <github@multica.ai> * feat(desktop): local_directory project_resource UI (MUL-2665) (#3273) * feat(desktop): local_directory project_resource UI (MUL-2665) First UI surface for the local-working-directory flow tracked in MUL-2618. Lets users on the desktop pin a project to an existing folder on this machine; web stays read-only since the per-daemon check can't be done in the browser. What's new for the renderer: - ProjectResourcesSection grows a desktop-only "Add local directory" button next to the existing GitHub-repo popover. Clicking it opens Electron's native folder picker, validates the path through a new IPC pair (existence + r/w), and submits a project_resource of resource_type=local_directory with daemon_id pulled live from daemonAPI.getStatus. - LocalDirectoryRow renders the rename pencil + path tooltip, and greys out when ref.daemon_id != this machine's daemon_id (with a "only available on the machine that registered this directory" tooltip). Delete stays enabled so users can drop stale registrations from any device. - LocalDirectoryHint sits above the issue-detail comment composer and shows "Agent will work in-place at {label} ({path})" when the issue's project has a local_directory matching this daemon. Hidden on web. - TaskStatusPill picks up a new "waiting_for_directory_release" stage that the daemon will publish when it dequeues a task but can't acquire the path lock. The render is in place now so the daemon sibling subtask can wire the status string without an additional UI PR. Plumbing: - @multica/core/types gains LocalDirectoryResourceRef + UpdateProjectResourceRequest, and the api client gets the matching PUT method backed by the server endpoint that landed in `2ac3faebb` (MUL-2662). A useUpdateProjectResource hook drives the in-place label edit. - New Electron handlers under apps/desktop/src/main/local-directory.ts: local-directory:pick -> dialog.showOpenDialog (openDirectory) local-directory:validate -> stat + access(R_OK + W_OK) exposed through the preload as desktopAPI.pickDirectory / validateLocalDirectory. View code talks to them via a thin packages/views/platform helper that returns reason=unsupported on web instead of crashing. - useLocalDaemonStatus exposes the local daemon's id, device name, and running flag from daemonAPI.onStatusChange so the renderer can do the cross-device match without coupling to the desktop preload typings. Tests: - pickStageKeys gets a unit test covering the new stage and proving the directory-release status outranks availability hints. - LocalDirectoryHint tests cover the four render branches (no project, no daemon, foreign daemon, matching daemon). - i18n parity stays green; new keys added under projects.resources.* and chat.status_pill.stages.waiting_for_directory_release in both locales. Out of scope (will land separately): - The daemon-side waiting/lock signal that flips the pill into the new state. - Adding local_directory to the create-project modal's bulk attach flow. - Docs page refresh for project-resources.mdx — left for the MUL-2618 umbrella sweep. Co-authored-by: multica-agent <github@multica.ai> * fix(desktop): hide rename for foreign daemon local_directory rows (MUL-2618) Address review nit on #3273: the rename pencil was gated only by `canEdit`, so a foreign / unknown-daemon row still showed it even though the spec says cross-device rows are disabled. Gate rename on `!mismatch` so it disappears on those rows; delete stays available so a stale registration can still be dropped from any device. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> * feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663) (#3274) * feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663) Wires up the daemon side of the local_directory project_resource introduced in MUL-2662. When a task is dispatched against a project whose resources include a local_directory pinned to this daemon's UUID, the daemon now: - Validates the path (absolute, exists, daemon process can read+write, not in the system-root / $HOME blacklist) and fails the task fast on any precondition violation, with a user-readable reason. - Serialises concurrent tasks on the same on-disk path via a daemon-local LocalPathLocker keyed by symlink-resolved realpath. The lock is held for the entire task lifetime (claim → context write → agent → result report). - When the lock is contended, the daemon flips the row to a new waiting_local_directory status on the server (carrying a wait_reason like "<path> (held by task <short id>)") so the UI can render "等待本地目录释放" instead of leaving the row silently in dispatched past the sweeper timeout. The status accepts being woken into running once the lock is acquired. - Sets execenv.WorkDir to the user's path (no copy, no mount). envRoot still lives under workspacesRoot/<wsID>/ and hosts output/, logs/, and .gc_meta.json — the daemon's logbook for the run. - Stamps GCMeta.LocalDirectory=true so the GC loop never RemoveAlls envRoot for these tasks (gcActionClean → gcActionCleanArtifacts, gcActionOrphan → gcActionSkip). The user's directory was never under envRoot to begin with, so this is defense in depth. - Skips execenv.Reuse for local_directory tasks because the prior WorkDir is the user's path and reusing it through that code path loses the envRoot association the GC loop needs. Prepare is cheap here (no clone, no copy), so always running it is fine. Server-side protocol changes: - New CHECK value 'waiting_local_directory' on agent_task_queue.status plus a wait_reason TEXT column (migration 109). - All cancel / active / counted-as-running / orphan-recovery queries expanded to include the new status; FailStaleTasks intentionally excludes it (the daemon owns the wait). - New SQL MarkAgentTaskWaitingLocalDirectory(id, reason) and a relaxed StartAgentTask that accepts both dispatched and waiting_local_directory as preconditions (and clears wait_reason on the way through). - New POST /api/daemon/tasks/{taskId}/wait-local-directory endpoint, TaskService.MarkTaskWaitingLocalDirectory broadcaster, and matching daemon Client.MarkTaskWaitingLocalDirectory. Tests cover: path blacklist + R/W enforcement, mutex serialisation + ctx-cancelled wait, lock handover between two tasks, GC never returns gcActionClean / gcActionOrphan for local_directory rows (with negative control for the standard path), and Prepare/Cleanup correctly substitute + protect the user's WorkDir. The desktop UI side (UI for adding a local_directory resource, surfacing the "等待本地目录" badge) is MUL-2665; the agent-task lifecycle changes (no branch switch, dirty-tree tolerant, auto-commit) are MUL-2664. This PR targets the shared MUL-2618 v1 feature branch agent/j/912b8cb1, not main; the whole v1 will be merged to main together when complete. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): tighten local_directory status, symlink, cancel handling (MUL-2618) Address the 3 must-fix items from Elon's review of PR #3274. 1. Status string unified. The server / daemon publish `waiting_local_directory`; align views, locales, and the pickStageKeys test (PR #3273 had used `waiting_for_directory_release` on a placeholder string). Without this, the daemon's wait state never reached the pill once the two siblings merged. 2. validateLocalPath now also runs the blacklist against the symlink-resolved realpath, with macOS's `/etc` -> `/private/etc` redirect handled via `isBlacklistedRealPath` which compares canonical forms. Without this, a symlink such as `/Users/me/proj/home -> /Users/me` slipped the literal $HOME check while every daemon write still landed in the user's home. Tests cover symlink-to-home, symlink-to-system-root, and the negative case (symlink to a regular subdirectory). 3. acquireLocalDirectoryLockIfNeeded now spins up a cancellation watcher inside `onWait` (lazy — the fast path stays free) so the gap between dispatch and StartTask responds to server-side cancel or row deletion. If the watcher fires while the daemon is parked on the path mutex, the lock-wait context is cancelled, Acquire returns promptly, and the helper exits silently the same way the run-phase poller does. New TestAcquireLocalDirectoryLock_CancelDuringWait exercises the path end-to-end with a fake server. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): unconditional canonical blacklist + Windows drive-root generalisation (MUL-2618) - validateLocalPath now always runs isBlacklistedRealPath on the symlink-resolved path, not only when it differs from absPath. The old guard let users type the canonical form of an OS-symlinked banned root (e.g. /private/tmp, /private/etc, /private/var on macOS) straight through, since EvalSymlinks is a no-op on already-canonical input. - Windows drive-root rejection moved off the static C/D/E/F enumeration onto filepath.VolumeName via a new isDriveRoot helper, so removable / network drives mounted at G:..Z: and UNC \\server\share roots are also blocked. systemRootBlacklist keeps the well-known C:\ trees only. - Tests: macOS-only case exercises direct /private/{tmp,etc,var}; a new TestIsDriveRoot covers the Windows generalisation (skipped on POSIX runners by runtime guard). Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> * feat(views): wire waiting_local_directory end-to-end in issue UI + presence (MUL-2618) Connect the daemon-emitted `task:waiting_local_directory` and `task:running` events through to issue execution log, sticky agent banner, activity indicator, and agent presence so a parked task is no longer invisible on the issue page. - Add `waiting_local_directory` to `AgentTask.status` and the typed `task:running` / `task:waiting_local_directory` WS event payloads. - Chat realtime sync writes both new statuses into the pending-task cache so the chat StatusPill flips out of a stale `dispatched` frame. - ExecutionLogSection: count `waiting_local_directory` as active, add tone + status label, treat parked tasks the same as dispatched for time anchor / transcript visibility / terminate-confirm note. - AgentLiveCard: subscribe to both new events, rank the parked state between dispatched and queued, and surface a "is waiting for the local directory" banner with the muted "Clock" treatment used for queued. - IssueAgentActivityIndicator: route parked tasks into the queued bucket so the hover stack and chip stay visible. - derive-presence: parked tasks count toward `queuedCount` so the agent workload chip stays out of `idle` while the daemon waits on the path lock. - Locales: add `agent_live.is_waiting_local_directory` and `execution_log.status_waiting_local_directory` (en + zh-Hans). Co-authored-by: multica-agent <github@multica.ai> * feat(project): enforce one local_directory per (project, daemon) (MUL-2618) The daemon-side resolver picks the first matching local_directory by daemon_id, so allowing two rows on the same daemon — even at different paths — let the agent silently write into whichever sorted first. Tighten the invariant top to bottom: - server: `findLocalDirectoryConflict` rejects any second row sharing a daemon_id, regardless of `local_path` or label. Bundled-create surface in `CreateProject` runs the same daemon-scoped dedupe up front. - daemon: `findLocalDirectoryAssignment` fails fast when it finds more than one row pinned to the current daemon (older API client / direct DB writes can still produce that state — refuse to guess). - desktop UI: hide the "Add local directory" action once the current daemon owns a row on this project, with a hint and a defensive toast on the call path; foreign-daemon rows stay visible read-only as before. - Tests: * daemon: new `two local_directory rows on this daemon fail fast` / `local_directory rows on different daemons coexist` cases. * handler: rewrite the legacy `LabelShadow` cases as `DaemonScopedConflict` / `BundledLocalDirectoryDaemonConflict` — asserts 409 on same-daemon different-path, 201 on per-daemon bundles. - Locales: en + zh-Hans copy for the new hint + toast. Co-authored-by: multica-agent <github@multica.ai> * chore(sqlc): drop stale skills_local in UpdateAgentCustomEnv (MUL-2618) Follow-up to the main-merge in `0f8e8ca7`: the auto-merge preserved most of main's skills_local revert but kept the column reference inside the UpdateAgentCustomEnv scanner because that block hadn't been touched by either side. Re-running `sqlc generate` regenerates the file without skills_local in this query, matching the rest of the file and the post-revert schema. Co-authored-by: multica-agent <github@multica.ai> * feat(create-project): binary source picker — repos OR local directory Turn the create-project dialog's "Repos" pill into a binary Source picker. A project's source is mutually exclusive: either a set of GitHub repos (worktree mode, default) or a single local working directory (local mode, desktop-only). Mirrors the constraint the backend will enforce next. Behavior: - Pill shows the active mode's selection (GitHub icon + repo count, or folder icon + local label/path). - Popover has a 2-tab segmented control at the top; the Local tab is hidden entirely on web (local_directory needs a daemon_id). - Local tab requires the daemon online — amber notice + disabled picker when offline, re-renders automatically via useLocalDaemonStatus. - Switching tabs preserves the other side's stash, but handleSubmit only emits the resource matching the active sourceMode, so abandoned picks never leak into the created project. Backend mutual-exclusion validation + the resources-section conditional-add-button still to come — this PR just unblocks the dialog so it can be demoed. * fix(mobile): cover waiting_local_directory in run row status maps (MUL-2618) --------- Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: Multica J <j@multica.ai>	2026-05-27 13:44:31 +08:00
Multica Eve	744b474199	revert(agent): remove per-agent local skill toggle (MUL-2603) (#3286 ) * Revert "feat(agents): hide skills_local toggle for runtimes that don't honour it (MUL-2603) (#3276)" This reverts commit `0b50c5a209`. Co-authored-by: multica-agent <github@multica.ai> * Revert "fix(agent): surface host OAuth token via env var on macOS isolation (MUL-2603) (#3267)" This reverts commit `a67bf81225`. Co-authored-by: multica-agent <github@multica.ai> * Revert "fix(agents): tighten skills-tab intro and drop redundant import hint (#3265)" This reverts commit `d8075a5775`. Co-authored-by: multica-agent <github@multica.ai> * Revert "fix(agent): mirror $HOME/.claude.json into isolated config dir (MUL-2661) (#3261)" This reverts commit `40da88fc16`. Co-authored-by: multica-agent <github@multica.ai> * Revert "feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200)" This reverts commit `960befa56f`. Co-authored-by: multica-agent <github@multica.ai> * Add migration cleanup for reverted agent skills toggle Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-26 17:00:01 +08:00
LinYushen	bf8a346cf0	feat(runtimes): cascade-archive agents on runtime delete (MUL-2667) (#3266 ) * feat(runtimes): cascade-archive agents on runtime delete (MUL-2667) Replace the bare 409 "cannot delete runtime: it has active agents" with a structured response carrying the blocking agent list, and wire a cascade endpoint that archives those agents, cancels their tasks, pauses dangling autopilots and deletes the runtime in a single transaction. The unified DeleteRuntimeDialog opens directly in cascade mode when the runtime has bound agents, pivots from light to cascade if the strict DELETE refuses with runtime_has_active_agents, and re-prompts when the cascade refuses with runtime_delete_plan_changed (live agent set drifted while the dialog was open). The online-local self-healing rule is preserved at the affordance level (kebab hidden, Diagnostics button disabled with tooltip) and re-checked at confirm time as defence in depth. Co-authored-by: multica-agent <github@multica.ai> * fix(runtimes): close cascade race + i18n delete dialog (PR #3266 review) - Acquire FOR UPDATE on the runtime row at the top of the cascade tx so FK-validated agent INSERTs/UPDATEs that would point at this runtime block until commit, and lock each currently-active agent row via ListActiveAgentsByRuntimeForUpdate so a concurrent archive/move of an existing active row also blocks. - Switch the bulk archive from runtime-keyed (ArchiveAgentsByRuntime) to ID-keyed (ArchiveAgentsByIDs), narrowed to the user-confirmed expected_active_agent_ids set. Combined with the runtime row lock, this guarantees no agent outside the confirmed plan can be silently archived between plan-compare and archive even at read-committed. - Wire delete-runtime-dialog.tsx to runtimes locale via useT(); add detail.delete_dialog.{light,cascade} keys (EN with _one/_other plurals, zh-Hans _other) covering titles, descriptions, warning, notices, checkbox, buttons, table headers, presence labels, and toasts. Resolves the i18next/no-literal-string CI failure. - Locale parity test passes (51 tests). All 4 dialog test cases pass unmodified (EN copy preserves original wording). Full views vitest: 91 files / 792 tests green; full server go test: green. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-26 14:59:38 +08:00
Bohan Jiang	960befa56f	feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200 ) * feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) Adds an agent-scoped `skills_local` switch ("ignore" default / "merge") so shared agents stop inheriting the operator's user-global Claude skill directory. A single broken local skill on one operator's machine was crashing the Claude CLI before it ever read stdin — the daemon saw a "broken pipe" with no recoverable signal (GitHub #3052). - DB: migration 108 adds `agent.skills_local` (NOT NULL DEFAULT 'ignore'), with sqlc CreateAgent/UpdateAgent updates and handler validation. - Claude runtime: when the agent is in "ignore" mode the backend points CLAUDE_CONFIG_DIR at an empty per-task scratch dir under the task cwd (fallback: OS temp), strips any inherited override, and cleans up after the run. Workspace skills under `{cwd}/.claude/skills/` still load. "merge" preserves the legacy inherit-from-machine behavior; Codex and other isolated backends are no-ops. - UI: new Skills toggle in the Create Agent dialog and the Agent → Skills tab, with EN/zh-Hans copy and SkillsLocalToggle shared between the two. - Tests: unit coverage for the new env helper, isolation dir lifecycle, full Claude execute paths (ignore + merge), and the handler tristate contract. Existing skills-tab test updated for the new copy. - Docs: updated `/skills` docs (EN + ZH) and added a 0.3.7 changelog entry in the landing-page i18n. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): preserve claude login + validate skills_local input (MUL-2603) Address Elon's review on PR #3200: 1. Skill isolation no longer drops the operator's Claude login. The per-task scratch dir now mirrors every entry under `~/.claude/` as symlinks except `skills/`, so `.credentials.json`, settings, plugins, etc. reach the CLI exactly as on the host while the user-global skills directory stays hidden. Without this, default `ignore` would have broken every Claude agent on a non-API-key host the moment migration 108 landed. 2. Internal CreateAgent callers (agent_template, onboarding_shim) now set `SkillsLocal: "ignore"`. The Go zero value was about to trip the migration-108 CHECK constraint and 500 template / onboarding agent creation. 3. Create / update handler validation no longer normalizes garbage to "ignore". The strict 400 path is now reachable on bad client input; the drift-safe `normalizeSkillsLocal` stays on the read side only. UI copy + docs clarified that the toggle is Claude-only; other runtimes ignore the setting. Verification: - `go test ./...` green (full suite locally). - `pnpm --filter @multica/views exec vitest run agents/components/tabs/skills-tab.test.tsx` green. - Handler DB-backed tests still skip locally without docker (same as Elon's run) — CI will validate the create / update paths against migration 108. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): mirror effective claude config dir with windows fallback (MUL-2603) Address Elon's second-round review on PR #3200: 1. The per-task scratch dir now mirrors the effective host Claude config dir, not unconditionally `~/.claude/`. Precedence: agent `custom_env` CLAUDE_CONFIG_DIR > parent process env > `~/.claude/`. Without this, an operator who pinned Claude at a managed install (custom env CLAUDE_CONFIG_DIR) would get the wrong credentials in the scratch dir, because `buildClaudeEnv` strips that env before handing it to the child. We resolve the source up front and feed it to the mirror, so the override env still points at the right bytes. 2. Mirror entries now go through platform-aware linkers. On Windows without Developer Mode / admin, `os.Symlink` is denied, which previously left the scratch dir empty and broke Claude Code auth on default `ignore`. The new helpers try symlink first, then fall back to a directory junction (`mklink /J`) for dirs or a hardlink (same-volume content share) / copy for files. Mirrors the execenv/codex_home_link_windows.go pattern. 3. Tests: - `TestResolveHostClaudeConfigDir` locks in the custom_env > parent_env > `~/.claude` precedence. - `TestNewIsolatedClaudeConfigDirMirrorsCustomHostDir` confirms the scratch dir picks up `.credentials.json` from a synthetic custom host dir, proving the source resolution actually propagates into the mirror. - `TestNewIsolatedClaudeConfigDirEmptyHostIsNoop` documents the env-var-auth-only case (no host source ⇒ empty scratch dir). - `TestMirrorHostClaudeExceptSkillsWith_FallbackWhenSymlinkFails` exercises the Windows-no-Developer-Mode path via the new `mirrorHostClaudeExceptSkillsWith` seam, asserting credentials and sub-dir children still reach the scratch dir after the symlink stand-in fails. - `TestMirrorHostClaudeExceptSkillsWith_PropagatesFirstLinkError` confirms callers see the per-entry error when even fallback fails (so the warn-log fires on broken Windows installs). - `TestCopyFileRoundTrip` covers the last-resort copy fallback and its EXCL no-overwrite contract. - `TestClaudeExecuteIsolatesUsesCustomEnvSource` is the end-to-end check: an agent with custom_env CLAUDE_CONFIG_DIR reads its credentials from the pinned dir, not `~/.claude/`. 4. Docs: `apps/docs/content/docs/skills.{mdx,zh.mdx}` updated to describe the effective-source resolution and the Windows fallback chain so the docs match the runtime behaviour. Verification: - `go test ./...` green (full server suite locally, including `pkg/agent` 23 cases covering the new + existing isolation paths). - `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and `go test -c -o /dev/null` both compile clean, confirming the Windows-tagged linker file builds. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): default skills_local to merge to preserve legacy behavior (MUL-2603) Per Bohan's product decision on PR #3200, the per-agent host-skill toggle defaults to "merge" — the pre-MUL-2603 inherit-from-machine behavior — so existing personal workflows that rely on locally installed Claude Skills keep working unchanged. Agent owners explicitly opt into "ignore" when they need to harden a shared agent against a broken local skill on one operator's machine (GitHub #3052). Also audited all 11 runtimes for user-global skill discovery paths and documented the scope of the toggle. Only Claude reads a user-global `~/.claude/skills/`; Codex isolates via `CODEX_HOME`, the ACP backends (Hermes / Kimi / Kiro) and the JSON-stream backends (Copilot / Cursor / Gemini / Pi / OpenCode / OpenClaw) anchor discovery to the task workdir and never read a user-global skill directory. UI copy and docs now say "for runtimes that support it (currently Claude Code)" everywhere so the scope is explicit. Changes: - Migration 108: column default flipped to 'merge'. - Handler CreateAgent: missing field → "merge"; explicit "ignore" / "merge" still validated, garbage still 400. - normalizeSkillsLocal: drift-safe coercion now lands on "merge" for anything that isn't the exact literal "ignore". - agent_template.go / onboarding_shim.go: internal CreateAgent callers send "merge" instead of "ignore" to match the new default. - Claude runtime (`claude.go`): isolate-mode gate flipped from `SkillsLocal != "merge"` to `SkillsLocal == "ignore"`, so "" (legacy daemons / older clients) and "merge" both walk `~/.claude/` directly. - Create Agent dialog + Skills tab: toggle defaults to on (merge); only duplicate of an explicit "ignore" agent carries through. The isolation opt-in is now `skills_local: "ignore"` when the user flips off; "merge" is omitted from the request body. - i18n (EN + zh-Hans): copy reframed — "On (default) — merged"; "Off — ignored. Recommended for shared agents". - Docs (`/skills`, `/guides/agents.zh`): describe new default and enumerate which runtimes act on the toggle. - Landing changelog 0.3.7: retitled "Per-Agent Local-Skill Toggle"; note the on-by-default behavior + off-to-isolate framing. - Tests: - `TestClaudeExecuteIsolatesHostSkillsWhenIgnoreOptedIn` replaces the old by-default isolation case (now requires explicit "ignore"). - New `TestClaudeExecuteDefaultModeKeepsHostConfigDir` locks in that default ExecOptions preserve the host CLAUDE_CONFIG_DIR. - `TestClaudeExecuteIsolatesUsesCustomEnvSource` now explicitly opts into "ignore" mode. - Handler tests: omitted → "merge"; explicit "ignore" round-trips; preserve-existing test seeds "ignore" and asserts "merge" flip-back. - `TestNormalizeSkillsLocal_DriftStaysSafe`: only literal "ignore" maps to ignore; everything else → "merge". - `skills-tab.test.tsx`: toggle ON by default; flip OFF when agent opted into "ignore". Intro-text matcher anchored to a more specific phrase so it no longer collides with the toggle hint copy. Verification: - `go test ./...` green (full server suite locally). - `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and `go test -c -o /dev/null` both compile clean (windows-tagged linker file still builds). - `pnpm typecheck` green across all packages and apps. - `pnpm --filter @multica/views test` 88 files / 771 tests green. - `pnpm --filter @multica/core test` 43 files / 390 tests green. - Handler DB-backed tests still skip locally without docker; CI will validate the create / update paths against migration 108. Co-authored-by: multica-agent <github@multica.ai> * chore(landing): drop 0.3.7 changelog entry from this PR (MUL-2603) The landing-page release notes belong in a separate release-prep PR, not in the feature PR. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): propagate skills_local=ignore to codex user-skill seed (MUL-2603) Make the per-agent skills_local toggle real for Codex too, not just Claude. Previously the toggle was only consumed by the Claude backend, while the daemon's execenv layer always seeded Codex's per-task CODEX_HOME with the host machine's user-installed skills from ~/.codex/skills/. A shared Codex agent with skills_local=ignore could still inherit a broken local skill from one operator's machine. Now: PrepareParams/ReuseParams carry SkillsLocal; hydrateCodexSkills skips seedUserCodexSkills when SkillsLocal == "ignore" so the per-task CODEX_HOME exposes only workspace skills to the codex CLI. Default ("merge", or empty from older servers/clients) preserves existing inherit-from-machine behavior. UI / docs are updated to reflect the contract honestly: Claude Code and Codex honor the toggle; other runtimes (Hermes / Kimi / Kiro / Copilot / Cursor / Gemini / Pi / OpenCode / OpenClaw) leave $HOME untouched and discover user-level skills natively, so the toggle is a no-op for them today. New tests: TestPrepareCodexSkillsLocalIgnoreSkipsUserSeed, TestPrepareCodexSkillsLocalMergeSeedsUserSkills, and TestReuseCodexSkillsLocalIgnoreSkipsUserSeed cover Prepare(ignore), Prepare(merge), and the toggle-flip-on-reuse path. Co-authored-by: multica-agent <github@multica.ai> * docs(skills): scope skills_local toggle copy to Claude Code + Codex (MUL-2603) Off-state hint and Skills tab intro now explicitly call out Claude Code + Codex as the only runtimes that honor the toggle, with "other runtimes ignore this setting" wired into both states (en + zh-Hans), so users on non-Claude/Codex agents don't read "Off" as runtime-wide isolation. Docs (skills.mdx, skills.zh.mdx, guides/agents.zh.mdx) stop describing Hermes / Kimi / Gemini / Copilot / Cursor / Pi / OpenCode / OpenClaw / Kiro as having native user-level skill discovery; the daemon simply does not manage user-level skill discovery for those runtimes today, and the toggle is a no-op regardless of where it is set. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-26 13:26:33 +08:00
Bohan Jiang	13f74e651a	feat(agents): remove custom_env from agent resources, add audited env endpoint (MUL-2600) (#3209 ) * feat(agents): remove custom_env from agent resources, add audited env endpoint (MUL-2600) The agent resource shape (list / get / create / update / archive / restore responses + WebSocket events) no longer carries `custom_env` values. Reads/writes of env now flow exclusively through a dedicated `/api/agents/{id}/env` endpoint that is owner/admin-only, rejects agent-actor sessions, applies a "**" sentinel preserve guard on PUT, and writes a persistent audit row per reveal/update. Why - `multica agent list --output json` historically returned plaintext `custom_env` for owner/admin callers (the redaction gate gave only members the masked map). Any agent token running on the workspace inherits its owner's role and could read every other agent's secrets just by listing. - Patching list/get redaction alone (PR #3175 direction) left symmetric leaks via mutation responses, WS events, the "reveal" path itself (no actor-aware auth), and a `` overwrite footgun on UpdateAgent. What changed - Backend: drop `custom_env` from AgentResponse; add coarse `has_custom_env` + `custom_env_key_count`. Strip env handling from UpdateAgent (silently ignored if sent). Keep CreateAgent's custom_env acceptance. - Backend: new GET/PUT `/api/agents/{id}/env` handlers in `internal/handler/agent_env.go`: - resolveActor → 403 for agent actors (closes the lateral-movement path). - Owner/admin role gate via existing helper. - PUT honours value == "*" as "preserve existing value". - Both write to `activity_log` with `agent_env_revealed` / `agent_env_updated` actions. Audit details record key names only, never values. - Daemon claim path (`ClaimAgentTask`) unchanged — `TaskAgentData` still carries plaintext env for runtime injection. - SQL: new `UpdateAgentCustomEnv` query; sqlc regenerated (v1.31.1). - CLI: new `multica agent env get\|set` subcommands. `--custom-env` flags removed from `multica agent update`; the no-fields error now points to the new path. - Frontend: drop env fields from `Agent` + `UpdateAgentRequest`; add `getAgentEnv` / `updateAgentEnv` client methods; rewrite env-tab to show "N variables configured" + explicit "Reveal & edit" button, fetching values only on intentional reveal. - Locales: parity-safe additions to en + zh-Hans. - Docs: agents-create.{mdx,zh.mdx} reflect the new threat model and endpoint. - Mobile: schema drops `custom_env` / `custom_env_redacted`, adds metadata fields. Tests - Handler tests pinned the new invariants: no env in list/get responses, owner reveal happy-path + audit row, agent-actor 403, `***` sentinel preserves real values, UpdateAgent silently ignores `custom_env`, pure `mergeAgentEnv` cases. - CLI tests pivot to the new flag surface: `agent update` MUST NOT expose the env flags; `agent env set` MUST expose --custom-env-stdin/--custom-env-file. - Frontend test fixtures updated; pnpm typecheck / test / lint pass cleanly. This is a breaking API change. Scripts that read `custom_env` from `/api/agents` must migrate to `GET /api/agents/{id}/env`. Co-authored-by: multica-agent <github@multica.ai> fix(agents): close actor-spoofing + audit fail-closed in env endpoints (MUL-2600) Addresses Elon's review of #3209: * Mint a task-scoped `mat_` token per claim, bound to (agent, task, workspace, owner). Daemon injects it into the agent process in place of its own credential. Auth middleware authoritatively rebuilds X-User-ID / X-Agent-ID / X-Task-ID from the token row and sets X-Actor-Source=task_token; that header is server-set only — incoming values are stripped before any auth branch runs. resolveActor honors the header so an agent that strips X-Agent-ID / X-Task-ID still resolves as actor=agent. * GetAgentEnv / UpdateAgentEnv are now fail-closed on audit-log failures: GET refuses to return plaintext, PUT persists inside the same tx as the audit row so they commit/roll back together. * PUT /api/agents/{id} returns 400 when the body carries custom_env instead of silently dropping it — directs callers to the audited env endpoint. * Agent actors never see mcp_config, even when the underlying member is owner/admin; mutation broadcasts go through a redaction shim so WS subscribers don't pick it up either. * Fix backend test that asserted dense JSON (jsonb::text renders whitespace) and frontend test that assumed a unique "Test User" match. Co-authored-by: multica-agent <github@multica.ai> * fix(agents): close residual MUL-2600 gaps from review (MUL-2600) Migration 108 FK now correctly references agent_task_queue(id) instead of the non-existent agent_task table; the previous name blocked CI backend migrations. Task-token-authenticated requests can no longer be re-routed at a different workspace by passing workspace_slug / workspace_id / ?workspace_id / a URL workspace param. ResolveWorkspaceIDFromRequest and resolveWorkspaceUUID both short-circuit on X-Actor-Source=task_token and return only the token-bound X-Workspace-ID; buildMiddleware adds a defence-in-depth 403 if any URL-resolved workspace disagrees with the token binding. mcp_config no longer leaks back to agent actors through UpdateAgent / CreateAgent / ArchiveAgent / RestoreAgent HTTP responses — the same redactAgentResponseForActor helper that GetAgent/ListAgents use is now applied to mutation responses too. WS broadcasts were already redacted via broadcastAgentResponse. FailTask and every TaskService cancel path (CancelTask / CancelTasksForIssue / CancelTasksForAgent / CancelTasksByTriggerComment / BroadcastCancelledTasks) now eagerly DeleteTaskTokensByTask so the mat_ token's 24h window doesn't outlive a terminated task. Failure is non-fatal — the FK cascade and expiry remain durable guards. Doc-only: clarify that PUT /api/agents/{id} now hard-rejects bodies that carry custom_env (was previously "silently ignores"). Tests: - middleware: TestResolveWorkspaceIDFromRequest gains a task_token case asserting client-supplied slug/id/query cannot override the bound workspace. - handler: TestUpdateAgent_RedactsMcpConfigForAgentActor and TestUpdateAgent_KeepsMcpConfigForMemberActor pin the mutation- response redaction contract per actor type. Co-authored-by: multica-agent <github@multica.ai> * fix(agents): match redacted mcp_config as JSON null, not Go nil (MUL-2600) `AgentResponse.McpConfig` is `json.RawMessage` without `omitempty`, so the redacted response serialises as `"mcp_config": null`. On decode, `json.RawMessage` keeps the literal bytes `null` rather than collapsing to Go nil, which made the assertion fire on a non-leak. The product contract (field always present, distinguished from "no config" via `mcp_config_redacted`) is intentional, so adjust the test to check for "no secret-bearing content" instead of weakening the contract via `omitempty`. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-25 18:42:48 +08:00
YOMXXX	29c2a5d18f	fix(daemon): reclaim stale dispatched claims (MUL-2485) (#2872 ) * fix(daemon): reclaim stale dispatched claims * fix(daemon): widen stale claim reclaim window	2026-05-21 17:06:55 +08:00
iYuan	2f1f90c11a	fix(agent): retry codex semantic inactivity fresh (#2593 )	2026-05-20 20:03:39 +08:00
Bohan Jiang	2bec2221d2	feat(agent): per-agent thinking_level for claude + codex (MUL-2339) (#2865 ) * feat(agent): persist thinking_level per agent (MUL-2339) Adds a nullable `thinking_level` column to the `agent` table so the backend can route a runtime-native reasoning/effort token (e.g. Claude's `xhigh`, Codex's `minimal`) through to the agent CLI on every dispatch. The column is intentionally TEXT rather than an enum — Claude and Codex publish overlapping but distinct vocabularies and we want the persisted value to round-trip exactly through whichever CLI receives it. NULL is the "use runtime default" sentinel that every downstream consumer reads as "do not inject --effort / reasoning_effort". This commit is just the storage layer (migration + sqlc); subsequent commits wire it through the API, daemon, and agent backends. Co-authored-by: multica-agent <github@multica.ai> * feat(agent-backend): inject reasoning effort for claude + codex (MUL-2339) Extends ExecOptions with a runtime-native ThinkingLevel string and wires it into the Claude and Codex backends. Discovery is driven by the local CLI so the daemon advertises whatever the host install supports rather than a hand-maintained list that goes stale. Per Elon's PR1 review: - Claude: parses `claude --help` to learn the `--effort` superset and projects through a per-model allow-list (xhigh is Opus-only; max is session-only on the smaller models). Falls back to a conservative static list when the binary is missing or help drift hides the line. - Codex: drives `codex debug models --output json` so per-model reasoning subsets and the documented default come directly from the CLI. The older config-error probe trick is gone — the JSON path is stable and doesn't pollute stderr with an intentional misconfig. - Cache key includes (provider, executablePath, cliVersion) so a CLI upgrade invalidates entries that referenced the older help / catalog. Per Trump's PR1 constraint, all three Codex injection points (thread/start.config, thread/resume.config, turn/start.effort) flow through one helper (`applyCodexReasoningEffort`) so they cannot drift independently. The shared `codexReasoningCases` fixture in `thinking_test.go` asserts the same value→{shape, key} contract at each site for every level the runtimes know about. Claude's `--effort` is also added to `claudeBlockedArgs` so a user custom_args entry can't silently outvote the daemon-injected value. Co-authored-by: multica-agent <github@multica.ai> * feat(api): wire thinking_level through API + daemon contract (MUL-2339) End-to-end plumbing for the per-agent reasoning/effort setting: - AgentResponse / TaskAgentData now carry `thinking_level`; the daemon's claim response includes it and the daemon's executor passes it through to agent.ExecOptions, where the Claude and Codex backends already know what to do with it. - ModelEntry on the runtime-models wire format gains a `thinking` block carrying `supported_levels` + `default_level` per model so the UI can render a runtime-aware picker without the server having to know about the local CLI install. `handleModelList` projects the agent-package catalog (including the new Thinking field) into the wire shape. - CreateAgent / UpdateAgent gate the field with a synchronous provider enum check (claude / codex only today). UpdateAgent is tri-state: field omitted = no change, "" = explicit clear (new `ClearAgentThinkingLevel` query, mirrors the existing mcp_config null pattern), non-empty = validate then set. Per Trump's PR1 review, the API NEVER auto-clears on a runtime/model swap and ALWAYS returns 400 on an unknown literal value — same shape across CreateAgent, UpdateAgent, and combined patches that move runtime + level in one request. Per-model combination failures (e.g. `xhigh` against a model that only supports up to `high`) surface as a daemon-side task error, not a silent server-side rewrite. TS types follow the same shape: `Agent.thinking_level`, `CreateAgentRequest`/`UpdateAgentRequest` add the field, `RuntimeModel` grows a `thinking` block. Older backends omit the field, which the front-end treats as "no picker for this model" — installed desktop builds keep working. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): correct codex debug models argv + pin via runner test (MUL-2339) `codex debug models --output json` is rejected by codex-cli 0.131.0 — the subcommand emits JSON on stdout by default and has no `--output` flag. Drop the flag and add `--bundled` to skip the network refresh discovery doesn't need. Move the argv to a package-level var and add a test that runs a fake `codex` to assert the binary actually receives exactly `debug models --bundled`, so the contract can't silently drift on the next refactor. Also teach ValidateThinkingLevel to resolve an empty model to the provider's default model entry. Without this, every default-model task with a persisted thinking_level would be misjudged "unknown model" by the daemon guard. Co-authored-by: multica-agent <github@multica.ai> * fix(api): reject runtime switch that would leave invalid thinking_level (MUL-2339) A PATCH that changed `runtime_id` without touching `thinking_level` used to silently keep the existing value, so a Claude agent storing `max` could land on a Codex runtime where `max` is not a recognised token at all, and the daemon would receive a literal-invalid level. Hold the same "always 400 on literal-invalid, never silent coerce" rule on this implicit path. When runtime_id changes and the existing value is not in the new provider's enum, return 400 with the recovery options (clear via `thinking_level=""` or re-set in the same PATCH). Add coverage for both the kept-when-still-valid and the rejected cases, plus the two recovery paths (clear and replace). Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): guard runTask with per-model thinking_level validator (MUL-2339) ValidateThinkingLevel existed but had no call site — `task.Agent. ThinkingLevel` flowed straight into ExecOptions, so `xhigh` configured on a non-Opus Claude model, or API-side stale values that escaped the provider enum gate, would be injected anyway. Run the validator before building ExecOptions. Invalid combinations log a warning and drop the level instead of failing the task: the agent still runs, just at the runtime's default reasoning effort. Discovery errors fail open (keep the level, let the CLI surface any objection) so a transient `claude --help` failure can't strand work. Empty model is forwarded as-is; the validator resolves it to the provider's default model internally per the cross-package contract. Co-authored-by: multica-agent <github@multica.ai> * chore(agent): drop stale `--output json` comments + unused scanner (MUL-2339) Codex CLI's `debug models` subcommand emits JSON without an `--output` flag, and `parseCodexDebugModels` never read from the bufio.Scanner. Sync the comments with the actual invocation and remove the dead init. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 12:30:10 +08:00
LinYushen	319b23eb39	Revert "feat(task): add claim lease mechanism (Phase 2, MUL-2246) (#2660 )" (#2674 ) This reverts commit `3137feecdf`.	2026-05-15 16:07:23 +08:00
LinYushen	b7a58c06ac	Revert "feat(task): wire claim lease into TaskService and sweeper (MUL-2246) …" (#2673 ) This reverts commit `bb32be0e50`.	2026-05-15 16:06:58 +08:00
LinYushen	bb32be0e50	feat(task): wire claim lease into TaskService and sweeper (MUL-2246) (#2662 ) * feat(task): wire claim lease queries into TaskService and sweeper (MUL-2246) - ClaimTask now uses ClaimAgentTaskWithLease (generates claim_token + lease) - StartTask accepts optional claim_token for token-verified start - AgentTaskResponse includes claim_token for daemon to use - Daemon client sends claim_token in StartTask body - Sweeper calls RequeueExpiredClaimLeases each tick - Legacy daemons without claim_token still work (graceful fallback) Co-authored-by: multica-agent <github@multica.ai> * fix(task): address PR #2662 review blockers (MUL-2246) 1. ClaimAgentTaskForRuntime: push runtime_id into atomic SQL WHERE clause so runtime A cannot claim tasks queued for runtime B under the same agent. 2. Legacy StartAgentTask: add claim_token IS NULL guard so leased rows cannot be started without token verification. Handler rejects malformed tokens with 400 instead of silently degrading to legacy path. 3. StartAgentTaskWithClaimToken: validate claim_expires_at >= now(), preserve claim_token until terminal state (only clear claim_expires_at), use CTE + UNION ALL for idempotent retry when daemon resends after a lost StartTask response. Return 409 Conflict on token mismatch/expiry. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): StartTask 409 handling, transport retry, claim_token on FailTask (MUL-2246) - StartTask 409 (claim superseded): release slot, don't call FailTask - StartTask transport timeout/5xx: retry once with same token, then check task status before failing - FailTask now sends claim_token; server-side FailAgentTask SQL adds AND (claim_token IS NULL OR claim_token = @claim_token) guard so stale daemons cannot fail tasks that have been re-claimed Co-authored-by: multica-agent <github@multica.ai> * fix(task): close FailTask token bypass and RequeueExpiredClaimLeases liveness gap (MUL-2246) Blocker 1 - FailTask token validation: - SQL: change (param IS NULL OR claim_token = param) to (param IS NULL AND claim_token IS NULL) OR claim_token = param so tokenless requests can only fail legacy (tokenless) rows. - task.go: malformed claim_token now returns ErrInvalidClaimToken (400) instead of being silently dropped to NULL. - Handler: maps ErrInvalidClaimToken→400, ErrClaimTokenInvalid→409. - Service: when UPDATE returns no rows but task is still active, return ErrClaimTokenInvalid (token mismatch) instead of silent success. Blocker 2 - RequeueExpiredClaimLeases runtime liveness: - SQL: JOIN agent_runtime, only requeue tasks where runtime is 'online'. Dead/offline runtime tasks stay dispatched for FailTasksForOfflineRuntimes. - FOR UPDATE → FOR UPDATE OF atq (required with JOIN). Regression tests: - task_claim_token_test.go: malformed, tokenless-on-tokened, wrong-token - requeue_lease_test.go: SQL must JOIN agent_runtime with online filter Co-authored-by: multica-agent <github@multica.ai> * fix(task): move expired lease requeue to ClaimTaskForRuntime preflight, add heartbeat freshness backstop (MUL-2246) - Add RequeueExpiredClaimLeasesForRuntime: per-runtime preflight self-requeue in ClaimTaskForRuntime. Runtime proves liveness by actively claiming, so no heartbeat check needed. - Update global RequeueExpiredClaimLeases to require ar.last_seen_at freshness (stale_threshold_secs param). Prevents requeuing to a dead runtime in the 90s gap between lease expiry (60s) and offline detection (150s). - Add regression tests verifying the heartbeat freshness check and that the preflight query does not join agent_runtime. Co-authored-by: multica-agent <github@multica.ai> * fix(task): use LivenessStore for global requeue, move preflight before empty-cache (MUL-2246) Blocker 1: Global RequeueExpiredClaimLeases now uses LivenessStore.IsAliveBatch to verify runtimes are truly alive before requeuing expired leases. When LivenessStore is unavailable (no Redis), global requeue is skipped entirely — the preflight self-requeue in ClaimTaskForRuntime handles live runtimes. This closes the 60-150s gap where a dead runtime still appears online in DB. Blocker 2: Moved RequeueExpiredClaimLeasesForRuntime BEFORE EmptyClaim.IsEmpty fast-path in ClaimTaskForRuntime. Expired leases are now requeued (which bumps the empty cache via notifyTaskAvailable) before the empty check can short-circuit the claim path. Also adds ListRuntimesWithExpiredClaimLeases SQL query and LivenessChecker interface on TaskService. Co-authored-by: multica-agent <github@multica.ai> * fix(task): wire EmptyClaimCache into backend taskSvc for backstop requeue (MUL-2246) The backend taskSvc used by the sweeper only had Liveness wired but not EmptyClaim. When global backstop requeue called notifyTaskAvailable, s.EmptyClaim.Bump() was a nil no-op — the handler's empty-cache was never invalidated, so the daemon's next claim hit a stale empty verdict. Fix: wire the same Redis-backed EmptyClaimCache into the backend taskSvc in main.go (same Redis keys as router.go:139 handler instance). Add regression test verifying backstop requeue invalidates the handler's empty-cache. Co-authored-by: multica-agent <github@multica.ai> * fix(task): global backstop must not requeue — alive runtimes use preflight, dead stay dispatched (MUL-2246) - RequeueExpiredClaimLeases is now a no-op (returns 0 always) - Alive runtimes self-requeue via ClaimTaskForRuntime preflight - Dead runtimes stay dispatched for FailTasksForOfflineRuntimes - Rewriting to queued on dead runtime creates 2h blackhole (offline sweeper only handles dispatched/running) - Test actually calls RequeueExpiredClaimLeases and asserts 0 in all cases Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): remove duplicate usage reporting block after merge conflict (MUL-2246) The merge resolution introduced a second ReportTaskUsage call after the status check, duplicating the usage-before-early-return block that already runs right after runner.run. Remove the duplicate and add a regression test asserting /usage is called exactly once on the normal completion path. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-15 15:15:31 +08:00
LinYushen	3137feecdf	feat(task): add claim lease mechanism (Phase 2, MUL-2246) (#2660 ) Add claim_token + claim_expires_at columns to agent_task_queue and three new SQL queries for the claim lease protocol: - ClaimAgentTaskWithLease: generates a UUID token and sets a lease expiry when claiming a task, so the daemon must prove it received the response - StartAgentTaskWithClaimToken: validates the token on StartTask, preventing stale daemons from starting requeued tasks - RequeueExpiredClaimLeases: moves dispatched tasks with expired leases back to queued for re-claim This closes the reliability gap where a claim response lost in transit leaves a task stuck in dispatched until the 60s dispatch timeout fires. Co-authored-by: multica-agent <github@multica.ai>	2026-05-15 15:14:05 +08:00
Jiayuan Zhang	4d6b5ad06f	fix(squad): wake leader when dual-role agent posts as worker (MUL-2218) (#2626 ) * fix(squad): wake leader when dual-role agent posts as worker (MUL-2218) The squad-leader self-trigger guard skipped a comment whenever the author equalled the squad's leader id, regardless of the role the agent was acting in. For an agent that holds both leader and worker roles in the same squad, this meant the leader role never reacted to its own worker output and the issue stalled. Tag each enqueued task with is_leader_task and consult the agent's most recent task on the issue from both self-trigger guards (comment path + @squad mention path) — skip only when that task was itself a leader task. Co-authored-by: multica-agent <github@multica.ai> * fix(squad): inherit is_leader_task on retry task clone (MUL-2218) CreateRetryTask cloned a parent task into a fresh queued attempt but omitted is_leader_task from the column list, so the child silently fell back to the column default (false). For a leader task that hit auto-retry through MaybeRetryFailedTask, the retried task posed as a worker task — the self-trigger guard then no longer recognised the leader's own comments, re-opening the very loop MUL-2218 closes. Inherit p.is_leader_task in the clone and add a query-level test that covers both leader and worker retries. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-14 15:23:36 +02:00
Bohan Jiang	f5c2994aed	feat(workspace): revoke a member's runtimes when they leave or are removed (#2401 ) * feat(workspace): revoke a member's runtimes when they leave or are removed Previously, leaving or being removed from a workspace only deleted the member row — every runtime the departed user owned in that workspace remained in the DB, kept its daemon_token valid, and stayed reachable to the workspace's other members. The departed user lost access but their machine kept doing work. This change converges the runtime state in the same transaction as the member-row deletion: agents pinned to those runtimes are archived, in-flight tasks are cancelled (so the daemon's per-task status poller interrupts the running agent gracefully), the runtimes are forced offline, and the daemon_token rows are deleted. After commit the DaemonTokenCache is invalidated and agent:archived / daemon:register events fire so connected clients reconcile immediately. Server-side state convergence is the production safety net; the daemon_token revoke takes effect once the mdt_ flow is live (today most daemons fall back to PAT/JWT, and the member-row deletion is what stops those requests via requireWorkspaceMember). Daemon-side handling (recognising the resulting 401/404 and tearing down the local pairing for that workspace) lands in a follow-up. Co-authored-by: multica-agent <github@multica.ai> * fix(workspace): also cancel tasks for archived agents on member revoke CancelAgentTasksByRuntime only matched tasks whose runtime_id was in the revoked set, missing a real path: agent.runtime_id can be reassigned via UpdateAgent, but agent_task_queue.runtime_id keeps the value from when the task was queued. So an agent currently bound to the leaving member's runtime gets archived correctly, but its older tasks still pinned to a prior runtime stay 'queued' — and ClaimAgentTask does not gate on agent.archived_at, so those orphaned tasks remain claimable by the prior runtime. Replace CancelAgentTasksByRuntime with CancelAgentTasksByRuntimeOrAgent, which OR-matches runtime_ids and the archived agent IDs in one UPDATE. Pass the archived agent IDs through from revokeAndRemoveMember. Adds TestDeleteMember_CancelsTasksFromAgentReassignment as a regression guard: same agent, two runtimes, the older task on the surviving runtime must end up cancelled while the surviving runtime stays online. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-11 15:06:50 +08:00
Multica Eve	a2dd80d4f6	feat(autopilot): skip dispatch when assignee runtime is offline (MUL-1899) (#2311 ) * feat(autopilot): skip dispatch when assignee runtime is offline (MUL-1899) Prevents scheduled autopilots from accumulating doomed tasks against offline / archived / unbound agents. Before this change, a paused laptop or crashed daemon would let a 5-minute-cron autopilot pile up thousands of queued agent_task_queue rows that no runtime would ever drain — this is the dominant source of the 89k stuck-task backlog flagged in MUL-1899. DispatchAutopilot now performs a pre-flight admission check on the assignee agent's runtime status. If the runtime is not 'online' (or the agent is archived / has no runtime bound / has no assignee), the run is recorded as 'skipped' with a failure_reason and no task is enqueued. Skipped runs still emit autopilot:run.done so the UI / activity feed reflect that the trigger fired and was evaluated. Skipped runs are deliberately NOT counted toward the failure-ratio auto-pause: a user who closes their laptop overnight should not have their autopilot paused. Sustained server-side failures keep their existing pause path via the failure monitor. Tests: added an integration test that creates an offline runtime and asserts DispatchAutopilot records a skipped run with no task enqueued. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> * feat(scheduler): expire stale queued tasks via TTL sweeper (MUL-1899) Companion to the dispatch-time admission gate added in this PR. The admission gate prevents new tasks from being enqueued against an offline runtime, but it does not drain the historical backlog (~89k stuck queued rows observed at MUL-1899 baseline) and does not help when a runtime goes offline after a task has already been queued. This adds a passive TTL sweeper: - New SQL query `ExpireStaleQueuedTasks` transitions queued tasks older than the TTL to status='failed' with failure_reason='queued_expired' and a clear error message. - Sweep is capped per tick (`queuedExpireBatchSize`, default 500) via a CTE+LIMIT so that draining a large backlog cannot monopolise the DB on a single tick. At 30s ticks the worst case is 60k rows/hour. - Wired into the existing 30s `runRuntimeSweeper` loop alongside `sweepStaleTasks` and reuses `taskSvc.HandleFailedTasks` so the expired tasks broadcast `task:failed` events, reconcile agent status, and roll back any in-progress issues — same lifecycle as any other failed task. - Default TTL = 2h. Conservatively above any reasonable "queued behind a long-running task" window (default agent timeout is 2h, sweeper runs every 30s) so legitimate work isn't expired. - Integration tests cover the happy path (stale → expired, fresh → left alone, correct status/reason/error) and the per-tick batch cap. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> * fix(autopilot): address review blockers from PR #2311 (MUL-1899) GPT-Boy review of the offline-runtime + queued-TTL PR flagged four blockers; this commit addresses them all. 1. Restore the 'skipped' autopilot_run status in the DB constraint. Migration 043 had removed 'skipped' along with the now-defunct concurrency_policy feature, so the new admission gate's INSERT of status='skipped' violated `autopilot_run_status_check` and broke `TestAutopilotDispatchSkipsWhenRuntimeOffline` in CI. New migration 079 re-adds 'skipped' to the CHECK list. The down migration migrates skipped → failed before re-tightening, mirror- ing what 043 did for the original removal. 2. Make `ExpireStaleQueuedTasks` race-safe. The CTE-then-UPDATE pattern could clobber a task that the daemon claimed between victim selection and the outer update. Two guards added: - `FOR UPDATE SKIP LOCKED` in the CTE so we never wait on a row that's currently being claimed (and never block the claim path either). - The outer UPDATE now re-checks `t.status = 'queued'` AND the TTL predicate so even if a row's lock is released after a successful claim, we cannot transition a now-dispatched/ running task to 'failed'. 3. Add a partial index for the queued-TTL sweeper. `idx_agent_task_queue_queued_created_at` on `created_at WHERE status = 'queued'` — keeps the 30s sweep query (status=queued AND created_at < ... ORDER BY created_at LIMIT 500) cheap even when historical terminal rows accumulate (~89k+ at MUL-1899 baseline). The partial predicate keeps the index tiny because only in-flight rows live in 'queued'. 4. Fix the failure-monitor denominator. `SelectAutopilotsExceedingFailureThreshold` had been counting 'skipped' toward total runs, which would have diluted the failure ratio: a 100%-failing autopilot could mask itself behind a wall of admission skips. With 'skipped' restored as a real status, the auto-pause monitor must explicitly exclude it from BOTH numerator and denominator — admission skips are neither a success nor a failure. Verified: `go test ./cmd/server/... ./internal/service/...` passes (including TestAutopilotDispatchSkipsWhenRuntimeOffline, TestExpireStaleQueuedTasks, TestExpireStaleQueuedTasksRespectsBatch Limit). `go build ./... && go vet ./...` clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> * fix(migrations): split queued-task TTL index into concurrent migration Per PR #2311 review: agent_task_queue is a hot table, so building the new partial index with plain CREATE INDEX inside migration 079 would hold ACCESS EXCLUSIVE on the queue and block dispatch during deploy. The migration runner does not allow CONCURRENTLY to share a file with other statements (documented in 068), so split the index into its own single-statement file 080 — matching the existing pattern in 035 / 067 / 074 / 075 / 078. Migration 079 keeps the autopilot_run constraint change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 15:07:57 +08:00
Bohan Jiang	6d9ebb0fdd	fix(daemon): unblock issues stuck on a poisoned-image agent session (#2314 ) * fix(daemon): treat upstream API 400 invalid_request_error as poisoned session A markdown-linked image in an issue description that the agent downloads as a tiny CDN auth-error file and Read's as a PNG poisons the conversation: the LLM API rejects the bad image with 400 invalid_request_error, the session_id is pinned mid-flight, and every follow-up task on the issue (comment-trigger, auto-retry) resumes the same poisoned conversation and hits the same 400 — the issue can no longer be executed even after the description is cleaned up. Mirror the existing fallback-output classifier on the error side: detect "API Error: ... 400 ... invalid_request_error" in the agent error string, persist failure_reason='api_invalid_request', and add it to the GetLastTaskSession exclusion list so the next task starts a fresh session that re-reads the (now-clean) description. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): unblock issues already poisoned by API 400 invalid_request_error The forward-only classifier from the previous commit only tags new failures. Issues like MUL-1918 already have multiple failed-task rows whose failure_reason is the pre-fix default 'agent_error', and GetLastTaskSession falls back to those legacy rows on the next claim — so deploying the classifier alone leaves existing poisoned issues stuck (GPT-Boy review on PR #2314). Two complementary changes: - Migration 079 backfills failure_reason='api_invalid_request' on every pre-existing 'agent_error' row whose error text matches the canonical Anthropic 400 invalid_request_error shape. Keeps observability consistent (multica issue runs / UI now report the right reason). - GetLastTaskSession adds a defensive ILIKE clause on error text. Closes the deploy-window gap where the old binary could write a new 'agent_error' row between the migration running and the new code taking over, and protects against future error-format variants the daemon classifier might miss. Plus regression tests covering the legacy + new coexistence case GPT-Boy flagged, and a guard rail asserting benign 'agent_error' failures (timeouts, tool errors) still resume their session. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 14:39:10 +08:00
Jiayuan Zhang	0cd50e14eb	feat(agent-live-card): show queued tasks in issue live banner (MUL-1897) (#2307 ) The issue-detail "agent live" banner only showed dispatched/running tasks. A task that was queued — runtime offline, busy on a prior task, or held behind a coalesced sibling — left the issue silent until claim, which reads as "the trigger never landed". Include 'queued' in `ListActiveTasksByIssue`, then branch the renderer: queued banners use a non-spinning Clock, "{name} 排队中 / is queued" copy, "queued for Ns" elapsed anchored on `created_at`, and hide the transcript button (no execution log yet). Cancel still works because `CancelAgentTask` already accepts queued. Client-side re-sort by lifecycle (running → dispatched → queued) so the sticky slot stays on the most-active task even when a queued sibling was created more recently. Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 07:33:12 +02:00
Bohan Jiang	b17f975a17	docs(cli): clarify `issue rerun` semantics (current assignee, fresh session) (#2304 ) * docs(cli): clarify `issue rerun` semantics The CLI table described `multica issue rerun <id>` as "Rerun the most recent agent task", which led users to expect it would re-run whichever agent ran last. The actual behavior is to enqueue a fresh task for the issue's current agent assignee, regardless of who ran most recently — see `TaskService.RerunIssue` in `server/internal/service/task.go`. Also fix a stale claim in `tasks.mdx`: the "Manual rerun" section described session inheritance as "Yes", but commit `b1345685` made manual rerun pass `force_fresh_session=true` precisely to avoid replaying a poisoned session. Only automatic retry still inherits the session. Updates EN + ZH mirrors of `cli.mdx` and `tasks.mdx`. Co-authored-by: multica-agent <github@multica.ai> * docs(tasks): tighten rerun trigger surface; clean stale Go comments Apply review feedback on PR #2304: - `tasks.mdx` / `tasks.zh.mdx`: rerun is triggered via CLI or the `/api/issues/{id}/rerun` endpoint, not "UI or CLI" — there's no rerun affordance in web/desktop today. - `tasks.mdx` / `tasks.zh.mdx`: comparison table — manual rerun applies to "Issues with an agent assignee", not "All sources". The handler rejects with `issue is not assigned to an agent` for anything else, and there's no rerun path for chat or autopilot tasks. - `task_lifecycle.go`: `RerunIssue` doc comment claimed the new task "carries the most recent session_id/work_dir so the agent can resume". That has been false since `b1345685` — rewrite to reflect the actual `force_fresh_session=true` contract. - `agent.sql` (regenerated `agent.sql.go`): `GetLastTaskSession` doc said it serves "auto-retry / manual rerun"; manual rerun is now routed around it via `force_fresh_session=true`. Note both the auto-retry path it does serve and the rerun escape hatch. No logic change. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-09 12:46:37 +08:00
LinYushen	250ada1fb3	chore(db): drop unused agent_task_queue.last_heartbeat_at (#2212 ) Drops the unused agent_task_queue.last_heartbeat_at column and removes the hot-path task heartbeat write.	2026-05-07 15:45:29 +08:00
Bohan Jiang	60b215f44f	feat(chat): support deleting chat sessions (#2115 ) * feat(chat): support deleting chat sessions Replaces the unreachable archive endpoint with a real hard delete and exposes it from the chat history panel. - DELETE /api/chat/sessions/{id} now hard-deletes the session and its messages (CASCADE), cancels any in-flight tasks before removal so the daemon doesn't keep running work whose result has nowhere to land, and broadcasts chat:session_deleted. - Frontend adds a per-row delete button with a confirmation dialog, optimistically drops the session from both list caches, and clears the active session pointer locally + on other tabs via the WS handler. Co-authored-by: multica-agent <github@multica.ai> * fix(chat): make session delete atomic and keep archived sessions read-only Address review feedback on #2115. - DeleteChatSession now runs lock + cancel + delete in a single tx and only broadcasts events post-commit. The new LockChatSessionForDelete query takes FOR UPDATE on chat_session, which blocks the FK validation of any concurrent SendChatMessage trying to enqueue a task for this session — that insert fails after we commit, so it can no longer produce an orphaned task whose chat_session_id is nulled by ON DELETE SET NULL. Cancel failure now aborts the delete instead of warn-and-continue. - SendChatMessage refuses non-active sessions again. The archive code path is gone, but legacy rows with status='archived' may still exist in the DB; keep the guard until we explicitly migrate them. - Frontend re-reads allChatSessionsOptions to disable ChatInput on legacy archived sessions so the UX matches the server-side guard. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-06 13:22:53 +08:00
Bright Zheng	cf47d9b702	fix: guard session resume by runtime (#1905 )	2026-05-03 10:51:31 +08:00
Bohan Jiang	2dddfaa196	feat(daemon): Redis empty-claim fast path for /tasks/claim polling (#1860 ) * feat(daemon): Redis empty-claim fast path for /tasks/claim polling Daemons poll /tasks/claim every 30s per runtime; the steady-state warm-empty case currently runs ListPendingTasksByRuntime against Postgres on every poll. This collapses that path: - New ListQueuedClaimCandidatesByRuntime query restricts to status = 'queued' (the old query also returned 'dispatched' rows that can never be reclaimed) and is backed by a partial index keyed on (runtime_id, priority DESC, created_at ASC). - New EmptyClaimCache caches the negative verdict in Redis with a 30s TTL. ClaimTaskForRuntime checks the cache before SELECT and populates it on confirmed-empty results. - notifyTaskAvailable now invalidates the runtime's empty key before kicking the daemon WS, so newly enqueued tasks become claimable immediately rather than waiting out the TTL. - AutopilotService.dispatchRunOnly now goes through TaskService.NotifyTaskEnqueued so run_only tasks get the same invalidate-then-wakeup contract as every other enqueue path. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): close MarkEmpty/Bump race in empty-claim fast path GPT-Boy's review on PR #1860 caught a real concurrency bug. Under the prior implementation it was possible for a slow claim to write an empty verdict AFTER a concurrent enqueue had already invalidated it: T1 claim: SELECT -> empty T2 enqueue: INSERT row, DEL empty key (no-op, key not set yet), wakeup T1 claim: SET empty (writes a stale "empty" verdict) T3 wakeup: IsEmpty -> hit -> returns null The just-queued task would then sit idle until the empty key's TTL expired (up to 30s). Replace the DEL-based invalidation with a per-runtime version counter: - CurrentVersion(rt) is a Redis INCR counter at mul:claim:runtime:version:<rt> with a 24h sliding TTL. - Claim samples version BEFORE the SELECT and passes it to MarkEmpty, which stores the verdict's value as the observed-version string. - IsEmpty MGETs both keys and trusts the verdict only when the empty-key value equals the current version. - Enqueue Bumps the version (INCR + EXPIRE) before the wakeup, causing any verdict written under a prior version to be rejected on the next read. Also bound every Redis call from this cache with a 250ms timeout — notifyTaskAvailable uses a background context so a wedged Redis must not block enqueue. Tests against a real Redis (REDIS_TEST_URL) cover: - MarkEmpty + IsEmpty under matching version returns hit - Bump invalidates a prior empty verdict (race-fix pin) - A MarkEmpty written under a stale pre-Bump version is rejected - TTL clamping, per-runtime isolation, nil-cache safety - notifyTaskAvailable Bumps before the wakeup fires Co-authored-by: multica-agent <github@multica.ai> * chore(daemon): renumber claim-candidate index migration to 067 Slot 064 was taken on main by 064_notification_preference. The migration runner tracks per-version in schema_migrations and would silently skip the second 064_*, leaving the index uncreated. Rename to 067 (next free slot). Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-04-30 15:50:05 +08:00
Bohan Jiang	b1345685a3	fix(task): rerun starts a fresh session, skip poisoned resume (#1928 ) * fix(task): rerun starts a fresh session, skip poisoned resume When a task ended in a known agent fallback ("I reached the iteration limit and couldn't generate a summary.", "Put your final update inside the content string. Keep it concise.") the (agent_id, issue_id) resume lookup would still pick that session, so a manual rerun inherited the poisoned state and reproduced the same bad output. Two complementary guards: 1. Daemon classifies poisoned terminal output and routes it through the blocked path with failure_reason set ('iteration_limit' / 'agent_fallback_message'). GetLastTaskSession excludes failed tasks with those reasons, so even comment-triggered tasks no longer resume them. Tasks that failed mid-flight (timeout, runtime_recovery, etc.) are still resumable, preserving MUL-1128's auto-retry contract. 2. Manual rerun marks the new task force_fresh_session=true. The daemon claim handler skips the resume lookup entirely when the flag is set, capturing the user-intent signal that "the prior output was bad" even when poisoned classification misses a future fallback wording. Auto-retry of orphaned mid-flight failures (MaybeRetryFailedTask → CreateRetryTask) does not take this path, so it keeps resuming. Tests: classifyPoisonedOutput unit test; integration tests assert the SQL filter excludes poisoned classifiers, RerunIssue flips the flag, and the normal enqueue path leaves it false. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): cap poisoned-output matcher to short trimmed text GPT-Boy review on MUL-1630: the previous strings.Contains match would classify any output that quoted the marker substring — including a review/analysis that simply discussed the marker itself. Real fallback messages are short single-sentence affairs, so cap the candidate at ~one paragraph and trim whitespace before matching. Adds regression tests covering a long quoting review and a marker buried in a long real conclusion; both must stay classified as completed. Co-authored-by: multica-agent <github@multica.ai> * fix(migrations): rename 065 force_fresh_session → 066 to clear collision main introduced 065_project_resources after this branch was cut, so both files shared the 065_ prefix. The readiness check (server/cmd/server/health.go → migrations.LatestVersion) takes the last entry by lexical order, which is 065_project_resources, leaving this branch's 065_force_fresh_session unguarded — a deploy that applied project_resources but not force_fresh_session would still report ready, and the next enqueue / rerun / claim would crash on "column force_fresh_session does not exist". Renaming to 066_force_fresh_session puts it strictly after project_resources so readiness blocks until it's applied. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-04-30 14:17:53 +08:00
Bohan Jiang	9baa72cc68	fix: polish quick-create UX (kind labeling, dark toast, placeholder) (#1831 ) * fix: polish quick-create UX (kind labeling, dark toast, placeholder) Three small fixes shaken out from using the agent-create flow: - AgentTaskResponse now carries a `kind` discriminator ("comment" \| "autopilot" \| "chat" \| "quick_create" \| "direct"), computed from the existing FK shape with no extra DB access. The Activity row uses it to label quick-create tasks as "Creating issue" instead of falling through to the generic "Untracked" — once the agent finishes and the new issue is linked, the row transitions to the normal identifier+title display. - Sonner Toaster reads `resolvedTheme` instead of `theme`, so toasts follow the actual dark/light state. Forwarding "system" let sonner pick its own answer from `prefers-color-scheme`, which in the Electron renderer can disagree with next-themes' `html.dark` class — the toast rendered light on a dark UI. - Agent-create placeholder rephrased to a more conversational example with a project reference: "let Bohan fix the inbox loading slowness in the Web project". Drops the priority hint (priority isn't widely used) and matches how people actually instruct the agent. * fix(quick-create): link new issue back to task on completion Addresses the review on PR #1831: completed quick-create tasks were left with issue_id=NULL forever, so the activity row stayed on "Creating issue" instead of transitioning to the normal MUL-XXX + title rendering once the agent finished. - Server: notifyQuickCreateCompleted now writes the resolved issue id back to agent_task_queue.issue_id via a new LinkTaskToIssue query (guarded by `issue_id IS NULL` so it only ever fills the unset quick-create case). Best-effort: a write failure logs but doesn't block the inbox notification. - Frontend: defensive wording fallback — kind=quick_create rows in terminal status (completed/failed/cancelled) now render as "Quick create" instead of the active "Creating issue" label, covering rows whose link write failed or whose agent never produced an issue at all.	2026-04-29 15:40:59 +08:00
Naiyuan Qing	f745a3bbbe	feat(agent): presence v3 + execution log + trigger summary (#1823 ) * refactor(views): migrate agent/runtime/skill lists to TanStack DataTable Replace the per-page CSS Grid + minmax(min, fr) + sticky-first-col + truncate implementation with a TanStack Table backend rendered through a Dice UI-style DataTable shell. Column widths are now px-based via column.size, so cells no longer shrink or auto-truncate as the viewport narrows; when the sum of columns exceeds the viewport, the container scrolls horizontally instead. - Add @tanstack/react-table to the catalog (8.21.3) and wire it into packages/ui (dep) and packages/views (peerDep). - packages/ui: new DataTable + DataTableColumnHeader + lib/data-table.ts (getColumnPinningStyle), adapted from Dice UI's registry. The shell renders <table> directly (skipping shadcn's <Table> wrapper) so its own outer overflow controls both axes — no nested overflow conflicts. - packages/views: each list now declares ColumnDef[] with explicit cell renderers. Row click navigates to detail via onRowClick (instead of wrapping <tr> in <a>, which is invalid HTML); kebab dropdowns stopPropagation so they don't trigger the row navigation. - Drop the previous AGENT_LIST_GRID / GRID_WITH_OWNER / ROW_GRID templates and the sticky-first-col / subgrid mechanics that came with them. agent-list-item.tsx is removed; runtime-list.tsx and skills-page.tsx are trimmed to thin wrappers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(agent): cap description at 255 chars (db + api + ui) Symmetric enforcement across DB, server, and UI: - Migration 060: pre-flight truncate of any oversize rows, then ADD CONSTRAINT NOT VALID + VALIDATE CONSTRAINT so the new check doesn't block writes during validation. - Server handler validates utf8.RuneCountInString on Create/Update and rejects over-limit input with 400. - Front-end gets AGENT_DESCRIPTION_MAX_LENGTH in core/agents/constants (single source of truth shared by the create dialog + edit modal + test suite) and a CharCounter component that warns at 90% and errors past the cap. - Description editor moves from a 288px popover to a roomy modal. Editor body is mounted only while the dialog is open, so the local draft state is locked in at mount time and never reset by an external WS update — the React-recommended replacement for the useEffect(reset, [value]) anti-pattern. Counted in code points everywhere (rune count / spread length / char_length) so multibyte input agrees across all three layers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(views): data-table polish across runtime + skill lists Builds on the DataTable migration in `2be0f287`: - Add ColumnMeta.grow flag — declared via TanStack module augmentation in ui/lib/data-table.ts. Columns marked meta.grow skip their inline width so fixed table-layout assigns them the leftover container space (no spacer column). The Title-grows / others-fixed pattern from Linear / GitHub PR rows. - Authoritative table min-width = sum of column.size, applied to the <table> itself (fixed-layout ignores cell-level min-width per spec, so the floor has to live on the table). - Header tightens to h-8 + uppercase + tracking-wider; pinned cells switch to opaque bg + group-hover so they cover content scrolling beneath them and follow row hover state. - Toolbar slot removed from DataTable (callers wrap the toolbar themselves now — keeps DataTable single-purpose). Also: hover-card popup stops contextmenu / auxclick / dblclick from bubbling out (in addition to click). Stops the popup from triggering ancestor handlers (e.g. issue list rows) on right-click / middle-click without breaking Base UI's outside-click dismiss, which listens to pointerdown — pointerdown is deliberately NOT stopped. Runtime + skill list pages updated to use the new sizing model. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(agent): drop LastTaskState, introduce 3-state Workload Continues the presence-model rework started in #1794 / #1798. The previous LastTaskState union (running / completed / failed / cancelled / idle) carried historical outcome at the list level — a runtime-healthy agent whose last task failed showed a sticky red dot indistinguishable from a daemon-dead agent. New model: presence is two orthogonal "right-now" dimensions: AgentAvailability — runtime reachability only (online / unstable / offline). Drives the dot colour everywhere. Workload — current load (working / queued / idle). Three states, never historical. Failure / completion / cancellation are surfaced via Recent Work + Inbox, not list-level state. `queued` (= nothing running, ≥1 queued) is an honest "stuck on offline runtime" signal. To avoid amber flashes during the brief enqueue→claim race on healthy runtimes, the queued chip composes with availability: muted on online, warning amber otherwise. Activity tab cleanup that follows from the new model: - failureReasonLabel relocated from agents/presence.ts to tabs/task-failure.ts (presence no longer owns historical state). - Recent Work paginates (5 initial, +20 per "Show more"); chat-session tasks are filtered out of every Agent-scoped surface to keep "team work" separate from private chat. - Agents page drops the lastTaskFilter chip group; users find broken agents via Inbox / Recent Work, not a list-level filter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(task): trigger summary snapshot + task:queued lifecycle event Two task-lifecycle improvements that ship together because they share the same enqueue/retry hot paths and changes interleave inside task.go: 1. trigger_summary snapshot (migration 061) New nullable column on agent_task_queue. Comment-triggered tasks snapshot the comment content; autopilot tasks snapshot the run title. Truncated to 200 runes via strings.Builder so multibyte input counts correctly without O(N²) concatenation. Snapshot survives source edits/deletes — every task row self-describes across surfaces (issue detail Execution log, agent activity tooltip, inbox) without joining back to the originating row. Retry rows inherit the parent's snapshot (CreateRetryTask SELECT) so the description stays meaningful across attempts. The UI is responsible for stacking "Retry #N" context on top. 2. task:queued WS event New protocol event covering the ∅ → queued transition. Front-end types/events.ts registers it; use-realtime-sync's task: prefix path already invalidates task caches via onAny, so old clients without this exact-match subscription still refresh correctly. Specific subscribers (sticky banner) get sub-second updates instead of waiting for daemon claim. Retry path now broadcasts task:queued (not task:dispatch) — same status transition shape as enqueue, so all "new task created" paths agree on one event type. Ordering: broadcastTaskEvent runs before notifyTaskAvailable so the queued event is published into the WS bus before the daemon is poked. Without this, a fast daemon could claim and emit task:dispatch over the wire before the in-process queued broadcast fan-out reached clients — race window is tiny but unsafe-by-construction. Per-agent task list (agentTasksKeys.all) and per-issue task list (["issues","tasks"]) added to the task: invalidation set so Activity tab Recent Work and the Execution log section stay fresh. Type contracts: AgentTask gains parent_task_id / attempt / trigger_comment_id (already returned by the API, just missing from TS) plus the new trigger_summary field. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(issue): ExecutionLogSection — unified active+past runs panel Replaces two pieces: - the click-to-expand timeline that lived inside AgentLiveCard - the standalone TaskRunHistory below the main content with a single right-panel section that lists every agent run for the issue. Active runs sit at the top (always visible when present); past runs collapse behind a "Show past runs (N)" toggle, sorted failed → cancelled → completed within group. Active rows show the trigger summary, status + relative time, and Cancel / Transcript actions on hover (gradient backdrop fades the status text rather than hard-clipping). Past rows show the same shape minus Cancel. Retry tasks prepend "Retry #N · " to the inherited summary so they're distinguishable from their parent (which would otherwise share the exact same trigger text). Cache key registered as issueKeys.tasks(issueId); the global useRealtimeSync task: prefix path already invalidates ["issues","tasks"] on every task lifecycle event, so the section stays fresh without local WS subscriptions. AgentLiveCard slims down to a header-only "agent is working" sticky banner — keeps the at-a-glance "is anyone working on this right now" signal and the Stop / Transcript actions, drops the inline timeline that ExecutionLogSection now owns. Subscribes to both task:queued and task:dispatch so retries (which only emit queued) land in the banner without waiting for daemon claim. issue-detail mounts ExecutionLogSection in the right panel and removes the now-defunct TaskRunHistory call site. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-29 14:50:58 +08:00
Bohan Jiang	2d9c153695	feat: quick-create issue (async agent + inbox completion) (#1786 ) * feat(server): add quick-create issue async task path Adds POST /api/issues/quick-create which validates the picked agent's reachability up front (not archived, has runtime, runtime online) then queues an issue-less agent task whose context JSONB carries the user's natural-language prompt + requester + workspace. Daemon claim resolves the workspace from the context, and the prompt builder switches to a quick-create template instructing the agent to translate the prompt into a single multica issue create call. Task completion writes a success inbox item to the requester pointing at the newly-created issue (located by querying the agent's most recent issue in the workspace since task start, so we don't depend on agent stdout shape). Failures write an action_required inbox item carrying the original prompt + agent id so the frontend can offer "Edit as advanced form" without losing input. * feat(views): quick-create issue modal + inbox failure CTA Adds a streamlined create-issue UI bound to the c shortcut: pick an agent, type one line, submit. The modal closes immediately and the agent translates the prompt into a multica issue create call in the background. Shift+c keeps the legacy advanced form for users who want every field. The "Advanced" button inside the new modal seeds the shared issue-draft store with the prompt + picked agent so switching mid-flow doesn't lose input. Last-used agent persists per (user, workspace) via a workspace-aware zustand store so frequent users skip the picker on every open. Inbox renders quick_create_done items with a status pin to the new issue and quick_create_failed items with an "Edit as advanced form" CTA that re-seeds the legacy modal with the original prompt. ApiError now carries the parsed JSON body so the modal can branch on the structured agent_unavailable code without parsing the error message. * fix(quick-create): execenv injection, claim race, private-agent permission Addresses GPT-Boy review on #1786: 1. execenv was rendering the assignment-task issue_context.md / runtime workflow even for quick-create, telling the agent to call `multica issue get/status/comment add` against an empty IssueID. Adds QuickCreatePrompt to TaskContextForEnv, plus a quick-create branch in renderIssueContext + the runtime_config workflow that instructs the agent to run a single `multica issue create` and exit, with explicit "do NOT call issue get/status/comment add" guards. 2. ClaimAgentTask serialized only on issue_id / chat_session_id, so concurrent quick-creates on the same agent (both NULL on those columns) ran in parallel — making the success-inbox lookup race over "most recent issue by this agent". Adds a third OR clause that treats "all four FKs NULL" as a serialization key for the same agent, so quick-create tasks on a given agent run one at a time. 3. QuickCreateIssue handler bypassed the private-agent ownership rule that validateAssigneePair enforces elsewhere — a user could POST a private agent_id they didn't own and trigger it. Now routes the picked agent through validateAssigneePair before the runtime liveness check. 4. Clarifies the quick-create-store namespacing comment to match the actual workspace-aware StateStorage convention used by the other issue stores (per-user is browser-profile-local). * fix(quick-create): branch Output section + deterministic origin lookup Addresses GPT-Boy's second-pass review on #1786: 1. The runtime_config.go Output section forced "Final results MUST be delivered via multica issue comment add" for every non-autopilot task — quick-create still got this conflicting instruction even though there's no issue to comment on. Switched the Output block to a three-way switch so quick-create gets a tailored "stdout is captured automatically; do NOT call comment add" branch matching the autopilot variant. 2. Completion lookup was "most recent issue created by this agent since task.started_at", which races against concurrent issue creates by the same agent (assignment task running alongside quick-create when max_concurrent_tasks > 1). Replaced with a deterministic origin link: - Migration 060 extends issue.origin_type CHECK to allow 'quick_create'. - Daemon sets MULTICA_QUICK_CREATE_TASK_ID env var when running a quick-create task. - multica issue create CLI reads the env var and stamps the new issue with origin_type=quick_create + origin_id=<task_id>. - Server CreateIssue handler accepts (origin_type, origin_id) from trusted callers (only "quick_create" is allowed; the pair is rejected unless both fields are provided together). - notifyQuickCreateCompleted now calls GetIssueByOrigin keyed on (workspace_id, "quick_create", task.ID) — no more time-window racing against parallel agent activity. The old GetRecentIssueByCreatorSince query is removed.	2026-04-29 14:05:26 +08:00
Naiyuan Qing	21e3cfaa01	Agent runtime status redesign: split presence into availability + last-task (#1794 ) * feat(agent-status): add workspace live-tasks endpoint and TaskFailureReason type Lays the API + type contract for the front-end agent presence cache: - New `GET /api/active-tasks` returns active (queued/dispatched/running) tasks plus failed tasks within the last 2 minutes for the current workspace. The 2-minute window powers a UI-side auto-clearing "Failed" agent state without back-end pollers. - `agent_task_queue` has no workspace_id column, so the query JOINs agent; `SELECT atq.` keeps `failure_reason` (migration 055) on the wire. - Adds `TaskFailureReason` to `AgentTask` so the UI can map the 5 backend classifiers (agent_error / timeout / runtime_offline / runtime_recovery / manual) to copy without parsing free-text errors. - New `api.getActiveTasksForWorkspace()` client method; workspace is resolved server-side from the X-Workspace-Slug header (no path param, matching /api/agents and /api/runtimes conventions). Includes the joint engineering plan and designer brief that scope the broader Agent / Runtime status redesign — Phase 0 is this contract plus the front-end derivation layer landing in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> feat(agent-status): derive presence/health states with WS sync and desktop IPC bridge Adds the front-end derivation layer that turns raw server data into the user-facing 5-state agent / 4-state runtime enums. UI files are deliberately untouched in this commit — derivation lives behind hooks (useAgentPresence, useRuntimeHealth) that any component can call with zero additional network traffic. Architecture: - Derivation is pure functions in packages/core/{agents,runtimes}; the back-end stays free of UI translation. Agents algorithm: runtime offline > recent failed (2-min window) > running > queued > available. Runtimes algorithm: status + last_seen_at -> online / recently_lost / offline / about_to_gc. - A single workspace-wide active-tasks query backs all per-agent presence reads, eliminating N+1 across hover cards, list rows, and pickers. 30-second tick re-renders the hooks so the failed window expires even when no underlying data changes. - WS task lifecycle events (dispatch / completed / failed / cancelled) invalidate active-tasks via the prefix dispatcher. completed/failed were removed from specificEvents so they go through both the prefix invalidate and the existing chat ws.on() handlers. Reconnect refetch picks up active-tasks too. - Desktop bridges window.daemonAPI.onStatusChange directly into the runtimes cache via setQueryData, giving the local daemon sub-second feedback (vs. 75s server sweep). Bridge is wsId-bound so workspace switches automatically rebind the subscription; daemon_id matching covers the same-daemon-multiple-providers case. 24 derivation unit tests cover all branches plus null/empty/boundary inputs (FAILED_WINDOW_MS edges, null last_seen_at, missing completed_at). Full core suite: 112 tests passing. Typecheck green across all 8 workspace packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(agent-status): redesign agent runtime status as two orthogonal dimensions Splits the conflated 5-state agent presence into two independent axes: - AgentAvailability (3-state): online / unstable / offline — drives the dot indicator everywhere a dot appears. Pure runtime reachability; never sticky-red because of a past task outcome. - LastTaskState (5-state): running / completed / failed / cancelled / idle — surfaced as text + icon on focused surfaces (hover card, agent detail page, agents list, runtime detail). Never colours the dot. Major changes: * Domain layer: AgentPresence union → AgentAvailability + LastTaskState. derive-presence split into deriveAgentAvailability + deriveLastTaskState + deriveAgentPresenceDetail orchestrator. Tests reorganised into three groups (availability invariants, last-task invariants, composition). * Visual config: presenceConfig (5 entries) → availabilityConfig (3) + taskStateConfig (5). availabilityOrder + lastTaskOrder for filter chips. * Workspace-level presence prefetch: new useWorkspacePresencePrefetch hook + WorkspacePresencePrefetch mount component, wired into DashboardLayout (web) and WorkspaceRouteLayout (desktop). Hover cards render synchronously with no skeleton flash on first hover. * ActorAvatar hover: flipped default — disableHoverCard removed, enableHoverCard added (default false). Opt-in at ~14 decision-moment surfaces; pickers / decoration sub-chips stay plain. Status dot decoupled (showStatusDot prop) so picker rows can show presence without nesting popovers. * Hover cards: AgentProfileCard simplified — availability dot only, Detail link top-right (logs live on the detail page). New MemberProfileCard mirrors the structure: name + role + email + top-2 owned agents (sorted by 30d run count) with click-through to agent detail. * Agents list: split Status into two columns — availability (3-color dot + label) and Last run (task icon + label, optional running counts). Two independent filter chip groups (Status + Last run); combination acts as intersection ("online + failed" finds broken- but-alive agents). * Other UI surfaces (issue list/board/detail, comments, autopilots, projects, runtimes, mention autocomplete, subscribers picker) updated to the new dot semantics; status dot now strictly 3-color. Server changes accompany the client redesign — workspace-wide agent-task-snapshot endpoint, runtime usage queries, etc. — to feed the derive layer with the data it needs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(agent-detail): drop last-task chip from detail header + inspector The Recent work section on the agent detail page already shows the same data (with task titles, timestamps, error context) — surfacing "Completed" / "Failed" / etc. up in the header was redundant chrome. Detail surfaces now show only the 3-state availability dot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tables): handle narrow viewports across agents / skills / runtimes Three table layouts were squeezing content into adjacent cells at intermediate widths. Each fix is small and targeted: * runtime-list: the Runtime cell's base name had `shrink-0`, so it refused to truncate when its grid column was narrowed under width pressure — the name visually overflowed into the Health column ("ClaudeOnline" etc). Removed shrink-0, added truncate. The Health column was also a fixed 9.5rem reservation for the worst-case "Recently lost · 2m 14s ago" copy; switched to minmax(0,1fr) so it competes fairly with Runtime. * skills-page: had a single grid template with no responsive breakpoints — all 6 columns were rendered at any width and got visually jammed below md. Added a <md template that drops Source + Updated; the row markup hides those cells via `hidden md:block` / `md:contents`. * agent-list-item: the new Last run column was reserved at minmax(8rem, max-content); on narrow md viewports the 8rem floor pushed the row past available width. Changed to minmax(0,max-content) so the cell shrinks under pressure (its content already truncates). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(agent-card): hover-only Detail + add Runtime row + breathing room Three small polish tweaks to the agent hover card: - Detail link gets `mr-1` + fades in only on card hover (group-hover). It was visually flush against the popover edge and competing for attention; now it stays out of the way during a quick glance and surfaces only when the user is dwelling on the card. - Runtime row is back, in the meta block (cloud/local icon + runtime name). The earlier removal was over-aggressive — knowing where an agent runs is part of "who is this agent". The wifi badge stays dropped because the availability dot in the header already conveys reachability. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(runtime): wifi-style health icon (4-state) for runtime list + agent card Replaces the 6px coloured dot with a wifi-shape icon that carries both state (Wifi vs WifiOff) and severity (success/warning/muted/destructive). Mapping: - online → Wifi (success) - recently_lost → WifiHigh (warning) — transient hiccup, fewer bars - offline → WifiOff (muted) — long unreachable - about_to_gc → WifiOff (destructive) — sweeper coming soon Used in two places: - Runtime list: replaces HealthDot in the dedicated leading-icon column. Bumped the column from 0.5rem (dot-sized) to 0.875rem (icon-sized). - Agent profile card RuntimeRow: derives runtime health from runtime + clock (matching the 4-state semantics) and renders HealthIcon next to the runtime name. Cloud runtimes always read as online. The duplicate signal with the header availability dot is intentional — it confirms WHICH runtime is the one currently in the dot's state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 19:21:13 +08:00
Bohan Jiang	b77acdf642	fix(comments): cancel triggered tasks when comment is deleted (#1747 ) When a user deletes a comment that triggered an agent task, the agent would still run with the now-deleted content baked into its prompt (fetched at task claim time) — manifesting as "the agent still sees the deleted comment". The FK ON DELETE SET NULL only nullified trigger_comment_id; the queued task itself was never cancelled. DeleteComment now cancels any queued/dispatched/running task whose trigger is the deleted comment, before the comment row is removed.	2026-04-27 18:24:07 +08:00
devv-eve	ba2f19d631	fix: refresh agent status from active tasks (#1733 ) Co-authored-by: Eve <eve@multica.ai>	2026-04-27 13:34:24 +08:00
jmoney8896	3c3e3bd330	fix(task): reconcile agent status when cancelling tasks by issue (#1587 ) (#1648 ) CancelTasksForIssue silently dropped the list of affected tasks, so whenever an issue transitioned to "cancelled" or "done" while a task was still active (6 call sites in issue.go), the underlying agent was left stuck at status="working" indefinitely and required a manual `multica agent update <id> --status idle` to self-correct. This matches the symptom reported in #1587: task rows move to "cancelled" via a non-user-initiated path, agent status never reconciles. Change CancelAgentTasksByIssue from :exec to :many (also tack on completed_at = now() for consistency with CancelAgentTasksByIssueAndAgent), then update CancelTasksForIssue to iterate the returned rows and call ReconcileAgentStatus + broadcast task:cancelled per affected task — mirroring the pattern already used by CancelTask and RerunIssue. No test added; the change is small and mirrors well-covered paths. Happy to add a mock-backed test in a follow-up if reviewers prefer. Refs #1587 Refs #1149	2026-04-26 10:58:42 +08:00
LinYushen	0b1333fb00	feat(server): orphan-task recovery + auto-retry + manual rerun (MUL-1128) (#1476 ) * feat(server): orphan-task recovery + auto-retry + manual rerun (MUL-1128) When the daemon process crashed mid-task the issue was stuck at in_progress for up to 2.5h: the in-flight task timeout was the only mechanism that ever moved the row, and the runtime heartbeat sweeper only fires after the runtime stays offline for 45s — a quick restart beats both windows. This change implements the A+B plan from the issue thread: A. lifecycle hygiene - migration 055 adds attempt / max_attempts / parent_task_id / failure_reason / last_heartbeat_at to agent_task_queue - new daemon-auth endpoint POST /runtimes/{id}/recover-orphans: daemon calls it on every register so the server fails any dispatched/running tasks the previous process left behind - new daemon-auth endpoint POST /tasks/{id}/session: persists the agent's session_id + work_dir mid-flight so a crash doesn't lose the resume pointer (claude+codex emit MessageStatus with SessionID; daemon forwards on the first one it sees) - FailAgentTask / FailStaleTasks / FailTasksForOfflineRuntimes now set failure_reason ('agent_error' / 'timeout' / 'runtime_offline') B. auto-retry with resume context - TaskService.MaybeRetryFailedTask spawns a fresh queued attempt carrying parent's session_id/work_dir when the failure reason is infrastructure-shaped (timeout, runtime_offline, runtime_recovery) and attempt < max_attempts; skips autopilot - wired into the runtime sweeper paths and TaskService.FailTask so the user transparently sees a new in_progress run instead of a stuck row - new user-auth POST /api/issues/{id}/rerun + multica issue rerun CLI for the manual escape hatch Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(server): address PR review for orphan-task recovery (MUL-1128) Three review-must-fix items on top of the A+B implementation: 1. recover-orphans now funnels through TaskService.HandleFailedTasks, the same shared post-failure pipeline used by the runtime sweeper. This guarantees task:failed events are emitted, agent status is reconciled, and issues stuck in_progress with no remaining active task are reset to todo even when no auto-retry is created (max_attempts exhausted, autopilot, non-retryable reason). 2. RerunIssue now uses CancelAgentTasksByIssueAndAgent, scoped to the issue's current assignee. The previous implementation called CancelAgentTasksByIssue, which would collateral-cancel parallel @-mention agents on the same issue. 3. GetLastTaskSession now considers both completed and failed tasks (mirroring GetLastChatTaskSession), ordering by the most recent timestamp. With UpdateAgentTaskSession pinning session_id/work_dir mid-flight, an auto-retry or manual rerun of a daemon-crash failure now actually resumes the prior conversation context instead of starting fresh — matching the stated B-branch behaviour. go build / go vet pass; the existing service and agent test suites pass. runtime_sweeper / handler integration tests require a local DB with the 055 migration (and the pre-existing 050 first_executed_at column). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-04-22 13:08:37 +08:00
Jiayuan Zhang	b291db11c2	feat(agents): add per-agent model field with provider-aware dropdown (#1399 ) Adds a first-class `model` field on agents so users can pick the LLM model from the create / settings UI instead of editing `custom_env` / `custom_args`. Each provider's dropdown is populated from the live CLI when possible (`opencode models`, `pi --list-models`, `openclaw agents list --json`, `cursor-agent --list-models`, hermes ACP `session/new` → `SessionModelState`), with a static catalog for providers that don't enumerate. Daemon resolves the runtime model as `agent.model → MULTICA_<PROVIDER>_MODEL → ""` — empty passes through so each backend's CLI picks its own default, avoiding static-guess drift. Per-provider honouring: - Claude / Codex / OpenCode / Cursor / Gemini / Pi / Copilot — CLI `--model` / thread payload. - OpenClaw — `opts.Model` is mapped to `--agent <name>` (the CLI rejects `--model`). - Hermes — `session/set_model` ACP RPC; stderr is sniffed for provider-level errors so HTTP 4xx from the configured LLM surfaces instead of "empty output"; explicit-model failures mark the task `failed`. Supporting changes: migration 050 adds `agent.model`; daemon ↔ server heartbeat piggyback carries a model-discovery request; new REST endpoints under `/api/runtimes/{id}/models`; `multica agent create --model` / `update --model`; shared `ModelDropdown` in `packages/views/agents` (searchable, creatable, provider-grouped, default-badge, runtime-supported gate).	2026-04-21 00:06:34 +08:00
devv-eve	5fa1da448f	fix(chat): preserve chat session resume pointer across failures (#1360 ) * fix(chat): preserve chat session resume pointer across failures The chat 'forgets earlier messages' bug came from PriorSessionID being silently lost in several edge cases: - UpdateChatSessionSession unconditionally overwrote chat_session.session_id, so any task that completed without a session_id (early agent crash, missing result) wiped the resume pointer to NULL. - CompleteAgentTask + UpdateChatSessionSession ran in separate calls. A follow-up chat message claimed in between resumed against a stale (or NULL) session and started over. - FailAgentTask never wrote session_id back, so a task that established a real session before failing lost its resume pointer. - ClaimTaskByRuntime only trusted chat_session.session_id and never fell back to the existing GetLastChatTaskSession query, so a single bad turn could permanently drop the conversation memory. This change: - Use COALESCE in UpdateChatSessionSession so empty inputs preserve the existing pointer; surface DB errors instead of swallowing them. - Run CompleteAgentTask/FailAgentTask + UpdateChatSessionSession inside the same transaction (TaskService now takes a TxStarter). - Extend FailAgentTask + the daemon FailTask path (client, handler, service) to forward session_id/work_dir, so failed/blocked tasks that built a real session still record it. - Fall back to GetLastChatTaskSession in ClaimTaskByRuntime when the chat_session pointer is missing, and include failed tasks in that lookup so a single failure can't lose the conversation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(daemon): forward session_id/work_dir on blocked + timeout paths runTask previously dropped result.SessionID and env.WorkDir on the non-completed return paths: - timeout returned a naked error, so handleTask called FailTask with empty session info and the chat resume pointer was either left stale or eventually overwritten with NULL. - blocked / failed (default branch) returned a TaskResult without SessionID / WorkDir, so even though FailTask now COALESCEs into chat_session, there was no value to write through. - the empty-output completion path was the same: it raised an error even when a real session_id had been built. All three paths now return a TaskResult that carries the SessionID / WorkDir the backend produced. Combined with the COALESCE-based update in UpdateChatSessionSession and the FailTask plumbing introduced in PR #1360, the next chat turn can always resume from the latest agent session — even when the previous turn timed out, was rate-limited, or returned an empty completion — instead of starting over with no memory of the conversation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(copilot): capture session id from session.start as fallback The Copilot backend only read sessionId from the synthetic 'result' event, ignoring the one already present on session.start. When the CLI was killed before result arrived (timeout, cancel, crash, or a session.error mid-turn), the daemon reported SessionID="" and the chat-session resume pointer could not advance — causing the chat to silently drop conversation memory on the next turn. Capture session.start.sessionId into state up front, and only let 'result' overwrite it when it actually carries one. result still wins when present (it is the authoritative end-of-turn record). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(copilot): parse premiumRequests as float to preserve session id Copilot CLI v1.0.32 serializes premiumRequests as a float (e.g. 7.5), not an integer. Our copilotResultUsage struct typed it as int, which made the entire 'result' line fail json.Unmarshal — silently dropping sessionId on every turn. This was the real cause of chat memory loss: the daemon reported SessionID="" to the server, chat_session.session_id stayed NULL, and the next chat turn never received --resume <id>, so each turn started a fresh Copilot session with no prior context. Add a regression test using the real JSON line from CLI v1.0.32 that asserts sessionId is preserved when premiumRequests is fractional. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Devv <devv@Devvs-Mac-mini.local> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Eve <eve@multica.ai> Co-authored-by: yushen <ldnvnbl@gmail.com>	2026-04-19 22:50:33 -07:00
Korkyzer	63800f05ff	fix(agent): add per-agent mcp_config field to restore MCP access (#1168 ) * fix(agent): add per-agent mcp_config field to restore MCP access Closes #1111 The --strict-mcp-config flag was added defensively in #592 to prevent Claude agents from inheriting MCP state from the outer Claude Code session. It was meant to be paired with --mcp-config <path> to inject a controlled set of MCPs, but that path was never implemented, which silently stripped all user-scope MCPs from spawned agents. This PR completes the original design by: - Adding a nullable mcp_config jsonb column to the agents table - Wiring mcp_config through AgentResponse, Create/Update requests - Piping it into ExecOptions.McpConfig in the daemon - Serializing to a temp file and passing --mcp-config <path> in buildClaudeArgs - Blocklisting --mcp-config in claudeBlockedArgs to prevent override via custom_args Does not touch Codex provider (tracked separately in #674). Does not implement Multica MCP auto-injection (out of scope). * fix: disambiguate JSON null vs absent for mcp_config	2026-04-18 01:35:22 +08:00
Bohan Jiang	ce447c7f06	feat(agent): add custom CLI arguments support (#986 ) * feat(agent): add custom CLI arguments support Allow users to configure custom CLI arguments per agent that get appended to the agent subprocess command at launch time. This enables use cases like specifying different models (--model o3), max turns, or other provider-specific flags without needing separate runtimes. Changes: - Add custom_args JSONB column to agent table (migration 041) - Update API handler to accept/return custom_args in create/update - Pass custom_args through claim endpoint to daemon - Append custom_args to CLI commands for all agent backends - Add ExecOptions.CustomArgs field in agent package - Add Custom Args tab in agent detail UI - Add --custom-args flag to CLI agent create/update commands Closes MUL-802 * fix(agent): filter protocol-critical flags from custom_args Add per-backend filtering of custom_args to prevent users from accidentally overriding flags that the daemon hardcodes for its communication protocol (e.g. --output-format, --input-format, --permission-mode for Claude). This follows the same pattern as custom_env's isBlockedEnvKey: we only block the small, stable set of flags that would break the daemon↔agent protocol — not every possible dangerous flag. Workspace members are trusted for everything else. Each backend defines its own blocked set: - Claude: -p, --output-format, --input-format, --permission-mode - Gemini: -p, --yolo, -o - Codex: --listen - OpenCode: --format - OpenClaw: --local, --json, --session-id, --message - Hermes: none (ACP is positional) Includes unit tests for the filtering logic. * fix(agent): address code review nits for custom_args - Replace module-level `nextArgId` counter with `crypto.randomUUID()` in custom-args-tab.tsx to avoid SSR ID conflicts - Add unit tests for custom args passthrough and blocked-arg filtering in both Claude and Gemini arg builders	2026-04-15 14:58:53 +08:00
Jiang Bohan	4165401d16	feat(agent): support custom environment variables for router/proxy mode Add per-agent custom_env configuration that gets injected into the agent subprocess at launch time. This enables users to configure custom API endpoints (ANTHROPIC_BASE_URL), API keys (ANTHROPIC_API_KEY), and cloud provider modes (CLAUDE_CODE_USE_BEDROCK, CLAUDE_CODE_USE_VERTEX) without requiring code changes. Changes: - Migration 040: add custom_env JSONB column to agent table - Backend: custom_env in agent CRUD API + claim endpoint - Daemon: merge custom_env into subprocess environment variables - Frontend: env var editor in agent settings (key-value pairs with visibility toggle for sensitive values) Closes #816 Related: #807, #809	2026-04-13 16:47:56 +08:00
yushen	50f9e673e8	feat(chat): add agent chat feature (full stack) Implement the Master Agent chat feature allowing users to chat with agents directly from a floating window, separate from the issue-based workflow. Backend: - New chat_session and chat_message tables (migration 033) - Make issue_id nullable on agent_task_queue for chat tasks - REST API: create/list/get/archive sessions, send/list messages - EnqueueChatTask in TaskService with session_id persistence - WS events: chat:message, chat:done - Daemon: chat task type with separate prompt builder - ClaimTaskByRuntime populates chat context (session, message, repos) Frontend: - ChatSession/ChatMessage types + API client methods - core/chat: TanStack Query options, mutations with optimistic updates, WS updaters - features/chat: Zustand store, ChatFab (floating button), ChatWindow with real-time streaming via task:message events - Mounted in dashboard layout (bottom-right corner) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 14:19:46 +08:00
Bohan Jiang	98af9f442c	Merge pull request #471 from multica-ai/agent/j/959392dd feat: support multiple agents running on same issue	2026-04-08 12:45:56 +08:00
devv-eve	7c79611309	refactor: remove agent triggers config field (#469 ) * refactor: remove agent triggers config field Remove the triggers field from agent configuration. The on_assign, on_comment, and on_mention behaviors are now always enabled (hardcoded), as decided in the Agentflow design discussion (MUL-372). Changes: - Database: migration 032 drops triggers column from agent table - Backend: remove triggers from create/update agent APIs and response - Backend: simplify trigger-checking logic to always-enabled - Frontend: remove TriggersTab UI and AgentTrigger types - Tests: remove trigger config unit tests (no longer configurable) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: also remove agent tools config field Remove the tools field from agent configuration alongside triggers. The tools field was a placeholder — stored in the DB and shown in the UI but never passed to the daemon or used at runtime. - Database: migration 032 now also drops tools column - Backend: remove tools from create/update agent APIs and response - Frontend: remove ToolsTab UI, AgentTool type, and tools tab - Update landing page copy Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(test): remove tools/triggers columns from test fixtures The test fixtures still referenced the dropped tools and triggers columns when inserting agent rows, causing CI failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Devv <devv@Devvs-Mac-mini.local> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 22:02:28 +08:00
Jiang Bohan	b94108768e	feat: support multiple agents running concurrently on the same issue - Relax ClaimAgentTask SQL constraint from per-issue to per-(issue, agent) serialization, allowing different agents to run in parallel on the same issue - Update GetActiveTaskForIssue API to return all active tasks (array) instead of just the first one - Refactor AgentLiveCard to render one card per active task, routing WebSocket messages by task_id for independent timelines - Fix shouldEnqueueOnComment to use per-agent dedup so a mentioned agent's pending task doesn't block the assigned agent's on_comment trigger Closes MUL-160	2026-04-07 18:19:57 +08:00
LinYushen	09764c5f51	feat(agent): replace hard delete with archive/restore (#346 ) * feat(agent): replace hard delete with archive/restore Replace agent deletion with soft archive pattern. Archived agents are preserved in the database with all historical references intact but cannot be assigned, mentioned, or trigger tasks. Backend: - Add archived_at/archived_by columns to agent table (migration 031) - Replace DELETE /api/agents/{id} with POST /api/agents/{id}/archive - Add POST /api/agents/{id}/restore endpoint - ListAgents excludes archived by default (?include_archived=true to include) - Skip archived agents in task triggers (on_assign, on_comment, on_mention) - Block assignment to archived agents - Cancel pending tasks on archive - New events: agent:archived, agent:restored (replacing agent:deleted) Frontend: - Agent type includes archived_at/archived_by fields - Mention autocomplete and assignee picker filter out archived agents - Agent list shows archived agents with muted styling - Agent detail shows archive banner with restore button - Delete button replaced with Archive button and updated confirmation dialog - API client: archiveAgent/restoreAgent replace deleteAgent Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(agent): self-review fixes for archive feature - Fix: workspace store now fetches agents with include_archived=true so archived agents are actually visible in the frontend (the archived UI was dead code before — ListAgents excludes archived by default) - Fix: add error logging for CancelAgentTasksByAgent in ArchiveAgent - Fix: add idempotency guards — return 409 Conflict when archiving an already-archived agent or restoring a non-archived agent - Fix: revert unnecessary extra GetAgent query in ReconcileAgentStatus (archived agents won't have running tasks after CancelAgentTasksByAgent) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 17:33:52 +08:00
LinYushen	a80d61f8e1	fix(task): enforce per-issue serial execution in task claiming (#330 ) Add NOT EXISTS check to ClaimAgentTask SQL to prevent claiming a queued task when the same issue already has a dispatched/running task. This ensures serial execution within an issue while preserving parallel execution across different issues (concurrency group pattern). Also add defensive guard in the frontend task:dispatch handler to avoid replacing an active task's LiveLog timeline mid-execution. Closes MUL-183 Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 14:12:00 +08:00
Jiang Bohan	4780540bd2	merge: resolve conflicts with main	2026-04-01 13:12:23 +08:00
Jiang Bohan	365b5b5f0e	feat(agent): add task cancellation with stop button Users can now interrupt running agents via a "Stop" button on the live card. The daemon polls task status every 5 seconds and kills the agent process when cancellation is detected. Changes: - New CancelAgentTask SQL query and CancelTask service method - POST /api/issues/{id}/tasks/{taskId}/cancel endpoint - Daemon polls GetTaskStatus during execution, cancels context on match - Frontend: Stop button on AgentLiveCard, task:cancelled WS event	2026-03-31 18:43:37 +08:00
Jiayuan	37881adbed	feat(server): trigger agents via @mention in comments When a user @mentions an agent in any issue's comment, the system now enqueues a task for that agent. The agent reads the issue context and replies to the triggering comment thread. Changes: - Add shared util.ParseMentions for mention parsing (used by both comment handler and notification listeners) - Add EnqueueTaskForMention to TaskService for explicit agent targeting - Add on_mention trigger type support in agent trigger config - Add HasPendingTaskForIssueAndAgent SQL query for per-agent dedup - Add enqueueMentionedAgentTasks in CreateComment handler Safety: prevents self-trigger (agent mentioning itself), dedup with assignee on_comment trigger, terminal issue status check, and per-agent pending task dedup.	2026-03-31 15:30:24 +08:00
LinYushen	961de18c97	feat(agents): reply as thread instead of top-level comment (#205 ) * feat(agents): reply as thread instead of top-level comment When an agent responds to a user comment, the reply is now nested under the triggering comment (parent_id) instead of appearing as a separate top-level comment. Also enables on_comment trigger by default for newly created agents. - Add trigger_comment_id column to agent_task_queue (migration 028) - Pass triggering comment ID through EnqueueTaskForIssue → task → createAgentComment - Include parent_id in WebSocket broadcast for agent comments - Default agent creation includes both on_assign and on_comment triggers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(cli): add --parent flag to comment add for threaded replies The agent posts comments via the CLI, so the correct fix is giving it a --parent flag rather than wiring trigger_comment_id through the task infrastructure. The agent reads the comment list, decides which comment to reply to, and passes --parent <comment-id>. - Add --parent flag to `multica issue comment add` - Update agent runtime instructions to explain --parent usage Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(daemon): pass trigger_comment_id to agent execution context The agent now knows which comment triggered its task and gets an explicit instruction to reply to it using --parent. The trigger_comment_id flows from the DB through the claim response, daemon Task struct, and into issue_context.md where the agent sees it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(comments): agent replies to thread root, matching frontend behavior When the triggering comment is itself a reply (has parent_id), resolve to the thread root so the agent's reply stays in the same flat thread. This matches the frontend where all replies share the top-level parent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(cli): show parent_id and full IDs in comment list The table output now includes a PARENT column and shows full comment IDs (not truncated) so agents can see thread structure and use --parent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(daemon): instruct agents to always use --output json Agents now see explicit guidance to use --output json for all read commands, ensuring they get structured data with full IDs and parent_id for proper threading. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat(daemon): differentiate comment-trigger vs assign-trigger context When triggered by a comment, the agent now gets clear instructions: - Primary goal is to read and respond to the comment - Do NOT change issue status just because you replied - Only change status if explicitly requested This prevents the agent from seeing "In Review" and stopping, since it now understands the task is to reply, not to re-evaluate the issue. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(daemon): split workflow by trigger type in CLAUDE.md/AGENTS.md The Workflow section in the agent's runtime config now shows a comment-reply workflow when triggered by a comment (read comments, find trigger, reply, don't change status) vs the full assignment workflow (set in_progress, do work, set in_review). Previously the agent always saw the assignment workflow, causing it to check the issue status, see "In Review", and stop without reading or replying to the triggering comment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor(daemon): remove duplicate workflow from issue_context.md Workflow instructions now live only in CLAUDE.md/AGENTS.md (runtime_config.go). issue_context.md keeps just the task data: issue ID, trigger type, and triggering comment ID. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(task): skip duplicate comment on completion for comment-triggered tasks When triggered by a comment, the agent posts its own reply via CLI with --parent. The task completion path was also creating a comment from the agent's stdout output, resulting in duplicates. Now only assignment-triggered tasks auto-post output as a comment. Error messages from FailTask are still posted regardless of trigger type. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 13:48:39 +08:00

1 2

62 Commits