multica

mirror of https://github.com/multica-ai/multica.git synced 2026-07-05 13:29:44 +02:00

Author	SHA1	Message	Date
yushen	ae4191fab1	fix(ui): pass node.instance_id instead of node.id to deleteNode mutation Fleet expects the actual AWS instance_id (e.g. i-0123456789abcdef0), not the internal DB id. Updated the mutate call in cloud-runtime-dialog to pass node.instance_id so the correct value reaches Fleet's DELETE /api/v1/nodes endpoint. Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 17:59:32 +08:00
yushen	bc79a94b5f	fix(api): use instance_id in deleteCloudRuntimeNode body Fleet API requires instance_id, not id. Fixes 'instance_id is required' error. MUL-2510 Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 17:53:33 +08:00
LinYushen	adec90c621	MUL-2510 feat: add delete button to fleet nodes list (#3007 ) * feat: add delete button to fleet nodes list - Add deleteCloudRuntimeNode method to API client (DELETE /api/cloud-runtime/nodes/:nodeId) - Add useDeleteCloudRuntimeNode mutation hook in cloud-runtime.ts - Add delete button with Trash2 icon to CloudRuntimeNodeRow component - Include confirmation dialog, loading state, and toast notifications - Add i18n keys for en and zh-Hans locales Co-authored-by: multica-agent <github@multica.ai> * fix(api): correct deleteCloudRuntimeNode contract to match server - Change from DELETE /api/cloud-runtime/nodes/:nodeId (no body) to DELETE /api/cloud-runtime/nodes with JSON body { id: nodeId } - Use fetchRaw + Content-Type header to match server's withBody proxy - Add contract test verifying URL, method, body, and Content-Type Fixes review feedback on MUL-2510 Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 17:46:26 +08:00
Bohan Jiang	ae530ef057	docs(runtime): tighten issue-metadata write bar (MUL-2507) (#3004 ) The previous wording invited agents to pin too much: any opened PR, external link, or "fact future agents will want one-glance access to" was framed as worth writing, with no explicit upper bound. In practice this caused metadata bags to accumulate single-run details and description-summary noise instead of the small set of repeatedly-read values the feature was designed for. Rework the agent runtime brief and the CLI docs to lead with the bar: write a key only when it is materially important AND likely to be re-read by future runs on the same issue. "Most runs write zero new keys" is now stated as the expected case, and the workflow exit step is rewritten to mirror the same gate. Recommended-key list, safety boundaries, and stale-key cleanup are preserved so the locked-in test anchors still pass. Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 17:20:43 +08:00
Bohan Jiang	ab0228c2a1	feat(issues): collapse long metadata bags in sidebar MUL-2503 (#3003 ) * feat(issues): collapse long metadata bags in sidebar (MUL-2503) The metadata KV strip rendered every key inline, so issues with many pinned keys pushed the rest of the sidebar far down. Keep the first four rows visible and tuck the remainder behind a Show N more / Show less toggle once the bag reaches five keys, mirroring the PR list collapse rule. Co-authored-by: multica-agent <github@multica.ai> * refactor(issues): hide metadata behind a JSON dialog (MUL-2503) Metadata is an agent-facing free-form KV bag — the values almost never mean anything to a human reader, and every property humans actually care about already has a dedicated sidebar field (status, priority, assignee, etc.). Rendering the first four keys inline still pushed real signal down and added visual noise for no benefit, so drop the inline strip entirely. Replace the section with a small `{ }` Metadata button at the bottom of the sidebar that opens a Dialog showing the formatted JSON. The button hides itself when the bag is empty, so the common case stays completely quiet. Removes the prior collapse threshold (and its `Show N more` / `Show less` strings) since there is nothing to collapse anymore. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 17:18:57 +08:00
LinYushen	e288eff2c5	feat: server auto-generates PAT for cloud runtime bootstrap (#3002 ) When bootstrap is enabled and no PAT is available from the request header or Authorization bearer token, the server now generates a new PAT automatically and forwards it to the cloud service. This removes the need for the frontend to pass X-User-PAT — the server handles it entirely.	2026-05-21 17:07:44 +08:00
YOMXXX	29c2a5d18f	fix(daemon): reclaim stale dispatched claims (MUL-2485) (#2872 ) * fix(daemon): reclaim stale dispatched claims * fix(daemon): widen stale claim reclaim window	2026-05-21 17:06:55 +08:00
Tom Qiao	81e8aa5812	test(core): add unit tests for reserved-slugs (#2985 ) Co-authored-by: Tom Qiao <tomqiaozc@users.noreply.github.com> Co-authored-by: Claude Opus 4 <noreply@anthropic.com>	2026-05-21 16:54:45 +08:00
Bohan Jiang	0c767c0052	feat(issues): per-issue metadata KV (MUL-2017) (#2845 ) * feat(issues): per-issue metadata KV (MUL-2017) Adds a small JSONB KV map to every issue for agent pipeline state (attempts, PR number, pipeline status, ...). Keys match a narrow regex, values are primitives (string / number / bool), capped at 50 keys per issue and 8KB per blob. Defense-in-depth via two CHECK constraints (object shape + size). All mutations are single-key atomic (jsonb_set / `- key`). `UpdateIssue` intentionally does NOT touch metadata: a whole-blob overwrite would race with concurrent agent writes. GET /api/issues/:id/metadata PUT /api/issues/:id/metadata/:key body: { "value": <primitive> } DELETE /api/issues/:id/metadata/:key Containment filter on list: GET /api/issues?metadata=<json-object> uses PG `@>` against a `jsonb_path_ops` GIN index. Mirrored across ListIssues, CountIssues, ListOpenIssues, and the hand-rolled ListGroupedIssues SQL so CLI/API and UI grouped views stay consistent. CLI: multica issue metadata {list,get,set,delete} multica issue list --metadata key=value (repeatable, AND) set has --type to override the default value-sniffing Co-authored-by: multica-agent <github@multica.ai> * fix(issues): metadata test bugs + wire realtime + read-only display (MUL-2017) - Fix two failing handler tests blocking backend CI: - reset decode target after delete so map merge does not mask removal - url.PathEscape the key segment so spaces no longer panic NewRequest - Wire issue_metadata:changed end to end so the detail / list / my-issues caches stay in sync with set/delete events (other tabs, CLI writes). - Add a read-only Metadata strip to the issue detail sidebar; hidden when the issue has no keys so it stays quiet in the common case. Co-authored-by: multica-agent <github@multica.ai> * feat(runtime): teach agents to read/write issue metadata (MUL-2017) Add an `## Issue Metadata` section to the runtime brief plus a `metadata list` step on entry and a `metadata set`/`delete` step on exit. Section only emits when the task carries an issue id (comment- or assignment-triggered); chat / quick-create / run-only autopilot stay clean so they don't fire failing CLI calls. Co-authored-by: multica-agent <github@multica.ai> * fix(issues): bump metadata migration to 105 and drop attempts as example (MUL-2017) main is now at 104_drop_runtime_timezone; the migrator picks LatestVersion() by sorted filename, so a slot before the tail would let DBs that have already run 099–104 think they're up-to-date while the issue.metadata column is missing — runtime would then fail with column does not exist. Renumbering to 105 puts the migration at the tail and forces it to run. Also drop attempts as a positive example across docs/code comments and test fixtures — the runtime instruction prompt already lists it under "What NOT to pin" (runtime bookkeeping). Replace with pr_number, which is in the recommended-keys set, so docs/tests speak the same language as the prompt. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 16:35:45 +08:00
Multica Eve	66c0464140	fix: simplify cloud runtime create form (#3000 ) Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 16:34:11 +08:00
Bohan Jiang	9a5d8a52f3	fix(timezone): harden hourly-rollup rollout against straight-through migrate MUL-2488 (#2998 ) * fix(timezone): harden hourly-rollup rollout against straight-through migrate MUL-2488 PR #2968 introduced the new task_usage_hourly rollup but assumed operators would stop migrate between 102 and 103 to run the one-shot cmd/backfill_task_usage_hourly. Two pieces made that unsafe in practice: 1. The Dockerfile only shipped server / multica / migrate, so a deployed container has no backfill binary to run between phases. 2. cmd/migrate has no per-version stop, and entrypoint.sh runs `migrate up` to the latest version, so 103 silently drops the legacy daily rollups even when nobody ran the backfill — leaving usage dashboards at zero despite source data being intact in task_usage. Changes: - Build cmd/backfill_task_usage_hourly into the runtime image alongside the other binaries so operators can `docker exec` the backfill instead of needing a source checkout. - Add a fail-closed plpgsql guard at the top of migration 103 that aborts the migration when task_usage has rows but task_usage_hourly is empty. Fresh databases (no task_usage rows) are exempt because the new triggers from 102 will populate the hourly table on the first event. Already-applied databases are unaffected — schema_migrations tracks by version only, so 103 is not re-run. Co-authored-by: multica-agent <github@multica.ai> * fix(timezone): use watermark coverage for hourly-rollup guard The previous check only required `task_usage_hourly` to be non-empty, which an interrupted backfill or a manual `rollup_task_usage_hourly_window` call both satisfy. The completion signal we actually trust is `task_usage_hourly_rollup_state.watermark_at` — backfill only stamps it to `now() - 5 min` after every monthly slice succeeded, and the cron worker only advances it on a real tick. Default after migration 101 is `1970-01-01`, so an unrun or partial backfill is trivially detected. Also corrects the comment about fresh-install behavior: the triggers in 102 only enqueue dirty keys for agent_task_queue / issue / task_usage DELETE — they do not write hourly rows. INSERT/UPDATE flows through the `updated_at` watermark window of `rollup_task_usage_hourly()`, which only runs once the operator registers it as a pg_cron job. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 16:26:42 +08:00
Multica Eve	51b3c5291f	feat: add env-gated cloud runtime launcher (MUL-2453) (#2995 ) * feat: add env-gated cloud runtime launcher Co-authored-by: multica-agent <github@multica.ai> * fix: address cloud runtime frontend nits Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 15:41:31 +08:00
Bohan Jiang	51c6e90363	docs: finish /projects link fix + tidy AWS_ENDPOINT_URL description (#2996 ) Followup to #2979. One missed /issues → /projects link in agents.mdx plus two AWS_ENDPOINT_URL row nits (URL/URLs repetition and trailing period) in SELF_HOSTING_ADVANCED.md and the Chinese self-hosting page. MUL-2498 Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 15:35:39 +08:00
YYClaw	614dfae884	MUL-2488 feat(timezone): Scheduling / Viewing two-layer timezone architecture (#2968 ) * docs(timezone): add scheduling/viewing timezone architecture RFC * feat(db): replace daily rollups with task_usage_hourly, add user.timezone Migrations 100-104: add "user".timezone (Viewing tz), build the UTC hourly task_usage_hourly rollup with its pipeline, drop the legacy task_usage_daily / task_usage_dashboard_daily pipelines, and drop the agent_runtime.timezone column. Report queries now slice day boundaries at read time by the caller-supplied @tz instead of materialising in a fixed tz. Regenerate sqlc. * feat(server): add task_usage_hourly backfill command Replace the two legacy backfill commands (daily / dashboard_daily) with a single backfill_task_usage_hourly that loads historical task_usage into the new UTC hourly rollup, sliced per workspace. * refactor(server): resolve viewing timezone in report handlers Report handlers resolve the Viewing tz per request (?tz query param, then user.timezone, then UTC) and pass it to the hourly-rollup queries. Drop the UseDailyRollup feature flags and the old raw-scan/daily-rollup dual paths, remove the /api/usage endpoints, and stop the daemon from reporting and the runtime handler from accepting host timezone. * refactor(core): switch report queries to viewing timezone API client and dashboard/runtime queries send ?tz with each report request, the user schema/types carry the new timezone field, and the runtime timezone field/mutation is removed. * feat(views): add viewing timezone preference and UI Add the useViewingTimezone hook and a Timezone setting in Preferences; report charts and the dashboard week boundary follow the viewer tz. Remove the runtime detail timezone editor and its locale strings. * fix(test): update fixtures and stabilize tests for timezone refactor The timezone architecture refactor changed several types without updating dependent test code: - RuntimeDevice no longer has a timezone field — drop it from the create-agent-dialog runtime fixture. - User now requires a timezone field — add it to the apps/web mockUser fixture. - The PreferencesTab timezone tests asserted on the async save handler (PATCH then store update) with a bare expect, racing the mutation's settle callback, and timed out querying the Select's ~600-option IANA list on a loaded CI runner. Wrap the assertions in waitFor and extend the timeout for those three tests. * docs(timezone): document self-host migration order and trigger invariant Add a SELF-HOST UPGRADE ORDER runbook to the backfill command's package comment: applying migrations 100-104 in a single migrate-up drops the legacy daily rollups before the hourly backfill runs, leaving dashboards empty until cron catches up. Add an INVARIANT comment on trg_atq_dirty_hourly noting that agent_id must be added to the trigger's OF list if it ever becomes mutable, otherwise dirty buckets for the old agent_id are silently missed. * style(runtimes): drop trailing blank line in runtime-detail	2026-05-21 15:33:47 +08:00
Tom Qiao	d0666138ec	docs: fix broken anchor links and truncated env-var description (#2979 ) Three docs issues spotted while reading: - agents.mdx and agents.zh.mdx: [project](/issues) -> [project](/projects) - cloud-quickstart.mdx: troubleshooting anchor #daemon-cant-reach-the-server did not exist; the heading is "Daemon can't connect to the server" - SELF_HOSTING_ADVANCED.md and getting-started/self-hosting.zh.mdx: AWS_ENDPOINT_URL row description was truncated; append " URLs." Co-authored-by: Tom Qiao <tomqiaozc@users.noreply.github.com>	2026-05-21 15:32:58 +08:00
Multica Eve	41cb91abd9	feat: add cloud runtime fleet proxy API (MUL-2453) (#2986 ) * feat: add cloud runtime fleet proxy API Co-authored-by: multica-agent <github@multica.ai> * test: cover cloud runtime handler nits Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 15:06:10 +08:00
Bohan Jiang	1c892aa3f9	fix(projects): default project view to compact (#2975 ) The compact view was the original list layout and is what users expect on this page; the post-#2840 default of comfortable changed long-standing behavior. Reset the unpersisted default (and the cross-workspace fallback in `merge`) back to compact. Updates the view-store tests accordingly. MUL-2464 Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 14:07:40 +08:00
Anderson Shindy Oki	65feb890b8	feat: Add project list responsive compact and comfortable views (MUL-2464) (#2840 ) * feat: Add project screen compact and comfortable views * wip * i18n * refactor and add search * refactor	2026-05-21 13:56:11 +08:00
兰之	7e55813460	fix(ui): show tooltip when create-issue button is disabled due to empty title (#2943 ) Co-Authored-By: Xiaomi MiMo V2.5 Pro	2026-05-21 13:43:31 +08:00
Bohan Jiang	7f9e4e829d	feat(comments): thread-internal --tail pagination + reply cursor (MUL-2421) (#2846 ) * feat(comments): thread-internal pagination via --tail + reply cursor (MUL-2421) Long threads inside a single issue still forced agents to read every reply once they used --thread, even after MUL-2387 fixed cross-thread noise. This adds reply-level paging so a 200-reply thread can be navigated tail-first without dragging the whole conversation into prompt context. - New SQL query ListThreadCommentsForIssuePaged: same recursive root walk as the legacy thread query, but caps reply count and supports an (created_at, id) composite cursor. Root is unconditional — even tail=0 emits it so the reader keeps the "what is this thread about" context. - Handler ListComments: parses `tail` (non-negative, ThreadTailSet flag preserves the tail=0 intent), threads it through to the paged query, and re-uses X-Multica-Next-Before / X-Multica-Next-Before-Id for the reply cursor. Cursor's meaning is now context-dependent: thread cursor under --recent, reply cursor under --thread + --tail. - CLI: new --tail flag (only valid with --thread; mutually exclusive with --recent), reply-cursor semantics for --before / --before-id when paired with --thread + --tail, stderr label flips to "Next reply cursor" so an operator copy-pasting the cursor knows which scope it scrolls. - Tests cover the new contract: tail=N keeps newest N + root, tail=0 is root-only, anchor on a nested reply still walks up, reply cursor scrolls older replies page-by-page, since combined with tail filters after the cut, and the negative-flag-combination matrix. Out of scope: prompt template update to hint at `--thread <id> --tail 30` on long threads — separate follow-up per the issue. Co-authored-by: multica-agent <github@multica.ai> * fix(comments): only emit reply cursor when older reply exists (MUL-2421) The thread-tail path emitted `X-Multica-Next-Before` whenever the page filled to exactly the requested reply count, even when there was nothing older to scroll to. So `--thread <root> --tail 3` on a thread with exactly 3 replies sent a cursor that, when followed, returned just the root — a wasted round-trip that surfaced as a phantom "older replies" affordance in the agent prompt. Switch to a `reply_limit + 1` probe: ask the SQL for one extra row, trim the oldest overflow before responding, and only emit the cursor when an older reply actually existed. The exact-boundary case (replyCount == tail with no overflow) now returns no cursor. Also documents `--thread/--tail/--recent/--before` and the cursor semantics in CLI_AND_DAEMON.md, which was the second must-fix in the MUL-2421 review. Co-authored-by: multica-agent <github@multica.ai> * fix(comments): suppress reply cursor when --since covers older replies (MUL-2421) In the thread + tail + since path the server still emitted a reply cursor whenever there was an older reply on disk, regardless of `since`. If the oldest retained reply on the page was already `<= since`, every older reply was guaranteed to be filtered out too, so the next page only ever returned the root — wasting round-trips until the agent walked the whole pre-`since` history. Mirror the recent + since suppression: when `replies[0].CreatedAt <= since`, drop the cursor. Test covers the exact case from Elon's review: tail=2 overflow, body keeps a fresher reply, but the cursor target (oldest retained reply) is already past `since` — header must be empty. Co-authored-by: multica-agent <github@multica.ai> * feat(prompt): default comment-trigger reads to --thread --tail 30 (MUL-2421) Comment-triggered agents previously defaulted the trigger-thread read to the unbounded `--thread <id> --output json`, which dumps the full thread into the prompt — exactly the kind of context bloat MUL-2387 fixed at the cross-thread layer but never bounded inside a single thread. Use the new `--tail` flag landed earlier in this PR (server + CLI) as the default for both the per-turn prompt and the runtime-config Workflow: - `--thread <trigger-id> --tail 30 --output json` is the new default. Root is always included so "what is this about" context survives. - If 30 replies aren't enough, the prompt now spells out the reply cursor: re-feed the stderr `Next reply cursor: --before <ts> --before-id <reply-id>` pair back to walk older replies. - `--recent 20` stays as the cross-thread background fallback, with an explicit callout that the same `--before` / `--before-id` flags walk threads (not replies) in that mode. - Available Commands core line now surfaces `--tail N` and both stderr cursor labels so non-workflow callers also discover the flag. - `--since` callouts reflect the post-MUL-2421 combinable mode names (`--thread --tail` / `--recent`). Tests (`prompt_test.go`, `execenv_test.go`) pin the new defaults and add a regression guard against the unbounded `--thread` recipe sneaking back in. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 13:43:15 +08:00
Bohan Jiang	8a135d2982	fix(ws): truncate unparseable frame payload in client warn log (#2974 ) The post-#2946 onmessage guard logs the raw event.data alongside the warning. A malformed or rogue server can stream arbitrarily large garbage and bloat the renderer / desktop main-process log buffers, so cap the logged payload to the first 200 chars and append a "(truncated, N chars total)" suffix when truncation occurs. MUL-2490 Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 13:37:42 +08:00
YOMXXX	83e90c9530	fix(ws): log auth frame write failures (#2946 )	2026-05-21 13:33:12 +08:00
Bohan Jiang	ef6a944063	fix(cli): accept slug + short UUID prefix in workspace get/update/member (#2972 ) * fix(cli): accept slug + short UUID prefix in workspace get/update/member (MUL-2385) `workspace list` shows the 8-char short UUID prefix, name, and slug by default; `workspace get`/`update`/`member list` only accepted full UUIDs. That broke the natural list -> get flow: every value the user could copy from list output was rejected. They had to either rerun list with `--full-id` or parse the JSON output -- both implementation-detail level operations. Extend `resolveWorkspaceByIDOrSlug` with a short UUID prefix fallback (>=4 hex chars, ambiguous matches return all candidates), introduce `resolveWorkspaceRef`/`resolveWorkspaceArg` helpers that fetch the caller's accessible workspaces and resolve UUID/slug/prefix in one call, and wire them into get/update/member list (switch already used the same list-then-resolve pattern). Full UUIDs short-circuit the extra `/api/workspaces` round trip; access control remains on the downstream endpoint. Also add a one-line tip after `workspace list` table output pointing users at get/update/switch with the same identifier columns, and broaden the command Use strings to `<id\|slug\|prefix>` so help reflects the new behavior. Refs https://github.com/multica-ai/multica/issues/2750 Co-authored-by: multica-agent <github@multica.ai> * chore(cli): include prefix hint in workspace list footer Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-21 13:08:44 +08:00
YOMXXX	ed2957ddf8	fix(claude): record result model usage (#2899 )	2026-05-21 13:00:12 +08:00
iYuan	2f1f90c11a	fix(agent): retry codex semantic inactivity fresh (#2593 )	2026-05-20 20:03:39 +08:00
Bohan Jiang	688dcb017c	fix(agents): drop confusing "default" badge from model picker (MUL-2477) (#2938 ) The model dropdown already exposes a "Default (provider)" option meaning "follow the CLI's current selection". Tagging the runtime's preferred model with a small "default" chip created two competing notions of "default" in the same UI and confused users. Remove the chip from both the create-agent ModelDropdown and the inspector ModelPicker; keep the underlying RuntimeModel.default flag intact since thinking-prop-row still uses it as a fallback heuristic. Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 18:07:57 +08:00
Multica Eve	cf000d1e93	docs(changelog): add 2026-05-20 release notes (#2932 ) Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai> v0.3.4	2026-05-20 17:28:08 +08:00
Naiyuan Qing	317bca40c1	feat(squads): show skeleton on squad detail initial load (#2930 ) Replaces the plain "Loading..." text fallback in SquadDetailPage with a skeleton that mirrors the loaded page's two-column layout (left inspector + right tabs panel), matching the SquadsListSkeleton work shipped in #2890. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 17:21:52 +08:00
Bohan Jiang	8d4f4caf4a	MUL-2338 fix(comments): allow agent self-mention to enqueue cross-issue handoff (#2928 ) * fix(comments): allow agent self-mention to enqueue cross-issue handoff The @mention path in CreateComment unconditionally skipped any self-mention. That dropped the child→parent handoff between issues assigned to the same agent: the child run posted `@J` on the parent issue, the guard tripped, and the parent's J was never woken — the chain silently broke. Drop the self-trigger `continue` in the agent mention branch. Runtime ready / private-agent gate / HasPendingTaskForIssueAndAgent dedup all remain, so a same-issue self-mention while a queued or dispatched task exists is still deduped; a running task no longer pre-empts a new follow-up (the existing queue coalescing handles that). Three regression tests: - cross-issue self-mention enqueues a task on the target issue - same-issue self-mention while running queues a follow-up - same-issue self-mention with a pre-existing queued/dispatched task is deduped MUL-2338 Co-authored-by: multica-agent <github@multica.ai> * test(handler): assign per-workspace issue number in self-mention fixture The fixture inserts two issues in the same test workspace; without an explicit number both default to 0 and the second insert violates uq_issue_workspace_number, taking the backend CI job down on PR #2928. Mirror the workspace-counter advancement pattern from issue_scheduled_test.go so each fixture issue gets a unique number. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 17:18:41 +08:00
YOMXXX	34f16e2c7a	fix(opencode): deny interactive questions in daemon mode (#2878 ) * fix(opencode): deny interactive questions in daemon mode * fix(opencode): avoid permission env ordering bypass	2026-05-20 17:17:31 +08:00
Naiyuan Qing	85e363370e	Revert "feat(issues): Working filter + agent-working badge on board (MUL-2452…" (#2927 ) This reverts commit `dee5c7cf50`.	2026-05-20 16:47:41 +08:00
Naiyuan Qing	b040165f4e	feat(squads): skeleton loader + AlertDialog archive confirm (MUL-2437) (#2890 ) * feat(squads): skeleton loader + AlertDialog archive confirm (MUL-2437) - Replace `Loading...` text on the squads list with a Skeleton placeholder matching the SquadCard shape (avatar + title + subtitle), aligning with the Agents / Dashboard pattern. - Replace the native `confirm()` on the squad detail Archive button with the project's AlertDialog (destructive variant, pending-disabled, i18n copy interpolating the squad name). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * fix(squads): drop misleading restore copy from archive confirm (MUL-2437) Archive is irreversible — there is no unarchive command (see apps/docs/content/docs/squads.mdx:113). Aligns dialog copy with docs: tells the user the action can't be undone and to create a new squad if they need the routing back. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 16:43:58 +08:00
Naiyuan Qing	dee5c7cf50	feat(issues): Working filter + agent-working badge on board (MUL-2452) (#2924 ) * feat(issues): surface "agent working" on board + add Working filter (MUL-2452) Adds a brand-color "agent working" badge to board cards / list rows so users can see at a glance which issues have an active agent task, plus a new "Working" toggle on the `/issues` and `/my-issues` headers (next to the existing scope segmented control) that filters to those issues. The toggle shows an avatar stack of the agents currently active on the current surface + scope. Pure frontend: re-shapes the existing workspace-wide `agentTaskSnapshot` cache via two new selectors (`activeTasksByIssueOptions` / `workingIssueIdsOptions`), no new SQL, endpoint, or DB field; WS `task:` events already invalidate the snapshot so the badge / filter update in realtime. Project detail page keeps the per-card badge but intentionally omits the header toggle (`showWorkingToggle={false}`) to leave the project surface's filter dimensions unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> fix(issues): working filter column header reflects filtered count (MUL-2452) Assignee-grouped board column headers kept showing the unfiltered cache total when Working was on, because `PaginatedAssigneeBoardColumn` passed `useLoadMoreByAssigneeGroup`'s cache-derived `total` straight to `BoardColumn`. The hook still needs the cache total for hasMore, but the displayed count must follow the visible-after-filter set. Split the two: when Working is active the column header now uses `group.totalCount` (set by applyWorkingFilterToGroups) for the assignee path, and `issueIds.length` for the status path. Load-more keeps reading from cache so paginated columns still see the full server total. Regression tests cover applyWorkingFilterToGroups (total rewrite + empty-group preservation), filterIssues workingOnly combinations, and an end-to-end assertion via IssuesPage that proves the column header equals the filtered count, not the cached value. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 16:35:58 +08:00
Bohan Jiang	aeb284cbeb	feat(runtime): teach agents the parent/sub-issue protocol (MUL-2338) (#2918 ) * feat(runtime): teach agents the parent/sub-issue protocol (MUL-2338) Adds a Parent / Sub-issue Protocol section to the runtime brief built by `buildMetaSkillContent`, emitted whenever the agent is running on a real Multica issue (assignment- or comment-triggered). Two behaviors are now documented for every issue-bound agent: - A. When wrapping up a child issue, post the final result and switch to `in_review` on this issue first, then post a single top-level comment on the parent. Mention the parent assignee only when it is another agent on a still-open parent — never self-mention, never @ member / squad, never re-trigger a `done` / `cancelled` parent. - B. When creating sub-issues, choose `--status backlog` for sub-issues that must wait and `--status todo` for the one to start immediately; promote with `multica issue status <id> todo` when its turn comes. The signal is explicitly framed as best-effort — no server-side state sync, no claim of a guaranteed handshake. The section is skipped for chat, quick-create, and run-only autopilot runs, which have no parent/child semantics. Tests in runtime_config_test.go assert that the section is present in both issue workflows, absent in the three non-issue modes, and that the wording does not introduce a non-existent `multica issue list --parent` command or promise a reliable handshake. Co-authored-by: multica-agent <github@multica.ai> * fix(runtime): split Step A of parent/sub-issue protocol by trigger type (MUL-2338) Comment-triggered runs were inheriting an unconditional `multica issue status <this-issue-id> in_review` from Step A, which conflicts with the comment-triggered workflow rule "Do NOT change the issue status unless the comment explicitly asks for it" (Elon's blocking review on PR #2918). Step A now branches on trigger type: - Assignment-triggered: keep "post final results + flip in_review". - Comment-triggered: complete the reply per the existing workflow rule, only flip status when the triggering comment asked for it, and gate the parent-notification steps on actually closing out child work. Tests lock the boundary: comment-triggered briefs must not contain the unconditional in_review command, must echo the existing status guardrail inside Step A, and must spell out the "closing out" gate. Assignment-triggered briefs still carry the unconditional flip. Co-authored-by: multica-agent <github@multica.ai> * fix(runtime): simplify parent/sub-issue mention rule to always @ parent assignee (MUL-2338) Per Bohan's directive on PR #2918: the per-case mention table (same agent / member / squad / closed parent) is overkill prompt complexity. Replace it with a single rule: always @mention the parent's assignee using the URL that matches assignee_type. The platform's existing run dedup handles re-triggers, and a single rule is easier for agents to follow predictably. Preserves the existing comment-triggered boundary (Step A still does NOT add an unconditional in_review flip on comment-triggered runs). Co-authored-by: multica-agent <github@multica.ai> * refactor(runtime): compress parent/sub-issue protocol to 3-rule convention (MUL-2338) Drop the spec-flavored A/B sub-headings and per-case mention table; keep three numbered rules (close out child, notify parent, pick backlog vs todo) plus a one-line best-effort preamble. The comment-triggered branch still re-asserts the "do not change status unless asked" guardrail and gates parent notification on actually closing out child work; the assignment-triggered branch still flips to `in_review`. Section is now 7 lines instead of 29. A new TestParentSubIssueProtocolIsCompact guards the ≤10-line ceiling so this stays a convention, not a spec. Co-authored-by: multica-agent <github@multica.ai> * fix(runtime): make sub-issue creation rule unconditional in parent/sub-issue protocol (MUL-2338) Elon's review on PR #2918: the preamble previously gated all three rules on the current issue having `parent_issue_id`, but rule 3 (creating sub-issues) needs to reach top-level parents that have no parent themselves — that is exactly where the `todo` vs `backlog` decision matters most. Move the gate from the preamble onto rules 1 and 2 per-rule; rule 3 now applies to any issue-bound run. Section stays at 7 newlines (≤10). Co-authored-by: multica-agent <github@multica.ai> * refactor(runtime): unify parent/sub-issue protocol as mechanism description (MUL-2338) Drop the if/else split between assignment- and comment-triggered runs in the Parent / Sub-issue Protocol section: both runs now read the same two-rule description of how the parent/child mechanism works. The comment-triggered workflow rule "Do NOT change the issue status unless the comment explicitly asks for it" naturally short-circuits the parent notification (no status flip → not closing out the child → skip), so the protocol no longer needs to branch on TriggerCommentID. Tests collapse the two trigger-specific cases into one parameterized test, and the assignment vs comment status-flip invariants are now anchored on the real workflow command (with substituted issue id) instead of the protocol's removed `<this-issue-id>` placeholder. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 16:20:33 +08:00
Angular	1f978bf1ec	feat(autopilot): link created issues to projects (#2908 ) * feat(autopilot): link created issues to projects * test(autopilot): cover project flag	2026-05-20 15:37:23 +08:00
Bohan Jiang	ffc0c5ab2e	docs(agent-inspector): sync thinking_level comments with no-override semantics (MUL-2339) (#2923 ) Follow-up to #2919 review nits — comments still described the empty thinking_level as "use runtime default" and claimed ThinkingPicker callers guaranteed non-empty levels. Both were stale after the semantics changed: - packages/core/types/agent.ts: clarify that "" clears the override and the local CLI config / built-in default decides at runtime. - thinking-picker.tsx: document that the stale-orphan clear path in ThinkingPropRow mounts the picker with an empty levels list plus a persisted value, so callers do not guarantee non-empty levels. Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 15:34:27 +08:00
Bohan Jiang	b7082a01f1	fix(issues): retry button targets the row's agent (MUL-2457) (#2921 ) * fix(issues): retry button targets the row's agent, not the assignee (MUL-2457) The execution log retry button used to re-fire the issue's current assignee instead of the agent that actually ran the clicked row. After a reassignment, or for squad workers / @-mention agents, the rerun landed on the wrong agent. POST /api/issues/{id}/rerun now accepts an optional task_id: when set, the rerun targets that task's agent (and reuses its leader/worker role). An empty body keeps the assignee-driven CLI/API contract. The execution-log retry button passes task.id, so per-row retry always fires the correct agent. enqueueMentionTask gained a forceFreshSession parameter so the new mention-path rerun keeps the same fresh-session contract as the assignee path. Co-authored-by: multica-agent <github@multica.ai> * fix(issues): inherit trigger provenance + fix cross-issue test (MUL-2457) Address review feedback on PR #2921: 1. RerunIssue now inherits TriggerCommentID from the source task when sourceTaskID is valid. Without this, a per-row rerun of a comment- or mention-triggered task degrades into a generic issue run because the daemon's buildCommentPrompt path keys on TriggerCommentID. The inherited summary is rebuilt naturally inside the enqueue helpers (buildCommentTriggerSummary derives it from the comment ID). 2. The new cross-issue rejection test inserted a second issue without `number`, hitting uq_issue_workspace_number on a same-workspace collision with the fixture's issue. Both inserts now claim the next available per-workspace number (MAX(number)+1) — matching the pattern used by notification_listeners_test. Added TestRerunIssueInheritsTriggerCommentFromSourceTask to lock the trigger provenance contract. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 15:30:03 +08:00
Angular	314e91fa6d	fix(chat): guard optimistic task message ids (#2901 )	2026-05-20 15:18:42 +08:00
Bohan Jiang	68270e238e	MUL-2339: polish(agent-inspector): optimistic updates + picker layout + thinking-default semantics (#2919 ) * polish(agent-inspector): optimistic updates + picker layout + thinking-default semantics Round of cleanup on the agent inspector pickers after using them end-to-end: 1. Optimistic updates (`agent-detail-page.tsx`) The `handleUpdate` callback that backs every inspector picker (thinking / model / visibility / concurrency / runtime / name / description / avatar) was strictly sequential: `await api.updateAgent → invalidateQueries → toast.success`. Each pick waited 0.5-2s for the network round trip before the trigger chip updated, which read as visible UI lag. Snapshot the cached agent list, patch the matching agent synchronously via `setQueryData`, then run the network request in the background. On error roll back to the snapshot before the toast surfaces the cause. All inspector pickers now respond instantly. 2. Block-in-inline fix in Model + Thinking pickers `PickerItem` wraps its children in a flex `<span>`. The picker bodies had `<div>` children, which is block-in-inline (invalid HTML5) and triggers a browser layout quirk that off-aligns descendants — model IDs floated to the center under their labels in ModelPicker, descriptions indented unevenly under levels in ThinkingPicker. Replace the inner `<div>`s with `<span block text-left>` so the layout is deterministic across rows. 3. Visual polish in Thinking picker Label was `font-medium` at the parent's default `text-sm` (14px), chunky next to the 10px description. Drop to `text-[13px]`, bump description to `text-[11px] leading-snug` with `mt-0.5` so the contrast between rows feels less jarring. 4. Match Model picker's row typography to Thinking's Same `text-[13px]` for label + `text-[10px] mt-0.5` for the model ID. Both pickers now read as the same component family. 5. "Default" semantics: follow CLI config, not model factory default The chip displayed "Default" / "default" badge when no `thinking_level` was set, alongside a `[default]` chip on the model's factory-advertised default option in the menu. That was misleading: when Multica omits `--effort` (because picker is unset), it's the user's local CLI config (claude/codex) that decides the reasoning level — not the model's factory default. Showing "medium [default]" while the user has xhigh in their CLI config lies about what actually fires at the API. - Trigger label: "Default" → "Follow CLI config" (zh: "跟随 CLI 配置") - Footer clear button: "Use model default" → "Follow CLI config" - Footer tooltip: explicitly mentions claude/codex CLI config - Inline `[default]` badge on the factory-default option: removed - `defaultLevel` prop chain (picker + prop-row + test): cleaned up as now-dead code 6. Stop hiding the Thinking row while discovery loads `if (levels.length === 0 && !value) return null` hid the row while the runtime-models query was still in flight, which subscribed-then-unsubscribed from useQuery in such a way that the discovery only fired when the user manually opened the Model picker. Gate the early return on `!isLoading && !isFetching` so ThinkingPropRow stays mounted (and thus its useQuery keeps subscribed) until discovery returns; row appears as soon as data arrives, no Model-picker tap required. 7. Drop the inline tooltip on Thinking picker items The same description was rendered both inline under the label (always visible) and as a hover tooltip (overlapping the next row). The hover bubble was redundant — removed. Tests - `pnpm --filter @multica/views test thinking-picker` → 7/7 pass after renaming the "Default" assertion + clearing the unused defaultLevel test prop. - `pnpm --filter @multica/views typecheck` clean. * fix(test): align thinking-prop-row tests with renamed copy + loading-aware row gate CI surfaced 3 broken assertions in `thinking-prop-row.test.tsx` — all consequences of the polish PR's behaviour changes that the test file hadn't tracked: - "hides the row when ... no thinking levels and nothing is persisted" The row now stays mounted while runtime-models discovery is in flight (so the useQuery subscription actually survives long enough to issue the request — fixes the bug where Thinking only appeared after manually opening the Model picker). The assertion asserted absence only after `initiate` was called, but loading is still in progress at that point. Wrap the absence assertion in `waitFor` so it waits for the row to disappear after the query settles. - "clears the orphan value via the picker footer" Tooltip copy changed from "Clear and fall back to this model's default reasoning level" → "Clear the override and let the local CLI config decide the reasoning level". Update the regex. - "renders the row with \"Default\" when value is empty" Trigger label changed from "Default" → "Follow CLI config" to reflect that Multica omits --effort and the local CLI config decides. Update the assertion + test name. `pnpm --filter @multica/views test` → 701/701 pass. * fix(agent-inspector): drop loading-row gate + per-field optimistic rollback (MUL-2339) Addressing review feedback on #2919: - ThinkingPropRow no longer keeps the row visible during discovery. The previous explanation ("early return null aborts the useQuery subscription") was wrong — React doesn't unmount a component that returns null, so hooks (and their subscriptions) stay live. The loading-aware gate only succeeded in showing an empty "Follow CLI config" row that opened to an empty menu before discovery settled. Restore the simple `levels empty && !value -> null` behavior; the sibling ModelPicker mounts unconditionally and keeps the shared runtime-models query active regardless. - AgentDetailPage.handleUpdate now rolls back only the fields the failing PATCH wrote, instead of restoring a whole-list snapshot. A whole-list snapshot rollback discards any concurrent successful inspector mutation that landed between snapshot and rollback. Per- field rollback + a final invalidate converges the cache on server truth without clobbering unrelated optimistic writes. - Sync the now-stale "use model/runtime default" wording in the thinking-related JSDoc and type comments: empty thinking_level is a "no override" sentinel — the backend omits --effort and the upstream CLI config decides — not a Multica-known default level. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 15:18:34 +08:00
Bohan Jiang	eaf8b14866	fix(installer): post-merge nits from #2881 (MUL-2458) (#2922 ) - Capture `brew tap` output and print the same diagnostic tail on failure that `brew install` already prints, so #2867-style "no signal" reports are gone from both Homebrew failure paths. - Add a `brew tap` failure regression case to `scripts/install.test.sh` and refactor the test runner to share sandbox/curl-stub setup; both cases now also assert the diagnostic tail is emitted. - Move the shell installer test out of the heavy backend job into a dedicated `installer` matrix job that runs on `ubuntu-latest` and `macos-latest`, since the installer targets macOS/Homebrew and BSD vs GNU `tar` / `sed` / `mktemp` differences are the next likely break. - Surface `MULTICA_INSTALL_DIR`, `MULTICA_BIN_DIR`, and `MULTICA_SELFHOST_REF` in `install.sh --help` so `MULTICA_BIN_DIR` stops looking like a test-only knob. Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 15:18:17 +08:00
Jiayuan Zhang	41753d17a2	feat(desktop): pin tab (MUL-2449) (#2914 ) * feat(desktop): pin tab — keep parked tabs anchored across navigations (MUL-2449) Adds tab pinning to the desktop tab bar. Pinned tabs render as icon-only at the left, suppress the X close button, and intercept any `navigation.push()` that would change their pathname — those are redirected into a new tab so the pinned tab stays parked on its original route. Search/hash/back/forward stay in-tab so pinned filter and drawer state still work. Implements the FINAL combo from the MUL-2449 RFC §4: right-click menu + ⌘⇧P shortcut (D1 a+c), icon-only visual (D1v i), pathname-change → new tab with same-path-allowed (D2a/b A), back / refresh allowed (D2c/d A), pinned auto-cluster left and persist (D3a/b A), pinned can't be X-closed (D3c A), dedupe respected (D4a A), default Issues tab pinnable (D4b A), drag clamped to its zone (D4c A), deep link prefers pinned (D4e A). Store changes: - Tab.pinned added; togglePin maintains the "pinned first" invariant by inserting at the zone boundary. - moveTab clamps cross-zone drags so dnd-kit can't violate the ordering. - Persistence bumped v2 → v3 with a defaulting migration (pinned=false). Rehydrate sorts pinned-first as a defensive net. Navigation: - tryRouteToPinnedNewTab compares the active tab router's live pathname to the target. Same-pathname push (query / hash / sub-router) falls through to the router; different pathname → openTab + setActiveTab (foreground; respects dedupe). UI: - Tab bar wraps each tab in a shadcn ContextMenu with Pin/Unpin + Close (Close disabled for pinned or last-remaining tab). - Pinned tabs use a narrower icon-only layout with an accent left border and a divider between the pinned and unpinned groups. - Global keydown listener registers ⌘⇧P / Ctrl+Shift+P to toggle pin on the active tab. Tests: - tab-store: togglePin ordering, moveTab boundary clamping, v2→v3 migration. - navigation: pinned push → new foreground tab; same-pathname push stays in tab; cross-workspace still wins over pin. Co-authored-by: multica-agent <github@multica.ai> * test(desktop): cover TabNavigationProvider.push pin interception (MUL-2449) Add pathname-diff / same-pathname cases for the per-tab navigation adapter. Existing tests only exercised the root-level DesktopNavigationProvider, but in-tab AppLink / page clicks flow through TabNavigationProvider — so a future refactor that drops the pin check from that provider would silently regress. Co-authored-by: multica-agent <github@multica.ai> * refactor(desktop): pin tab — hover button, full title, drop ⌘⇧P (MUL-2449) Jiayuan's interactive review of PR #2914 surfaced three changes to the RFC's D1 (entry / visual) decisions: 1. Drop the ⌘⇧P global shortcut — it added a keybinding for a low-frequency action and crowded the shortcut namespace. 2. Reveal a Pin / Unpin button on tab hover instead of relying on the right-click menu as the primary entry; right-click remains as a fallback (and for Close). 3. Pinned tabs keep their full title and width. The only weak visual differences vs. unpinned tabs are the accent left border and the suppressed X close button. Removes the global keydown listener (no other doc / handler referenced it). Adds a hover-only Pin / Unpin span next to the existing close affordance, both gated by group-hover. Drops the icon-only width / hidden-title styling for pinned tabs. Tests: new tab-bar.test.tsx covers Pin / Unpin button rendering, click handlers (togglePin), the hidden-X invariant on pinned tabs, and the full-title rendering. 146 passed, typecheck clean. Co-authored-by: multica-agent <github@multica.ai> * refactor(desktop): pin tab — drop accent left border, swap leading icon to Pin (MUL-2449) Jiayuan reported that the accent left border on pinned tabs reads as a heavy black edge in light mode and looks unrefined. Replace it with a quieter identifier: pinned tabs swap their route icon for a Pin glyph in the leading slot (same size, no extra horizontal space). The hidden X close button stays as the secondary cue. RFC §3 D1v moves from iii FINAL to iv FINAL; iii is demoted to v2 FINAL → v3 REMOVED. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Lambda <lambda@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 09:14:43 +02:00
Angular	edded77691	fix(installer): fall back when brew install fails (#2881 )	2026-05-20 15:14:18 +08:00
Bohan Jiang	9d3b6e2241	feat(agent): inspector picker for thinking_level (MUL-2339) (#2912 ) * feat(agent): inspector picker for thinking_level (MUL-2339) PR1 (#2865) shipped the backend — column, daemon-side discovery, Claude/Codex injection, API validation — but the agent detail inspector had no UI to set the value. Users could only configure thinking_level via custom_env / API. This wires up the picker so it lives next to Runtime and Model where everything else editable already lives. Picker is per-(runtime, model): it reuses the same `runtimeModelsOptions` query the Model picker already runs (60s cache, no extra round-trip) and reads the active model's `thinking.supported_levels`. When the list is empty — every provider except Claude/Codex today, or a Claude model that doesn't expose `--effort` — the entire PropRow is hidden, not just rendered inert. The picker never gets to invent value/label pairs itself; they come verbatim from each CLI's own catalog (`Low`, `Extra high`, …) so the user sees exactly what `claude --effort` / `/effort` and Codex's TUI show. The `default_level` from the catalog is badged inside the popover so the user knows which value `""` (the persisted "use model default" sentinel) maps to. The clear footer sends `""` explicitly, which the backend already understands as the tri-state "explicit clear" branch of UpdateAgent. Invalid combinations (e.g. picking a value not in the target provider's enum after a runtime swap in the same PATCH) hit the existing 400 path on the server and surface as a toast via the inspector's standard `onUpdate` error handler — no extra client-side guard needed. Exports `RuntimeModelThinking` and `RuntimeModelThinkingLevel` from `@multica/core/types` so views consumers can refer to them by name. i18n keys added in EN and zh-Hans (parity test green). Co-authored-by: multica-agent <github@multica.ai> * fix(agent): preserve unknown thinking_level in picker label Stale persisted values (model swap, CLI catalog shrink) used to render as 'Default' even though the backend would still ship the orphaned token. Fall back to the raw value when no entry matches so the user sees what's actually saved and can clear it. Co-authored-by: multica-agent <github@multica.ai> * test(agent): unit tests for thinking-picker label + clear flow Covers the default-vs-set trigger label, the unknown-token preservation path added in `3452fae3f`, the read-only display, picking and re-picking into onChange, and the clear footer's empty-string emission. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): keep Thinking row visible when value is stale (MUL-2339) Inspector was hiding the row whenever the active model had no supported_levels, which also hid persisted orphan tokens (model swap into a non-thinking runtime, or a CLI catalog that shrank). PR1's per-model invalid behavior is daemon-side warn/drop, not a synchronous DB clear, so the frontend has to surface the raw value and let the user explicit-clear it via the picker footer. Render the row when levels are empty AND value is empty; otherwise keep it. Extract ThinkingPropRow into its own file so the row-level logic is unit-testable. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 13:47:19 +08:00
Bohan Jiang	2bec2221d2	feat(agent): per-agent thinking_level for claude + codex (MUL-2339) (#2865 ) * feat(agent): persist thinking_level per agent (MUL-2339) Adds a nullable `thinking_level` column to the `agent` table so the backend can route a runtime-native reasoning/effort token (e.g. Claude's `xhigh`, Codex's `minimal`) through to the agent CLI on every dispatch. The column is intentionally TEXT rather than an enum — Claude and Codex publish overlapping but distinct vocabularies and we want the persisted value to round-trip exactly through whichever CLI receives it. NULL is the "use runtime default" sentinel that every downstream consumer reads as "do not inject --effort / reasoning_effort". This commit is just the storage layer (migration + sqlc); subsequent commits wire it through the API, daemon, and agent backends. Co-authored-by: multica-agent <github@multica.ai> * feat(agent-backend): inject reasoning effort for claude + codex (MUL-2339) Extends ExecOptions with a runtime-native ThinkingLevel string and wires it into the Claude and Codex backends. Discovery is driven by the local CLI so the daemon advertises whatever the host install supports rather than a hand-maintained list that goes stale. Per Elon's PR1 review: - Claude: parses `claude --help` to learn the `--effort` superset and projects through a per-model allow-list (xhigh is Opus-only; max is session-only on the smaller models). Falls back to a conservative static list when the binary is missing or help drift hides the line. - Codex: drives `codex debug models --output json` so per-model reasoning subsets and the documented default come directly from the CLI. The older config-error probe trick is gone — the JSON path is stable and doesn't pollute stderr with an intentional misconfig. - Cache key includes (provider, executablePath, cliVersion) so a CLI upgrade invalidates entries that referenced the older help / catalog. Per Trump's PR1 constraint, all three Codex injection points (thread/start.config, thread/resume.config, turn/start.effort) flow through one helper (`applyCodexReasoningEffort`) so they cannot drift independently. The shared `codexReasoningCases` fixture in `thinking_test.go` asserts the same value→{shape, key} contract at each site for every level the runtimes know about. Claude's `--effort` is also added to `claudeBlockedArgs` so a user custom_args entry can't silently outvote the daemon-injected value. Co-authored-by: multica-agent <github@multica.ai> * feat(api): wire thinking_level through API + daemon contract (MUL-2339) End-to-end plumbing for the per-agent reasoning/effort setting: - AgentResponse / TaskAgentData now carry `thinking_level`; the daemon's claim response includes it and the daemon's executor passes it through to agent.ExecOptions, where the Claude and Codex backends already know what to do with it. - ModelEntry on the runtime-models wire format gains a `thinking` block carrying `supported_levels` + `default_level` per model so the UI can render a runtime-aware picker without the server having to know about the local CLI install. `handleModelList` projects the agent-package catalog (including the new Thinking field) into the wire shape. - CreateAgent / UpdateAgent gate the field with a synchronous provider enum check (claude / codex only today). UpdateAgent is tri-state: field omitted = no change, "" = explicit clear (new `ClearAgentThinkingLevel` query, mirrors the existing mcp_config null pattern), non-empty = validate then set. Per Trump's PR1 review, the API NEVER auto-clears on a runtime/model swap and ALWAYS returns 400 on an unknown literal value — same shape across CreateAgent, UpdateAgent, and combined patches that move runtime + level in one request. Per-model combination failures (e.g. `xhigh` against a model that only supports up to `high`) surface as a daemon-side task error, not a silent server-side rewrite. TS types follow the same shape: `Agent.thinking_level`, `CreateAgentRequest`/`UpdateAgentRequest` add the field, `RuntimeModel` grows a `thinking` block. Older backends omit the field, which the front-end treats as "no picker for this model" — installed desktop builds keep working. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): correct codex debug models argv + pin via runner test (MUL-2339) `codex debug models --output json` is rejected by codex-cli 0.131.0 — the subcommand emits JSON on stdout by default and has no `--output` flag. Drop the flag and add `--bundled` to skip the network refresh discovery doesn't need. Move the argv to a package-level var and add a test that runs a fake `codex` to assert the binary actually receives exactly `debug models --bundled`, so the contract can't silently drift on the next refactor. Also teach ValidateThinkingLevel to resolve an empty model to the provider's default model entry. Without this, every default-model task with a persisted thinking_level would be misjudged "unknown model" by the daemon guard. Co-authored-by: multica-agent <github@multica.ai> * fix(api): reject runtime switch that would leave invalid thinking_level (MUL-2339) A PATCH that changed `runtime_id` without touching `thinking_level` used to silently keep the existing value, so a Claude agent storing `max` could land on a Codex runtime where `max` is not a recognised token at all, and the daemon would receive a literal-invalid level. Hold the same "always 400 on literal-invalid, never silent coerce" rule on this implicit path. When runtime_id changes and the existing value is not in the new provider's enum, return 400 with the recovery options (clear via `thinking_level=""` or re-set in the same PATCH). Add coverage for both the kept-when-still-valid and the rejected cases, plus the two recovery paths (clear and replace). Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): guard runTask with per-model thinking_level validator (MUL-2339) ValidateThinkingLevel existed but had no call site — `task.Agent. ThinkingLevel` flowed straight into ExecOptions, so `xhigh` configured on a non-Opus Claude model, or API-side stale values that escaped the provider enum gate, would be injected anyway. Run the validator before building ExecOptions. Invalid combinations log a warning and drop the level instead of failing the task: the agent still runs, just at the runtime's default reasoning effort. Discovery errors fail open (keep the level, let the CLI surface any objection) so a transient `claude --help` failure can't strand work. Empty model is forwarded as-is; the validator resolves it to the provider's default model internally per the cross-package contract. Co-authored-by: multica-agent <github@multica.ai> * chore(agent): drop stale `--output json` comments + unused scanner (MUL-2339) Codex CLI's `debug models` subcommand emits JSON without an `--output` flag, and `parseCodexDebugModels` never read from the bufio.Scanner. Sync the comments with the actual invocation and remove the dead init. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 12:30:10 +08:00
Jiayuan Zhang	292226f632	fix(runtimes): use official Gemini spark icon (MUL-2447) (#2904 ) * fix(runtimes): use official Gemini spark icon (MUL-2447) Gemini provider was falling through to the default Monitor icon in the runtime list. Add the official 4-point spark mark with Google's blue → purple → pink gradient, matching the SVG style/sizing of the other provider icons. Co-authored-by: multica-agent <github@multica.ai> * fix(runtimes): use current Gemini multicolor spark gradient (MUL-2447) Per review on PR #2904: the previous 3-stop blue/purple/pink gradient was the legacy Bard-era Gemini spark. Update to the 5-stop cyan → blue → purple → pink → orange gradient used by the current Gemini app/web multicolor mark. Co-authored-by: multica-agent <github@multica.ai> * fix(runtimes): switch Gemini icon to aurora multicolor treatment (MUL-2447) Co-authored-by: multica-agent <github@multica.ai> * fix(runtimes): align Gemini aurora color positions and smooth spark path Swap yellow/green radial gradient anchors so colors land at the official positions: top red / right blue / left yellow / bottom green, matching gemini.google.com's current aurora spark. Replace the arc-based 4-point spark outline with a cubic-bezier version normalized to the 24-viewBox so the inset between tips is smoother and closer to the gstatic source. Co-authored-by: multica-agent <github@multica.ai> * fix(runtimes): use Simple Icons Google Gemini mark (MUL-2447) Drop the hand-crafted aurora gradient approximation and inline the canonical "Google Gemini" path from Simple Icons (CC0 1.0), rendered in the Simple Icons brand color (#8E75B2). This matches the pattern used by the other provider marks in this file (Claude/Codex from Bootstrap Icons, etc.) instead of trying to manually approximate the official multicolor wash from gemini.google.com (which paints via a clipPath over an embedded raster). Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Lambda <lambda@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 12:27:53 +08:00
Jiayuan Zhang	72339f347b	fix(desktop): keep local machine row visible after stopping daemon (#2906 ) The Start button lives in `DaemonRuntimeActions`, which is rendered in the per-machine detail pane and only when the selected machine is flagged `isCurrent`. After the user manually stopped the daemon, `status.daemonId` went back to undefined, so no machine could be matched as `isCurrent` — the local row either disappeared (when the server-side runtime had been GC'd) or moved into the "remote" section (when it was still present but unmatched). Either way the Start button was unreachable until the app was restarted. Two-part fix: - `DesktopRuntimesPage` now caches the last-known daemonId/deviceName so the local match keeps working while the runtime is still on the server (recently_lost / offline window). - `buildRuntimeMachines` accepts an `ensureLocalMachine` flag; when no real runtime matches, a placeholder local row is synthesized so the Start button still has a home. Desktop opts in via a new `hasLocalMachine` prop on `RuntimesPage`. The empty state is also suppressed when this prop is set so the placeholder row isn't hidden behind the "register a runtime" hint on first launch. Co-authored-by: Lambda <lambda@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 06:16:20 +02:00
Jiayuan Zhang	fc8528d64d	feat(autopilot): support assigning to a squad (MUL-2429) (#2888 ) * feat(autopilot): support assigning autopilot to a squad (MUL-2429) Path A (Squad-as-Leader) from the RFC: when an autopilot's assignee is a squad, dispatch resolves to squad.leader_id and executes against the leader's runtime — semantics match a human manually assigning the issue to that squad, no fan-out. Backend scope only; frontend picker change is a follow-up PR. Changes: - 096_autopilot_squad_assignee migration: drop agent FK on autopilot.assignee_id, add assignee_type column (default 'agent'), add autopilot_run.squad_id attribution column. - service.AgentReadiness: single source of truth for archived / runtime-bound / runtime-online checks. Shared by autopilot admission gate, run_only dispatch, and isSquadLeaderReady. - service.resolveAutopilotLeader: translates assignee_type/id to the agent that actually runs the work. - dispatchCreateIssue: stamps issue with assignee_type='squad' for squad autopilots and enqueues via EnqueueTaskForSquadLeader. - dispatchRunOnly: belt-and-braces readiness re-check after resolving squad → leader so a leader that went offline between admission and dispatch produces a clean failure instead of a doomed task. - handler.CreateAutopilot / UpdateAutopilot: accept assignee_type with squad/agent existence + leader-archived validation. Backward-compatible default of "agent" preserves the contract for older clients. - Analytics: AutopilotRunStarted/Completed/Failed events carry assignee_type and squad_id; PostHog can now group autopilot runs by squad without joining back to the autopilot row. Co-authored-by: multica-agent <github@multica.ai> * fix(autopilot): reject archived squads, route post-admission skips, cleanup dangling-agent autopilots (MUL-2429) Addresses three review findings on PR #2888: 1. Archived squad handling: validateAutopilotAssignee now rejects squads with archived_at set; resolveAutopilotLeader returns errSquadArchived so the admission gate fails closed; DeleteSquad now mirrors the issue transfer for autopilot rows (TransferSquadAutopilotsToLeader) so surviving autopilots flip to assignee_type='agent' (leader) instead of dangling at the archived squad. 2. dispatchRunOnly post-admission readiness: introduces errDispatchSkipped sentinel, recognised by DispatchAutopilot via handleDispatchSkip so the run is recorded as `skipped` (not `failed`). Manual triggers no longer 500 when the leader's runtime goes offline between admission and task creation. New TestManualTriggerDoesNotErrorOnPostAdmissionSkip locks the behaviour in. 3. Dangling agent assignee after migration 096 dropped the FK: shouldSkipDispatch now distinguishes pgx.ErrNoRows / errSquadArchived (hard skip — retrying won't help) from transient DB errors (fail-open). DeleteAgentRuntime pauses autopilots that target agents about to be hard-deleted (ListArchivedAgentIDsByRuntime + PauseAutopilotsByAgentAssignees) so the breakage surfaces as a paused row in the UI instead of a quiet skip-burning loop. Unit tests cover the sentinel unwrap contract and errSquadArchived errors.Is behaviour. Integration test TestAutopilotDispatchSkipsWhenRuntimeOffline re-verified against a fresh DB with migration 096 applied. Co-authored-by: multica-agent <github@multica.ai> * fix(autopilot): bump last_run_at on post-admission skip (MUL-2429) Match recordSkippedRun (pre-flight skip) and the success path so the scheduler / "last seen" UI both reflect that this tick evaluated the trigger, even when the post-admission readiness gate caught a late regression. Addresses Emacs review caveat #1 on PR #2888. Co-authored-by: multica-agent <github@multica.ai> * feat(autopilot): mixed agent/squad assignee picker in dialog (MUL-2429) End-to-end UI for assigning an autopilot to a squad. Closes the PR #2888 backend gap: the squad-as-assignee feature was already wired in Go (Path A, RFC §4) but the desktop dialog never offered the choice. - core/types/autopilot: add `AutopilotAssigneeType`, surface `assignee_type` on `Autopilot` + Create/Update request payloads. - views/autopilots/pickers/agent-picker: switch to a polymorphic AssigneeSelection (`{type, id}`); render agents and squads as two grouped sections with shared pinyin search. - views/autopilots/autopilot-dialog: maintain `assigneeType` state, send it on create/update, render the trigger avatar / hover dot with `assignee.type`. - views/autopilots/autopilots-page + autopilot-detail-page: render the assignee row using `autopilot.assignee_type` so squad-typed autopilots show the squad avatar + name, not a broken agent lookup. - locales: add `agents_group` / `squads_group` / `select_assignee` keys (en + zh-Hans), keep legacy `select_agent` for callers that still reference it. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Lambda <lambda@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 05:30:13 +02:00
Jiayuan Zhang	4a487adfeb	feat(github): split canView / canManage in settings tab for read-only members (MUL-2413) (#2898 ) Wires the frontend half of the read-only RFC. The Settings → GitHub tab now always issues the installation list query for any workspace member (the backend gates it via `RequireWorkspaceMember` after PR #2886) and gets `can_manage` straight from the API response. The render matrix covers the six cases the RFC calls out: - configured + connected + admin → Disconnect + (optional) Connected by - configured + connected + member → read-only "Connected to" + read_only_hint - configured + not connected + admin → Connect button + dev description - configured + not connected + member → contact_admin_to_connect hint - not configured + admin → operator banner + disabled Connect - not configured + member → contact_admin_to_connect hint New i18n keys (en + zh-Hans): read_only_hint, connected_by, contact_admin_to_connect. The unused github.manage_hint string is removed (its non-admin branch now resolves to one of the two new hints depending on connection state). GitHubInstallation gains an optional `connected_by` display name so the UI can render the "Connected by {name}" line without further changes once the backend exposes the field. Co-authored-by: Lambda <lambda@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 04:19:28 +02:00
Jiayuan Zhang	e48f6a84d6	feat(github): expose read-only installation list to workspace members (MUL-2413) (#2886 ) * feat(github): expose read-only installation list to workspace members (MUL-2413) Relax `GET /api/workspaces/{id}/github/installations` from owner/admin-only to any workspace member so the Settings → Integrations tab no longer renders blank for non-admins (the original symptom of MUL-2413). The handler now reads the caller's role from the workspace middleware: - owner / admin keep the full row including the numeric `installation_id` (the connect / disconnect handle) and receive `can_manage: true`. - every other role (member / guest) receives rows with `installation_id` omitted and `can_manage: false`, giving them visibility into "is GitHub wired up?" without the management handle. `GET /github/connect` and `DELETE /github/installations/{id}` stay under the admin/owner middleware group — this PR only relaxes the read path. Tests: `TestListGitHubInstallations_RoleGating` exercises admin, owner, member, and guest paths against the real DB-backed handler fixture and asserts the field stripping + `can_manage` contract. Refs: MUL-2413 Co-authored-by: multica-agent <github@multica.ai> * fix(github): redact installation_id from realtime broadcasts (MUL-2413) GET /github/installations strips the numeric installation_id for non-admin members, but the github_installation:created / uninstall / suspend WS events were still publishing it, so the same handle was reachable from any workspace client subscribed to the workspace scope. Broadcast both payload variants without it — the frontend uses these events only to invalidate the installations query, so admins re-query the list endpoint to recover the management handle. Also adds a router-level test that mounts the production middleware split (member-visible list vs. owner/admin connect+delete) so a future routing change can't silently widen the write surface. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Lambda <lambda@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 04:17:45 +02:00
Naiyuan Qing	5b8303b83c	fix(editor): fill modal viewport in attachment preview (MUL-2431) (#2891 ) In the attachment preview modal, image and video previews used `max-h-full max-w-full`, which let small assets render at their natural size and leave the modal mostly empty. Switch to `h-full w-full` so the preview always occupies the modal viewport, relying on `object-contain` to preserve aspect ratio without upscaling beyond the intrinsic bounds. Only touches `packages/views/editor/attachment-preview-modal.tsx` for the image (line 355) and video (line 373) branches; pdf, audio, markdown, html, and text branches keep their existing layout. Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 09:16:08 +08:00

1 2 3 4 5 ...

3199 Commits