Commit Graph

3199 Commits

Author SHA1 Message Date
yushen
ae4191fab1 fix(ui): pass node.instance_id instead of node.id to deleteNode mutation
Fleet expects the actual AWS instance_id (e.g. i-0123456789abcdef0),
not the internal DB id. Updated the mutate call in cloud-runtime-dialog
to pass node.instance_id so the correct value reaches Fleet's
DELETE /api/v1/nodes endpoint.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 17:59:32 +08:00
yushen
bc79a94b5f fix(api): use instance_id in deleteCloudRuntimeNode body
Fleet API requires instance_id, not id. Fixes 'instance_id is required' error.

MUL-2510

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 17:53:33 +08:00
LinYushen
adec90c621 MUL-2510 feat: add delete button to fleet nodes list (#3007)
* feat: add delete button to fleet nodes list

- Add deleteCloudRuntimeNode method to API client (DELETE /api/cloud-runtime/nodes/:nodeId)
- Add useDeleteCloudRuntimeNode mutation hook in cloud-runtime.ts
- Add delete button with Trash2 icon to CloudRuntimeNodeRow component
- Include confirmation dialog, loading state, and toast notifications
- Add i18n keys for en and zh-Hans locales

Co-authored-by: multica-agent <github@multica.ai>

* fix(api): correct deleteCloudRuntimeNode contract to match server

- Change from DELETE /api/cloud-runtime/nodes/:nodeId (no body) to
  DELETE /api/cloud-runtime/nodes with JSON body { id: nodeId }
- Use fetchRaw + Content-Type header to match server's withBody proxy
- Add contract test verifying URL, method, body, and Content-Type

Fixes review feedback on MUL-2510

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 17:46:26 +08:00
Bohan Jiang
ae530ef057 docs(runtime): tighten issue-metadata write bar (MUL-2507) (#3004)
The previous wording invited agents to pin too much: any opened PR,
external link, or "fact future agents will want one-glance access to"
was framed as worth writing, with no explicit upper bound. In practice
this caused metadata bags to accumulate single-run details and
description-summary noise instead of the small set of repeatedly-read
values the feature was designed for.

Rework the agent runtime brief and the CLI docs to lead with the bar:
write a key only when it is materially important AND likely to be
re-read by future runs on the same issue. "Most runs write zero new
keys" is now stated as the expected case, and the workflow exit step
is rewritten to mirror the same gate. Recommended-key list, safety
boundaries, and stale-key cleanup are preserved so the locked-in test
anchors still pass.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 17:20:43 +08:00
Bohan Jiang
ab0228c2a1 feat(issues): collapse long metadata bags in sidebar MUL-2503 (#3003)
* feat(issues): collapse long metadata bags in sidebar (MUL-2503)

The metadata KV strip rendered every key inline, so issues with many
pinned keys pushed the rest of the sidebar far down. Keep the first
four rows visible and tuck the remainder behind a Show N more / Show
less toggle once the bag reaches five keys, mirroring the PR list
collapse rule.

Co-authored-by: multica-agent <github@multica.ai>

* refactor(issues): hide metadata behind a JSON dialog (MUL-2503)

Metadata is an agent-facing free-form KV bag — the values almost never
mean anything to a human reader, and every property humans actually care
about already has a dedicated sidebar field (status, priority, assignee,
etc.). Rendering the first four keys inline still pushed real signal
down and added visual noise for no benefit, so drop the inline strip
entirely.

Replace the section with a small `{ }` Metadata button at the bottom of
the sidebar that opens a Dialog showing the formatted JSON. The button
hides itself when the bag is empty, so the common case stays completely
quiet. Removes the prior collapse threshold (and its `Show N more` /
`Show less` strings) since there is nothing to collapse anymore.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 17:18:57 +08:00
LinYushen
e288eff2c5 feat: server auto-generates PAT for cloud runtime bootstrap (#3002)
When bootstrap is enabled and no PAT is available from the request
header or Authorization bearer token, the server now generates a new
PAT automatically and forwards it to the cloud service.

This removes the need for the frontend to pass X-User-PAT — the
server handles it entirely.
2026-05-21 17:07:44 +08:00
YOMXXX
29c2a5d18f fix(daemon): reclaim stale dispatched claims (MUL-2485) (#2872)
* fix(daemon): reclaim stale dispatched claims

* fix(daemon): widen stale claim reclaim window
2026-05-21 17:06:55 +08:00
Tom Qiao
81e8aa5812 test(core): add unit tests for reserved-slugs (#2985)
Co-authored-by: Tom Qiao <tomqiaozc@users.noreply.github.com>
Co-authored-by: Claude Opus 4 <noreply@anthropic.com>
2026-05-21 16:54:45 +08:00
Bohan Jiang
0c767c0052 feat(issues): per-issue metadata KV (MUL-2017) (#2845)
* feat(issues): per-issue metadata KV (MUL-2017)

Adds a small JSONB KV map to every issue for agent pipeline state (attempts,
PR number, pipeline status, ...). Keys match a narrow regex, values are
primitives (string / number / bool), capped at 50 keys per issue and 8KB
per blob. Defense-in-depth via two CHECK constraints (object shape + size).

All mutations are single-key atomic (jsonb_set / `- key`). `UpdateIssue`
intentionally does NOT touch metadata: a whole-blob overwrite would race
with concurrent agent writes.

  GET    /api/issues/:id/metadata
  PUT    /api/issues/:id/metadata/:key   body: { "value": <primitive> }
  DELETE /api/issues/:id/metadata/:key

Containment filter on list: GET /api/issues?metadata=<json-object> uses
PG `@>` against a `jsonb_path_ops` GIN index. Mirrored across ListIssues,
CountIssues, ListOpenIssues, and the hand-rolled ListGroupedIssues SQL so
CLI/API and UI grouped views stay consistent.

CLI: multica issue metadata {list,get,set,delete}
  multica issue list --metadata key=value (repeatable, AND)
  set has --type to override the default value-sniffing
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): metadata test bugs + wire realtime + read-only display (MUL-2017)

- Fix two failing handler tests blocking backend CI:
  - reset decode target after delete so map merge does not mask removal
  - url.PathEscape the key segment so spaces no longer panic NewRequest
- Wire issue_metadata:changed end to end so the detail / list / my-issues
  caches stay in sync with set/delete events (other tabs, CLI writes).
- Add a read-only Metadata strip to the issue detail sidebar; hidden when
  the issue has no keys so it stays quiet in the common case.

Co-authored-by: multica-agent <github@multica.ai>

* feat(runtime): teach agents to read/write issue metadata (MUL-2017)

Add an `## Issue Metadata` section to the runtime brief plus a
`metadata list` step on entry and a `metadata set`/`delete` step on
exit. Section only emits when the task carries an issue id (comment- or
assignment-triggered); chat / quick-create / run-only autopilot stay
clean so they don't fire failing CLI calls.

Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): bump metadata migration to 105 and drop attempts as example (MUL-2017)

main is now at 104_drop_runtime_timezone; the migrator picks
LatestVersion() by sorted filename, so a slot before the tail would
let DBs that have already run 099–104 think they're up-to-date while
the issue.metadata column is missing — runtime would then fail with
column does not exist. Renumbering to 105 puts the migration at the
tail and forces it to run.

Also drop attempts as a positive example across docs/code comments and
test fixtures — the runtime instruction prompt already lists it under
"What NOT to pin" (runtime bookkeeping). Replace with pr_number, which
is in the recommended-keys set, so docs/tests speak the same language
as the prompt.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 16:35:45 +08:00
Multica Eve
66c0464140 fix: simplify cloud runtime create form (#3000)
Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 16:34:11 +08:00
Bohan Jiang
9a5d8a52f3 fix(timezone): harden hourly-rollup rollout against straight-through migrate MUL-2488 (#2998)
* fix(timezone): harden hourly-rollup rollout against straight-through migrate

MUL-2488

PR #2968 introduced the new task_usage_hourly rollup but assumed operators
would stop migrate between 102 and 103 to run the one-shot
cmd/backfill_task_usage_hourly. Two pieces made that unsafe in practice:

1. The Dockerfile only shipped server / multica / migrate, so a deployed
   container has no backfill binary to run between phases.
2. cmd/migrate has no per-version stop, and entrypoint.sh runs `migrate up`
   to the latest version, so 103 silently drops the legacy daily rollups
   even when nobody ran the backfill — leaving usage dashboards at zero
   despite source data being intact in task_usage.

Changes:

- Build cmd/backfill_task_usage_hourly into the runtime image alongside
  the other binaries so operators can `docker exec` the backfill instead
  of needing a source checkout.
- Add a fail-closed plpgsql guard at the top of migration 103 that
  aborts the migration when task_usage has rows but task_usage_hourly is
  empty. Fresh databases (no task_usage rows) are exempt because the new
  triggers from 102 will populate the hourly table on the first event.

Already-applied databases are unaffected — schema_migrations tracks by
version only, so 103 is not re-run.

Co-authored-by: multica-agent <github@multica.ai>

* fix(timezone): use watermark coverage for hourly-rollup guard

The previous check only required `task_usage_hourly` to be non-empty,
which an interrupted backfill or a manual `rollup_task_usage_hourly_window`
call both satisfy. The completion signal we actually trust is
`task_usage_hourly_rollup_state.watermark_at` — backfill only stamps it
to `now() - 5 min` after every monthly slice succeeded, and the cron
worker only advances it on a real tick. Default after migration 101 is
`1970-01-01`, so an unrun or partial backfill is trivially detected.

Also corrects the comment about fresh-install behavior: the triggers in
102 only enqueue dirty keys for agent_task_queue / issue / task_usage
DELETE — they do not write hourly rows. INSERT/UPDATE flows through the
`updated_at` watermark window of `rollup_task_usage_hourly()`, which
only runs once the operator registers it as a pg_cron job.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 16:26:42 +08:00
Multica Eve
51b3c5291f feat: add env-gated cloud runtime launcher (MUL-2453) (#2995)
* feat: add env-gated cloud runtime launcher

Co-authored-by: multica-agent <github@multica.ai>

* fix: address cloud runtime frontend nits

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 15:41:31 +08:00
Bohan Jiang
51c6e90363 docs: finish /projects link fix + tidy AWS_ENDPOINT_URL description (#2996)
Followup to #2979. One missed /issues → /projects link in agents.mdx
plus two AWS_ENDPOINT_URL row nits (URL/URLs repetition and trailing
period) in SELF_HOSTING_ADVANCED.md and the Chinese self-hosting page.

MUL-2498

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 15:35:39 +08:00
YYClaw
614dfae884 MUL-2488 feat(timezone): Scheduling / Viewing two-layer timezone architecture (#2968)
* docs(timezone): add scheduling/viewing timezone architecture RFC

* feat(db): replace daily rollups with task_usage_hourly, add user.timezone

Migrations 100-104: add "user".timezone (Viewing tz), build the UTC
hourly task_usage_hourly rollup with its pipeline, drop the legacy
task_usage_daily / task_usage_dashboard_daily pipelines, and drop the
agent_runtime.timezone column. Report queries now slice day boundaries
at read time by the caller-supplied @tz instead of materialising in a
fixed tz. Regenerate sqlc.

* feat(server): add task_usage_hourly backfill command

Replace the two legacy backfill commands (daily / dashboard_daily) with
a single backfill_task_usage_hourly that loads historical task_usage
into the new UTC hourly rollup, sliced per workspace.

* refactor(server): resolve viewing timezone in report handlers

Report handlers resolve the Viewing tz per request (?tz query param,
then user.timezone, then UTC) and pass it to the hourly-rollup queries.
Drop the UseDailyRollup feature flags and the old raw-scan/daily-rollup
dual paths, remove the /api/usage endpoints, and stop the daemon from
reporting and the runtime handler from accepting host timezone.

* refactor(core): switch report queries to viewing timezone

API client and dashboard/runtime queries send ?tz with each report
request, the user schema/types carry the new timezone field, and the
runtime timezone field/mutation is removed.

* feat(views): add viewing timezone preference and UI

Add the useViewingTimezone hook and a Timezone setting in Preferences;
report charts and the dashboard week boundary follow the viewer tz.
Remove the runtime detail timezone editor and its locale strings.

* fix(test): update fixtures and stabilize tests for timezone refactor

The timezone architecture refactor changed several types without
updating dependent test code:

- RuntimeDevice no longer has a timezone field — drop it from the
  create-agent-dialog runtime fixture.
- User now requires a timezone field — add it to the apps/web mockUser
  fixture.
- The PreferencesTab timezone tests asserted on the async save handler
  (PATCH then store update) with a bare expect, racing the mutation's
  settle callback, and timed out querying the Select's ~600-option IANA
  list on a loaded CI runner. Wrap the assertions in waitFor and extend
  the timeout for those three tests.

* docs(timezone): document self-host migration order and trigger invariant

Add a SELF-HOST UPGRADE ORDER runbook to the backfill command's package
comment: applying migrations 100-104 in a single migrate-up drops the
legacy daily rollups before the hourly backfill runs, leaving dashboards
empty until cron catches up.

Add an INVARIANT comment on trg_atq_dirty_hourly noting that agent_id
must be added to the trigger's OF list if it ever becomes mutable,
otherwise dirty buckets for the old agent_id are silently missed.

* style(runtimes): drop trailing blank line in runtime-detail
2026-05-21 15:33:47 +08:00
Tom Qiao
d0666138ec docs: fix broken anchor links and truncated env-var description (#2979)
Three docs issues spotted while reading:
- agents.mdx and agents.zh.mdx: [project](/issues) -> [project](/projects)
- cloud-quickstart.mdx: troubleshooting anchor #daemon-cant-reach-the-server
  did not exist; the heading is "Daemon can't connect to the server"
- SELF_HOSTING_ADVANCED.md and getting-started/self-hosting.zh.mdx:
  AWS_ENDPOINT_URL row description was truncated; append " URLs."

Co-authored-by: Tom Qiao <tomqiaozc@users.noreply.github.com>
2026-05-21 15:32:58 +08:00
Multica Eve
41cb91abd9 feat: add cloud runtime fleet proxy API (MUL-2453) (#2986)
* feat: add cloud runtime fleet proxy API

Co-authored-by: multica-agent <github@multica.ai>

* test: cover cloud runtime handler nits

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 15:06:10 +08:00
Bohan Jiang
1c892aa3f9 fix(projects): default project view to compact (#2975)
The compact view was the original list layout and is what users expect
on this page; the post-#2840 default of comfortable changed long-standing
behavior. Reset the unpersisted default (and the cross-workspace fallback
in `merge`) back to compact. Updates the view-store tests accordingly.

MUL-2464

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 14:07:40 +08:00
Anderson Shindy Oki
65feb890b8 feat: Add project list responsive compact and comfortable views (MUL-2464) (#2840)
* feat: Add project screen compact and comfortable views

* wip

* i18n

* refactor and add search

* refactor
2026-05-21 13:56:11 +08:00
兰之
7e55813460 fix(ui): show tooltip when create-issue button is disabled due to empty title (#2943)
Co-Authored-By: Xiaomi MiMo V2.5 Pro
2026-05-21 13:43:31 +08:00
Bohan Jiang
7f9e4e829d feat(comments): thread-internal --tail pagination + reply cursor (MUL-2421) (#2846)
* feat(comments): thread-internal pagination via --tail + reply cursor (MUL-2421)

Long threads inside a single issue still forced agents to read every reply
once they used --thread, even after MUL-2387 fixed cross-thread noise. This
adds reply-level paging so a 200-reply thread can be navigated tail-first
without dragging the whole conversation into prompt context.

- New SQL query ListThreadCommentsForIssuePaged: same recursive root walk
  as the legacy thread query, but caps reply count and supports an
  (created_at, id) composite cursor. Root is unconditional — even tail=0
  emits it so the reader keeps the "what is this thread about" context.
- Handler ListComments: parses `tail` (non-negative, ThreadTailSet flag
  preserves the tail=0 intent), threads it through to the paged query,
  and re-uses X-Multica-Next-Before / X-Multica-Next-Before-Id for the
  reply cursor. Cursor's meaning is now context-dependent: thread cursor
  under --recent, reply cursor under --thread + --tail.
- CLI: new --tail flag (only valid with --thread; mutually exclusive
  with --recent), reply-cursor semantics for --before / --before-id when
  paired with --thread + --tail, stderr label flips to "Next reply cursor"
  so an operator copy-pasting the cursor knows which scope it scrolls.
- Tests cover the new contract: tail=N keeps newest N + root, tail=0 is
  root-only, anchor on a nested reply still walks up, reply cursor
  scrolls older replies page-by-page, since combined with tail filters
  after the cut, and the negative-flag-combination matrix.

Out of scope: prompt template update to hint at `--thread <id> --tail 30`
on long threads — separate follow-up per the issue.

Co-authored-by: multica-agent <github@multica.ai>

* fix(comments): only emit reply cursor when older reply exists (MUL-2421)

The thread-tail path emitted `X-Multica-Next-Before` whenever the page
filled to exactly the requested reply count, even when there was nothing
older to scroll to. So `--thread <root> --tail 3` on a thread with
exactly 3 replies sent a cursor that, when followed, returned just the
root — a wasted round-trip that surfaced as a phantom "older replies"
affordance in the agent prompt.

Switch to a `reply_limit + 1` probe: ask the SQL for one extra row, trim
the oldest overflow before responding, and only emit the cursor when an
older reply actually existed. The exact-boundary case (replyCount ==
tail with no overflow) now returns no cursor.

Also documents `--thread/--tail/--recent/--before` and the cursor
semantics in CLI_AND_DAEMON.md, which was the second must-fix in the
MUL-2421 review.

Co-authored-by: multica-agent <github@multica.ai>

* fix(comments): suppress reply cursor when --since covers older replies (MUL-2421)

In the thread + tail + since path the server still emitted a reply cursor
whenever there was an older reply on disk, regardless of `since`. If the
oldest retained reply on the page was already `<= since`, every older
reply was guaranteed to be filtered out too, so the next page only ever
returned the root — wasting round-trips until the agent walked the whole
pre-`since` history. Mirror the recent + since suppression: when
`replies[0].CreatedAt <= since`, drop the cursor.

Test covers the exact case from Elon's review: tail=2 overflow, body
keeps a fresher reply, but the cursor target (oldest retained reply) is
already past `since` — header must be empty.

Co-authored-by: multica-agent <github@multica.ai>

* feat(prompt): default comment-trigger reads to --thread --tail 30 (MUL-2421)

Comment-triggered agents previously defaulted the trigger-thread read to
the unbounded `--thread <id> --output json`, which dumps the full thread
into the prompt — exactly the kind of context bloat MUL-2387 fixed at the
cross-thread layer but never bounded inside a single thread.

Use the new `--tail` flag landed earlier in this PR (server + CLI) as the
default for both the per-turn prompt and the runtime-config Workflow:

- `--thread <trigger-id> --tail 30 --output json` is the new default.
  Root is always included so "what is this about" context survives.
- If 30 replies aren't enough, the prompt now spells out the reply
  cursor: re-feed the stderr `Next reply cursor: --before <ts>
  --before-id <reply-id>` pair back to walk older replies.
- `--recent 20` stays as the cross-thread background fallback, with an
  explicit callout that the same `--before` / `--before-id` flags walk
  *threads* (not replies) in that mode.
- Available Commands core line now surfaces `--tail N` and both stderr
  cursor labels so non-workflow callers also discover the flag.
- `--since` callouts reflect the post-MUL-2421 combinable mode names
  (`--thread --tail` / `--recent`).

Tests (`prompt_test.go`, `execenv_test.go`) pin the new defaults and add
a regression guard against the unbounded `--thread` recipe sneaking back
in.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 13:43:15 +08:00
Bohan Jiang
8a135d2982 fix(ws): truncate unparseable frame payload in client warn log (#2974)
The post-#2946 onmessage guard logs the raw event.data alongside the
warning. A malformed or rogue server can stream arbitrarily large
garbage and bloat the renderer / desktop main-process log buffers, so
cap the logged payload to the first 200 chars and append a
"(truncated, N chars total)" suffix when truncation occurs.

MUL-2490

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 13:37:42 +08:00
YOMXXX
83e90c9530 fix(ws): log auth frame write failures (#2946) 2026-05-21 13:33:12 +08:00
Bohan Jiang
ef6a944063 fix(cli): accept slug + short UUID prefix in workspace get/update/member (#2972)
* fix(cli): accept slug + short UUID prefix in workspace get/update/member (MUL-2385)

`workspace list` shows the 8-char short UUID prefix, name, and slug by
default; `workspace get`/`update`/`member list` only accepted full UUIDs.
That broke the natural list -> get flow: every value the user could copy
from list output was rejected. They had to either rerun list with
`--full-id` or parse the JSON output -- both implementation-detail level
operations.

Extend `resolveWorkspaceByIDOrSlug` with a short UUID prefix fallback
(>=4 hex chars, ambiguous matches return all candidates), introduce
`resolveWorkspaceRef`/`resolveWorkspaceArg` helpers that fetch the
caller's accessible workspaces and resolve UUID/slug/prefix in one call,
and wire them into get/update/member list (switch already used the same
list-then-resolve pattern). Full UUIDs short-circuit the extra
`/api/workspaces` round trip; access control remains on the downstream
endpoint.

Also add a one-line tip after `workspace list` table output pointing
users at get/update/switch with the same identifier columns, and
broaden the command Use strings to `<id|slug|prefix>` so help reflects
the new behavior.

Refs https://github.com/multica-ai/multica/issues/2750

Co-authored-by: multica-agent <github@multica.ai>

* chore(cli): include prefix hint in workspace list footer

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-21 13:08:44 +08:00
YOMXXX
ed2957ddf8 fix(claude): record result model usage (#2899) 2026-05-21 13:00:12 +08:00
iYuan
2f1f90c11a fix(agent): retry codex semantic inactivity fresh (#2593) 2026-05-20 20:03:39 +08:00
Bohan Jiang
688dcb017c fix(agents): drop confusing "default" badge from model picker (MUL-2477) (#2938)
The model dropdown already exposes a "Default (provider)" option meaning
"follow the CLI's current selection". Tagging the runtime's preferred
model with a small "default" chip created two competing notions of
"default" in the same UI and confused users. Remove the chip from both
the create-agent ModelDropdown and the inspector ModelPicker; keep the
underlying RuntimeModel.default flag intact since thinking-prop-row
still uses it as a fallback heuristic.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 18:07:57 +08:00
Multica Eve
cf000d1e93 docs(changelog): add 2026-05-20 release notes (#2932)
Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
v0.3.4
2026-05-20 17:28:08 +08:00
Naiyuan Qing
317bca40c1 feat(squads): show skeleton on squad detail initial load (#2930)
Replaces the plain "Loading..." text fallback in SquadDetailPage with a
skeleton that mirrors the loaded page's two-column layout (left inspector
+ right tabs panel), matching the SquadsListSkeleton work shipped in #2890.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 17:21:52 +08:00
Bohan Jiang
8d4f4caf4a MUL-2338 fix(comments): allow agent self-mention to enqueue cross-issue handoff (#2928)
* fix(comments): allow agent self-mention to enqueue cross-issue handoff

The @mention path in CreateComment unconditionally skipped any
self-mention. That dropped the child→parent handoff between issues
assigned to the same agent: the child run posted `@J` on the parent
issue, the guard tripped, and the parent's J was never woken — the chain
silently broke.

Drop the self-trigger `continue` in the agent mention branch. Runtime
ready / private-agent gate / HasPendingTaskForIssueAndAgent dedup all
remain, so a same-issue self-mention while a queued or dispatched task
exists is still deduped; a running task no longer pre-empts a new
follow-up (the existing queue coalescing handles that).

Three regression tests:
  - cross-issue self-mention enqueues a task on the target issue
  - same-issue self-mention while running queues a follow-up
  - same-issue self-mention with a pre-existing queued/dispatched task
    is deduped

MUL-2338

Co-authored-by: multica-agent <github@multica.ai>

* test(handler): assign per-workspace issue number in self-mention fixture

The fixture inserts two issues in the same test workspace; without an
explicit number both default to 0 and the second insert violates
uq_issue_workspace_number, taking the backend CI job down on PR #2928.

Mirror the workspace-counter advancement pattern from
issue_scheduled_test.go so each fixture issue gets a unique number.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 17:18:41 +08:00
YOMXXX
34f16e2c7a fix(opencode): deny interactive questions in daemon mode (#2878)
* fix(opencode): deny interactive questions in daemon mode

* fix(opencode): avoid permission env ordering bypass
2026-05-20 17:17:31 +08:00
Naiyuan Qing
85e363370e Revert "feat(issues): Working filter + agent-working badge on board (MUL-2452…" (#2927)
This reverts commit dee5c7cf50.
2026-05-20 16:47:41 +08:00
Naiyuan Qing
b040165f4e feat(squads): skeleton loader + AlertDialog archive confirm (MUL-2437) (#2890)
* feat(squads): skeleton loader + AlertDialog archive confirm (MUL-2437)

- Replace `Loading...` text on the squads list with a Skeleton placeholder
  matching the SquadCard shape (avatar + title + subtitle), aligning with
  the Agents / Dashboard pattern.
- Replace the native `confirm()` on the squad detail Archive button with
  the project's AlertDialog (destructive variant, pending-disabled, i18n
  copy interpolating the squad name).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(squads): drop misleading restore copy from archive confirm (MUL-2437)

Archive is irreversible — there is no unarchive command (see
apps/docs/content/docs/squads.mdx:113). Aligns dialog copy with
docs: tells the user the action can't be undone and to create a
new squad if they need the routing back.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 16:43:58 +08:00
Naiyuan Qing
dee5c7cf50 feat(issues): Working filter + agent-working badge on board (MUL-2452) (#2924)
* feat(issues): surface "agent working" on board + add Working filter (MUL-2452)

Adds a brand-color "agent working" badge to board cards / list rows so
users can see at a glance which issues have an active agent task, plus a
new "Working" toggle on the `/issues` and `/my-issues` headers (next to
the existing scope segmented control) that filters to those issues. The
toggle shows an avatar stack of the agents currently active on the
current surface + scope. Pure frontend: re-shapes the existing
workspace-wide `agentTaskSnapshot` cache via two new selectors
(`activeTasksByIssueOptions` / `workingIssueIdsOptions`), no new SQL,
endpoint, or DB field; WS `task:*` events already invalidate the
snapshot so the badge / filter update in realtime.

Project detail page keeps the per-card badge but intentionally omits the
header toggle (`showWorkingToggle={false}`) to leave the project
surface's filter dimensions unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): working filter column header reflects filtered count (MUL-2452)

Assignee-grouped board column headers kept showing the unfiltered cache
total when Working was on, because `PaginatedAssigneeBoardColumn` passed
`useLoadMoreByAssigneeGroup`'s cache-derived `total` straight to
`BoardColumn`. The hook still needs the cache total for hasMore, but the
displayed count must follow the visible-after-filter set.

Split the two: when Working is active the column header now uses
`group.totalCount` (set by applyWorkingFilterToGroups) for the assignee
path, and `issueIds.length` for the status path. Load-more keeps reading
from cache so paginated columns still see the full server total.

Regression tests cover applyWorkingFilterToGroups (total rewrite +
empty-group preservation), filterIssues workingOnly combinations, and an
end-to-end assertion via IssuesPage that proves the column header equals
the filtered count, not the cached value.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 16:35:58 +08:00
Bohan Jiang
aeb284cbeb feat(runtime): teach agents the parent/sub-issue protocol (MUL-2338) (#2918)
* feat(runtime): teach agents the parent/sub-issue protocol (MUL-2338)

Adds a Parent / Sub-issue Protocol section to the runtime brief built by
`buildMetaSkillContent`, emitted whenever the agent is running on a real
Multica issue (assignment- or comment-triggered). Two behaviors are now
documented for every issue-bound agent:

- A. When wrapping up a child issue, post the final result and switch to
  `in_review` on this issue first, then post a single top-level comment
  on the parent. Mention the parent assignee only when it is another
  agent on a still-open parent — never self-mention, never @ member /
  squad, never re-trigger a `done` / `cancelled` parent.
- B. When creating sub-issues, choose `--status backlog` for sub-issues
  that must wait and `--status todo` for the one to start immediately;
  promote with `multica issue status <id> todo` when its turn comes.

The signal is explicitly framed as best-effort — no server-side state
sync, no claim of a guaranteed handshake. The section is skipped for
chat, quick-create, and run-only autopilot runs, which have no
parent/child semantics.

Tests in runtime_config_test.go assert that the section is present in
both issue workflows, absent in the three non-issue modes, and that the
wording does not introduce a non-existent `multica issue list --parent`
command or promise a reliable handshake.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): split Step A of parent/sub-issue protocol by trigger type (MUL-2338)

Comment-triggered runs were inheriting an unconditional
`multica issue status <this-issue-id> in_review` from Step A, which
conflicts with the comment-triggered workflow rule "Do NOT change the
issue status unless the comment explicitly asks for it" (Elon's blocking
review on PR #2918). Step A now branches on trigger type:

- Assignment-triggered: keep "post final results + flip in_review".
- Comment-triggered: complete the reply per the existing workflow rule,
  only flip status when the triggering comment asked for it, and gate
  the parent-notification steps on actually closing out child work.

Tests lock the boundary: comment-triggered briefs must not contain the
unconditional in_review command, must echo the existing status
guardrail inside Step A, and must spell out the "closing out" gate.
Assignment-triggered briefs still carry the unconditional flip.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): simplify parent/sub-issue mention rule to always @ parent assignee (MUL-2338)

Per Bohan's directive on PR #2918: the per-case mention table (same agent /
member / squad / closed parent) is overkill prompt complexity. Replace it
with a single rule: always @mention the parent's assignee using the URL
that matches assignee_type. The platform's existing run dedup handles
re-triggers, and a single rule is easier for agents to follow predictably.

Preserves the existing comment-triggered boundary (Step A still does NOT
add an unconditional in_review flip on comment-triggered runs).

Co-authored-by: multica-agent <github@multica.ai>

* refactor(runtime): compress parent/sub-issue protocol to 3-rule convention (MUL-2338)

Drop the spec-flavored A/B sub-headings and per-case mention table; keep
three numbered rules (close out child, notify parent, pick backlog vs
todo) plus a one-line best-effort preamble. The comment-triggered
branch still re-asserts the "do not change status unless asked"
guardrail and gates parent notification on actually closing out child
work; the assignment-triggered branch still flips to `in_review`.

Section is now 7 lines instead of 29. A new TestParentSubIssueProtocolIsCompact
guards the ≤10-line ceiling so this stays a convention, not a spec.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): make sub-issue creation rule unconditional in parent/sub-issue protocol (MUL-2338)

Elon's review on PR #2918: the preamble previously gated all three
rules on the current issue having `parent_issue_id`, but rule 3
(creating sub-issues) needs to reach top-level parents that have no
parent themselves — that is exactly where the `todo` vs `backlog`
decision matters most. Move the gate from the preamble onto rules 1
and 2 per-rule; rule 3 now applies to any issue-bound run. Section
stays at 7 newlines (≤10).

Co-authored-by: multica-agent <github@multica.ai>

* refactor(runtime): unify parent/sub-issue protocol as mechanism description (MUL-2338)

Drop the if/else split between assignment- and comment-triggered runs in
the Parent / Sub-issue Protocol section: both runs now read the same
two-rule description of how the parent/child mechanism works. The
comment-triggered workflow rule "Do NOT change the issue status unless
the comment explicitly asks for it" naturally short-circuits the parent
notification (no status flip → not closing out the child → skip), so the
protocol no longer needs to branch on TriggerCommentID.

Tests collapse the two trigger-specific cases into one parameterized
test, and the assignment vs comment status-flip invariants are now
anchored on the real workflow command (with substituted issue id)
instead of the protocol's removed `<this-issue-id>` placeholder.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 16:20:33 +08:00
Angular
1f978bf1ec feat(autopilot): link created issues to projects (#2908)
* feat(autopilot): link created issues to projects

* test(autopilot): cover project flag
2026-05-20 15:37:23 +08:00
Bohan Jiang
ffc0c5ab2e docs(agent-inspector): sync thinking_level comments with no-override semantics (MUL-2339) (#2923)
Follow-up to #2919 review nits — comments still described the empty
thinking_level as "use runtime default" and claimed ThinkingPicker callers
guaranteed non-empty levels. Both were stale after the semantics changed:

- packages/core/types/agent.ts: clarify that "" clears the override and
  the local CLI config / built-in default decides at runtime.
- thinking-picker.tsx: document that the stale-orphan clear path in
  ThinkingPropRow mounts the picker with an empty levels list plus a
  persisted value, so callers do not guarantee non-empty levels.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 15:34:27 +08:00
Bohan Jiang
b7082a01f1 fix(issues): retry button targets the row's agent (MUL-2457) (#2921)
* fix(issues): retry button targets the row's agent, not the assignee (MUL-2457)

The execution log retry button used to re-fire the issue's current
assignee instead of the agent that actually ran the clicked row. After
a reassignment, or for squad workers / @-mention agents, the rerun
landed on the wrong agent.

POST /api/issues/{id}/rerun now accepts an optional task_id: when set,
the rerun targets that task's agent (and reuses its leader/worker
role). An empty body keeps the assignee-driven CLI/API contract.

The execution-log retry button passes task.id, so per-row retry always
fires the correct agent. enqueueMentionTask gained a forceFreshSession
parameter so the new mention-path rerun keeps the same fresh-session
contract as the assignee path.

Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): inherit trigger provenance + fix cross-issue test (MUL-2457)

Address review feedback on PR #2921:

1. RerunIssue now inherits TriggerCommentID from the source task when
   sourceTaskID is valid. Without this, a per-row rerun of a comment-
   or mention-triggered task degrades into a generic issue run because
   the daemon's buildCommentPrompt path keys on TriggerCommentID. The
   inherited summary is rebuilt naturally inside the enqueue helpers
   (buildCommentTriggerSummary derives it from the comment ID).
2. The new cross-issue rejection test inserted a second issue without
   `number`, hitting uq_issue_workspace_number on a same-workspace
   collision with the fixture's issue. Both inserts now claim the next
   available per-workspace number (MAX(number)+1) — matching the
   pattern used by notification_listeners_test.

Added TestRerunIssueInheritsTriggerCommentFromSourceTask to lock the
trigger provenance contract.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 15:30:03 +08:00
Angular
314e91fa6d fix(chat): guard optimistic task message ids (#2901) 2026-05-20 15:18:42 +08:00
Bohan Jiang
68270e238e MUL-2339: polish(agent-inspector): optimistic updates + picker layout + thinking-default semantics (#2919)
* polish(agent-inspector): optimistic updates + picker layout + thinking-default semantics

Round of cleanup on the agent inspector pickers after using them end-to-end:

1. **Optimistic updates** (`agent-detail-page.tsx`)
   The `handleUpdate` callback that backs every inspector picker
   (thinking / model / visibility / concurrency / runtime / name /
   description / avatar) was strictly sequential:
   `await api.updateAgent → invalidateQueries → toast.success`. Each pick
   waited 0.5-2s for the network round trip before the trigger chip
   updated, which read as visible UI lag.
   Snapshot the cached agent list, patch the matching agent
   synchronously via `setQueryData`, then run the network request in
   the background. On error roll back to the snapshot before the toast
   surfaces the cause. All inspector pickers now respond instantly.

2. **Block-in-inline fix in Model + Thinking pickers**
   `PickerItem` wraps its children in a flex `<span>`. The picker
   bodies had `<div>` children, which is block-in-inline (invalid
   HTML5) and triggers a browser layout quirk that off-aligns
   descendants — model IDs floated to the center under their labels
   in ModelPicker, descriptions indented unevenly under levels in
   ThinkingPicker. Replace the inner `<div>`s with `<span block
   text-left>` so the layout is deterministic across rows.

3. **Visual polish in Thinking picker**
   Label was `font-medium` at the parent's default `text-sm` (14px),
   chunky next to the 10px description. Drop to `text-[13px]`, bump
   description to `text-[11px] leading-snug` with `mt-0.5` so the
   contrast between rows feels less jarring.

4. **Match Model picker's row typography to Thinking's**
   Same `text-[13px]` for label + `text-[10px] mt-0.5` for the model
   ID. Both pickers now read as the same component family.

5. **"Default" semantics: follow CLI config, not model factory default**
   The chip displayed "Default" / "default" badge when no
   `thinking_level` was set, alongside a `[default]` chip on the
   model's factory-advertised default option in the menu. That was
   misleading: when Multica omits `--effort` (because picker is
   unset), it's the user's *local CLI config* (claude/codex) that
   decides the reasoning level — not the model's factory default.
   Showing "medium [default]" while the user has xhigh in their CLI
   config lies about what actually fires at the API.
   - Trigger label: "Default" → "Follow CLI config" (zh: "跟随 CLI 配置")
   - Footer clear button: "Use model default" → "Follow CLI config"
   - Footer tooltip: explicitly mentions claude/codex CLI config
   - Inline `[default]` badge on the factory-default option: removed
   - `defaultLevel` prop chain (picker + prop-row + test): cleaned up
     as now-dead code

6. **Stop hiding the Thinking row while discovery loads**
   `if (levels.length === 0 && !value) return null` hid the row
   while the runtime-models query was still in flight, which
   subscribed-then-unsubscribed from useQuery in such a way that
   the discovery only fired when the user manually opened the Model
   picker. Gate the early return on `!isLoading && !isFetching` so
   ThinkingPropRow stays mounted (and thus its useQuery keeps
   subscribed) until discovery returns; row appears as soon as
   data arrives, no Model-picker tap required.

7. **Drop the inline tooltip on Thinking picker items**
   The same description was rendered both inline under the label
   (always visible) and as a hover tooltip (overlapping the next
   row). The hover bubble was redundant — removed.

Tests
- `pnpm --filter @multica/views test thinking-picker` → 7/7 pass after
  renaming the "Default" assertion + clearing the unused defaultLevel
  test prop.
- `pnpm --filter @multica/views typecheck` clean.

* fix(test): align thinking-prop-row tests with renamed copy + loading-aware row gate

CI surfaced 3 broken assertions in `thinking-prop-row.test.tsx` —
all consequences of the polish PR's behaviour changes that the test
file hadn't tracked:

- "hides the row when ... no thinking levels and nothing is persisted"
  The row now stays mounted while runtime-models discovery is in
  flight (so the useQuery subscription actually survives long enough
  to issue the request — fixes the bug where Thinking only appeared
  after manually opening the Model picker). The assertion asserted
  absence only after `initiate` was called, but loading is still in
  progress at that point. Wrap the absence assertion in `waitFor`
  so it waits for the row to disappear after the query settles.

- "clears the orphan value via the picker footer"
  Tooltip copy changed from "Clear and fall back to this model's
  default reasoning level" → "Clear the override and let the local
  CLI config decide the reasoning level". Update the regex.

- "renders the row with \"Default\" when value is empty"
  Trigger label changed from "Default" → "Follow CLI config" to
  reflect that Multica omits --effort and the local CLI config
  decides. Update the assertion + test name.

`pnpm --filter @multica/views test` → 701/701 pass.

* fix(agent-inspector): drop loading-row gate + per-field optimistic rollback (MUL-2339)

Addressing review feedback on #2919:

- ThinkingPropRow no longer keeps the row visible during discovery.
  The previous explanation ("early return null aborts the useQuery
  subscription") was wrong — React doesn't unmount a component that
  returns null, so hooks (and their subscriptions) stay live. The
  loading-aware gate only succeeded in showing an empty "Follow CLI
  config" row that opened to an empty menu before discovery settled.
  Restore the simple `levels empty && !value -> null` behavior; the
  sibling ModelPicker mounts unconditionally and keeps the shared
  runtime-models query active regardless.

- AgentDetailPage.handleUpdate now rolls back only the fields the
  failing PATCH wrote, instead of restoring a whole-list snapshot.
  A whole-list snapshot rollback discards any concurrent successful
  inspector mutation that landed between snapshot and rollback. Per-
  field rollback + a final invalidate converges the cache on server
  truth without clobbering unrelated optimistic writes.

- Sync the now-stale "use model/runtime default" wording in the
  thinking-related JSDoc and type comments: empty thinking_level is a
  "no override" sentinel — the backend omits --effort and the upstream
  CLI config decides — not a Multica-known default level.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 15:18:34 +08:00
Bohan Jiang
eaf8b14866 fix(installer): post-merge nits from #2881 (MUL-2458) (#2922)
- Capture `brew tap` output and print the same diagnostic tail on
  failure that `brew install` already prints, so #2867-style "no
  signal" reports are gone from both Homebrew failure paths.
- Add a `brew tap` failure regression case to `scripts/install.test.sh`
  and refactor the test runner to share sandbox/curl-stub setup; both
  cases now also assert the diagnostic tail is emitted.
- Move the shell installer test out of the heavy backend job into a
  dedicated `installer` matrix job that runs on `ubuntu-latest` and
  `macos-latest`, since the installer targets macOS/Homebrew and BSD vs
  GNU `tar` / `sed` / `mktemp` differences are the next likely break.
- Surface `MULTICA_INSTALL_DIR`, `MULTICA_BIN_DIR`, and
  `MULTICA_SELFHOST_REF` in `install.sh --help` so `MULTICA_BIN_DIR`
  stops looking like a test-only knob.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 15:18:17 +08:00
Jiayuan Zhang
41753d17a2 feat(desktop): pin tab (MUL-2449) (#2914)
* feat(desktop): pin tab — keep parked tabs anchored across navigations (MUL-2449)

Adds tab pinning to the desktop tab bar. Pinned tabs render as icon-only at
the left, suppress the X close button, and intercept any `navigation.push()`
that would change their pathname — those are redirected into a new tab so
the pinned tab stays parked on its original route. Search/hash/back/forward
stay in-tab so pinned filter and drawer state still work.

Implements the FINAL combo from the MUL-2449 RFC §4: right-click menu +
⌘⇧P shortcut (D1 a+c), icon-only visual (D1v i), pathname-change → new tab
with same-path-allowed (D2a/b A), back / refresh allowed (D2c/d A), pinned
auto-cluster left and persist (D3a/b A), pinned can't be X-closed (D3c A),
dedupe respected (D4a A), default Issues tab pinnable (D4b A), drag clamped
to its zone (D4c A), deep link prefers pinned (D4e A).

Store changes:
  - Tab.pinned added; togglePin maintains the "pinned first" invariant by
    inserting at the zone boundary.
  - moveTab clamps cross-zone drags so dnd-kit can't violate the ordering.
  - Persistence bumped v2 → v3 with a defaulting migration (pinned=false).
    Rehydrate sorts pinned-first as a defensive net.

Navigation:
  - tryRouteToPinnedNewTab compares the active tab router's live pathname
    to the target. Same-pathname push (query / hash / sub-router) falls
    through to the router; different pathname → openTab + setActiveTab
    (foreground; respects dedupe).

UI:
  - Tab bar wraps each tab in a shadcn ContextMenu with Pin/Unpin + Close
    (Close disabled for pinned or last-remaining tab).
  - Pinned tabs use a narrower icon-only layout with an accent left border
    and a divider between the pinned and unpinned groups.
  - Global keydown listener registers ⌘⇧P / Ctrl+Shift+P to toggle pin on
    the active tab.

Tests: - tab-store: togglePin ordering, moveTab boundary clamping, v2→v3
    migration.
  - navigation: pinned push → new foreground tab; same-pathname push stays
    in tab; cross-workspace still wins over pin.
Co-authored-by: multica-agent <github@multica.ai>

* test(desktop): cover TabNavigationProvider.push pin interception (MUL-2449)

Add pathname-diff / same-pathname cases for the per-tab navigation
adapter. Existing tests only exercised the root-level
DesktopNavigationProvider, but in-tab AppLink / page clicks flow
through TabNavigationProvider — so a future refactor that drops the
pin check from that provider would silently regress.

Co-authored-by: multica-agent <github@multica.ai>

* refactor(desktop): pin tab — hover button, full title, drop ⌘⇧P (MUL-2449)

Jiayuan's interactive review of PR #2914 surfaced three changes to the
RFC's D1 (entry / visual) decisions:

  1. Drop the ⌘⇧P global shortcut — it added a keybinding for a
     low-frequency action and crowded the shortcut namespace.
  2. Reveal a Pin / Unpin button on tab hover instead of relying on the
     right-click menu as the primary entry; right-click remains as a
     fallback (and for Close).
  3. Pinned tabs keep their full title and width. The only weak visual
     differences vs. unpinned tabs are the accent left border and the
     suppressed X close button.

Removes the global keydown listener (no other doc / handler referenced
it). Adds a hover-only Pin / Unpin span next to the existing close
affordance, both gated by group-hover. Drops the icon-only width /
hidden-title styling for pinned tabs.

Tests: new tab-bar.test.tsx covers Pin / Unpin button rendering, click
handlers (togglePin), the hidden-X invariant on pinned tabs, and the
full-title rendering. 146 passed, typecheck clean.

Co-authored-by: multica-agent <github@multica.ai>

* refactor(desktop): pin tab — drop accent left border, swap leading icon to Pin (MUL-2449)

Jiayuan reported that the accent left border on pinned tabs reads as a
heavy black edge in light mode and looks unrefined. Replace it with a
quieter identifier: pinned tabs swap their route icon for a Pin glyph
in the leading slot (same size, no extra horizontal space). The hidden
X close button stays as the secondary cue. RFC §3 D1v moves from
iii FINAL to iv FINAL; iii is demoted to v2 FINAL → v3 REMOVED.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 09:14:43 +02:00
Angular
edded77691 fix(installer): fall back when brew install fails (#2881) 2026-05-20 15:14:18 +08:00
Bohan Jiang
9d3b6e2241 feat(agent): inspector picker for thinking_level (MUL-2339) (#2912)
* feat(agent): inspector picker for thinking_level (MUL-2339)

PR1 (#2865) shipped the backend — column, daemon-side discovery,
Claude/Codex injection, API validation — but the agent detail inspector
had no UI to set the value. Users could only configure thinking_level
via custom_env / API. This wires up the picker so it lives next to
Runtime and Model where everything else editable already lives.

Picker is per-(runtime, model): it reuses the same `runtimeModelsOptions`
query the Model picker already runs (60s cache, no extra round-trip)
and reads the active model's `thinking.supported_levels`. When the list
is empty — every provider except Claude/Codex today, or a Claude model
that doesn't expose `--effort` — the entire PropRow is hidden, not just
rendered inert. The picker never gets to invent value/label pairs
itself; they come verbatim from each CLI's own catalog (`Low`,
`Extra high`, …) so the user sees exactly what `claude --effort` /
`/effort` and Codex's TUI show.

The `default_level` from the catalog is badged inside the popover so
the user knows which value `""` (the persisted "use model default"
sentinel) maps to. The clear footer sends `""` explicitly, which the
backend already understands as the tri-state "explicit clear" branch
of UpdateAgent. Invalid combinations (e.g. picking a value not in the
target provider's enum after a runtime swap in the same PATCH) hit
the existing 400 path on the server and surface as a toast via the
inspector's standard `onUpdate` error handler — no extra client-side
guard needed.

Exports `RuntimeModelThinking` and `RuntimeModelThinkingLevel` from
`@multica/core/types` so views consumers can refer to them by name.
i18n keys added in EN and zh-Hans (parity test green).

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): preserve unknown thinking_level in picker label

Stale persisted values (model swap, CLI catalog shrink) used to render
as 'Default' even though the backend would still ship the orphaned
token. Fall back to the raw value when no entry matches so the user
sees what's actually saved and can clear it.

Co-authored-by: multica-agent <github@multica.ai>

* test(agent): unit tests for thinking-picker label + clear flow

Covers the default-vs-set trigger label, the unknown-token preservation
path added in 3452fae3f, the read-only display, picking and re-picking
into onChange, and the clear footer's empty-string emission.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): keep Thinking row visible when value is stale (MUL-2339)

Inspector was hiding the row whenever the active model had no
supported_levels, which also hid persisted orphan tokens (model swap
into a non-thinking runtime, or a CLI catalog that shrank). PR1's
per-model invalid behavior is daemon-side warn/drop, not a synchronous
DB clear, so the frontend has to surface the raw value and let the
user explicit-clear it via the picker footer.

Render the row when levels are empty AND value is empty; otherwise
keep it. Extract ThinkingPropRow into its own file so the row-level
logic is unit-testable.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 13:47:19 +08:00
Bohan Jiang
2bec2221d2 feat(agent): per-agent thinking_level for claude + codex (MUL-2339) (#2865)
* feat(agent): persist thinking_level per agent (MUL-2339)

Adds a nullable `thinking_level` column to the `agent` table so the
backend can route a runtime-native reasoning/effort token (e.g. Claude's
`xhigh`, Codex's `minimal`) through to the agent CLI on every dispatch.

The column is intentionally TEXT rather than an enum — Claude and Codex
publish overlapping but distinct vocabularies and we want the persisted
value to round-trip exactly through whichever CLI receives it. NULL is
the "use runtime default" sentinel that every downstream consumer reads
as "do not inject --effort / reasoning_effort".

This commit is just the storage layer (migration + sqlc); subsequent
commits wire it through the API, daemon, and agent backends.

Co-authored-by: multica-agent <github@multica.ai>

* feat(agent-backend): inject reasoning effort for claude + codex (MUL-2339)

Extends ExecOptions with a runtime-native ThinkingLevel string and wires
it into the Claude and Codex backends. Discovery is driven by the local
CLI so the daemon advertises whatever the host install supports rather
than a hand-maintained list that goes stale.

Per Elon's PR1 review:
- Claude: parses `claude --help` to learn the `--effort` superset and
  projects through a per-model allow-list (xhigh is Opus-only; max is
  session-only on the smaller models). Falls back to a conservative
  static list when the binary is missing or help drift hides the line.
- Codex: drives `codex debug models --output json` so per-model
  reasoning subsets and the documented default come directly from the
  CLI. The older config-error probe trick is gone — the JSON path is
  stable and doesn't pollute stderr with an intentional misconfig.
- Cache key includes (provider, executablePath, cliVersion) so a CLI
  upgrade invalidates entries that referenced the older help / catalog.

Per Trump's PR1 constraint, all three Codex injection points
(thread/start.config, thread/resume.config, turn/start.effort) flow
through one helper (`applyCodexReasoningEffort`) so they cannot drift
independently. The shared `codexReasoningCases` fixture in
`thinking_test.go` asserts the same value→{shape, key} contract at
each site for every level the runtimes know about.

Claude's `--effort` is also added to `claudeBlockedArgs` so a user
custom_args entry can't silently outvote the daemon-injected value.

Co-authored-by: multica-agent <github@multica.ai>

* feat(api): wire thinking_level through API + daemon contract (MUL-2339)

End-to-end plumbing for the per-agent reasoning/effort setting:

- AgentResponse / TaskAgentData now carry `thinking_level`; the daemon's
  claim response includes it and the daemon's executor passes it through
  to agent.ExecOptions, where the Claude and Codex backends already know
  what to do with it.
- ModelEntry on the runtime-models wire format gains a `thinking` block
  carrying `supported_levels` + `default_level` per model so the UI can
  render a runtime-aware picker without the server having to know about
  the local CLI install. `handleModelList` projects the agent-package
  catalog (including the new Thinking field) into the wire shape.
- CreateAgent / UpdateAgent gate the field with a synchronous provider
  enum check (claude / codex only today). UpdateAgent is tri-state:
  field omitted = no change, "" = explicit clear (new
  `ClearAgentThinkingLevel` query, mirrors the existing mcp_config null
  pattern), non-empty = validate then set.

Per Trump's PR1 review, the API NEVER auto-clears on a runtime/model
swap and ALWAYS returns 400 on an unknown literal value — same shape
across CreateAgent, UpdateAgent, and combined patches that move
runtime + level in one request. Per-model combination failures (e.g.
`xhigh` against a model that only supports up to `high`) surface as a
daemon-side task error, not a silent server-side rewrite.

TS types follow the same shape: `Agent.thinking_level`,
`CreateAgentRequest`/`UpdateAgentRequest` add the field, `RuntimeModel`
grows a `thinking` block. Older backends omit the field, which the
front-end treats as "no picker for this model" — installed desktop
builds keep working.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): correct codex debug models argv + pin via runner test (MUL-2339)

`codex debug models --output json` is rejected by codex-cli 0.131.0 —
the subcommand emits JSON on stdout by default and has no `--output`
flag. Drop the flag and add `--bundled` to skip the network refresh
discovery doesn't need. Move the argv to a package-level var and add
a test that runs a fake `codex` to assert the binary actually
receives exactly `debug models --bundled`, so the contract can't
silently drift on the next refactor.

Also teach ValidateThinkingLevel to resolve an empty model to the
provider's default model entry. Without this, every default-model
task with a persisted thinking_level would be misjudged "unknown
model" by the daemon guard.

Co-authored-by: multica-agent <github@multica.ai>

* fix(api): reject runtime switch that would leave invalid thinking_level (MUL-2339)

A PATCH that changed `runtime_id` without touching `thinking_level`
used to silently keep the existing value, so a Claude agent storing
`max` could land on a Codex runtime where `max` is not a recognised
token at all, and the daemon would receive a literal-invalid level.

Hold the same "always 400 on literal-invalid, never silent coerce"
rule on this implicit path. When runtime_id changes and the existing
value is not in the new provider's enum, return 400 with the
recovery options (clear via `thinking_level=""` or re-set in the
same PATCH).

Add coverage for both the kept-when-still-valid and the rejected
cases, plus the two recovery paths (clear and replace).

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): guard runTask with per-model thinking_level validator (MUL-2339)

ValidateThinkingLevel existed but had no call site — `task.Agent.
ThinkingLevel` flowed straight into ExecOptions, so `xhigh` configured
on a non-Opus Claude model, or API-side stale values that escaped the
provider enum gate, would be injected anyway.

Run the validator before building ExecOptions. Invalid combinations
log a warning and drop the level instead of failing the task: the
agent still runs, just at the runtime's default reasoning effort.
Discovery errors fail open (keep the level, let the CLI surface any
objection) so a transient `claude --help` failure can't strand work.

Empty model is forwarded as-is; the validator resolves it to the
provider's default model internally per the cross-package contract.

Co-authored-by: multica-agent <github@multica.ai>

* chore(agent): drop stale `--output json` comments + unused scanner (MUL-2339)

Codex CLI's `debug models` subcommand emits JSON without an `--output`
flag, and `parseCodexDebugModels` never read from the bufio.Scanner.
Sync the comments with the actual invocation and remove the dead init.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 12:30:10 +08:00
Jiayuan Zhang
292226f632 fix(runtimes): use official Gemini spark icon (MUL-2447) (#2904)
* fix(runtimes): use official Gemini spark icon (MUL-2447)

Gemini provider was falling through to the default Monitor icon in the
runtime list. Add the official 4-point spark mark with Google's
blue → purple → pink gradient, matching the SVG style/sizing of the
other provider icons.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtimes): use current Gemini multicolor spark gradient (MUL-2447)

Per review on PR #2904: the previous 3-stop blue/purple/pink gradient
was the legacy Bard-era Gemini spark. Update to the 5-stop cyan → blue
→ purple → pink → orange gradient used by the current Gemini app/web
multicolor mark.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtimes): switch Gemini icon to aurora multicolor treatment (MUL-2447)

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtimes): align Gemini aurora color positions and smooth spark path

Swap yellow/green radial gradient anchors so colors land at the official
positions: top red / right blue / left yellow / bottom green, matching
gemini.google.com's current aurora spark. Replace the arc-based 4-point
spark outline with a cubic-bezier version normalized to the 24-viewBox
so the inset between tips is smoother and closer to the gstatic source.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtimes): use Simple Icons Google Gemini mark (MUL-2447)

Drop the hand-crafted aurora gradient approximation and inline the
canonical "Google Gemini" path from Simple Icons (CC0 1.0), rendered
in the Simple Icons brand color (#8E75B2). This matches the pattern
used by the other provider marks in this file (Claude/Codex from
Bootstrap Icons, etc.) instead of trying to manually approximate the
official multicolor wash from gemini.google.com (which paints via a
clipPath over an embedded raster).

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 12:27:53 +08:00
Jiayuan Zhang
72339f347b fix(desktop): keep local machine row visible after stopping daemon (#2906)
The Start button lives in `DaemonRuntimeActions`, which is rendered in
the per-machine detail pane and only when the selected machine is
flagged `isCurrent`. After the user manually stopped the daemon,
`status.daemonId` went back to undefined, so no machine could be
matched as `isCurrent` — the local row either disappeared (when the
server-side runtime had been GC'd) or moved into the "remote" section
(when it was still present but unmatched). Either way the Start button
was unreachable until the app was restarted.

Two-part fix:

- `DesktopRuntimesPage` now caches the last-known daemonId/deviceName
  so the local match keeps working while the runtime is still on the
  server (recently_lost / offline window).
- `buildRuntimeMachines` accepts an `ensureLocalMachine` flag; when no
  real runtime matches, a placeholder local row is synthesized so the
  Start button still has a home. Desktop opts in via a new
  `hasLocalMachine` prop on `RuntimesPage`. The empty state is also
  suppressed when this prop is set so the placeholder row isn't hidden
  behind the "register a runtime" hint on first launch.

Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 06:16:20 +02:00
Jiayuan Zhang
fc8528d64d feat(autopilot): support assigning to a squad (MUL-2429) (#2888)
* feat(autopilot): support assigning autopilot to a squad (MUL-2429)

Path A (Squad-as-Leader) from the RFC: when an autopilot's assignee is a
squad, dispatch resolves to squad.leader_id and executes against the
leader's runtime — semantics match a human manually assigning the issue
to that squad, no fan-out.

Backend scope only; frontend picker change is a follow-up PR.

Changes:
- 096_autopilot_squad_assignee migration: drop agent FK on
  autopilot.assignee_id, add assignee_type column (default 'agent'),
  add autopilot_run.squad_id attribution column.
- service.AgentReadiness: single source of truth for archived /
  runtime-bound / runtime-online checks. Shared by autopilot
  admission gate, run_only dispatch, and isSquadLeaderReady.
- service.resolveAutopilotLeader: translates assignee_type/id to the
  agent that actually runs the work.
- dispatchCreateIssue: stamps issue with assignee_type='squad' for
  squad autopilots and enqueues via EnqueueTaskForSquadLeader.
- dispatchRunOnly: belt-and-braces readiness re-check after resolving
  squad → leader so a leader that went offline between admission and
  dispatch produces a clean failure instead of a doomed task.
- handler.CreateAutopilot / UpdateAutopilot: accept assignee_type with
  squad/agent existence + leader-archived validation. Backward-compatible
  default of "agent" preserves the contract for older clients.
- Analytics: AutopilotRunStarted/Completed/Failed events carry
  assignee_type and squad_id; PostHog can now group autopilot runs by
  squad without joining back to the autopilot row.

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): reject archived squads, route post-admission skips, cleanup dangling-agent autopilots (MUL-2429)

Addresses three review findings on PR #2888:

1. Archived squad handling: validateAutopilotAssignee now rejects squads
   with archived_at set; resolveAutopilotLeader returns errSquadArchived
   so the admission gate fails closed; DeleteSquad now mirrors the issue
   transfer for autopilot rows (TransferSquadAutopilotsToLeader) so
   surviving autopilots flip to assignee_type='agent' (leader) instead
   of dangling at the archived squad.

2. dispatchRunOnly post-admission readiness: introduces errDispatchSkipped
   sentinel, recognised by DispatchAutopilot via handleDispatchSkip so
   the run is recorded as `skipped` (not `failed`). Manual triggers no
   longer 500 when the leader's runtime goes offline between admission
   and task creation. New TestManualTriggerDoesNotErrorOnPostAdmissionSkip
   locks the behaviour in.

3. Dangling agent assignee after migration 096 dropped the FK:
   shouldSkipDispatch now distinguishes pgx.ErrNoRows / errSquadArchived
   (hard skip — retrying won't help) from transient DB errors
   (fail-open). DeleteAgentRuntime pauses autopilots that target agents
   about to be hard-deleted (ListArchivedAgentIDsByRuntime +
   PauseAutopilotsByAgentAssignees) so the breakage surfaces as a paused
   row in the UI instead of a quiet skip-burning loop.

Unit tests cover the sentinel unwrap contract and errSquadArchived
errors.Is behaviour. Integration test
TestAutopilotDispatchSkipsWhenRuntimeOffline re-verified against a fresh
DB with migration 096 applied.

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): bump last_run_at on post-admission skip (MUL-2429)

Match recordSkippedRun (pre-flight skip) and the success path so the
scheduler / "last seen" UI both reflect that this tick evaluated the
trigger, even when the post-admission readiness gate caught a late
regression.

Addresses Emacs review caveat #1 on PR #2888.

Co-authored-by: multica-agent <github@multica.ai>

* feat(autopilot): mixed agent/squad assignee picker in dialog (MUL-2429)

End-to-end UI for assigning an autopilot to a squad. Closes the PR #2888
backend gap: the squad-as-assignee feature was already wired in Go (Path A,
RFC §4) but the desktop dialog never offered the choice.

- core/types/autopilot: add `AutopilotAssigneeType`, surface
  `assignee_type` on `Autopilot` + Create/Update request payloads.
- views/autopilots/pickers/agent-picker: switch to a polymorphic
  AssigneeSelection (`{type, id}`); render agents and squads as two
  grouped sections with shared pinyin search.
- views/autopilots/autopilot-dialog: maintain `assigneeType` state, send
  it on create/update, render the trigger avatar / hover dot with
  `assignee.type`.
- views/autopilots/autopilots-page + autopilot-detail-page: render the
  assignee row using `autopilot.assignee_type` so squad-typed autopilots
  show the squad avatar + name, not a broken agent lookup.
- locales: add `agents_group` / `squads_group` / `select_assignee` keys
  (en + zh-Hans), keep legacy `select_agent` for callers that still
  reference it.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 05:30:13 +02:00
Jiayuan Zhang
4a487adfeb feat(github): split canView / canManage in settings tab for read-only members (MUL-2413) (#2898)
Wires the frontend half of the read-only RFC. The Settings → GitHub tab
now always issues the installation list query for any workspace member
(the backend gates it via `RequireWorkspaceMember` after PR #2886) and
gets `can_manage` straight from the API response. The render matrix
covers the six cases the RFC calls out:

- configured + connected + admin   → Disconnect + (optional) Connected by
- configured + connected + member  → read-only "Connected to" + read_only_hint
- configured + not connected + admin   → Connect button + dev description
- configured + not connected + member  → contact_admin_to_connect hint
- not configured + admin               → operator banner + disabled Connect
- not configured + member              → contact_admin_to_connect hint

New i18n keys (en + zh-Hans): read_only_hint, connected_by, contact_admin_to_connect.
The unused github.manage_hint string is removed (its non-admin branch
now resolves to one of the two new hints depending on connection state).

GitHubInstallation gains an optional `connected_by` display name so the
UI can render the "Connected by {name}" line without further changes
once the backend exposes the field.

Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 04:19:28 +02:00
Jiayuan Zhang
e48f6a84d6 feat(github): expose read-only installation list to workspace members (MUL-2413) (#2886)
* feat(github): expose read-only installation list to workspace members (MUL-2413)

Relax `GET /api/workspaces/{id}/github/installations` from owner/admin-only
to any workspace member so the Settings → Integrations tab no longer renders
blank for non-admins (the original symptom of MUL-2413).

The handler now reads the caller's role from the workspace middleware:
- owner / admin keep the full row including the numeric `installation_id`
  (the connect / disconnect handle) and receive `can_manage: true`.
- every other role (member / guest) receives rows with `installation_id`
  omitted and `can_manage: false`, giving them visibility into "is GitHub
  wired up?" without the management handle.

`GET /github/connect` and `DELETE /github/installations/{id}` stay under
the admin/owner middleware group — this PR only relaxes the read path.

Tests: `TestListGitHubInstallations_RoleGating` exercises admin, owner,
member, and guest paths against the real DB-backed handler fixture and
asserts the field stripping + `can_manage` contract.

Refs: MUL-2413
Co-authored-by: multica-agent <github@multica.ai>

* fix(github): redact installation_id from realtime broadcasts (MUL-2413)

GET /github/installations strips the numeric installation_id for non-admin
members, but the github_installation:created / uninstall / suspend WS
events were still publishing it, so the same handle was reachable from
any workspace client subscribed to the workspace scope. Broadcast both
payload variants without it — the frontend uses these events only to
invalidate the installations query, so admins re-query the list endpoint
to recover the management handle.

Also adds a router-level test that mounts the production middleware split
(member-visible list vs. owner/admin connect+delete) so a future routing
change can't silently widen the write surface.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 04:17:45 +02:00
Naiyuan Qing
5b8303b83c fix(editor): fill modal viewport in attachment preview (MUL-2431) (#2891)
In the attachment preview modal, image and video previews used
`max-h-full max-w-full`, which let small assets render at their
natural size and leave the modal mostly empty. Switch to
`h-full w-full` so the preview always occupies the modal viewport,
relying on `object-contain` to preserve aspect ratio without
upscaling beyond the intrinsic bounds.

Only touches `packages/views/editor/attachment-preview-modal.tsx`
for the image (line 355) and video (line 373) branches; pdf, audio,
markdown, html, and text branches keep their existing layout.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 09:16:08 +08:00