multica

mirror of https://github.com/multica-ai/multica.git synced 2026-06-17 03:38:32 +02:00

Author	SHA1	Message	Date
Naiyuan Qing	3ce4cf6f2f	fix(lists): navigate rows via onClick, not a nested row anchor (#4146 ) Clicking a row's ⋯ kebab (or any in-row control) full-page reloaded the app. The row was a whole-row <AppLink>, so a child's stopPropagation stopped the event before AppLink's onClick (which calls preventDefault to cancel native anchor navigation and do an SPA push) could run — leaving the browser to perform the native <a> navigation, i.e. a full reload. It was also invalid HTML: interactive content (button/menu) nested in an <a>. Rework all five ListGrid row surfaces (agents, runtimes, skills, autopilots, squads) to a plain <div> row whose whole-row navigation is a mouse onClick (new useRowLink hook): left-click pushes, cmd/ctrl/middle opens a background tab. Interactive cells (checkbox, kebab) stopPropagation so they never trigger row nav — and with no <a> ancestor there is no native navigation to cancel, so the reload class of bug is gone. Names are plain text since the row itself is the click target. projects is unchanged — its inline-editable cells make it a deliberate name-link exception. Also fixes two adjacent defects found in the same menus: - agents/runtimes kebab triggers reused the shared <Button>, which lacks the data-popup-open styling the other surfaces have, so the trigger vanished and lost its background while its menu was open. Switch them to the bare-button trigger with data-popup-open: visible + highlighted. - agents archive menu items used className="text-destructive" instead of variant="destructive", so the base focus style overrode the red on hover. Switch to variant (list row + detail page). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-15 16:56:38 +08:00
Naiyuan Qing	63cf0ed308	feat(lists): rebuild all six list surfaces on a shared Linear-style list grid (#4038 ) * fix(issues): render thread replies in chronological order (#3691) collectThreadReplies walked the parent_id tree depth-first, so an agent reply forced to nest under its trigger comment rendered before earlier sibling replies (A-D-B-C instead of A-B-C-D) whenever the agent returned late. Sort the collected subtree by created_at (id tie-break) so the thread reads in arrival order — the same order the server already feeds agents via `comment list --thread` (ListThreadCommentsForIssue). All other consumers of the array (resolution derivation, fold bars, counts, deep-link) are order-independent. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(skills): rebuild skills list on shared Linear-style list grid - new ListGrid primitives (subgrid: single source of truth for column tracks) - skills list: sortable columns, used-by avatar stack, source/creator columns, row kebab + batch toolbar with add-to-agent and delete - skill view store in core; addAgentSkills client method; HoverCheck extracted to views/common (issues header now imports the shared copy) - locale keys for list actions/filters and the reworked detail page Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(skills): rework detail page into overview/files tabs - tabs directly under the breadcrumb header: overview (default) and files - overview: identity block + rendered SKILL.md as the main column, right rail with metadata card (source/creator/updated, inline name+description edit toggle) and used-by panel with bind/unbind - files: file tree + viewer/editor unchanged; SKILL.md "edit" jumps here - header kebab menu (copy skill ID, delete); page-level save bar shared by both tabs; tab state persisted in ?tab= - file tree: ARIA tree roles + roving-tabindex keyboard navigation - drop the old right sidebar (metadata dl, permissions paragraph) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * revert(skills): restore detail page to main, keep branch list-only Drop the overview/files tabs rework from this branch so the PR scope is the list rebuild only. skill-detail-page.tsx and file-tree.tsx are back to the main versions; the locale detail/file_tree sections are restored to match. The detail rework is preserved on stash/skills-detail-tabs for a follow-up PR. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(skills): drop description column from skills list Description is agent-facing routing metadata, not a scannable list property — Linear's display options expose no description column for the same reason. Removes the cell, column key, display toggle, lg grid track, skeleton cells, and the now-dead table.description / table.no_description locale keys. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(skills): drive list column hiding by container width, drop by priority Replace viewport sm:/lg: breakpoints with Tailwind v4 container query variants (@2xl/@4xl) on the list wrapper, so an open sidebar or split pane narrows the column set instead of squashing tracks. Remove the min-w-fit + overflow-x-auto horizontal-scroll fallback: when space runs out, low-priority columns (created/source/creator, then updated) drop and return as the container widens; name and usedBy never drop. ListGrid conventions comment updated — this is the template for all list pages. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(skills): virtualize list rows with @tanstack/react-virtual Linear-style headless virtualization: the virtualizer computes the visible index range and offsets; offsets land as padding on the scrolling ListGridBody so mounted rows stay direct subgrid children and column alignment is untouched. Fixed 48px rows skip per-row measurement. Hideable column tracks move from max-content to deterministic widths (CSS vars) — with only the visible slice mounted, content-driven tracks would resize during scroll. A user-hidden column zeroes its var so the track still collapses; per-cell max-w caps move into the tracks. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(skills): list tiers must fit their container trigger width The @4xl tier's track sum (~1080px with gaps) exceeded its 896px trigger; with the horizontal-scroll fallback gone, the right-side columns were clipped unreachably between 896-1080px. Move tier 3 to @5xl (1024px), trim usedBy/source/creator tracks, and document the fit invariant with its arithmetic next to the template and in the ListGrid conventions. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(skills): show description as subtext under the skill name Lives in the name track as a second truncated line (max-w 36rem, title attr for the full text) — no track, no header, no slot in the responsive arithmetic. Both lines fit the fixed 48px row, so the virtualizer contract is untouched; rows without a description center the name. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * Revert "feat(skills): show description as subtext under the skill name" This reverts commit `f39721301b`. * fix(skills): anchor batch toolbar to the page, not the viewport fixed bottom-6 left-1/2 centered the bar on the window; with the sidebar open the list's visual center sits ~120px right of the window center, so the bar looked off-center (worse with desktop split panes). Page root becomes the positioning context (relative) and the bar uses absolute — same rule applies to future list pages. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(skills): show matching count next to search while list is narrowed "n / total" appears right of the search box only when search or filters are active — idle state would duplicate the total already in the page header. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(autopilots): derive trigger kinds, next run, last run status in list The list endpoint only selected the autopilot table, so the list UI could not answer "is this automation working" without N+1 detail calls. Each list row now carries trigger_kinds + next_run_at (enabled triggers only — the columns describe how it fires today) and last_run_status (most recent run). Fields are omitempty and absent from detail/create/update responses; clients must treat them as optional per the API compatibility rules. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(autopilots): list schema, parsed client, and view store in core - listAutopilots now runs through parseWithFallback with a zod schema (this endpoint was a bare fetch — overdue per the API compatibility rules); malformed bodies degrade to an empty list, old-server rows without assignee_type or the new derived fields parse cleanly, and enum drift passes through as plain strings - Autopilot type gains the three optional list-only derived fields - New autopilots view store (scope/sort/columns/filters, persisted per workspace): status is the promoted scope dimension so it does NOT appear in filters — one dimension lives in exactly one place Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(autopilots): rebuild list on shared ListGrid with scope buttons Same skeleton as the skills list (container-query tiers, deterministic var-width tracks with documented fit arithmetic, virtualized 48px rows, sortable headers, filter + display toolbar, page-anchored batch toolbar), plus the autopilots-specific pieces: - Status is the promoted SCOPE dimension: 全部/运行中/已暂停/已归档 segmented buttons with full-set counts; "all" = active+paused (archived gets its own visible home, Linear archive semantics); status is therefore absent from the filter dropdown - Columns: name (paused marker inline), assignee (agent/squad), trigger kind badges, last run (outcome dot + time, enum-drift safe default), next run; mode/creator/created opt-in hidden - Filters: assignee, trigger kind, mode, creator (composite type:id values for polymorphic actors); sort name/lastRun/nextRun/created with lastRun desc default - Row kebab (pause/resume/archive/unarchive/delete) and batch toolbar share one delete dialog; status changes ride useUpdateAutopilot's optimistic cache - Fix noUncheckedIndexedAccess errors the branch had never typechecked (skills virtual rows, UsedByCell, added_toast) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(autopilots): scope buttons follow the issues header pattern Replace the bespoke segmented-pill control with the existing scope button convention from the issues page: outline buttons with bg-accent active state on md+, collapsing to a radio dropdown below md. Counts stay (stage inventories from the full set). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(skills,autopilots): toolbar small-screen treatment follows issues header Below md: the search box (and its result count) disappear entirely, and the filter/display controls collapse to square icon-only buttons (labels and the clear-X are md+), matching the issues header's responsive pattern. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(skills,autopilots): two-zone columns — WYSIWYG with scroll escape valve Static width tiers silently hid user-enabled columns (toggle on, nothing appears — autopilots' mode/creator/created sat behind a 1280px container gate no laptop reaches; skills' source/created behind 1024px). Tiers can't know how many columns are enabled, so the mechanism is replaced, not retuned: - ≥@2xl container: every enabled column renders; the grid carries min-width = Σ(enabled tracks + gaps) (pure constants, no measurement) and the wrapper scrolls horizontally only when the enabled set outgrows the container - <@2xl: static core set (skills: name+usedBy; autopilots: name+assignee), no scroll, toggles don't apply Per-tier templates and the hand-maintained fit arithmetic retire; ListGrid conventions updated accordingly. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(skills,autopilots): widen name column minimums (120px base, 200px wide) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(autopilots): drop the archived scope and the list search box Archiving never existed as a UI flow (the DB status value is only reachable via direct API; the detail page disables its switch when archived), so the list stops inventing it: no archived scope, no archive/unarchive row or batch actions. API-archived rows are excluded everywhere; a persisted retired scope value falls back to "all". The search box goes too — scope buttons already partition the small set, search is redundant (product call). Skills keeps its search (no scope there). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(skills,autopilots): quiet outline create buttons in page headers Page-header chrome shouldn't carry the loudest element on the page: the create button becomes outline with text on md+ and collapses to a square plus icon below md (same responsive treatment as the toolbar controls). Primary stays reserved for empty-state CTAs. Agents follows when its list migrates to ListGrid. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(agents): rebuild list on shared ListGrid with identity rows Same skeleton as the skills/autopilots lists (two-zone container responsiveness, deterministic var tracks + min-width scroll escape valve, virtualized fixed-height rows, issues-style scope buttons, page-anchored batch toolbar, quiet outline create button), plus the agents-specific decisions: - Identity rows: the documented exception to the single-line rule — avatar + name + description two-line cells, 64px rows (agents are few, identity-rich entities); the italic "no description" placeholder is gone, empty descriptions just center the name - Scope: Mine (historical default) \| All \| Archived with full-set counts; archived ignores the ownership lens; no search box - The 7d sparkline column is replaced by a sortable "Last active" column derived from the same 30-day activity buckets (zero API change) — per-row-normalized mini bars can't be compared across rows, and the default sort finally has a visible anchor; the detailed histogram stays on the hover card / detail page - Workload folds into the status cell ("Online · 2 tasks") — a 0-2 integer doesn't earn a column - Columns: status, runtime, last active, runs (30d); model/created opt-in hidden; filters: availability, runtime - Operations unchanged: row kebab reuses AgentRowActions (cancel-tasks/duplicate/archive/restore with permissions); batch archive (confirmed) + restore; no delete — the API has none - View store extended (scope incl. archived, sort, columns, filters); agent-columns.tsx (DataTable columns) deleted Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(agents): trim status track to its real worst case (160 -> 144px) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(runtimes): machine detail's runtime table on the shared ListGrid The master-detail console keeps its shape (machines are few and strongly categorized; left list, charts, update section untouched) — only the right pane's runtimes table moves from TanStack DataTable to the ListGrid family, taking the paradigm pieces that earn their keep at 1-5 rows: subgrid template + var tracks, two-zone container responsiveness (the pane is squeezed by the machine list, so the core-set collapse below @2xl matters more here than on full-width pages), min-width scroll escape valve, shared header/row/hover visual language. Deliberately NOT taken: virtualization, sorting, filters, column toggles, and batch selection — dead weight at this row count, and batch-deleting runtimes (a cascade-confirm operation) is unsafe by design. Workload folds into the health cell ("Online · Working 2") like the agents status cell; the owner column keeps its only-when-multiple- owners rule via a zeroed track var. runtime-columns.tsx is deleted; the row-menu/CLI tests render the exported cells directly. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(runtimes): collapse the kebab track when no row has actions On a healthy local machine every row's only action (delete) is hidden by the self-healing rule, leaving a permanent ~64px dead zone after the CLI column. The action track now follows the owner column's conditional-var mechanism: zeroed unless at least one row will show the menu. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(runtimes): drop doubled header border, align create button with convention PageHeader already carries border-b; the content wrappers' border-t stacked a second line right under it (the only list page doing this). "Add a computer" follows the chrome-button convention: outline with text on md+, square plus icon below md — primary stays reserved for the empty state CTA. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(runtimes): health cell load suffix matches the agents status cell "Healthy · 2 tasks" instead of the old workload vocabulary ("Working 2 +1q") — the count is unit-bearing and both surfaces now speak one language. The queued-anomaly distinction the old words hinted at belongs to the health layer if it ever earns surfacing. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(lists): pin overflow-y-hidden on the horizontal-scroll wrappers CSS coerces overflow-x:auto into overflow:auto on both axes, which silently armed the list wrappers with a vertical scrollbar they were never meant to have. Combined with the h-full grid's percentage resolution across scrollbar-induced reflows, the wrapper's vertical bar and horizontal bar fed each other in a non-converging layout loop (visible as two stacked, flickering scrollbars on the agents list — the same latent loop exists in all four wrappers; agents' wider min-width and 64px rows just hit the trigger zone first). Vertical scrolling belongs solely to ListGridBody; declare overflow-y-hidden explicitly to break the loop. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(agents): single scroll container for the list (trial before rollout) Both scroll axes move to the outer wrapper; the grid drops h-full and the rows wrapper drops its own overflow. Kills the percentage-height bridge between the two scroll elements that fed the flickering double scrollbars and clipped the last row under the horizontal scrollbar. Sticky header pins inside the scroller; vertical scrollbar now spans the full pane (Linear's structure). Skills/autopilots follow after visual confirmation. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(lists): roll single scroll container out to skills/autopilots, add bottom clearance ListGridBody retires its own scrolling entirely (the agents trial confirmed the structure): both axes live on the single outer wrapper, grids drop the h-full percentage bridge, virtualizers point at the wrapper. The rows wrapper gains LIST_GRID_BOTTOM_CLEARANCE (64px) appended to the virtualization padding so the last row scrolls clear of the chat FAB (~48px at bottom-right) and the batch toolbar (~62px). Runtimes' machine table is untouched: content-height at the top of a tall pane, no bridge and no practical FAB overlap. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(squads): rebuild list on shared ListGrid (identity rows, minimal) The last list joins the family. Squads are the fewest entity (1-5 rows), so this is the agents identity-row shell on the runtime-list minimal skeleton: ListGrid subgrid + var tracks + two-zone responsiveness + single scroll container, but NO virtualization, checkbox, or batch. - Identity two-line rows (squad avatar + name + description, 64px) like agents; columns: name / leader / members (polymorphic ActorAvatar stack from member_preview), creator + created opt-in hidden - Scope Mine/All (creator-based, issues-header styling, <md dropdown); no archived scope (list API hard-filters archived + no restore endpoint), no search (scope-bearing), no filters (set too small) - Sort name (default) / members / created - Row kebab = Archive (= the delete endpoint, which archives + transfers issues/autopilots to the leader); workspace owner/admin only, so the kebab track collapses for non-admins. Reuses the existing archive_dialog copy. No batch. - View store extended (scope + sort + columns); zero API change — pure frontend (member_preview/count already in the list payload) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(agents,squads): owner/created-by columns + owner filter Surface ownership as a real column on both lists, named by what the field actually means in each permission model: - Agents: "Owner" — owner_id is the creator (set at creation, never transferred) and carries management rights. Promoted to a default- visible column (avatar + name); the half-baked inline owner avatar in the name cell is removed ("You" badge stays). - Squads: "Created by" (NOT Owner) — creator_id holds no rights (archiving is workspace-admin only), so Owner would mislead. Now a default-visible column with avatar + name. Agents also gains an Owner filter, kept orthogonal to the Mine scope by the single-axis rule: "Mine" is the clean no-filter personal view, so applying any filter (owner or otherwise) leaves Mine for All, and clicking Mine clears all filters. Owner and Mine therefore never coexist — no "mine + owner=someone-else = empty" contradiction. Squads keep the plain Mine/All toggle (too few rows for a creator filter). Both lists keep a Created (date) column, opt-in hidden. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(agents): backfill new filter dimensions on rehydrate (owners crash) A view payload persisted before the owners filter existed overwrote the default filters wholesale on rehydrate, dropping filters.owners to undefined and crashing the list's filter predicate (.length on undefined). The store merge now deep-merges filters over EMPTY_AGENT_FILTERS so newly-added dimensions always get their default. Regression test added. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(skills,autopilots): deep-merge filters on rehydrate too Same latent crash the agents store just hit: the copied view-store merge spread persisted.filters wholesale, so adding a new filter dimension later would drop it to undefined for users with older persisted state. Harden skills and autopilots the same way (merge over their EMPTY__FILTERS) before that bug can ship. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> feat(projects): rebuild table view on ListGrid + filters + pin/delete kebab Projects is the dual-view list: the compact table moves onto the shared ListGrid (subgrid tracks, two-zone responsiveness, single scroll container, FAB bottom clearance) while the comfortable card grid stays as the alternate view, toggled by a restyled view switch (Table/Cards outline buttons, active = bg-accent). Inline editing is preserved — rows are NOT whole-row links; the name navigates and status/priority/ lead stay click-to-edit (matching prior behaviour, no navigate-vs-edit conflict). - View store extended: viewMode + sort (name/priority/status/progress/ created) + hidden columns + filters (status/priority/lead); merge deep-merges filters (migration-safe). No scope (lead optional/often an agent; status is a 5-value lifecycle → filter, not scope). - Toolbar: search (kept — scopeless list) + result count + Filter (status/priority/lead) + Display (sort+columns, table view only). - Row kebab: Pin/Unpin (any member, reuses the existing project pin API — zero new endpoints) + Delete (workspace admin). Pin is the flexible per-user favourite the list previously lacked. - Zero API change; status/priority filtering is client-side like the other lists. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(projects): GRID_COLS must be a literal string (Tailwind can't see interpolation) The table view's grid-cols template interpolated ${STATUS_WIDTH}px, so Tailwind never generated the arbitrary-value class — the grid collapsed to one column and every cell stacked vertically. Inline the literal 116px. This is the documented ListGrid rule (keep the class literal so Tailwind scans it). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(projects): single view-toggle button, decouple Display from view mode Two fixes from the same principle — view mode is pure presentation and must not couple to anything: - The view switch is now ONE button that flips table ⇄ cards (shows the current view's icon+label, tooltip names the target), instead of two side-by-side buttons. - The Display (sort/columns) control no longer disappears when you switch to cards — it was gated on isCompact, so flipping the view made it vanish (the "filter gone after switching" weirdness). It's always present now; only the columns section inside the popover is table-only (cards have no columns). Sort applies to both views. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(projects,squads): projects multi-select + squads FAB clearance/toast Cross-list consistency audit fixes: - projects: add multi-select (checkbox column + select-all header + page-anchored batch toolbar) — it's a dozens-scale full-page list like skills/autopilots/agents but was the only one missing it. Batch ops: Pin all (any member) + Delete (workspace admin). Table view only (cards have no checkboxes). GRID template + min-width updated for the checkbox track. - squads: add the FAB bottom clearance the other full-page lists have (last row/kebab was sliding under the chat FAB). - squads: archive success toast was showing the dialog's question title ("Archive this squad?"); use a proper "Squad archived" key. Intentional and left as-is (documented): squads/runtimes have no multi-select/virtualization (1-5 rows); projects table isn't virtualized yet (dual-view + card grid; tracked as low-risk debt). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(agents,squads): close the filter/column consistency gaps Apply the principle "every categorical column is filterable" where it was missing: - agents: add a Model filter (model was a categorical column with no filter). Distinct non-empty models from the in-scope rows. - squads: add filters entirely (it had leader/creator columns + a column-toggle panel but no Filter button — the only such outlier). Leader (agent) + Creator (member) filters, with the result count and the same Filter dropdown shape as the other lists. Store gains SquadListFilters + toggleFilter/clearFilters + migration-safe filters deep-merge. autopilots creator stays default-hidden per product call (not every "who made it" must be visible). Filter stores' partialize tests updated. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(autopilots): match list-page root to flex-1 convention skills/agents/projects roots use `relative flex flex-1 min-h-0 flex-col`; autopilots used `h-full`. Both anchor the batch toolbar correctly, but align the flex sizing for consistency across the six list surfaces. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-15 14:12:24 +08:00
YOMXXX	34d4cd3a28	feat(openclaw): support connecting to existing OpenClaw gateway (#3260 ) [MUL-3158] (#3664 ) * feat(openclaw): support connecting to existing OpenClaw gateway (#3260) When the daemon host is a lightweight dev machine or CI coordinator, the heavy agent work (LLM inference, code execution, tool use) often belongs on a more powerful remote server already running an OpenClaw gateway. Multica historically hard-coded `openclaw agent --local`, forcing every turn to execute in-process on the daemon host. This change adds an opt-in gateway routing mode controlled per-agent via `runtime_config`: { "mode": "gateway", "gateway": { "host": "...", "port": 18789, "token": "...", "tls": false } } - Backend: ExecOptions gains OpenclawMode + OpenclawGateway; buildOpenclawArgs drops `--local` when mode == "gateway". Per-task openclaw-config.json wrapper pins gateway.{host,port,auth.{mode,token},tls} so users do not need to edit the daemon host's `~/.openclaw/openclaw.json` to point at a different endpoint. - Daemon: AgentData carries the raw runtime_config; decoding is fail-soft (malformed JSON falls back to local mode rather than blocking dispatch). - API: gateway.token is masked to "**" on every GET; PATCH replays the sentinel back, and the update handler restores the persisted token so the round-trip never destroys the secret. Defense-in-depth masking on WS broadcasts, plus String/MarshalJSON masking on the in-memory struct to block stray `%+v` / json.Marshal leaks. - UI: openclaw-only "Routing" tab on the agent detail page with mode selector + structured endpoint form. Token uses a "saved — submit a new value to rotate" UX and matching backend preserve hook. Empty `runtime_config` keeps the historical embedded behaviour, so existing agents are unaffected. fix(openclaw): address #3664 review — drop dead gateway field, gate pin on mode Per Bohan-J's review: - Remove the dead ExecOptions.OpenclawGateway field (+ its String/MarshalJSON and the daemon.go construction block). It carried the plaintext bearer token but was never read — buildOpenclawArgs only consumes OpenclawMode and the live gateway path runs through execenv.OpenclawGatewayPin — so this narrows the secret's footprint. - Gate the gateway pin on mode=="gateway" in decodeOpenclawRuntimeConfig: a {"mode":"local","gateway":{...,"token"}} payload no longer writes the token into the 0o600 per-task wrapper that --local makes openclaw ignore. - Warn on an unrecognized non-empty mode (e.g. "gatway") instead of silently falling back to local. - Run preserveMaskedGatewayToken in CreateAgent too, so a literal "***" at create time can't persist as a real bearer token. - Document the gateway host:port trust boundary (SSRF note for shared daemon hosts). Adds regression tests for the local-mode pin drop and the unknown-mode warning.	2026-06-13 15:33:28 +08:00
Bohan Jiang	f415099c4a	MUL-3263: support managed MCP config for Cursor (#4081 ) * feat: support managed MCP config for Cursor Co-authored-by: multica-agent <github@multica.ai> * fix: address Cursor MCP review feedback Co-authored-by: multica-agent <github@multica.ai> * docs: include Cursor in skills MCP support Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-13 02:07:00 +08:00
Bohan Jiang	1ddf89a8f2	feat(daemon): enable Antigravity (agy) per-agent model selection (MUL-3125) (#3894 ) * feat(daemon): wire agy --model and model discovery for Antigravity agy 1.0.6 added a --model flag and an `agy models` catalog command, which were the #1 blocker in the earlier agy-backend review (MUL-3125). The antigravity backend already shipped but deliberately dropped opts.Model because agy 1.0.1 had no way to select a model. - buildAntigravityArgs now passes --model <display name> when opts.Model is set; the value is the exact `agy models` display string (spaces + parens), passed as a single exec arg so no shell quoting is needed. - Block --model in custom_args so it can't override the managed value. - ListModels("antigravity") enumerates via `agy models` (no static fallback: agy silently no-ops on unrecognised models, so a stale guess would turn a typo into a successful empty run). - ModelSelectionSupported now returns true for every built-in provider; the hook stays for any future model-less runtime. - Daemon probe reads MULTICA_ANTIGRAVITY_MODEL for the daemon-wide default. Co-authored-by: multica-agent <github@multica.ai> * docs(providers): mark Antigravity model selection as supported Antigravity gained --model in agy 1.0.6 (MUL-3125). Update the provider matrix + prose (en/zh/ja/ko) from "managed internally / no --model" to dynamic discovery via `agy models`, and refresh the now-stale picker comments. Flag the display-string (not slug) shape and agy's silent no-op on unrecognised values. Co-authored-by: multica-agent <github@multica.ai> * fix(daemon): reject unknown Antigravity model at spawn (MUL-3125) agy exits 0 with empty output on an unrecognised --model, so a stale/typo'd value would surface as a 'completed' but empty task. Validate opts.Model against the `agy models` catalog in Execute before spawning: a non-empty model the CLI does not advertise fails fast with an actionable error listing the real choices. opts.Model is the single funnel for agent.model and the MULTICA_ANTIGRAVITY_MODEL default, so this one check covers every source (UI free-text, API, persisted value, env) — addressing Elon's review that a UI-only guard is bypassable. Validation is fail-OPEN: if the catalog can't be discovered we pass the value through and let agy resolve it, so a discovery hiccup never blocks a run. Pure antigravityModelError() is unit-tested (valid / unknown / near-miss / empty-model / empty-catalog); verified live against real agy 1.0.6. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-08 15:32:53 +08:00
Naiyuan Qing	a02b3dfb4a	feat(issues): move agent live signal into the issue-detail header (#3879 ) * feat(issues): move agent live signal into the issue-detail header Replace the in-body sticky "agent is working" card (AgentLiveCard) with a compact chip in the issue-detail header, so the live signal sits in one fixed place and never competes with sticky banners in the content column. - New IssueAgentHeaderChip: avatar(s) + live-ticking blue elapsed time; click opens a popover listing every active task. - Popover reuses ExecutionLogSection's ActiveTaskRow (now exported) so the popover and the right panel are literally the same row — no duplication. - PopoverContent gains an optional keepMounted so the row's confirm dialog survives the popover closing on Stop. - Running rows in ExecutionLogSection drop the blue spinner for a live-ticking blue elapsed timer (panel + popover share this). - Source the chip from the workspace agent-task snapshot filtered by issue (same source as board/list indicators, zero extra network); delete the old AgentLiveCard + its test and its heavy per-issue WS machinery. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * feat(issues): live event count on the agent chip + execution-log rows Show a live "N events (elapsed)" on running agents, consistent across the header chip, its popover rows, and the right-panel execution log. - Read the shared per-task message cache (taskMessagesOptions, kept live by useRealtimeSync's global task:message handler) instead of a bespoke subscription — one source of truth, deduped across chip / popover / panel / transcript, no extra WS wiring. - Extract <RunningStat> (event count in info-blue + elapsed in muted parens) so all surfaces render the running stat identically. - ExecutionLogSection running rows now show the same "N events (elapsed)"; the transcript opened from them streams live from the shared cache. - Chip: single running shows events (elapsed); multiple shows "N working". - i18n: add agent_live.event_count (4 locales). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 14:51:20 +08:00
Naiyuan Qing	ef8dabd35d	feat(lark): split agent integration UI into inspector status + tab actions (#3830 ) The agent Lark binding surfaced the same connect/disconnect affordance in two places on one page — the left inspector's INTEGRATIONS section and the right pane's Integrations tab both rendered the full LarkAgentBindButton, so the destructive Disconnect lived in two spots. Split by role: - Inspector (left): a compact, read-only status row (green dot + region chip + "Connected to Lark") that deep-links into the Integrations tab. New LarkAgentBotStatusRow, opted into via LarkAgentBindButton's onShowConnectedDetails prop. - Integrations tab (right): keeps the full badge, now the single home for Manage / Disconnect. The badge itself is reworked to a two-row layout — status (left) + soft `destructive`-variant Disconnect (right) on row 1, "Manage in Lark" demoted to a muted secondary link on row 2. Cross-sibling navigation goes through a one-shot navIntent channel on AgentOverviewPane that routes via requestTabChange, so the unsaved-changes guard still fires when jumping from the inspector. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-05 17:35:52 +08:00
Bohan Jiang	d98fc85088	feat(agents): Integrations tab with Lark Bot bind entry + Lark Bot docs (MUL-2988) (#3751 ) * feat(agents): add Integrations tab with Lark Bot bind entry The agent detail page now has an Integrations tab alongside the inspector's Integrations section. It reuses the shared LarkAgentBindButton so the scan-to-bind / already-connected logic stays single-sourced, and adds the not-configured / coming-soon / members-only states the sidebar has no room for. The tab only appears once the deployment has Lark configured. MUL-2988 Co-authored-by: multica-agent <github@multica.ai> * docs: add Lark Bot integration guide Covers binding a Multica agent to a Lark Bot (scan-to-install), using it (DM / @-mention / /issue), management, permissions, and self-host setup. Added in all four locales under the Integrations nav section. MUL-2988 Co-authored-by: multica-agent <github@multica.ai> * fix(agents): show bound Lark state when install_supported is false install_supported governs only whether NEW scan-installs can complete; already-installed bots stay manageable when the transport is unwired (server/internal/handler/lark.go). LarkAgentBindButton checked the install_supported gate before the existing-installation check, so a bound agent on such a deployment showed 'coming soon' / nothing instead of 'Connected + Manage in Lark'. Reorder the guard (existing active install → badge, before the install_supported gate) and mirror it in the new Integrations tab. Adds regression tests for both surfaces. MUL-2988 Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-06-04 15:48:22 +08:00
Bohan Jiang	8c98940b79	Lark Bot integration MVP: migration + service boundary (MUL-2671) (#3277 ) * feat(db): add Lark integration migration (MUL-2671) Introduces seven tables for the 飞书 Bot integration MVP — per-agent PersonalAgent installations, user/chat bindings, inbound dedup + non-content drop audit, outbound card mapping, and short-lived single-use member binding tokens. Schema notes: - chat_session schema unchanged; Lark routes through a separate binding table rather than adding a metadata JSONB column. - Outbound card mapping is task/message scoped so multiple runs on the same session can't stomp each other's cards. - lark_inbound_audit stores routing / identity / drop_reason ONLY, never message body — the audit channel for unbound users and group messages that don't address the Bot. - app_secret stores ciphertext (encryption helper lands in a follow-up commit on this branch); DB never sees plaintext. Co-authored-by: multica-agent <github@multica.ai> * feat(util): add secretbox AES-256-GCM helper for at-rest secrets First consumer is lark_installation.app_secret (MUL-2671 §4.4), but the helper is intentionally generic — future per-tenant secrets that must not appear in a DB dump can reuse it. Construction: AES-256-GCM with a per-message random nonce, providing authenticated encryption. Tampered ciphertext fails Open instead of silently decrypting to garbage. Master key loaded from a base64 env var via LoadKey; key rotation is not in scope yet. Co-authored-by: multica-agent <github@multica.ai> * refactor(issues): extract IssueService.Create as single create entry (MUL-2671) Establishes the service-layer boundary mandated by Elon's 二审 of MUL-2671 §4.8: issue creation no longer lives inside the HTTP handler. Both the HTTP POST /issues handler and the future Lark /issue command call into service.IssueService.Create, so duplicate guard, issue numbering, attachment linking, broadcast, analytics, and agent/squad enqueue stay aligned. Handler responsibilities shrink to parsing the HTTP request, doing actor resolution / validation (transport-specific), and converting service results into the IssueResponse + 201. The transaction-wrapped core, attachment link, event publish, analytics capture, and agent/squad enqueue all move into service.IssueService.Create. A BroadcastPayload callback on the service keeps the WS broadcast shape (the full IssueResponse) without forcing the service to depend on handler-layer response types. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations): add Lark package skeleton (MUL-2671) Establishes the architectural boundaries Elon's 二审 mandated as first-PR blockers without dragging in OAuth, WebSocket, or card-patching code (those land in follow-up PRs): - ChatSessionService interface — channel-aware chat-session entry point for Lark, deliberately separate from the HTTP SendChatMessage handler. The HTTP handler's single-creator guard (creator_id == request user_id) is correct for the browser client but rejects group chat_sessions by construction; Lark needs its own service. - AuditLogger interface — the only path for recording dropped events. Its signature deliberately omits message body, enforcing the drop-audit policy (MUL-2671 §4.7) at the type level: unbound users and non-addressed group messages can't accidentally end up in chat_session. - Typed IDs (OpenID, ChatID) prevent UUIDs from being conflated with Lark-side identifiers at compile time. - DropReason constants align dashboard/audit queries across callers. Co-authored-by: multica-agent <github@multica.ai> * refactor(issues): move parent/project workspace check into IssueService (MUL-2671) Parent existence and project workspace membership now live inside IssueService.Create, inside the same transaction as the duplicate guard and counter increment. The HTTP handler stops re-implementing the lookup; every future create entry (Lark /issue, MCP, API keys) inherits the same boundary without copy-pasting the SQL. Adds two error sentinels (ErrParentIssueNotFound, ErrProjectNotFound) so transports can translate to their own error shapes. Handler-level cross-workspace tests guard the boundary against future regressions. Co-authored-by: multica-agent <github@multica.ai> * fix(db): harden Lark migration safety底座 — TTL cap + workspace FK (MUL-2671) Two storage-layer hardenings that move the must-fix line off "the app layer enforces it" and onto the schema itself, so future write paths or hand-inserted rows cannot regress the invariants. 1) lark_binding_token TTL cap. The DB CHECK was 1 hour as defense-in-depth while the app constant was 15 minutes; the CHECK now matches the product cap (15 minutes). Application constant docstring updated to reflect that storage enforces the same bound. 2) lark_user_binding workspace membership. The table previously only FK'd to workspace / user / installation independently, so a binding could exist for a user no longer in the workspace, or claim a workspace different from its installation's. Two composite FKs close the gap structurally: * (installation_id, workspace_id) → lark_installation(id, workspace_id) — guarantees a binding's workspace_id always matches its installation's workspace_id. A new UNIQUE (id, workspace_id) on lark_installation is added as the FK target. * (workspace_id, multica_user_id) → member(workspace_id, user_id) with ON DELETE CASCADE — when a user is removed from the workspace, the binding cascades away in the same transaction. There is no longer a path where lark_user_binding outlives workspace membership. These two FKs are the schema-level proof for §4.3's "unbound or non-workspace members cannot leak content into chat_session" invariant. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): inbound services + /issue dispatcher (MUL-2671) Lands the inbound service layer for the Lark Bot MVP, sitting on top of the migration + service-boundary scaffold from the previous commits. What ships: - sqlc queries for all seven lark_* tables (idempotent dedup insert, CAS WS-lease, single-use binding-token consume, etc.) plus GetMostRecentUserChatMessage for the /issue fallback. - AuditLogger backed by lark_inbound_audit; signature deliberately body-free so callers cannot leak content into the drop log. - ChatSessionService: find-or-create chat_session via the binding table (winner-takes-all on the UNIQUE race), append-with-dedup, /issue parser, "previous user message" fallback for bare `/issue` invocation. - Dispatcher orchestrates the inbound pipeline in one place: installation routing → group-mention filter → identity check → ensure session → append+dedup → /issue → enqueue chat task. Group sessions use the installer as creator (stable workspace identity); p2p uses the sender. Agent-offline path falls through with OutcomeAgentOffline so the WS adapter can reply with the offline notice from §4.6. - BindingTokenService: random URL-safe token, SHA-256 stored hash, 15-min TTL pinned at the application AND the DB CHECK; Redeem returns the same opaque error for all rejection cases (no timing oracle on replay). - Unit tests for the parser (13 cases), dispatcher (8 cases via fake Queries/Chat/Audit/IssueCreator/Enqueuer), and binding-token hash/entropy. Real-DB integration tests for OAuth + token redeem land alongside the HTTP handlers in the next commit. Out of scope for this commit (next ones on the same feature branch): OAuth callback, HTTP routes, WebSocket hub, outbound card patcher, frontend. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): installation HTTP surface + secretbox-gated wiring (MUL-2671) Lands the HTTP boundary on top of the inbound services from the previous commit. What ships: - InstallationService.Upsert: the only path that writes lark_installation. Encrypts app_secret with the secretbox passed in at construction time; refuses to fall back to plaintext storage (returns an error from the constructor if no Box is supplied), so a misconfigured dev environment cannot accidentally land a row with cleartext credentials. Revoke flips status without DELETE so audit trail survives. - HTTP handlers under /api/workspaces/{id}/lark/: * GET /installations — member-visible (Integrations tab renders for non-admins). Soft 200 with empty list + configured:false when MULTICA_LARK_SECRET_KEY is unset, so the tab does not error on self-host that has not opted in. * POST /installations — admin-only; 503 when not configured. Re-validates agent_id ∈ workspace before accepting credentials so a cross-workspace agent UUID is rejected. * DELETE /installations/{id} — admin-only; workspace-scoped lookup so one workspace cannot revoke another's installation by UUID guess. - POST /api/lark/binding/redeem (user-scoped, no workspace context): the only path that mints a lark_user_binding row from user action. Redeemer identity comes from the session, not the token, so a stolen link cannot bind an open_id to an attacker's Multica user. The composite FK on lark_user_binding cascades the binding away if the user is not (or no longer) a workspace member, so a non-member who steals the link gets 403 at the DB layer. - Two new event-bus types in protocol.events: EventLarkInstallationCreated, EventLarkInstallationRevoked. - Router wiring: MULTICA_LARK_SECRET_KEY drives a conditional initialization of h.LarkInstallations + h.LarkBindingTokens. When unset, the integration disables itself with an INFO log and the rest of the server boots normally. - Handler tests cover all four not-configured short-circuits. Happy-path integration tests (real DB, full create→list→revoke cycle and token mint→redeem) ship alongside the WS hub PR. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): close binding-token rebind & typed task errors (MUL-2671) Two must-fixes from PR review on HEAD `87ad15e1`: 1. Binding-token redeem could be used to grab an already-bound Lark open_id. Two changes harden the path: - lark.sql `CreateLarkUserBinding` now gates ON CONFLICT DO UPDATE on `multica_user_id = EXCLUDED.multica_user_id`, so a cross-user rebind via a second valid token returns zero rows instead of silently switching ownership. - `BindingTokenService.RedeemAndBind` consumes the token and writes the binding row inside one transaction. A failed bind no longer burns the token; a successful bind never leaves a consumed-but- unused token. Distinct typed errors: ErrBindingTokenInvalid (410), ErrBindingAlreadyAssigned (409), ErrBindingNotWorkspaceMember (403). The handler maps each to its own status code. 2. Dispatcher collapsed every `EnqueueChatTask` error to `OutcomeAgentOffline`, hiding infra failure and misusing the "offline" label for cases (e.g. archived agent) where it doesn't fit. Now: - `service.EnqueueChatTask` returns `ErrChatTaskAgentNoRuntime` and `ErrChatTaskAgentArchived` as sentinel errors; DB / load / insert failures stay wrapped as ordinary errors. - Dispatcher uses `errors.Is` to map only the productizable cases (`OutcomeAgentOffline`, new `OutcomeAgentArchived`); any other error is returned to the WS adapter so it can retry or page instead of disguising the outage as an offline card. A daemon that's merely disconnected is still NOT an error — as long as `agent.runtime_id` is set the chat task enqueues and waits for the daemon to claim it on next online (returns `OutcomeIngested`). Co-authored-by: multica-agent <github@multica.ai> * ci: re-trigger workflow on lark MVP must-fix HEAD Co-authored-by: multica-agent <github@multica.ai> * ci: re-trigger workflow on lark MVP must-fix HEAD (retry) Co-authored-by: multica-agent <github@multica.ai> * test(integrations/lark): guard binding-token sentinel contract (MUL-2671) Two unit tests that document and protect the must-fix invariants without requiring a DB: 1. TestRedeemAndBindRequiresTxStarter — if a future refactor wires up BindingTokenService without a TxStarter, RedeemAndBind must fail fast with a clear error rather than nil-panic on Begin. The atomicity contract (consume + bind commit together) depends on that transaction existing. 2. TestBindingErrorSentinelsAreDistinct — the HTTP handler maps ErrBindingTokenInvalid → 410, ErrBindingAlreadyAssigned → 409, ErrBindingNotWorkspaceMember → 403. Accidentally aliasing them (e.g. var ErrBindingAlreadyAssigned = ErrBindingTokenInvalid) would silently regress the response codes without any other test catching it. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): WS hub orchestrator + outbound card patcher (MUL-2671) The hub owns one supervisor goroutine per active installation. Each supervisor acquires the WS lease via the existing CAS query, runs an EventConnector (interface — real Lark wire protocol lands in a follow-up behind it), renews the lease on a tighter cadence than the TTL, and backs off (with jitter) on connector failure. Lease loss tears the connector down cleanly; revocation is reaped on the next sweep. Per- process node id satisfies §4.4 multi-replica safety: at most one Hub globally holds the lease for any installation. The patcher subscribes to task / chat-done events on the existing events.Bus and keeps the per-task Lark interactive card in sync (thinking → streaming → final \| error). Card binding is per-task as required by §4.5; throttled patches via an in-memory last-patched map; final / error transitions bypass the throttle so the user always sees the terminal state. The Renderer is plug-replaceable so the product card template can evolve without touching transport. The APIClient interface centralizes the Lark Open Platform surface this package needs (send card, patch card, send binding prompt, exchange OAuth code). The default stubAPIClient returns ErrAPIClientNotConfigured for every transport call so a misconfigured deployment fails loudly instead of dropping cards silently. Real implementation lands in a follow-up; OAuth callback + frontend entries land in the next commits on this branch. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): OAuth install start / callback (MUL-2671) OAuthService builds a signed-state Lark authorization URL the frontend can render as a QR (or open directly), then on callback verifies the HMAC-protected state, exchanges the OAuth code for installation credentials via APIClient.ExchangeOAuthCode, and persists the row via InstallationService.Upsert (which keeps app_secret encryption inside a single chokepoint). State token format: workspaceID.agentID.initiatorID.expiresUnix.nonce.sig — HMAC-SHA256 over the first five fields with a deployment-level secret. TTL defaults to 10 minutes (covered by tests). Three failure modes (invalid state / expired state / missing code) map to typed errors so the HTTP handler can emit a single lark_error= query param the frontend uses to pick copy. Both endpoints degrade cleanly: the at-rest key gate (already in place) returns 503 from /install/start when the InstallationService is nil, and the OAuth gate (MULTICA_LARK_OAUTH_APP_ID / _SECRET / _REDIRECT_URI / _STATE_SECRET) returns configured:false from /install/start so the frontend can render "configure manually instead" without an error banner. /install/callback always finishes with a redirect to /settings?tab=lark carrying either lark_installed=1 or lark_error=<code>. Tests cover signed-URL shape, missing-config rejection, tampered state, expired state, propagated exchange error, and the no-config redirect path on the HTTP handler. Co-authored-by: multica-agent <github@multica.ai> * feat(views/lark): settings tab + agent bind button + /lark/bind redemption page (MUL-2671) Adds the user-facing Lark surface across the shared packages: - packages/core/types/lark.ts — wire shapes that mirror server/internal/ handler/lark.go. Optional fields default to undefined so older desktop builds keep parsing if the server adds new keys (CLAUDE.md → API Response Compatibility). - packages/core/lark/{queries,index}.ts — Tanstack Query options keyed by workspace id; realtime sync invalidates `installations(wsId)` on `lark_installation:` events. - packages/core/api/client.ts — listLarkInstallations, getLarkInstallURL, deleteLarkInstallation, redeemLarkBindingToken. - packages/views/settings/components/lark-tab.tsx — Settings → Lark panel. Listing is member-visible (matches backend); disconnect is admin-only. Empty state points users at the per-Agent bind entry, matching the (workspace_id, agent_id) UNIQUE: there is no "pick an agent" UI here because the bind URL is per-agent. - LarkAgentBindButton (same file) is the per-Agent CTA the Agent detail page imports. Opens the OAuth URL in a new tab; the callback bounces back to /settings?tab=lark with a query param the panel reads for inline confirmation copy. - packages/views/lark/bind-page.tsx — the Bot's "you need to bind" destination. Requires session before redeeming, distinguishes the 410/409/403 backend responses into distinct copy. - apps/web/app/lark/bind/page.tsx — Next.js route wrapping the shared bind page in a Suspense boundary (Next 15 useSearchParams rule). i18n: all user-facing strings land in en/zh-Hans, settings tab nav includes a Sparkles-iconed Lark entry, bind-page copy lives under common.lark_bind so it works pre-workspace-context too. typecheck + lint clean. Co-authored-by: multica-agent <github@multica.ai> chore(integrations/lark): wire outbound Patcher into server bootstrap (MUL-2671) Constructs the Patcher next to the existing Installation/BindingToken wiring in router.go and Register()s it on the event bus. With the stub APIClient any actual transport call surfaces ErrAPIClientNotConfigured; once the real Lark client lands, swap NewStubAPIClient for the real implementation here without touching the Patcher's subscription logic. doc.go updated to reflect everything the package now contains (Hub, Patcher, OAuthService, APIClient interface). The Hub itself is NOT booted here yet — it needs an EventConnector implementation for the Lark long-connection wire protocol, which lands in a follow-up; the orchestrator code and its unit tests are in place so that follow-up can focus on the WS protocol rather than lifecycle plumbing. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): address Elon 二审 5 must-fix items (MUL-2671) - Hub: renewer cancels run ctx on lease loss so the connector exits even if its wire I/O is blocked, keeping the §4.4 ownership invariant intact under lease theft. - Hub: EventEmitter returns (DispatchResult, error) so the real connector can post the matching Lark-side card (needs_binding, agent_offline, agent_archived) and react to infra failures instead of silently logging at the seam. - Dispatcher: top-level message_id dedup runs before group filter and identity check, so a reconnect storm cannot re-fire binding prompts or re-spam not_addressed_in_group audit rows; the in- AppendUserMessage dedup is removed since the table-level UNIQUE is the ultimate backstop. - OAuth: HandleCallback auto-binds the installer via the new InstallerBinder seam (BindingTokenService implements it), so the §2.1 "scan to bind, you're done" promise holds end-to-end. validateExchangeResult now requires installer open_id; new error reason codes wired through the callback redirect. - Frontend / handler: install_supported listing field + StartLark- Install short-circuit on stub APIClient hide install entry points (Settings tab + per-agent button) while no real Lark HTTP client is wired, so users do not land in an OAuth flow that fails at exchange. Includes tests for each fix (lease-loss cancel, emit error propagation, dedup ordering, OAuth installer-bind contract, stub- client install gate) and i18n strings for the new preview state. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): two-phase dedup so infra failures do not swallow messages (MUL-2671) The pre-fix top-level dedup wrote the lark_inbound_message_dedup row before EnsureChatSession / AppendUserMessage. An infra error in either step left the row in place and a WS-adapter retry was mis-classified as a duplicate, so the user's Lark message was permanently lost without ever landing in chat_session. Make dedup two-phase: - ClaimLarkInboundDedup acquires an in-flight claim (processed_at NULL). Stale claims older than 60 s are re-takeable so a process crash does not strand the message_id. - MarkLarkInboundDedupProcessed flips processed_at on durable success (audit row OR chat_message + session touch). - ReleaseLarkInboundDedup deletes the in-flight row on infra failure before any durable side effect, so the retry can re-claim immediately. Dispatcher.Handle now finalizes the claim exactly once based on whether the inner pipeline reached a durable outcome — chat_message commit being the transition point (errors past it Mark, errors before it Release). Regression tests cover the two failure variants Elon flagged plus the inverse invariants (durable-error Marks, drops Mark, in-flight replays drop, stale claims re-claim). Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): owner-fence dedup claim to close the double-write windows (MUL-2671) The two-phase Claim/Mark/Release fix from the previous commit closed the "infra error swallows a replay" gap but left two windows that could still write a chat_message twice for the same Lark message_id: 1. Stale-reclaim race. Worker A claims at t=0, runs slowly past the 60 s staleness TTL but is still alive. Worker B sees the row as stale and re-takes the claim. A reaches AppendUserMessage and commits a second chat_message. 2. Mark window. Worker A commits chat_message but the post-pipeline MarkLarkInboundDedupProcessed fails (DB hiccup) or the process crashes before it runs. 60 s later a retry treats the in-flight row as stale, re-claims it, and writes a second chat_message. Close both with owner fencing + same-tx Mark: - lark_inbound_message_dedup now carries a `claim_token` UUID; ClaimLarkInboundDedup mints a fresh one on insert and on stale re-take, so a reclaim ROTATES the token. - MarkLarkInboundDedupProcessed and ReleaseLarkInboundDedup are fenced on (message_id, claim_token, processed_at IS NULL) and return rowsAffected. Zero means our token is no longer live, and the caller treats it as a no-op (not an error). - AppendUserMessage invokes MarkLarkInboundDedupProcessed INSIDE its chat_message+session tx (qtx). If the token has been rotated by a concurrent reclaim, the Mark matches zero rows and the method returns ErrClaimLost; the deferred Rollback unwinds the chat_message insert, so the other holder is the sole writer. The durable write and the Mark therefore commit (or roll back) atomically — there is no "committed but not yet Marked" window for a crash or retry to exploit. Dispatcher.processClaimed now returns a tri-state dedupFinalize directive (none / mark / release): finalizeNone for the in-tx Mark path (and ErrClaimLost), finalizeMark for audit-drop branches and the defensive post-Append-success fallback, finalizeRelease for pre-durable infra errors. ErrClaimLost is translated into OutcomeDropped + DropReason- Duplicate at the Handle boundary, matching what the WS adapter expects for a "another worker is the writer" outcome. Regression tests: - TestDispatcher_StaleReclaimRaceDoesNotDoubleWrite injects worker B's reclaim via a beforeAppend hook so the claim_token rotates between Claim and AppendUserMessage. Asserts worker A's AppendUserMessage returns ErrClaimLost (no chat_message committed), the dispatcher surfaces a duplicate drop, the token rotated to a value distinct from A's original, and a follow-up replay still duplicate-drops. - TestDispatcher_InTxMarkPreventsPostCommitReclaim verifies the "Mark window" case is unreachable: a successful in-tx Mark produces exactly one Mark call (no post-finalize duplicate), the row is terminal, and a retry with dedupReclaim=true still duplicate-drops without re-rotating the token. - TestDispatcher_InTxMarkSucceedsAndSkipsPostFinalize pins the positive contract: DedupMarked=true must make applyFinalize a no-op (no extra Mark, no Release). fakeQueries gains a fakeDedupRow model carrying (processed, token, rotations) so the test seam matches production's UPDATE-with-WHERE semantics; fakeChat gains a beforeAppend hook to inject race timing. go test ./... and go vet ./... pass. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): real Lark HTTP APIClient for IM v1 send/patch (MUL-2671) Lands the production Lark Open Platform HTTP APIClient that replaces the stub for outbound transport. The patcher's "thinking → streaming → final \| error" card lifecycle and the dispatcher's binding-prompt card both now reach Lark for real once MULTICA_LARK_HTTP_ENABLED=true. Scope of this stage: - tenant_access_token retrieval via /open-apis/auth/v3/ tenant_access_token/internal, cached in-process per app_id with a 60s safety margin against Lark's `expire` value. Sub-2-minute expires are clamped to 120s so we never cache an entry that's already past its safe window. - SendInteractiveCard: POST /open-apis/im/v1/messages?receive_id_type=chat_id returning the Lark message_id the Patcher persists in lark_outbound_card_message for later patches. - PatchInteractiveCard: PATCH /open-apis/im/v1/messages/:id with the full re-rendered card body (Lark's update endpoint replaces, not deep-merges). - SendBindingPromptCard: open_id-targeted interactive card with a primary "去绑定" CTA pointing at the redemption URL. Template is co-located with the transport so the dispatcher never has to know about Lark's card schema. - Token-error invalidation: Lark codes 99991663 (expired) / 99991664 (invalid) drop the cached token so the next call refreshes from /tenant_access_token/internal instead of looping on a stale entry. Out of scope (deferred to follow-up stages): - ExchangeOAuthCode stays unimplemented behind ErrAPIClientNotConfigured. The PersonalAgent install handshake's response shape (returning per-installation app credentials in a single call) is not yet verified against the production endpoint, and a silent mis-fill of OAuthExchangeResult would corrupt lark_installation rows past validateExchangeResult. Operators continue to use the manual-paste InstallationService path until the OAuth stage lands. - Inbound WS EventConnector — Hub's ConnectorFactory still needs a real wire-protocol implementation. Wiring: - MULTICA_LARK_HTTP_ENABLED=true switches router.go from the stub to the real client. MULTICA_LARK_HTTP_BASE_URL overrides the default open.feishu.cn host (set to open.larksuite.com for the Lark international tenant, or to an httptest URL for integration tests). - The OAuth handler now also receives the real client (its ExchangeOAuthCode still surfaces ErrAPIClientNotConfigured, so callback behavior is unchanged until that stage lands). Tests (19 new cases against an httptest.Server fake): - happy path send/patch/binding-prompt round trips, asserting URL query params, body shape, Authorization header - token cache: 3 sends share one /tenant_access_token/internal hit - token refresh after clock-driven expiry - sub-margin expire clamping (10s expire → cached for >= safety margin of wall-clock) - Lark error code surfacing (230001 send, 230002 patch, 10003 auth) - token-expired (99991663) invalidates the cache; caller's retry re-fetches and succeeds - non-2xx HTTP status surfaces "http 500: …" - input validation: missing chat_id short-circuits BEFORE auth round-trip, missing card json / open_id / bind url all fail pre-flight without hitting Lark - ExchangeOAuthCode still returns ErrAPIClientNotConfigured - binding-prompt template carries the BindURL and the localized "去绑定" CTA in valid JSON go build ./..., go vet ./..., and go test ./internal/integrations/lark/... pass. Pre-existing handler/router integration tests that require a real Postgres connection are unaffected by this change. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): split outbound vs OAuth-install capability + card update_multi (MUL-2671) Address Elon's two must-fix items from the HEAD `a09993b1` review: 1. HTTP outbound and OAuth-install are now distinct APIClient capabilities. The new SupportsOAuthInstall() reports whether the install flow can succeed end-to-end (i.e. ExchangeOAuthCode is implemented); the real httpAPIClient still returns IsConfigured() = true (send / patch / binding prompt work) but SupportsOAuthInstall() = false until the PersonalAgent install-time response shape is pinned. Handler-side `install_supported` and StartLarkInstall now gate on SupportsOAuthInstall, so a half-wired client never reveals the scan-to-bind UI. larkOAuthErrorReason also maps ErrAPIClientNotConfigured to a dedicated `oauth_exchange_unimplemented` reason so a raw callback hit no longer masquerades as `internal_error`. 2. defaultRenderer now emits config.update_multi=true on every Kind. Lark refuses to apply PatchInteractiveCard to a card whose initial config doesn't declare it shared/updatable, so the absent flag would make every patch after the first send silently no-op on the wire while the local outbound status row still flipped to streaming/final. Tests cover both halves of each fix: - TestHTTPClient_SupportsOAuthInstall_FalseUntilExchangeLands + TestHTTPClient_StubReportsBothCapabilitiesFalse pin the new capability surface. - TestStartLarkInstall_TransportOnlyClientReportsNotConfigured + TestListLarkInstallations_TransportOnlyClientReportsInstallNotSupported pin the handler gate at exactly the half-wired state. - TestLarkOAuthErrorReason_APIClientNotConfigured pins the mapping for both the bare sentinel and the fmt.Errorf-wrapped form HandleCallback produces. - TestDefaultRendererConfigCarriesUpdateMulti covers every CardKind. - TestHTTPClient_(Send\|Patch)InteractiveCard_DefaultRendererBodyHasUpdateMulti verify the wire body Lark actually receives carries update_multi through both send and patch transport paths. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): real OAuth code exchange + agent-detail bind entry (MUL-2671) Stages the install side of the MVP critical path on top of the real HTTP outbound work: - httpAPIClient.ExchangeOAuthCode runs the production Lark v2 OAuth flow: POST /authen/v2/oauth/token to swap the authorization code for the installer's open_id, then GET /bot/v3/info under the parent app's tenant_access_token to fetch bot_open_id. Result feeds InstallationParams unchanged so OAuthService.HandleCallback's auto-bind step lights up automatically. - HTTPClientConfig gains OAuthAppID/OAuthAppSecret, read from the same MULTICA_LARK_OAUTH_APP_ID/_APP_SECRET env vars the OAuthConfig consumes. SupportsOAuthInstall now mirrors that pair so the install capability gate is honest: outbound transport without OAuth creds reports configured-but-not-install-supported, exactly like before. - Agent detail inspector wires the LarkAgentBindButton in a new Integrations section, viewer-hidden by canEdit. The button still self-hides when SupportsOAuthInstall is false, so a deployment without OAuth creds renders the section empty rather than CTA-broken. - Capability wording cleaned across handler / router / lark-tab to say "OAuth-install capability" instead of "real APIClient wired", and the misleading TransportOnly... test was renamed/refocused on the early-return branch it actually exercises (Elon non-blocking note). Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): identity-only OAuth + atomic bind (MUL-2671) Addresses Elon's round-4 must-fix items on PR #3277: 1. OAuth v2 token → user_info chain now matches Lark's official user-OAuth shape. `httpAPIClient.ExchangeOAuthCode` POSTs /open-apis/authen/v2/oauth/token (RFC 6749: top-level access_token, NO open_id), then GETs /open-apis/authen/v1/user_info with the user_access_token as Bearer to obtain the installer's open_id / union_id. The test fixture now reflects the real wire shape (separate user_info handler; no synthetic open_id in the token response). 2. `OAuthExchangeResult` is identity-only — drops the synthesized shared-parent AppID / AppSecret / BotOpenID return that broke the UNIQUE(app_id) constraint and the dispatcher's per-app_id routing. `OAuthService.HandleCallback` no longer Upserts an installation row: it looks up the lark_installation already provisioned via the manual-paste POST /lark/installations route and binds the installer onto it. Two new typed errors — ErrInstallationNotProvisioned and ErrInstallationRevoked — map to `installation_not_provisioned` / `installation_revoked` reasons at the HTTP boundary so the UI can guide the admin. The PersonalAgent install API (which would deliver per-installation bot credentials at scan time) remains a follow-up; until it lands the OAuth flow is identity-binding only and the agent-detail bind button stays hidden on deployments without OAuth env (capability gate unchanged). 3. The installation lookup + installer bind run inside a single DB transaction so a concurrent revoke / re-provision between the read and the binding insert cannot leak a half-applied state. `InstallerBinder.BindInstaller` is renamed to `BindInstallerTx` and accepts the OAuth-service-owned transaction's qtx; the binding_token redemption path is unchanged. 4. `validateExchangeResult` is simplified to require only the installer's open_id; the obsolete ErrExchangeMissingAppID / AppSecret / BotOpenID sentinels are removed (no caller can trip them now). The oauth_test suite is rewritten to use a stub failTxStarter so tests covering state-token verification and exchange-error propagation remain DB-free, while a new TestOAuthCallbackOpensTxAfterValidExchange pins the post-must-fix order (state ok + exchange ok ⇒ Begin runs before any lookup or bind, and a Begin failure aborts cleanly with no bind). Verified locally: - go build ./... / go vet ./... clean - go test ./internal/integrations/lark/... ✓ - go test ./internal/handler -run 'Lark\|Binding\|OAuth' ✓ - go test ./internal/util/secretbox/... ./internal/service/... ✓ Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): device-flow scan-to-install (MUL-2671) Replaces the manual paste-credentials install path + identity-only OAuth callback (rejected in product review: too many steps before a user sees value) with a true single-step scan-to-install built on Lark's RFC 8628 device-flow registration endpoint (POST accounts.feishu.cn/oauth/v1/app/registration) — the same protocol the official larksuite/oapi-sdk-go/scene/registration package and zarazhangrui/feishu-claude-code-bridge use. User journey: admin clicks "Bind to Lark" on the Agent detail page → QR dialog opens → admin scans in the Lark app on their phone → authorizes the new PersonalAgent → dialog auto-closes with the new installation visible. No app_id / app_secret to copy, no Lark developer console visit, no Multica-side OAuth env to configure. Backend (server/internal/integrations/lark): - registration.go — inline ~280-line RFC 8628 client. Begin posts archetype=PersonalAgent / auth_method=client_secret / request_user_info=open_id; Poll follows the upstream SDK's state machine including the tenant-brand mid-stream domain swap to accounts.larksuite.com when a Lark-international account authorizes. SDK is NOT vendored — one endpoint isn't worth dragging the full oapi-sdk-go + transitive deps. - registration_service.go — owns the in-process session store + background polling goroutine. On success calls APIClient.GetBotInfo (the new IM-side endpoint added below) and writes lark_installation + the installer's lark_user_binding inside one DB transaction so a half-applied install can never land. Stable error_reason codes (expired / access_denied / lark_protocol_error / bot_info_failed / installation_conflict / installer_bind_failed / internal_error) drive the UI copy without parsing prose. - client.go / http_client.go — drops ExchangeOAuthCode and SupportsOAuthInstall (no longer applicable: device-flow returns identity alongside credentials in one response); adds GetBotInfo which mints a tenant_access_token from the freshly-minted client_id / client_secret and calls /open-apis/bot/v3/info for the bot_open_id. install_supported now gates on IsConfigured() (real HTTP client wired) instead of a separate OAuth capability. - binding_token.go — absorbs InstallerBindParams / InstallerBinder (previously in oauth.go), retargets the doc-comment from the OAuth caller to the device-flow caller. - Deletes oauth.go + oauth_test.go entirely. Handler & router (server/internal/handler, server/cmd/server): - POST /api/workspaces/{id}/lark/install/begin — opens a new registration session, returns {session_id, qr_code_url, expires_in_seconds, poll_interval_seconds}. Admin-only. - GET /api/workspaces/{id}/lark/install/{sessionId}/status — polling endpoint, returns {status, installation_id?, error_reason?, error_message?}. Workspace-scoped lookup so a stolen session_id cannot be polled from another workspace. Admin-only. - Removes POST /lark/installations (paste form), GET /lark/install/start (OAuth-redirect entry), and GET /api/lark/install/callback (OAuth redirect target). - Removes MULTICA_LARK_OAUTH_APP_ID / _APP_SECRET / _REDIRECT_URI / _STATE_SECRET / _AUTHORIZE_URL / _SUCCESS_URL env vars. Self-host operators no longer need a parent Lark app at all. Frontend (packages/core, packages/views): - New types BeginLarkInstallResponse / LarkInstallStatusResponse + matching API methods (beginLarkInstall / getLarkInstallStatus); drops getLarkInstallURL. - LarkAgentBindButton opens LarkInstallDialog instead of a window.open() to Lark's authorize page. The dialog uses react-qr-code (catalog) to render the verification_uri_complete inline as SVG (no external CDN image), polls status at the server-supplied cadence, auto-closes on success, offers "scan again" on terminal failure. Per CLAUDE.md "Enum drift downgrades, not crashes", error_reason switch has a default fallback so an older desktop build on a newer server still renders the generic failure copy. - Adds the device-flow strings to en + zh-Hans settings.json; removes the obsolete OAuth-not-configured copy. Verified locally: - go build ./... / go vet ./... clean - go test ./internal/integrations/lark/... — all green (existing tests + 15 new registration / GetBotInfo tests) - go test ./internal/handler -run 'Lark\|Binding' — all green - pnpm typecheck — all 6 packages clean - pnpm lint — 0 errors (15 pre-existing warnings, none in changed files) - pnpm --filter @multica/views test — 859/859 pass Pre-existing failures in server/internal/middleware (column "profile_description" missing from local test DB) reproduce against the parent commit and are unrelated to this change. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): gate bind CTA to workspace admins, terminate QR polling on 4xx (MUL-2671) Two frontend must-fixes from the PR #3277 二审: 1. LarkAgentBindButton now self-hides for non-admin viewers in addition to the existing install_supported check. The agent-detail page mounts the button under `canEdit`, which canEditAgent lets agent owners through even when they are not workspace admins — but the backend gates POST /lark/install/begin and the status poll on owner/admin (router.go:478-487), so the previous behavior shipped a CTA that was guaranteed to 403. The new gate reads workspace role from the same member list the settings tab already uses. 2. The status polling loop now terminates on 404 (session gone — server restarted, multi-instance routing, or in-process GC swept it) and 403/401 (permission revoked mid-session). Previously every error path scheduled another setTimeout, which trapped the user on a stale QR forever. ApiError gives us the HTTP status verbatim; terminal responses set status=error with stable error_reason codes (session_lost, forbidden) that flow through the existing dialog switch + retry/close affordances. 5xx + network blips still retry. i18n: new install_error_session_lost / install_error_forbidden in en and zh-Hans, with default fallback preserved per the enum-drift rule. Coverage: 6 new vitest cases — admin/owner allow, member deny, unsupported-install deny, and the two terminal-error polling paths using fake timers to assert the loop stops scheduling. Also clears a handful of stale OAuth/manual-install doc comments flagged in the review (non-blocker cleanup): doc.go's §10 now points at RegistrationService, installation.go's input-shape doc loses the OAuth-callback half, and client.go's stubAPIClient comments no longer reference OAuth callbacks. Co-authored-by: multica-agent <github@multica.ai> * docs(integrations/lark): describe gate as device-flow install in agent-detail integrations comment (MUL-2671) The comment block above the agent-detail Integrations section still described the capability gate as 'server-side OAuth-install'. The OAuth path is gone — install is now device-flow per RFC 8628 — so the comment now reads 'server-side device-flow install capability gate'. Pure comment change; behavior is unchanged. Cleans up the nit Elon called out in PR #3277 二审 (MUL-2671). Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): wire inbound pipeline + WS Hub at boot (MUL-2671) Stage 3.a of MUL-2671. Hub class, Dispatcher, ChatSessionService and AuditLogger have all been implemented and tested in prior PRs but none of them was constructed at boot, so the in-process plumbing was never exercised end-to-end. This change wires them together behind the same `MULTICA_LARK_SECRET_KEY` gate that already gates InstallationService / RegistrationService, and starts the Hub under the existing `sweepCtx` so it winds down alongside the other long-running workers after HTTP drain. The real long-conn EventConnector is still pending; the factory hands every supervisor a shared NoopConnector that holds the lease and emits nothing. That lets staging exercise the lease / supervisor / shutdown lifecycle against real DB rows without committing to the Lark wire protocol implementation. Swapping in the real connector is a single line change in the same router block; the Dispatcher / ChatSessionService / Hub seams stay frozen. ## Why a noop placeholder, not a stub-or-skip The Hub's value is mostly its lifecycle: §4.4 ownership lease, LeaseRenewInterval / LeaseTTL, supervisor reap on revoke, clean release on shutdown. None of that runs unless the Hub is actually started. Holding off until the real connector lands means the next PR has to debut both pieces simultaneously; wiring the supervisor loop first lets the real connector PR be a focused, reviewable swap. ## Changes - `internal/integrations/lark/noop_connector.go` — `NoopConnector` implementing `EventConnector`: blocks on ctx until the Hub cancels (lease loss / shutdown / revoke), emits no events, logs on enter/exit so operators see exactly which installation the supervisor is holding the lease for. - `internal/integrations/lark/noop_connector_test.go` — verifies the connector blocks until ctx cancel, returns nil on clean exit, never invokes the emit callback, and the factory shares a single connector instance across installations. - `internal/handler/handler.go` — new `LarkHub lark.Hub` field on `Handler`. Nil when the Lark integration is disabled. - `cmd/server/router.go` — inside the existing Lark wiring block, construct `AuditLogger`, `ChatSessionService` (with `pgxpool.Pool` for the in-tx dedup Mark), `Dispatcher` (wiring `h.IssueService` and `h.TaskService` so `/issue`-created issues share counter / duplicate guard / project boundary / broadcast / analytics with the rest of the product), and the `Hub` with the `NoopConnectorFactory`. `NewRouterWithOptions` now returns `(chi.Router, handler.Handler)` so main.go can drive Hub lifecycle; `NewRouter` discards the handler. - `cmd/server/main.go` — start the Hub under `sweepCtx` after the other background workers, and `Wait` on it after HTTP drain + sweep cancel so the lease renewer can issue a final release before exit. Skipped entirely when `h.LarkHub == nil`. ## Test plan - [x] `go build ./...` clean - [x] `go vet ./...` clean - [x] `go test ./internal/integrations/lark/...` (new noop tests + existing hub / dispatcher / chat_service / registration / binding_token / outbound / issue_command suites) — all pass - [x] `go test ./internal/handler -run 'TestLark\|TestRedeemLarkBinding'` pass — handler-side Lark surfaces unchanged - [x] `go test ./internal/service/... ./internal/util/secretbox/...` pass - [x] `pnpm --filter @multica/views exec vitest run settings/components/lark-tab` pass (6/6) — frontend lark surfaces unchanged - [ ] Local broad `go test ./internal/handler/...` still blocked by the pre-existing test DB schema drift Elon flagged in the previous round (`column "metadata" does not exist`, unrelated to this change); CI is the authoritative check. - [ ] Manual end-to-end deferred until the real long-conn EventConnector lands (next stage). MUL-2671 Co-authored-by: multica-agent <github@multica.ai> fix(integrations/lark): bound Hub lease release + shutdown wait (MUL-2671) Lease release used context.Background(); a stalled DB pool could pin shutdown indefinitely. Add LeaseReleaseTimeout (5s default) and ShutdownTimeout (15s default) to HubConfig, route releaseLease through a bounded context, and expose WaitWithTimeout for main.go so a wedged supervisor degrades to LeaseTTL expiry on the next replica instead of blocking process exit. Also correct the LarkHub field comment in handler.go: the Hub is wired whenever the at-rest secret key is set, independent of whether the outbound HTTP APIClient is configured. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): real WS long-conn connector + ctx-cancel-breaks-read (MUL-2671) Replaces NoopConnectorFactory with a production EventConnector that opens Lark's event-subscription WebSocket. Gated behind MULTICA_LARK_WS_ENABLED so staging boots stay on the noop path until operators opt in, and falls back to noop with a warning when the WS flag is set without MULTICA_LARK_HTTP_ENABLED (the real connector needs the cached tenant_access_token). Why this connector exists separately from the Hub: gorilla/websocket ReadMessage blocks on the underlying TCP socket and does not observe context. The watchdog goroutine inside WSLongConnConnector.Run closes the conn the moment ctx fires, so lease loss / shutdown breaks the blocking read in bounded time — exactly the invariant Hub renewLeaseUntil's runCancel depends on for the "at most one active WS per installation across replicas" guarantee. Tests cover this explicitly (TestWSConnectorRunReturnsOnCtxCancelEvenWhenReadIsBlocked). The Lark wire surface is split into three swappable seams so the transport layer stays tested in isolation: - EndpointFetcher (POST /event-subscription/v1/connection_token) resolves a one-shot wss URL per Run. No caching — replaying a one-shot token would look like a Lark outage. - FrameDecoder turns one raw JSON envelope into an InboundMessage or a "control / heartbeat / drop" verdict. Decoder errors log + drop the frame; they do NOT tear down the connection. - CredentialsProvider wraps InstallationService.DecryptAppSecret so plaintext app_secret lives in memory only during a Run. Also fixes the handler.go LarkHub comment: it still said "joins on Wait during graceful shutdown" but main.go has used WaitWithTimeout (bounded wait) for several commits. Comment now matches. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): align WS to official binary Frame protocol + DispatchResult outbound replies (MUL-2671) Two must-fix items from Elon's review of PR #3277: 1. WS protocol layer rewritten to match the official Lark Go SDK (`larksuite/oapi-sdk-go/v3/ws`): - Bootstrap is `POST /callback/ws/endpoint` with AppID/AppSecret in the body (no tenant_access_token bearer). Response carries wss URL + ClientConfig (PingInterval / ReconnectInterval / ReconnectNonce / ReconnectCount). - `service_id` is parsed from the wss URL query and used as Frame.Service on every outbound frame. - Wire envelope is the binary protobuf `pbbp2.Frame` (hand-rolled via protowire to avoid pulling the whole SDK in, byte-identical field tags). JSON payloads are nested inside Frame.Payload. - Inbound data frames are ACKed with a `Response{code:200,...}` JSON payload that reuses the inbound headers; infra failures produce code=500 so Lark retries. - Ping is the app-layer binary `NewPingFrame(serviceID)` at the server-supplied cadence; WebSocket protocol PING is removed (Lark ignores it). Server-initiated pings get a pong reply. - ctx-cancel-breaks-read invariant preserved via the watchdog goroutine that closes the conn on ctx.Done; the read loop and ping goroutine serialize their writes through a single mutex. 2. `DispatchResult` outbound replies wired via a new `OutcomeReplier`: - `OutcomeNeedsBinding` mints a one-shot binding token and sends the binding prompt card to the sender's open_id. - `OutcomeAgentOffline` / `OutcomeAgentArchived` push a notice card into the chat with the agent name + Chinese copy matching §4.6. - `OutcomeIngested` stays owned by the Patcher; `OutcomeDropped` is silent. - The replier is best-effort: outbound failures are logged and swallowed so a Lark outage cannot stall the inbound pipeline. - Hub installs the noop replier by default; router wires the production `LarkOutcomeReplier` when APIClient.IsConfigured(). PersonalAgent long-conn risk surfaced (open per Feishu docs: `长连接模式仅支持企业自建应用`). The implementation works for any app archetype; the open question is whether `/callback/ws/endpoint` accepts PersonalAgent credentials in practice. Surfacing the Lark code+msg verbatim from the bootstrap response so an operator running the smoke test sees the exact failure rather than a generic timeout. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): byte-compat Frame marshal, chunk reassembly, ACK off reply critical path (MUL-2671) Three protocol blockers from Elon's review of `9540008a`: 1. Frame.Marshal is now byte-identical to oapi-sdk-go/v3/ws/pbbp2.Frame: - SeqID/LogID/Service/Method (proto2 req) emit unconditionally even at zero - PayloadEncoding/PayloadType/LogIDNew emit unconditionally per gogo generated MarshalToSizedBuffer (no zero-guard) - Payload uses the SDK's `!= nil` guard (nil omits, []byte{} emits 0-length) - ACK payload JSON matches SDK's NewResponseByCode + json.Marshal output ({"code":N,"headers":null,"data":null}) Golden tests pin exact byte sequences for ping/pong/ACK/full/zero frames; verified against the real SDK pbbp2.pb.go MarshalToSizedBuffer producing identical bytes. 2. Multi-frame events (sum>1) are reassembled via the new chunkAssembler: - 5s sliding TTL (matches SDK combine() cache TTL) - Lazy GC on admit (no separate sweeper goroutine) - Out-of-order seq + duplicate seq idempotent - Partial chunks are NOT ACKed (SDK behaviour: only the final chunk's ACK confirms the whole event so Lark can retry on partial loss) - Connector wires assembler per-Run; state dies with the session 3. OutcomeReplier detached from ACK critical path: - HubConfig.ReplyTimeout default 2.5s, strictly under Lark's 3s ACK deadline - handleEvent dispatches synchronously (fast DB path), then spawns the replier under a fresh background ctx with WithTimeout(ReplyTimeout) - Hub.replyWg tracks in-flight replies; Hub.Wait / WaitWithTimeout drain them so shutdown is bounded - Noop replier short-circuits inline (no goroutine cost when outbound APIClient isn't configured) Proof tests: - TestHubScheduleReplyReturnsImmediately: scheduleReply with a 10s slow replier returns in <50ms - TestHubReplyTimeoutCancelsHungReplier: hung replier ctx fires at ReplyTimeout - TestHubWaitDrainsInFlightReplies: Wait blocks until replies finish - TestHubACKNotBlockedByOutboundReply: end-to-end through the connector — data-frame ACK lands within 500ms even when the replier hangs 5s PersonalAgent real-env smoke remains Bohan's decision; this PR closes the technical blockers Elon flagged. Co-authored-by: multica-agent <github@multica.ai> * docs(service/issue): narrow position concurrency claim to create-create (MUL-2671) Elon's review of the merge resolution flagged that the comment on the new NextTopPosition call promised more than the code guarantees: concurrent manual reorder via UpdateIssue(position) does NOT take the workspace row lock that IncrementIssueCounter holds, so a create racing a reorder can still land on the same position. Rewrite the comment to only claim create-create serialization, which is the behaviour the lock actually delivers. No code change. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): keep device-flow polling on RFC 8628 HTTP 400 (MUL-2671) Lark's device-flow polling endpoint returns HTTP 400 with the JSON body `{"error":"authorization_pending"}` while the user hasn't scanned the QR yet — this is the RFC 8628 spec, and the upstream oapi-sdk-go implements the same handling. Our previous doForm treated ANY non-2xx as a terminal protocol error, so every install session was killed by the first poll (~5s after begin) and the install dialog appeared silently empty: the frontend received status=error + lark_protocol_error before the user could even read the description. Fix: doForm now decodes the JSON body first; if it parses, the caller (Begin / Poll) routes on the body's `error` field, where the existing switch correctly maps authorization_pending / slow_down to "keep polling" and access_denied / expired_token to terminal failure. Only unparseable bodies (5xx HTML proxy pages, gateway timeouts) still surface as a typed http_NNN RegistrationError. Three regression tests pin the new behaviour: - HTTP 400 + authorization_pending → res.Status="authorization_pending" - HTTP 400 + access_denied → res.Err.Code="access_denied" (terminal) - HTTP 502 + HTML body → http_502 RegistrationError Verified against the live local env: install/begin -> 200, status stays "pending" through the first poll cycle, no longer flips to "error" within seconds. Co-authored-by: multica-agent <github@multica.ai> * fix(views/lark): reset closedRef on every mount so StrictMode double-mount renders QR (MUL-2671) Empty QR dialog body in the dev env: Bohan opened the bind dialog and got an empty white area where the QR should have been — no QR, no "starting" placeholder, no error text. Backend was returning the QR URL correctly; the bug was on the frontend. Root cause: React 19 / Next.js dev StrictMode mounts every component twice (mount → cleanup → mount). The component instance is REUSED across the simulated remount, which means useRef objects are preserved. The dialog's `closedRef` lifecycle: 1. Mount #1: closedRef={current:false}, beginSession() kicked off (HTTP request still in flight) 2. Cleanup runs: closedRef.current=true 3. Mount #2: beginSession() kicked off again, BUT the ref still reads {current:true} from step 2 4. Both promises resolve. Both hit the post-await guard `if (closedRef.current) return;` and bail out before setSession(). 5. Result: session stays null forever. Every conditional in the dialog body (beginning/session-pending/success/error) is false → empty body. Fix: reset closedRef.current=false at the START of the effect, not just at component construction. The cleanup-then-mount pair now re-arms the guard so subsequent setSession calls actually land. Regression test wraps the dialog in <StrictMode> and asserts the QR appears within 2s with the correct value — fails closed if anyone removes the reset. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): drop EventTaskCompleted subscription so the chat reply doesn't get overwritten by "Done." (MUL-2671) Bohan reproduced on the live dev env: agent replies show only a card saying "Done." in Lark, even though Multica's own chat panel has the real "Hello! I'm cc…" reply. Tasks succeed end-to-end, but the user loses the reply on the Lark side. Root cause: TaskService.CompleteTask publishes two events for every chat task IN ORDER: 1. broadcastChatDone(...) → ChatDonePayload{Content: "Hello!..."} 2. broadcastTaskEvent(Completed) → map[string]any{task_id, agent_id,...} (no `content` key) The Patcher subscribed to BOTH and routed each to finalize(). The first patch correctly rendered the reply text, the second patched the same card with an empty payload — chatDoneContent() returned "" and the renderer fell back to "Done." (default empty-body copy). The second patch wins because Lark stores whatever was last applied. Fix: stop subscribing to EventTaskCompleted in the Patcher and remove the corresponding switch arm. EventChatDone is the canonical "agent finished replying" signal for the Lark card path; EventTaskCompleted is still emitted to the bus for other listeners (web UI, analytics, task usage) where the lack of content doesn't matter. Regression test TestPatcherIgnoresEventTaskCompletedForChatTasks emits ChatDone followed by TaskCompleted on a streaming card and asserts: exactly one patch, body contains the agent reply, body does NOT contain "Done.". If anyone re-adds the EventTaskCompleted subscription, this fails immediately. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): chat replies as plain text IM messages, not card chrome (MUL-2671) Bohan reported on the live dev env that even with the agent's reply shown correctly, every message is wrapped in an interactive card with the agent name as the header — it feels like a system notification, not a normal chat reply. He wants the reply to land as a regular Lark text bubble. Changes: - Add APIClient.SendTextMessage backed by Lark's /open-apis/im/v1/messages with msg_type=text. JSON-encodes the {"text": ...} envelope Lark requires so callers pass raw strings. - Patcher.Register no longer subscribes to EventTaskQueued / EventTaskRunning. There is no more thinking → running → final card lifecycle on the success path: it added card chrome without buying anything for free-form chat. - On EventChatDone, the new sendChatReply path posts the assistant message content as plain text. Empty content is silently dropped rather than rendered as "Done." (the prior fallback that confused Bohan). - Failure path keeps a one-shot error card on EventTaskFailed — the visual distinction from a normal reply is genuinely useful, and failures are rare enough that the chrome isn't noisy. - Throttle / lastPatched map / MinPatchInterval / shouldPatch / markPatched / loadCardOrSkip are all removed; nothing in the new flow patches. Tests: - TestPatcherSendsPlainTextOnChatDone pins the new contract: exactly one SendTextMessage call, no card sends or patches, content matches the ChatDonePayload. - TestPatcherDropsEmptyChatReply pins the "no more Done. fallback" decision — empty content drops, period. - TestPatcherFailEventSendsErrorCard pins the failure path still uses a card (one-shot, no patching). - TestPatcherIgnoresEventTaskCompletedForChatTasks rewritten for text path: ChatDone then TaskCompleted yields exactly one text send, no duplicate. - TestPatcherSkipsWhenNoChatSessionBinding and TestPatcherSwallowsInstallationLoadErrors rewritten to drive EventChatDone (the new entry point) instead of TaskQueued. - TestPatcherSendsThinkingCardOnTaskQueued deleted (no more thinking card). Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): pre-fill PersonalAgent bot name as "<agent> - Multica" (MUL-2823) (#3520) The device-flow install left the bot at Lark's auto-generated "{用户姓名}的智能助手". Lark's registration scene supports pre-filling the name via a `name` query param on the verification/QR URL (mirrors the upstream SDK's AppPreset.Name) — a user-editable default that rides on the QR URL, not the begin POST body (which has no name field). BeginInstall already loads the agent for its ownership check, so we keep it and thread `<agent.Name> - Multica` through Begin → decorateQRCodeURL. A blank name degrades to plain "Multica". There is no post-install rename API (bot/v3 is read-only; no bot/v3/update), so the install-time pre-fill is the only programmatic lever; the user can still edit the name on the creation form. Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): restore /issue confirmation + pin SendTextMessage wire (MUL-2671) Two recovered/added contracts off Trump's review of HEAD `fe381a07`: 1) /issue confirmation in Lark was a casualty of the plain-text refactor. The pre-refactor `RenderInput.IssueNumber` field was declared but never actually rendered into the card body, so even in the original card-based flow the user never saw a "Created [MUL-42]" confirmation. Now the OutcomeReplier handles OutcomeIngested + IssueID.Valid by sending a plain text message: Created MUL-42 — fix login bug https://multica.example/issues/MUL-42 Composed from a new DispatchResult.IssueIdentifier + IssueTitle, populated by the Dispatcher from workspace.IssuePrefix + issue.Number / issue.Title. Workspace lookup is best-effort: a Postgres blip on workspace gets a "#42" fallback rather than silently dropping the confirmation. The agent's own chat reply (if any) continues to land separately via the Patcher on EventChatDone — these are two semantically distinct messages and the user benefits from seeing both. 2) SendTextMessage is the wire layer Trump flagged for missing coverage. Three new wire tests pin: - happy path: POST /open-apis/im/v1/messages?receive_id_type=chat_id, msg_type=text, Bearer <tenant_access_token>, double-JSON content envelope - special-character round trip: newlines, double quotes, backslashes, tabs, Chinese + emoji, JSON-lookalike strings. The inner {"text": ...} is encoded once at JSON.Marshal time and once again when the outer body serializes; losing either pass corrupts the message and the bug is invisible without a contract pin. - Lark error path: non-zero `code` surfaces as a wrapped error with the code embedded. Tests: - TestDispatcher_IssueCreationFromCommand asserts IssueIdentifier ("MUL-42") and IssueTitle propagate through DispatchResult. - TestDispatcher_IssueIdentifierFallsBackToNumberOnWorkspaceLookupErr pins the "#7" degrade-graceful fallback. - TestLarkOutcomeReplierIssueCreatedSendsConfirmation pins the text body (identifier + title + deep link) and asserts no card send on this path. - TestLarkOutcomeReplierOutcomeIngestedSilentWithoutIssue pins the silent-on-plain-chat default so we don't accidentally start emitting a confirmation for every message. - TestHTTPClient_SendTextMessage_* covers the wire contract. Frontend locale parity (en + zh-Hans, 53 tests) is currently green on this HEAD; no changes needed. Co-authored-by: multica-agent <github@multica.ai> * fix(views/locales): add missing ko keys for Lark MVP (MUL-2671) Trump flagged on PR #3277 review that the ko bundle was missing the Lark-MVP-only keys that en + zh-Hans both carry. The parity test caught it cleanly after main was merged in (Korean PR landed on main between the prior review and this one): common.lark_bind.* (13 keys) settings.page.tabs.lark (1 key) settings.lark.* (45 keys) agents.inspector.section_integrations (1 key) Korean translations are professional/concise — "Lark" stays as the brand name (matches how en keeps "Lark" + "(飞书)" parenthetically; ko/users searching for the product expect "Lark"), and product copy follows the zh-Hans tone where Multica nouns ("에이전트", "워크스페이스") are romanized loan words consistent with the rest of the ko bundle. Slot ordering preserved against EN: - page.tabs.lark sits between github and integrations - inspector.section_integrations sits right after section_skills Verified: pnpm exec vitest run locales/parity → 105/105 pass. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): /issue origin_type CHECK + Hub restart on credentials rotation (MUL-2671) Two live-env bugs Bohan reproduced: 1) /issue command crashed the WS connector. Dispatcher writes origin_type='lark_chat' on issues born from `/issue`, but the issue_origin_type_check CHECK constraint was last extended in migration 060 for quick_create — it doesn't list lark_chat, so every Lark /issue tripped SQLSTATE 23514 and bubbled up as an infra error. The infra error tore down the WS connector, Lark retried the same message, the new connector tripped the same constraint and crashed again. Repro in the live env: three crashes from the same /issue event over ~40s, each leaving the user with no confirmation in Lark. Migration 111 extends the CHECK list: CHECK (origin_type IN ('autopilot', 'quick_create', 'lark_chat')) 2) Re-scanning an already-bound agent silenced the bot. The device flow re-registers with Lark, which mints a brand-new bot (fresh app_id + app_secret); RegistrationService.finishSuccess upserts into lark_installation by agent_id, so the row's credentials rotate in place. But the running supervisor held the OLD inst struct by value and kept a WS open against the OLD bot's app_id — so all events to the NEW bot went nowhere. Bohan's "claude code 现在不能在飞书里回复了" symptom maps exactly to this: log timeline: 16:29:57 cc connector connected with app_id=cli_aa9398dd... (OLD) 16:34:07 lark registration: install complete (rotation) → row.app_id is now cli_aa93f36f... (NEW) → old WS still subscribed to OLD app_id; new app_id receives nothing Fix: Hub.sweep now compares each installation row's credentials fingerprint (app_id + bot_open_id + sha256(app_secret_encrypted)) against the snapshot the running supervisor was started with. On diff, cancel the old supervisor and start a fresh one inline. A monotonic gen counter on the supervisor entry disambiguates the old goroutine's deferred cleanup from the new entry the rotation path already swapped in. Tests: - TestHubRestartsSupervisorOnCredentialsRotation pins the new path: starts hub on app_one, rotates the row to app_two, asserts the connector factory is called again with the fresh AppID. - TestHubDoesNotRestartSupervisorOnUnchangedRow pins the negative case so an unchanged row doesn't degenerate into a per-sweep busy-loop. - Existing hub tests (lease, supervise, shutdown, ACK timing, noop replier) all green. Verification: - go test ./internal/integrations/lark/... -race -count=1 ok - go build ./... clean - migration applied locally; \d+ issue confirms lark_chat in CHECK Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): per-supervisor lease token to fence rotation handoff (MUL-2671) Elon flagged a race in HEAD be8d4cef's rotation path: both the old and the new supervisors of the same Hub used the hub-wide nodeID as their WS lease token, so an old supervisor's post-cancel releaseLease(nodeID) would CAS-match the lease row the successor had just acquired with the SAME token and DELETE it. Symptom would be a silently empty lease row a few hundred ms after every device-flow re-scan — no replica owning the install, no events delivered, the "bot goes quiet" pattern Bohan hit the first time but now from the fencing side rather than the credentials side. Fix: leaseToken(nodeID, gen) composes "<nodeID>-g<gen>", where gen is the monotonic counter already attached to each supervisorEntry. The nodeID prefix keeps cross-replica observability (an operator inspecting lark_installation.ws_lease_token can still map back to a process) while the -g suffix makes the OLD supervisor's release target the OLD row state. Once the rotation path swaps in the new supervisor, the row's CurrentToken is the new -g(N+1) token, so the old -gN release's WHERE clause no-ops instead of clobbering. acquireLease / renewLeaseUntil / releaseLease now take an explicit token argument; supervise threads its leaseToken through. The plumbing isn't pretty, but having an explicit argument at every call site is the only way the rotation invariant survives subsequent refactors — without it, a future caller could quietly reintroduce "just use h.nodeID" and the race is back. Two regression tests: - TestHubRotationStaleReleaseDoesNotClearSuccessorLease drives the fake lease state machine directly: 1. old acquires(tokenA) 2. rotation lands; new acquires(tokenB) 3. old's stale release(tokenA) fires Asserts owner ends up still tokenB. Hub-wide-nodeID code would fail step 3 by clearing the entry. - TestHubRotationEndToEndKeepsSuccessorLeased runs the same scenario through the live supervise loop: starts hub, rotates the row, waits for sup2 to take over with a distinct token, sleeps past sup1's unwind, asserts the row is still held by a non-sup1 token. Catches the bug even when the goroutine timing is non-deterministic. Verification: go test ./internal/integrations/lark/... -race -count=1 ok go build ./... clean go vet ./... clean Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): route group @-mentions via union_id, not open_id (MUL-2671) In a Lark group with multiple Multica bots installed, the bot whose WS received the event sometimes failed to recognize that it was the @-target while the OTHER bot's supervisor falsely fired. Bohan's controlled three- message test (only @A, only @B, @both) hit this: @A and @B alone went unanswered, @both got picked up by A only. Root cause: the `mentions[].id.open_id` field Lark puts on the WS event is structurally INVERSE to `/bot/v3/info`'s `bot.open_id` across the two WSes. From A's WS perspective, the wire-form open_id for "A was @-ed" is NOT equal to A's API-side open_id, but IS equal to what B's WS sees on its side, and vice versa. The decoder's `mention.open_id == inst.BotOpenID` match therefore fires on the wrong bot in multi-bot groups. Only `union_id` (the Lark-tenant-scoped stable identifier) is consistent across both WSes. Changes: - migration 112 adds nullable `lark_installation.bot_union_id` - sqlc query exposes UpsertLarkInstallation/CreateLarkInstallation with bot_union_id, plus a focused SetLarkInstallationBotUnionID for the backfill path - httpAPIClient.GetBotInfo now follows /bot/v3/info with /contact/v3/ users/{open_id}?user_id_type=open_id and returns both identifiers on BotInfo. Soft-fails on contact-scope denial: install still succeeds with an empty UnionID, and the decoder falls back to the legacy open_id match for single-bot deployments. - RegistrationService.finishSuccess persists union_id alongside open_id during the device-flow finalize. - ws_frame_decoder.containsMention prefers union_id and only walks open_id when the installation row has not been backfilled yet. - BackfillBotUnionIDs runs once at server boot for installations created before migration 112; bounded per-row 10s timeout and a pure soft-fail policy so a slow Lark round-trip cannot block startup. - regression tests cover the three decoder paths: union_id match wins over open_id mismatch, union_id mismatch overrides open_id match, and open_id fallback when union_id is unknown. Co-authored-by: multica-agent <github@multica.ai> * chore: drop trailing blank lines at EOF on four files (MUL-2671) git diff --check origin/main..origin/pr-3277 flagged these as new blank lines at EOF; clearing so the diff stays clean for review. Co-authored-by: multica-agent <github@multica.ai> * fix(views/locales): add missing ja keys for Lark MVP + section_integrations (MUL-2671) CI frontend job tripped on the ja locale parity check: ja is missing the lark_bind block in common.json, the lark block + page.tabs.lark in settings.json, and inspector.section_integrations in agents.json. The ko fix earlier covered Korean; ja was added separately on main and the merge surfaced these gaps. Translations mirror the en source and follow the same voice as the existing ja bundle. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): rewrite @_user_N placeholders into clean body (MUL-2671) When Lark dispatches a group `im.message.receive_v1`, the message text contains opaque `@_user_1`, `@_user_2`, … placeholders and the real identity is in `mentions[]`. We were forwarding the raw text to the agent, so a Bohan-typed "@Bot ping test" arrived as "@_user_1 ping test" — neither human-readable nor useful as LLM context, and the agent was paying tokens to figure out which `@_user_N` was even itself. The new resolveMentions pass: * strips the bot's own mention entirely (the dispatcher already routes the event on AddressedToBot; re-emitting @<self> in front of every message adds zero signal and pollutes context), * substitutes other participants with `@<displayName>` so a follow-up "@Alice" reads naturally, * collapses horizontal whitespace introduced by the strip while preserving original newlines. Bot identity check uses the same union_id-preferred + open_id fallback as containsMention, so the rewrite stays consistent with the routing path. Tests cover the four shapes: bot self-mention, mixed bot + other-user mention, multi-line body with stripped mention, and a no-mention body that should be left untouched. Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): union_id-first self mention strip + token-aware scan + local whitespace cleanup (MUL-2671) Three review blockers on the mention rewrite from PR review: 1. isBotMention now mirrors containsMention's union_id-first policy. When the installation row knows our union_id, we trust it exclusively (open_id is structurally inverted in multi-bot groups — matching on it would re-introduce the routing bug we fixed two commits ago). open_id fallback fires only when union_id is absent. New tests: @-ing both bots in one message correctly strips only self and renders the sibling as @<name>; open_id-matches-but-union_id-differs does NOT strip. 2. resolveMentions no longer collapses or trims whitespace globally. Indentation, tabs, code blocks, tables — all preserved verbatim. When the self mention is removed we eat exactly one adjacent horizontal space (the one after the placeholder, or, when the mention sits at end-of-input, a single space already emitted right before it). New test exercises a multi-line indented + tabbed body and asserts the whole shape survives. 3. Prefix-collision-safe replacement. A chat with 11+ participants exposes both `@_user_1` and `@_user_10`; naive ReplaceAll for `@_user_1` would mangle the substring of `@_user_10`. The resolver now does a single-pass token scan with the mention list sorted longest-key-first, so the longer placeholder always wins at any scan position. New test covers the @_user_1 / @_user_10 case explicitly. Also drops the temporary INFO-level diag logging the previous commit added — root cause was confirmed (union_id swap in the manual backfill; not a decoder bug). Co-authored-by: multica-agent <github@multica.ai> * fix(integrations/lark): scope inbound dedup per (installation_id, message_id) (MUL-2671) Root cause of the residual "@Cc gets dropped as not_addressed_in_group" even after the union_id swap landed: lark_inbound_message_dedup was keyed on `message_id` alone. In a Lark group chat where the workspace has multiple Multica bots installed, Lark delivers the SAME message_id to every bot's WS supervisor. Whichever WS claimed first then ran its own AddressedToBot check; the bot that was actually @-ed lost the dedup race, found the row already terminal (`processed_at IS NOT NULL`), and was dropped as `duplicate` BEFORE it could evaluate its own mention. Net: every @ silently disappeared if Lark happened to route the OTHER bot's WS first. The dedup gate's original purpose (idempotency against WS reconnect replay) is per-installation by definition, so the right key is composite (installation_id, message_id). Changes: - migration 113 drops + recreates lark_inbound_message_dedup with installation_id NOT NULL REFERENCES lark_installation(id) ON DELETE CASCADE and PRIMARY KEY (installation_id, message_id). The table is a 24h transient cache, so dropping existing rows is safe. - sqlc queries: ClaimLarkInboundDedup / MarkLarkInboundDedupProcessed / ReleaseLarkInboundDedup all now take installation_id. - AppendUserMessageParams carries InstallationID through to the in-tx Mark call so the chat_message+dedup atomicity stays intact. - Dispatcher passes inst.ID to claim + applyFinalize + AppendUserMessage. - Test fakes key dedup state on (installation_id, message_id) via a composite map key; all existing pre-seeded rows use a seedDedupKey helper bound to the default activeInstallation fixture so the prior staleness / token-rotation / in-tx mark tests still exercise the same regression they did before. - New regression TestDispatcher_DedupIsScopedPerInstallation pins the multi-bot invariant: a row pre-seeded for installation A does NOT block installation B's first delivery of the same message_id; B runs through its own group-filter / identity / ingest pipeline. Co-authored-by: multica-agent <github@multica.ai> * feat(integrations/lark): render markdown chat replies via schema-2.0 card (MUL-2671) The agent's chat replies were going out as msg_type=text, so every `bold`, fenced code block, list, table, and link in the body showed up as literal markdown characters in Lark — the user saw raw asterisks, hashes, pipes instead of formatted text. Bohan reported this and pointed at zarazhangrui/lark-coding-agent-bridge as the shape to emulate. The bridge repo uses Lark interactive cards with the schema-2.0 envelope and a `tag: "markdown"` body element; Lark's client renders that to formatted text (GFM-ish: bold/italic, headings, lists, links, fenced code blocks, tables, blockquotes). They expose multiple reply modes (card / markdown-as-post / text) gated by user config; we go a step simpler — auto-detect markdown syntax in the agent's body and route accordingly: - containsMarkdown(): cheap substring + regex pass for fenced code blocks, headings, list markers, bold/italic, tables, links, blockquotes, horizontal rules, inline code. Biases toward false- positive — wrapping prose in a card still renders fine, but missing a real markdown block leaves raw characters visible. - APIClient gains SendMarkdownCard / SendMarkdownCardParams. Implementation marshals the schema-2.0 envelope verbatim: {schema:"2.0", body:{elements:[{tag:"markdown", content: md}]}}. Stub returns ErrAPIClientNotConfigured. - Patcher.sendChatReply now branches on containsMarkdown: markdown → SendMarkdownCard, plain prose → SendTextMessage. A one-liner "sure, on it" stays as a normal IM bubble (no card chrome); anything with markdown gets the rendered card. Tests: TestContainsMarkdown pins the heuristic across plain prose and ten markdown shapes; TestPatcherRoutesMarkdownReplyToCard and TestPatcherRoutesPlainReplyToText cover the router; new HTTP wire test TestHTTPClient_SendMarkdownCard_HappyPath contract-pins the card envelope (msg_type=interactive, schema 2.0, markdown tag, verbatim body). Full lark suite passes. Co-authored-by: multica-agent <github@multica.ai> * fix(service/issue): route analytics.IssueCreated through obsmetrics.RecordEvent (MUL-2671) CI's TestNoNakedAnalyticsCaptureInHandlersOrServices guard caught the post-merge analytics call in IssueService.captureCreatedAnalytics that still used s.Analytics.Capture(...) directly. Main added that lint to prevent the Prometheus and PostHog sides from drifting — any new analytics.* event must go through obsmetrics.RecordEvent so the business-metrics collector and the PostHog client fire from the same call site. Fix mirrors how TaskService handles it: IssueService gains a Metrics obsmetrics.BusinessMetrics field (router wires it via h.IssueService.Metrics = opts.BusinessMetrics next to the existing TaskService line), and the in-service Capture call becomes obsmetrics.RecordEvent(s.Analytics, s.Metrics, ...). nil-safe by construction — RecordEvent treats a nil Metrics as PostHog-only. Co-authored-by: multica-agent <github@multica.ai> feat(views/lark): swap Bind CTA for Connected+Manage link when agent already has an installation (MUL-2671) Bohan reported the agent-detail Bind button keeps inviting the user to re-scan the QR even when the agent already has an active Lark PersonalAgent connected — and re-scanning silently upserts the installation row, leaving the previously-created Lark bot dangling as a zombie. Frustrating UX and an actual product footgun. Anti-zombie guard at the only entry point: LarkAgentBindButton now checks the cached installations listing for an active row pinned to this agent_id. When one exists, the install CTA is gone — replaced by a small Connected pill + an "Manage in Lark" link that opens the Bot's app page in Lark's developer console (open.feishu.cn/app/<app_id>) in a new tab. That's where scopes, display name, and additional permission requests actually live; re-scanning never was the right answer for managing an existing bot. Scoping is per-agent: an active installation on a DIFFERENT agent in the same workspace doesn't affect this agent's button, and a revoked installation falls back to the bind CTA so the user can re-create. Tests cover all four states (own-active / own-revoked / other-agent-active / no-installation) and pin the Manage link's href + target=_blank + noopener. i18n: three new keys in settings.json (en / zh-Hans / ja / ko): agent_bot_connected_label, agent_bot_manage_link, agent_bot_manage_tooltip. Locale parity test still 157/157. The dev console host is hardcoded to open.feishu.cn — operators on the Lark international tenant currently get the wrong host; future-proof fix wants the backend to surface a per-installation dev_console_url on the listings response, called out in a code comment. Co-authored-by: multica-agent <github@multica.ai> * feat(views/settings): collapse Lark into Integrations + render agent identity (MUL-2671) Lark was its own top-level workspace settings tab while Integrations sat empty next to it. As more integrations land, the sidebar would balloon with one tab per provider. Move the Lark surface into Integrations as the first hosted integration; the old ?tab=lark URL redirects through LEGACY_WORKSPACE_TAB_REDIRECTS so bookmarks still resolve. The Connected bots list was leaking the raw Lark app_id (cli_…) as the row title with bot_open_id (ou_…) underneath — meaningless to product users. Since the binding is 1:1 with a Multica Agent, join on agent_id and render the agent's avatar + name via the workspace-standard ActorAvatar + useActorName.getAgentName. Deleted agents fall back to "Unknown Agent" so the row is still actionable for cleanup. Tests: stub useActorName + ActorAvatar in lark-tab.test.tsx and add LarkTab connected-bot tests covering the agent identity render and the deleted-agent fallback. Drop the now-dead integrations.* + page.tabs.lark + lark.bot_open_id_label keys across all four locales — parity still 157/157, views suite 1141/1141. Co-authored-by: multica-agent <github@multica.ai> * feat(views/settings): wrap Lark in a named section inside Integrations (MUL-2671) Integrations is meant to host multiple providers (Slack, Linear etc. as they land), so the Lark content should sit under a Lark heading rather than fill the tab directly — otherwise the first additional integration would feel like it broke the IA. Add a "Lark" / "飞书" section heading above LarkTab using the same h2 chrome the other settings tabs use, and pin lark.section_title across all four locales (parity 169/169). Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: J <j@multica.ai>	2026-06-03 19:12:14 +08:00
Multica Eve	e2720f7d33	feat: add opencode thinking variants Adds OpenCode model variant discovery for thinking controls, passes saved thinking_level through opencode run --variant, and hardens verbose model parsing with fallback coverage.	2026-06-02 13:15:14 +08:00
YYClaw	a6b83fef41	fix(agents): surface archived status for retired agents (#3608 ) Retired agents (agent.archived_at set) previously read as offline across the agent dot, hover card, detail badge, and squad member list — a leftover online runtime row could even make them look reachable. Add a dedicated archived presence/status that wins over every runtime/task signal so a retired agent never reads as live or merely offline. - Add archived to AgentAvailability and SquadMemberStatusValue unions - Short-circuit deriveAgentPresenceDetail before runtime/task scan - Backend deriveSquadMemberStatus returns archived instead of offline - Render gray Archive dot/label; skip workload + reassign affordances - en/ko/zh-Hans locale strings	2026-06-02 13:03:15 +08:00
Jiayuan Zhang	ad09baa045	feat(agents): add runtime machine filter to Agents tab (MUL-2846) (#3580 ) * feat(agents): add runtime machine filter to Agents tab (MUL-2846) Add a dropdown filter to the Agents tab toolbar that lets the user narrow the list to agents bound to a specific runtime machine. The filter reuses `buildRuntimeMachines` from the runtimes package so the machine grouping (Local / Remote / Cloud) matches the Runtimes page sidebar, and the per-machine agent counts respect the current scope (Mine/All) so the numbers reflect what the user would see if they clicked the row. Only rendered in the Active view; the Archived view's toolbar is unchanged. If the selected machine is GC'd while the user is on the page (daemon stopped, runtime deleted), the filter auto-resets to 'All runtimes' instead of leaving the list empty. The no-matches state now surfaces 'No agents on <machine>' when the machine filter is the reason for zero results. Adds new `runtime_filter` and `no_matches.runtime_filtered` / `no_matches.search_runtime_filtered` i18n keys in en, zh-Hans, and ko. 7 new unit tests in `runtime-machine-filter-dropdown.test.tsx`. Co-authored-by: multica-agent <github@multica.ai> * fix(agents): address code review on runtime machine filter - Plumb localDaemonId / localMachineName / hasLocalMachine / currentUserId through AgentsPage → buildRuntimeMachines so the Local section and device-name consolidation match the Runtimes page on both web and Desktop. Adds a DesktopAgentsPage wrapper that bridges daemonAPI the same way DesktopRuntimesPage does. - Make the 'All runtimes' badge use the in-scope total instead of summing per-machine counts, so an agent bound to a GC'd runtime doesn't silently vanish from the count. - Move Date.now() out of the machines useMemo into a useState lazy init so the snapshot stays stable per mount. - Drop unused i18n keys (all_description / this_machine / reset) from runtime_filter in en / zh-Hans / ko. - Add a regression test for the All-runtimes badge divergence. Co-authored-by: multica-agent <github@multica.ai> * fix(agents): machine-scoped availability counts + Base UI menu items Follow-up to the previous code-review round (Emacs review at `1144b6023`). #1 (medium) — Availability counts now respect the selected machine. Introduce an inScopeOnMachine memo (inScope narrowed by the selected runtime machine, but NOT by availability chip or search) and use it as the base for both availabilityCounts and the AvailabilityFilterRow's totalCount, so the chips reflect 'agents on this machine' once a machine is selected. filteredAgents is now derived from inScopeOnMachine so the availability chip and search further refine within the machine scope. The dropdown's 'All runtimes' badge still uses inScope.length — it's the count the user would see if they cleared the filter, so it should stay unfiltered. #2 (low) — Dropdown rows now use DropdownMenuItem instead of raw <button>. Replaces the bare <button> in RuntimeMachineFilterItem with the shared DropdownMenuItem wrapper (Base UI Menu.Item). The rows are now registered as proper menu items: keyboard navigation (arrow keys, Enter, Space), typeahead, ARIA role='menuitem' semantics, and auto-close on selection (closeOnClick: true) all work. Active styling is preserved via data-active, and a data-highlighted variant on the inactive style matches Base UI's keyboard-focus appearance. Tests updated to use role-based queries (getByRole('menuitem')) and add a regression that verifies the menu is properly registered with Base UI. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Lambda <lambda@multica.ai> Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: MiniMax M3 <M3@multica.local>	2026-06-01 10:17:56 +08:00
Multica Eve	c9c269675c	fix: align MCP support docs and UI gate (#3553 ) Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-30 18:24:45 +08:00
Multica Eve	eda2150a97	fix(agents): show MCP tab for ACP runtimes (#3534 ) Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-29 20:00:05 +08:00
Naiyuan Qing	645af40ed9	refactor(views): unify detail/list headers into shared BreadcrumbHeader (#3510 ) * refactor(views): unify detail/list headers into shared BreadcrumbHeader Replace four hand-rolled, divergent header styles (workspace-name root, "/" separator, back-arrow, raw div) with one shared BreadcrumbHeader component. The mental model is now identical everywhere: leading crumbs are the thing's real containers and clicking one navigates up. - New packages/views/layout/breadcrumb-header.tsx (segments/leaf/actions) - Detail pages (issue, project, runtime, skill, autopilot, agent, squad) now render `{Section} › name`; org name removed as a breadcrumb root - Issue breadcrumb shows the single most-direct container only (parent wins over project; they are orthogonal columns), never a fabricated chain; bare issue shows just its title - Issue leaf (identifier + title) is now a clickable link to the issue detail page with a subtle hover:opacity-80 - Issues / My Issues list headers drop the workspace prefix, matching the icon + title style of the other list pages Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(views): update breadcrumb tests for unified header behavior The header unification changed three observable behaviors the tests asserted against: - issue detail no longer renders the workspace name as a breadcrumb root - bare issue shows only its (now clickable) title leaf, no ancestor crumbs - the project "Unknown project" error placeholder was removed Rewrite the two affected issue-detail tests to assert the new leaf-link and no-project-crumb behavior, drop the obsolete Unknown-project test, and update the issues-page header test to assert the workspace prefix is gone. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-29 14:27:51 +08:00
Bohan Jiang	fa076d38f2	MUL-2778 feat(agent): wire mcp_config through OpenClaw runtime (#3450 ) * MUL-2778 feat(agent): wire mcp_config through OpenClaw runtime The MCP config tab (#3419) lets admins save mcp_config on an agent, and recent work (#3439) plumbed it through the three ACP runtimes. OpenClaw still ignored the field, leaving the Tab silently inert for any OpenClaw-backed agent. Translate the agent's Claude-style `{"mcpServers": {...}}` into the per-task OpenClaw wrapper's `mcp.servers` block — OpenClaw resolves MCP via its own config schema rather than ExecOptions, so the existing OPENCLAW_CONFIG_PATH preparer is the right seam. Fail closed on malformed JSON / entries missing `command` or `url`, matching the fail-closed posture the preparer already uses for the agents.list step. Null / absent mcp_config leaves the wrapper free of an `mcp` key so the user's global mcp.servers flows through untouched; an explicit empty managed set (`{}` / `{"mcpServers":{}}`) is honoured as "admin saved no servers" mirroring `hasManagedCodexMcpConfig`. Strict-mode replacement (drop user-only servers entirely) would require OpenClaw to do a per-key replace rather than a deep merge at `mcp.servers`; the comment documents that caveat rather than relying on undocumented behaviour. Also adds `openclaw` to `MCP_SUPPORTED_PROVIDERS` so the MCP Tab actually surfaces in the agent overview pane, and pins the new visibility case with a renderPane test. Co-authored-by: multica-agent <github@multica.ai> * MUL-2778 fix(agent): make openclaw mcp_config strict-replace via sanitized snapshot Elon flagged on #3450 that the previous wiring let user-only mcp.servers leak through the wrapper's `$include` of the live user config: deep-merge at `mcp.servers` keeps user-only names, and the strict-empty case (`{ "mcpServers": {} }`) silently inherited user globals. Switch the strict-replace path to write a sanitized snapshot of the user's fully resolved config (via `openclaw config get --json`) with the `mcp` block stripped, then have the wrapper `$include` the snapshot instead of the live user file. With the user's `mcp` gone from the $include resolution, the wrapper's `mcp.servers` is the only definition the embedded OpenClaw sees — managed only, including the explicit empty set. The snapshot lives in envRoot at 0o600 alongside the wrapper so the GC reaper sweeps it with the rest of the task scratch, and no extra OPENCLAW_INCLUDE_ROOTS entry is needed (same-dir $include). Fail-closed on `config get --json` errors so the daemon never silently falls back to the leaky $include path. The inherit branch (null mcp_config) still uses the live user file directly — no extra CLI roundtrip and no snapshot is written. New tests pin the contract Elon's review required: - TestPrepareOpenclawConfigStrictReplacesUserMcpServers: user has global_one + shared, managed has shared + managed_only → wrapper has exactly {shared (managed value), managed_only}; global_one does NOT leak; snapshot file has the user's `mcp` stripped while preserving gateway / providers / API keys. - TestPrepareOpenclawConfigStrictEmptyManagedSetDropsUserMcp: empty managed set drops user's global_one (both `{}` and `{"mcpServers":{}}` cases). - TestPrepareOpenclawConfigNullMcpConfigKeepsUserInclude: null path inherits the live user config, writes no snapshot, makes no extra CLI call. - TestPrepareOpenclawConfigFailsClosedOnResolvedConfigError: errors during `config get --json` surface; no stale wrapper or snapshot. - TestPrepareOpenclawConfigManagedSetFreshInstall: fresh install with managed mcp_config skips the snapshot dance entirely. Also tightens en + zh-Hans MCP Tab copy to mention OpenClaw goes via the per-task wrapper, and to use OpenClaw's own `transport` field rather than Claude's `type` for HTTP/SSE entries. Co-authored-by: multica-agent <github@multica.ai> * MUL-2778 fix(agent): narrow openclaw snapshot strip to mcp.servers only Elon's third-round must-fix: the previous strict-replace snapshot deleted the entire `mcp` block, which wiped out non-server settings under `mcp` like `sessionIdleTtlMs`. Those are documented OpenClaw config keys (https://docs.openclaw.ai/gateway/configuration-reference#mcp) outside the MCP Tab's scope — the agent's saved mcp_config only manages server definitions, so other `mcp.*` tuning the user set must survive. Replace the blanket `delete(resolved, "mcp")` with a stripUserMcpServers helper that: - deletes only `mcp.servers` when `mcp` is an object - drops the parent `mcp` key only when the object is empty after the strip (so we don't emit `mcp: {}` placeholders) - leaves non-object `mcp` values untouched (we only know how to strip servers from the documented shape) Pinned with TestPrepareOpenclawConfigStrictPreservesNonServerMcpKeys: user resolved has both `mcp.sessionIdleTtlMs: 300000` and `mcp.servers.global_one`; after the strict path runs the snapshot keeps the TTL and drops the servers map, and the wrapper's `mcp.servers` is exactly the managed set with no leak. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-28 18:43:02 +08:00
Bohan Jiang	bae8a84abd	MUL-2767 feat(agent): add Antigravity runtime backend (#3427 ) * feat(agent): add Antigravity runtime backend Adds Google's Antigravity CLI (`agy`) as the 12th supported coding-tool runtime, alongside Claude / Codex / Cursor / Copilot / Gemini / Hermes / Kimi / Kiro / OpenCode / OpenClaw / Pi. The CLI emits plain assistant text on stdout (no structured event stream), so the backend streams stdout line-by-line as `MessageText` events and accumulates the same text as the final `Result.Output`. Session resumption uses `--conversation <id>`; because the conversation UUID is not echoed on stdout, the daemon routes `--log-file` to a temp file and recovers the id from the glog-formatted log lines. MUL-2767 Co-authored-by: multica-agent <github@multica.ai> * fix(agent): correct Antigravity capability contract from Elon review - ModelSelectionSupported now returns false for antigravity. `agy` has no --model flag and antigravityBackend deliberately drops opts.Model, so the UI must render a disabled "Managed by runtime" picker instead of an empty dropdown plus a silently-ignored manual-entry field. Also stop seeding AgentEntry.Model from MULTICA_ANTIGRAVITY_MODEL — the backend would silently ignore it. - Antigravity skills now write to {workDir}/.agents/skills/, the CLI's native workspace path (inherits Gemini CLI's layout per https://antigravity.google/docs/gcli-migration). Previously they went to the .agent_context/skills/ fallback that the CLI doesn't scan. Runtime brief moves antigravity into the native-discovery branch and local_skills.go points the user-level skill root at ~/.gemini/antigravity-cli/skills for Runtime → local skill import. - Doc + UI comment sync: providers matrix / install-agent-runtime / cloud-quickstart / agents-create / tasks (session-resume support) / skills / README all now list Antigravity in the right buckets, and the model-picker / model-dropdown comments cite antigravity (not the stale hermes reference) as the supported=false example. New tests: TestAntigravityModelSelectionUnsupported, TestInjectRuntimeConfigAntigravity (native discovery wording), TestWriteContextFilesAntigravityNativeSkills (.agents/skills/ landing, .agent_context/skills/ NOT written). Co-authored-by: multica-agent <github@multica.ai> * feat(provider-logo): swap inline placeholder for real Antigravity PNG Replaces the hand-drawn planet+arc placeholder with the official asset shipped from Downloads. Stored next to the component; bundlers (Next.js / electron-vite) resolve the PNG import to a URL string at build time. Added a small assets.d.ts so packages/views' tsc accepts PNG / SVG module imports — there was no prior asset usage in this package to register the declaration. --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-28 15:40:05 +08:00
Bohan Jiang	d39da9f7f0	MUL-2764: feat(agents): add MCP config tab to agent detail page (#3419 ) * MUL-2764: feat(agents): add MCP config tab to agent detail page Backend already stores `mcp_config` and the daemon forwards it to the runtime CLI via `--mcp-config`; this only adds the UI entry point. The new tab presents a JSON editor that pretty-prints the existing config, validates the buffer on every keystroke, and saves through the existing `PUT /api/agents/{id}` path. Clearing the editor sends `mcp_config: null`, which the handler reads as "wipe the column" and the daemon falls back to the CLI's own default. When the caller can't see secrets (agent actor, or a non-owner non-admin member), the server already returns `mcp_config: null` with `mcp_config_redacted: true`; the tab renders a read-only "configured but hidden" state in that case so a non-privileged member cannot silently overwrite an admin-owned config by saving an empty editor. Co-authored-by: multica-agent <github@multica.ai> * fix(agents): MCP tab — preserve in-flight edits + warn non-Claude runtimes - Fix stale-editor sync: compare the local draft against the previous original via a ref, so a background agent refetch updates an untouched editor instead of being silently ignored. Without this, a draft equal to the OLD original was treated as user-edited after the prop changed, and the next Save would write the old config back over a concurrent admin edit. - Surface a notice inside the tab when the agent's runtime provider is not Claude — today's daemon only forwards mcp_config via Claude's --mcp-config, so saving on e.g. a Codex agent was silent but ineffective. - Tests for both: rerender resyncs an untouched editor, rerender preserves an in-flight edit, warning renders on non-Claude / hides on Claude. MUL-2764 Co-authored-by: multica-agent <github@multica.ai> * MUL-2764: feat(agents): codex MCP support + hide MCP tab on unsupported runtimes - Backend: codex.go now translates agent.mcp_config (Claude-style `{"mcpServers": {...}}`) into `-c mcp_servers.<name>=<inline-toml>` flags for `codex app-server`, so MCP servers configured in the UI reach Codex's per-task config layer. Bad mcp_config JSON downgrades to a warn-and-skip so it can't break the agent launch. - Frontend: AgentOverviewPane hides the MCP tab when the agent's runtime provider doesn't read mcp_config — only `claude` and `codex` are supported today, every other provider sees no MCP tab. The previous in-tab warning is removed (no longer reachable). - New shared helper `providerSupportsMcpConfig` lives in `@multica/core/agents` so views and any future caller share one list of MCP-aware providers. - Tests: new go-side coverage for stdio + url + multi-server inputs, TOML string escaping, malformed-input fallback, and arg ordering vs custom_args; new views-side coverage for which providers surface the MCP tab. En + zh-Hans copy and parity test refreshed. Co-authored-by: multica-agent <github@multica.ai> * MUL-2764: fix(agents): keep codex mcp_config secrets out of argv/logs Move the agent's mcp_config from a `-c mcp_servers.<id>=<inline-toml>` argv flag into a daemon-managed `[mcp_servers.]` block inside the per-task `$CODEX_HOME/config.toml`. mcp_servers.<id>.env is a documented Codex config field and the UI already treats mcp_config as redacted for non-admins; argv would have leaked those values into `ps aux` and the `agent command` log line. The file is forced to 0600 to keep secrets in the daemon owner's lane regardless of the seed file's mode. Also drop user-supplied `-c/--config mcp_servers.` entries from custom_args. Codex `-c` is last-wins (verified against codex-cli 0.132.0), so without filtering, a custom_args entry could silently shadow whatever the MCP Tab saved. Strip inherited `[mcp_servers.]` tables from the per-task config.toml when the agent has its own mcp_config, mirroring Claude's `--strict-mcp-config`: avoids TOML "table already exists" errors on name collisions and matches admin expectations that the MCP Tab is the authoritative source for that task. Co-authored-by: multica-agent <github@multica.ai> MUL-2764: fix(agents): codex mcp_config three-state semantics + custom_args compat Address the third review pass: 1. Distinguish nil vs present-but-empty mcp_config. `{}` and `{"mcpServers":{}}` now count as "admin saved an explicit (empty) managed set" — strip inherited user `[mcp_servers.]` and pin an empty managed marker block. Only SQL NULL / JSON `null` map to "absent" and fall back to the user's global `~/.codex/config.toml`. This aligns Codex with the API's three-state contract (omit / null / object) and with Claude's `--strict-mcp-config` semantics. 2. Fail closed on `ensureCodexMcpConfig` errors and on managed mcp_config without CODEX_HOME. Previous warn-and-launch would silently inherit the user's global MCP servers and look identical to a successful apply — exactly the surprise the MCP Tab is meant to remove. 3. Only filter `-c mcp_servers.` from `custom_args`/`extra_args` when the agent has a managed mcp_config. Pre-MUL-2764 agents that configured MCP via custom_args keep working; once an admin opts in via the MCP Tab the daemon owns the `mcp_servers` namespace and overrides are dropped (last-wins safety). 4. Update mcp_config locale intro to mention $CODEX_HOME/config.toml instead of the now-removed `-c mcp_servers.*` argv path. Tests: - Split `TestEnsureCodexMcpConfigEmptyInputsAreNoop` into `TestEnsureCodexMcpConfigAbsentLeavesUserTablesAlone` (nil/null) and `TestEnsureCodexMcpConfigEmptyManagedSetStripsUserMcp` (`{}`, `{"mcpServers":{}}`). - Add `TestEnsureCodexMcpConfigEmptyManagedSetIdempotent` to pin byte-identical reruns on the empty managed marker block. - Add `TestHasManagedCodexMcpConfig` covering the eight relevant inputs. - Add `TestBuildCodexArgsPreservesCustomMcpOverridesWhenUnmanaged` and `TestBuildCodexArgsDropsCustomMcpOverridesWhenManaged` to pin the new gating. - Add `TestCodexExecuteFailsClosedWhenMcpConfigInvalid` and `TestCodexExecuteFailsClosedWhenManagedMcpButNoCodexHome` for the Execute paths. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: J <j@multica.ai> Co-authored-by: multica-agent <github@multica.ai>	2026-05-28 15:11:28 +08:00
Alex	746c0c4456	MUL-2746 fix(avatar): normalize relative avatar urls in desktop/web (#3100 ) * fix(avatar): normalize relative avatar urls in desktop/web Co-authored-by: multica-agent <github@multica.ai> * fix: test Co-authored-by: multica-agent <github@multica.ai> * fix(avatar): normalize avatar url in AvatarPicker preview MUL-2746. The picker is used by create-agent and create-squad, and also prefills from a template's `avatar_url` when duplicating an agent. The upload result / template URL is root-relative in local-storage setups, so on Desktop (file:// runtime) the preview <img> resolves against the local filesystem and the avatar fails to render. Route the value through `resolvePublicFileUrl` for rendering only; the stored URL stays raw so the parent's create call still posts what the backend expects. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai> Co-authored-by: J (Multica agent) <agents@multica.ai>	2026-05-27 16:02:08 +08:00
Naiyuan Qing	bd1fb10afa	chore: react-doctor cleanup — button types, useContext→use(), toSorted, error fixes (#3350 ) - Add explicit type="button" to 61 <button> elements missing the attribute - Replace useContext() with React 19 use() across 16 context consumers - Replace [...arr].sort() with arr.toSorted() in 12 web/desktop files (mobile excluded — Hermes lacks toSorted support) - Fix rules-of-hooks violation: useSidebar try/catch → useSidebarSafe null check - Fix nested component definition: useMemo wrapping HeaderRight → useCallback - Fix missing ARIA: add aria-expanded + aria-controls to combobox in create-squad React Doctor score: 23 → 30. No behavioral changes, no business logic modified. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 14:57:07 +08:00
Bohan Jiang	735f18a4ef	fix(agents): drop the import-hint callout on agent Skills tab (again) (#3301 ) #3265 already removed this blue "Importing creates a workspace copy..." banner, but #3286 (the skills_local toggle revert) brought it back as collateral. Re-remove it — this tab isn't where skill imports happen (that lives behind Skills page → Add Skill → From Runtime), so the callout is pure noise here. Also flip the header row back to items-center now that the intro is once again the only thing in it.	2026-05-26 18:41:25 +08:00
Multica Eve	744b474199	revert(agent): remove per-agent local skill toggle (MUL-2603) (#3286 ) * Revert "feat(agents): hide skills_local toggle for runtimes that don't honour it (MUL-2603) (#3276)" This reverts commit `0b50c5a209`. Co-authored-by: multica-agent <github@multica.ai> * Revert "fix(agent): surface host OAuth token via env var on macOS isolation (MUL-2603) (#3267)" This reverts commit `a67bf81225`. Co-authored-by: multica-agent <github@multica.ai> * Revert "fix(agents): tighten skills-tab intro and drop redundant import hint (#3265)" This reverts commit `d8075a5775`. Co-authored-by: multica-agent <github@multica.ai> * Revert "fix(agent): mirror $HOME/.claude.json into isolated config dir (MUL-2661) (#3261)" This reverts commit `40da88fc16`. Co-authored-by: multica-agent <github@multica.ai> * Revert "feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200)" This reverts commit `960befa56f`. Co-authored-by: multica-agent <github@multica.ai> * Add migration cleanup for reverted agent skills toggle Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Eve <eve@multica-ai.local> Co-authored-by: multica-agent <github@multica.ai>	2026-05-26 17:00:01 +08:00
Bohan Jiang	0b50c5a209	feat(agents): hide skills_local toggle for runtimes that don't honour it (MUL-2603) (#3276 ) * feat(agents): hide skills_local toggle for runtimes that don't honour it (MUL-2603) Only Claude Code and Codex runtimes actually enforce `skills_local` at exec time today — Claude isolates `~/.claude/skills/` via `CLAUDE_CONFIG_DIR`, Codex isolates `~/.codex/skills/` via per-task `CODEX_HOME`. Every other runtime currently stores the field but treats it as a no-op, which made the toggle in the Create Agent dialog and Skills tab misleading for those runtimes. Gate the toggle on `runtime.provider` so it only renders for the providers the daemon currently isolates. Centralise the supported-provider list as `isSkillsLocalSupportedProvider()` in `packages/core/agents` and reuse it from the create dialog and the Skills tab. The create dialog also drops `skills_local` from the payload when the selected runtime is unsupported, so a runtime swap can't leave a stale `ignore` opt-in pinned where it would never take effect. Docs (EN + ZH) updated to say the toggle is hidden — not just "a no-op" — for the unsupported runtimes. Co-authored-by: multica-agent <github@multica.ai> * docs(agents): align skills_local hint and type comment with claude+codex boundary Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-26 16:09:51 +08:00
Bohan Jiang	d8075a5775	fix(agents): tighten skills-tab intro and drop redundant import hint (#3265 ) Three small UI cleanups on the agent Skills tab: - The blue "Importing creates a workspace copy that your team can edit and reuse" callout was visual clutter — drop it (and the Info icon import that it relied on). - The intro paragraph conflated two things: the workspace-skills concept (applies to every runtime) and the Allow-locally-installed-skills toggle (only honoured by Claude Code and Codex; verified — none of copilot/cursor/gemini/opencode/openclaw/hermes/pi/kimi/kiro read agent.SkillsLocal). Rewrite the intro to only describe the main concept; the toggle's own local_hint_on/off strings still carry the Claude/Codex caveat where it belongs. - The trimmed intro now fits one line, so flip the header row from items-start to items-center so the text sits on the same baseline as the "Add skill" button instead of clinging to its top edge.	2026-05-26 14:58:12 +08:00
Bohan Jiang	960befa56f	feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200 ) * feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) Adds an agent-scoped `skills_local` switch ("ignore" default / "merge") so shared agents stop inheriting the operator's user-global Claude skill directory. A single broken local skill on one operator's machine was crashing the Claude CLI before it ever read stdin — the daemon saw a "broken pipe" with no recoverable signal (GitHub #3052). - DB: migration 108 adds `agent.skills_local` (NOT NULL DEFAULT 'ignore'), with sqlc CreateAgent/UpdateAgent updates and handler validation. - Claude runtime: when the agent is in "ignore" mode the backend points CLAUDE_CONFIG_DIR at an empty per-task scratch dir under the task cwd (fallback: OS temp), strips any inherited override, and cleans up after the run. Workspace skills under `{cwd}/.claude/skills/` still load. "merge" preserves the legacy inherit-from-machine behavior; Codex and other isolated backends are no-ops. - UI: new Skills toggle in the Create Agent dialog and the Agent → Skills tab, with EN/zh-Hans copy and SkillsLocalToggle shared between the two. - Tests: unit coverage for the new env helper, isolation dir lifecycle, full Claude execute paths (ignore + merge), and the handler tristate contract. Existing skills-tab test updated for the new copy. - Docs: updated `/skills` docs (EN + ZH) and added a 0.3.7 changelog entry in the landing-page i18n. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): preserve claude login + validate skills_local input (MUL-2603) Address Elon's review on PR #3200: 1. Skill isolation no longer drops the operator's Claude login. The per-task scratch dir now mirrors every entry under `~/.claude/` as symlinks except `skills/`, so `.credentials.json`, settings, plugins, etc. reach the CLI exactly as on the host while the user-global skills directory stays hidden. Without this, default `ignore` would have broken every Claude agent on a non-API-key host the moment migration 108 landed. 2. Internal CreateAgent callers (agent_template, onboarding_shim) now set `SkillsLocal: "ignore"`. The Go zero value was about to trip the migration-108 CHECK constraint and 500 template / onboarding agent creation. 3. Create / update handler validation no longer normalizes garbage to "ignore". The strict 400 path is now reachable on bad client input; the drift-safe `normalizeSkillsLocal` stays on the read side only. UI copy + docs clarified that the toggle is Claude-only; other runtimes ignore the setting. Verification: - `go test ./...` green (full suite locally). - `pnpm --filter @multica/views exec vitest run agents/components/tabs/skills-tab.test.tsx` green. - Handler DB-backed tests still skip locally without docker (same as Elon's run) — CI will validate the create / update paths against migration 108. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): mirror effective claude config dir with windows fallback (MUL-2603) Address Elon's second-round review on PR #3200: 1. The per-task scratch dir now mirrors the effective host Claude config dir, not unconditionally `~/.claude/`. Precedence: agent `custom_env` CLAUDE_CONFIG_DIR > parent process env > `~/.claude/`. Without this, an operator who pinned Claude at a managed install (custom env CLAUDE_CONFIG_DIR) would get the wrong credentials in the scratch dir, because `buildClaudeEnv` strips that env before handing it to the child. We resolve the source up front and feed it to the mirror, so the override env still points at the right bytes. 2. Mirror entries now go through platform-aware linkers. On Windows without Developer Mode / admin, `os.Symlink` is denied, which previously left the scratch dir empty and broke Claude Code auth on default `ignore`. The new helpers try symlink first, then fall back to a directory junction (`mklink /J`) for dirs or a hardlink (same-volume content share) / copy for files. Mirrors the execenv/codex_home_link_windows.go pattern. 3. Tests: - `TestResolveHostClaudeConfigDir` locks in the custom_env > parent_env > `~/.claude` precedence. - `TestNewIsolatedClaudeConfigDirMirrorsCustomHostDir` confirms the scratch dir picks up `.credentials.json` from a synthetic custom host dir, proving the source resolution actually propagates into the mirror. - `TestNewIsolatedClaudeConfigDirEmptyHostIsNoop` documents the env-var-auth-only case (no host source ⇒ empty scratch dir). - `TestMirrorHostClaudeExceptSkillsWith_FallbackWhenSymlinkFails` exercises the Windows-no-Developer-Mode path via the new `mirrorHostClaudeExceptSkillsWith` seam, asserting credentials and sub-dir children still reach the scratch dir after the symlink stand-in fails. - `TestMirrorHostClaudeExceptSkillsWith_PropagatesFirstLinkError` confirms callers see the per-entry error when even fallback fails (so the warn-log fires on broken Windows installs). - `TestCopyFileRoundTrip` covers the last-resort copy fallback and its EXCL no-overwrite contract. - `TestClaudeExecuteIsolatesUsesCustomEnvSource` is the end-to-end check: an agent with custom_env CLAUDE_CONFIG_DIR reads its credentials from the pinned dir, not `~/.claude/`. 4. Docs: `apps/docs/content/docs/skills.{mdx,zh.mdx}` updated to describe the effective-source resolution and the Windows fallback chain so the docs match the runtime behaviour. Verification: - `go test ./...` green (full server suite locally, including `pkg/agent` 23 cases covering the new + existing isolation paths). - `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and `go test -c -o /dev/null` both compile clean, confirming the Windows-tagged linker file builds. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): default skills_local to merge to preserve legacy behavior (MUL-2603) Per Bohan's product decision on PR #3200, the per-agent host-skill toggle defaults to "merge" — the pre-MUL-2603 inherit-from-machine behavior — so existing personal workflows that rely on locally installed Claude Skills keep working unchanged. Agent owners explicitly opt into "ignore" when they need to harden a shared agent against a broken local skill on one operator's machine (GitHub #3052). Also audited all 11 runtimes for user-global skill discovery paths and documented the scope of the toggle. Only Claude reads a user-global `~/.claude/skills/`; Codex isolates via `CODEX_HOME`, the ACP backends (Hermes / Kimi / Kiro) and the JSON-stream backends (Copilot / Cursor / Gemini / Pi / OpenCode / OpenClaw) anchor discovery to the task workdir and never read a user-global skill directory. UI copy and docs now say "for runtimes that support it (currently Claude Code)" everywhere so the scope is explicit. Changes: - Migration 108: column default flipped to 'merge'. - Handler CreateAgent: missing field → "merge"; explicit "ignore" / "merge" still validated, garbage still 400. - normalizeSkillsLocal: drift-safe coercion now lands on "merge" for anything that isn't the exact literal "ignore". - agent_template.go / onboarding_shim.go: internal CreateAgent callers send "merge" instead of "ignore" to match the new default. - Claude runtime (`claude.go`): isolate-mode gate flipped from `SkillsLocal != "merge"` to `SkillsLocal == "ignore"`, so "" (legacy daemons / older clients) and "merge" both walk `~/.claude/` directly. - Create Agent dialog + Skills tab: toggle defaults to on (merge); only duplicate of an explicit "ignore" agent carries through. The isolation opt-in is now `skills_local: "ignore"` when the user flips off; "merge" is omitted from the request body. - i18n (EN + zh-Hans): copy reframed — "On (default) — merged"; "Off — ignored. Recommended for shared agents". - Docs (`/skills`, `/guides/agents.zh`): describe new default and enumerate which runtimes act on the toggle. - Landing changelog 0.3.7: retitled "Per-Agent Local-Skill Toggle"; note the on-by-default behavior + off-to-isolate framing. - Tests: - `TestClaudeExecuteIsolatesHostSkillsWhenIgnoreOptedIn` replaces the old by-default isolation case (now requires explicit "ignore"). - New `TestClaudeExecuteDefaultModeKeepsHostConfigDir` locks in that default ExecOptions preserve the host CLAUDE_CONFIG_DIR. - `TestClaudeExecuteIsolatesUsesCustomEnvSource` now explicitly opts into "ignore" mode. - Handler tests: omitted → "merge"; explicit "ignore" round-trips; preserve-existing test seeds "ignore" and asserts "merge" flip-back. - `TestNormalizeSkillsLocal_DriftStaysSafe`: only literal "ignore" maps to ignore; everything else → "merge". - `skills-tab.test.tsx`: toggle ON by default; flip OFF when agent opted into "ignore". Intro-text matcher anchored to a more specific phrase so it no longer collides with the toggle hint copy. Verification: - `go test ./...` green (full server suite locally). - `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and `go test -c -o /dev/null` both compile clean (windows-tagged linker file still builds). - `pnpm typecheck` green across all packages and apps. - `pnpm --filter @multica/views test` 88 files / 771 tests green. - `pnpm --filter @multica/core test` 43 files / 390 tests green. - Handler DB-backed tests still skip locally without docker; CI will validate the create / update paths against migration 108. Co-authored-by: multica-agent <github@multica.ai> * chore(landing): drop 0.3.7 changelog entry from this PR (MUL-2603) The landing-page release notes belong in a separate release-prep PR, not in the feature PR. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): propagate skills_local=ignore to codex user-skill seed (MUL-2603) Make the per-agent skills_local toggle real for Codex too, not just Claude. Previously the toggle was only consumed by the Claude backend, while the daemon's execenv layer always seeded Codex's per-task CODEX_HOME with the host machine's user-installed skills from ~/.codex/skills/. A shared Codex agent with skills_local=ignore could still inherit a broken local skill from one operator's machine. Now: PrepareParams/ReuseParams carry SkillsLocal; hydrateCodexSkills skips seedUserCodexSkills when SkillsLocal == "ignore" so the per-task CODEX_HOME exposes only workspace skills to the codex CLI. Default ("merge", or empty from older servers/clients) preserves existing inherit-from-machine behavior. UI / docs are updated to reflect the contract honestly: Claude Code and Codex honor the toggle; other runtimes (Hermes / Kimi / Kiro / Copilot / Cursor / Gemini / Pi / OpenCode / OpenClaw) leave $HOME untouched and discover user-level skills natively, so the toggle is a no-op for them today. New tests: TestPrepareCodexSkillsLocalIgnoreSkipsUserSeed, TestPrepareCodexSkillsLocalMergeSeedsUserSkills, and TestReuseCodexSkillsLocalIgnoreSkipsUserSeed cover Prepare(ignore), Prepare(merge), and the toggle-flip-on-reuse path. Co-authored-by: multica-agent <github@multica.ai> * docs(skills): scope skills_local toggle copy to Claude Code + Codex (MUL-2603) Off-state hint and Skills tab intro now explicitly call out Claude Code + Codex as the only runtimes that honor the toggle, with "other runtimes ignore this setting" wired into both states (en + zh-Hans), so users on non-Claude/Codex agents don't read "Off" as runtime-wide isolation. Docs (skills.mdx, skills.zh.mdx, guides/agents.zh.mdx) stop describing Hermes / Kimi / Gemini / Copilot / Cursor / Pi / OpenCode / OpenClaw / Kiro as having native user-level skill discovery; the daemon simply does not manage user-level skill discovery for those runtimes today, and the toggle is a no-op regardless of where it is set. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-26 13:26:33 +08:00
Bohan Jiang	13f74e651a	feat(agents): remove custom_env from agent resources, add audited env endpoint (MUL-2600) (#3209 ) * feat(agents): remove custom_env from agent resources, add audited env endpoint (MUL-2600) The agent resource shape (list / get / create / update / archive / restore responses + WebSocket events) no longer carries `custom_env` values. Reads/writes of env now flow exclusively through a dedicated `/api/agents/{id}/env` endpoint that is owner/admin-only, rejects agent-actor sessions, applies a "**" sentinel preserve guard on PUT, and writes a persistent audit row per reveal/update. Why - `multica agent list --output json` historically returned plaintext `custom_env` for owner/admin callers (the redaction gate gave only members the masked map). Any agent token running on the workspace inherits its owner's role and could read every other agent's secrets just by listing. - Patching list/get redaction alone (PR #3175 direction) left symmetric leaks via mutation responses, WS events, the "reveal" path itself (no actor-aware auth), and a `` overwrite footgun on UpdateAgent. What changed - Backend: drop `custom_env` from AgentResponse; add coarse `has_custom_env` + `custom_env_key_count`. Strip env handling from UpdateAgent (silently ignored if sent). Keep CreateAgent's custom_env acceptance. - Backend: new GET/PUT `/api/agents/{id}/env` handlers in `internal/handler/agent_env.go`: - resolveActor → 403 for agent actors (closes the lateral-movement path). - Owner/admin role gate via existing helper. - PUT honours value == "*" as "preserve existing value". - Both write to `activity_log` with `agent_env_revealed` / `agent_env_updated` actions. Audit details record key names only, never values. - Daemon claim path (`ClaimAgentTask`) unchanged — `TaskAgentData` still carries plaintext env for runtime injection. - SQL: new `UpdateAgentCustomEnv` query; sqlc regenerated (v1.31.1). - CLI: new `multica agent env get\|set` subcommands. `--custom-env` flags removed from `multica agent update`; the no-fields error now points to the new path. - Frontend: drop env fields from `Agent` + `UpdateAgentRequest`; add `getAgentEnv` / `updateAgentEnv` client methods; rewrite env-tab to show "N variables configured" + explicit "Reveal & edit" button, fetching values only on intentional reveal. - Locales: parity-safe additions to en + zh-Hans. - Docs: agents-create.{mdx,zh.mdx} reflect the new threat model and endpoint. - Mobile: schema drops `custom_env` / `custom_env_redacted`, adds metadata fields. Tests - Handler tests pinned the new invariants: no env in list/get responses, owner reveal happy-path + audit row, agent-actor 403, `***` sentinel preserves real values, UpdateAgent silently ignores `custom_env`, pure `mergeAgentEnv` cases. - CLI tests pivot to the new flag surface: `agent update` MUST NOT expose the env flags; `agent env set` MUST expose --custom-env-stdin/--custom-env-file. - Frontend test fixtures updated; pnpm typecheck / test / lint pass cleanly. This is a breaking API change. Scripts that read `custom_env` from `/api/agents` must migrate to `GET /api/agents/{id}/env`. Co-authored-by: multica-agent <github@multica.ai> fix(agents): close actor-spoofing + audit fail-closed in env endpoints (MUL-2600) Addresses Elon's review of #3209: * Mint a task-scoped `mat_` token per claim, bound to (agent, task, workspace, owner). Daemon injects it into the agent process in place of its own credential. Auth middleware authoritatively rebuilds X-User-ID / X-Agent-ID / X-Task-ID from the token row and sets X-Actor-Source=task_token; that header is server-set only — incoming values are stripped before any auth branch runs. resolveActor honors the header so an agent that strips X-Agent-ID / X-Task-ID still resolves as actor=agent. * GetAgentEnv / UpdateAgentEnv are now fail-closed on audit-log failures: GET refuses to return plaintext, PUT persists inside the same tx as the audit row so they commit/roll back together. * PUT /api/agents/{id} returns 400 when the body carries custom_env instead of silently dropping it — directs callers to the audited env endpoint. * Agent actors never see mcp_config, even when the underlying member is owner/admin; mutation broadcasts go through a redaction shim so WS subscribers don't pick it up either. * Fix backend test that asserted dense JSON (jsonb::text renders whitespace) and frontend test that assumed a unique "Test User" match. Co-authored-by: multica-agent <github@multica.ai> * fix(agents): close residual MUL-2600 gaps from review (MUL-2600) Migration 108 FK now correctly references agent_task_queue(id) instead of the non-existent agent_task table; the previous name blocked CI backend migrations. Task-token-authenticated requests can no longer be re-routed at a different workspace by passing workspace_slug / workspace_id / ?workspace_id / a URL workspace param. ResolveWorkspaceIDFromRequest and resolveWorkspaceUUID both short-circuit on X-Actor-Source=task_token and return only the token-bound X-Workspace-ID; buildMiddleware adds a defence-in-depth 403 if any URL-resolved workspace disagrees with the token binding. mcp_config no longer leaks back to agent actors through UpdateAgent / CreateAgent / ArchiveAgent / RestoreAgent HTTP responses — the same redactAgentResponseForActor helper that GetAgent/ListAgents use is now applied to mutation responses too. WS broadcasts were already redacted via broadcastAgentResponse. FailTask and every TaskService cancel path (CancelTask / CancelTasksForIssue / CancelTasksForAgent / CancelTasksByTriggerComment / BroadcastCancelledTasks) now eagerly DeleteTaskTokensByTask so the mat_ token's 24h window doesn't outlive a terminated task. Failure is non-fatal — the FK cascade and expiry remain durable guards. Doc-only: clarify that PUT /api/agents/{id} now hard-rejects bodies that carry custom_env (was previously "silently ignores"). Tests: - middleware: TestResolveWorkspaceIDFromRequest gains a task_token case asserting client-supplied slug/id/query cannot override the bound workspace. - handler: TestUpdateAgent_RedactsMcpConfigForAgentActor and TestUpdateAgent_KeepsMcpConfigForMemberActor pin the mutation- response redaction contract per actor type. Co-authored-by: multica-agent <github@multica.ai> * fix(agents): match redacted mcp_config as JSON null, not Go nil (MUL-2600) `AgentResponse.McpConfig` is `json.RawMessage` without `omitempty`, so the redacted response serialises as `"mcp_config": null`. On decode, `json.RawMessage` keeps the literal bytes `null` rather than collapsing to Go nil, which made the assertion fire on a non-leak. The product contract (field always present, distinguished from "no config" via `mcp_config_redacted`) is intentional, so adjust the test to check for "no secret-bearing content" instead of weakening the contract via `omitempty`. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-25 18:42:48 +08:00
Naiyuan Qing	e0b756f515	feat(issues): redesign board card layout + extract useTimeAgo i18n hook (#3064 ) * refactor(views): replace static timeAgo with shared useTimeAgo hook The previous timeAgo helper in packages/core/utils.ts hardcoded English output ("2d ago"), producing "更新于 2d ago" mixed-language strings in zh locale. Replaced with a localized useTimeAgo() hook in packages/views/i18n, backed by common.time.{just_now,minutes_ago, hours_ago,days_ago} translation keys. Migrated all 10 view-side call sites and removed the static function. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(issues): redesign board card layout Properties were piling onto the bottom row (assignee + priority badge + start date + due date) until it overflowed. Restructured into four semantic rows: - Top: priority icon (left, icon-only — color already conveys urgency) + identifier; agent activity indicator (right) - Title - Chip row: project + labels - Meta row: assignee (left, avatar + name when only property present; bare avatar otherwise) + start/due dates + child progress Long agent/team names truncate cleanly (min-w-0 + max-w-[160px]) and dates/progress are shrink-0 so they never compress. When the meta row contains only an assignee, the right side fills with "Updated 2d ago" to avoid a half-empty row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 15:21:10 +08:00
Naiyuan Qing	fedd0f1694	feat(issues): live agent activity chip + per-issue indicator + filter (#3058 ) * feat(server): broadcast task:running event The dispatched → running transition was silent: only task:queued, task:dispatch, task:cancelled, task:completed and task:failed broadcast over WS. Any UI that distinguishes "queued" from "running" (e.g. the new issue-card agent activity indicator) would lag by up to the 30s agentTaskSnapshot staleTime on the most user-visible transition. StartTask now broadcasts task:running so the workspace snapshot invalidates immediately, keeping the agent activity UI live. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(issues): live agent activity chip + per-issue indicator + filter Surfaces "which agents are working on what, right now" in the Issues and My Issues views, with a one-click filter to narrow the list to issues that have a running agent task. Two visual surfaces: - Workspace chip in the header (left of Filter). Shows the brand-tinted avatar stack of agents currently running on visible issues. Click toggles a page-scoped filter; idle state renders a static "0 working" button with a hover-card placeholder. When the filter is active the chip pins to brand fill across hover and popover states (the Button outline variant otherwise repaints back to neutral). A muted "Viewing only working agents" hint sits to the left of the chip whenever the filter is on, so users notice the active state without having to hover. - Per-issue indicator on every board card and list row (top-right of the identifier line). Renders the avatar stack of agents in running or queued state on that issue, full-opacity ring at brand/70 when ≥1 is running, half-opacity stack when only queued. Returns null when nothing is in flight. Both surfaces open the same hover-card body that lists each active task with the agent avatar, status dot (composed via the existing availability + workload tokens), and a live-ticking duration. Adds a new "All" scope to /my-issues that unions assignee, creator, and involves_user_id via three parallel fetches deduped on the client — no backend changes for this part. The chip's count and the quick-filter both use the page's currently visible issue ids so they stay in sync with the active scope. State is per-user (Zustand + localStorage) and the agentRunningFilter is intentionally omitted from partialize — running state changes second-to-second and a stored toggle would land users in an unexplained empty list. WS task:running, already added in the preceding commit, drives real-time updates without polling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(issues): swap indicator ring pulse for shimmer text label Earlier iterations layered a brand ring with various opacity-pulse cadences around the per-issue avatar stack. Every tuning attempt was either invisible (transparent ring + faded pulse) or oppressive (a visible ring that flashed on a dense board). Moves the "alive" signal onto a small text label and reuses chat's existing `animate-chat-text-shimmer` utility — a soft light sweep across the glyphs that already powers the ChatGPT-style "thinking" cue in task-status-pill. Indicator now reads as a 12 px avatar stack + 10 px label: - Running → full-opacity avatars + shimmering localized "Working" - Queued → half-opacity avatars + muted static "Queued" - Idle → render nothing (unchanged) Avatars and the surrounding card stay completely still; only the few glyphs animate. The label is i18n-driven via the existing `status_running` / `status_queued` keys, so no locale changes are required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-22 14:20:42 +08:00
YYClaw	614dfae884	MUL-2488 feat(timezone): Scheduling / Viewing two-layer timezone architecture (#2968 ) * docs(timezone): add scheduling/viewing timezone architecture RFC * feat(db): replace daily rollups with task_usage_hourly, add user.timezone Migrations 100-104: add "user".timezone (Viewing tz), build the UTC hourly task_usage_hourly rollup with its pipeline, drop the legacy task_usage_daily / task_usage_dashboard_daily pipelines, and drop the agent_runtime.timezone column. Report queries now slice day boundaries at read time by the caller-supplied @tz instead of materialising in a fixed tz. Regenerate sqlc. * feat(server): add task_usage_hourly backfill command Replace the two legacy backfill commands (daily / dashboard_daily) with a single backfill_task_usage_hourly that loads historical task_usage into the new UTC hourly rollup, sliced per workspace. * refactor(server): resolve viewing timezone in report handlers Report handlers resolve the Viewing tz per request (?tz query param, then user.timezone, then UTC) and pass it to the hourly-rollup queries. Drop the UseDailyRollup feature flags and the old raw-scan/daily-rollup dual paths, remove the /api/usage endpoints, and stop the daemon from reporting and the runtime handler from accepting host timezone. * refactor(core): switch report queries to viewing timezone API client and dashboard/runtime queries send ?tz with each report request, the user schema/types carry the new timezone field, and the runtime timezone field/mutation is removed. * feat(views): add viewing timezone preference and UI Add the useViewingTimezone hook and a Timezone setting in Preferences; report charts and the dashboard week boundary follow the viewer tz. Remove the runtime detail timezone editor and its locale strings. * fix(test): update fixtures and stabilize tests for timezone refactor The timezone architecture refactor changed several types without updating dependent test code: - RuntimeDevice no longer has a timezone field — drop it from the create-agent-dialog runtime fixture. - User now requires a timezone field — add it to the apps/web mockUser fixture. - The PreferencesTab timezone tests asserted on the async save handler (PATCH then store update) with a bare expect, racing the mutation's settle callback, and timed out querying the Select's ~600-option IANA list on a loaded CI runner. Wrap the assertions in waitFor and extend the timeout for those three tests. * docs(timezone): document self-host migration order and trigger invariant Add a SELF-HOST UPGRADE ORDER runbook to the backfill command's package comment: applying migrations 100-104 in a single migrate-up drops the legacy daily rollups before the hourly backfill runs, leaving dashboards empty until cron catches up. Add an INVARIANT comment on trg_atq_dirty_hourly noting that agent_id must be added to the trigger's OF list if it ever becomes mutable, otherwise dirty buckets for the old agent_id are silently missed. * style(runtimes): drop trailing blank line in runtime-detail	2026-05-21 15:33:47 +08:00
iYuan	2f1f90c11a	fix(agent): retry codex semantic inactivity fresh (#2593 )	2026-05-20 20:03:39 +08:00
Bohan Jiang	688dcb017c	fix(agents): drop confusing "default" badge from model picker (MUL-2477) (#2938 ) The model dropdown already exposes a "Default (provider)" option meaning "follow the CLI's current selection". Tagging the runtime's preferred model with a small "default" chip created two competing notions of "default" in the same UI and confused users. Remove the chip from both the create-agent ModelDropdown and the inspector ModelPicker; keep the underlying RuntimeModel.default flag intact since thinking-prop-row still uses it as a fallback heuristic. Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 18:07:57 +08:00
Bohan Jiang	ffc0c5ab2e	docs(agent-inspector): sync thinking_level comments with no-override semantics (MUL-2339) (#2923 ) Follow-up to #2919 review nits — comments still described the empty thinking_level as "use runtime default" and claimed ThinkingPicker callers guaranteed non-empty levels. Both were stale after the semantics changed: - packages/core/types/agent.ts: clarify that "" clears the override and the local CLI config / built-in default decides at runtime. - thinking-picker.tsx: document that the stale-orphan clear path in ThinkingPropRow mounts the picker with an empty levels list plus a persisted value, so callers do not guarantee non-empty levels. Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 15:34:27 +08:00
Bohan Jiang	68270e238e	MUL-2339: polish(agent-inspector): optimistic updates + picker layout + thinking-default semantics (#2919 ) * polish(agent-inspector): optimistic updates + picker layout + thinking-default semantics Round of cleanup on the agent inspector pickers after using them end-to-end: 1. Optimistic updates (`agent-detail-page.tsx`) The `handleUpdate` callback that backs every inspector picker (thinking / model / visibility / concurrency / runtime / name / description / avatar) was strictly sequential: `await api.updateAgent → invalidateQueries → toast.success`. Each pick waited 0.5-2s for the network round trip before the trigger chip updated, which read as visible UI lag. Snapshot the cached agent list, patch the matching agent synchronously via `setQueryData`, then run the network request in the background. On error roll back to the snapshot before the toast surfaces the cause. All inspector pickers now respond instantly. 2. Block-in-inline fix in Model + Thinking pickers `PickerItem` wraps its children in a flex `<span>`. The picker bodies had `<div>` children, which is block-in-inline (invalid HTML5) and triggers a browser layout quirk that off-aligns descendants — model IDs floated to the center under their labels in ModelPicker, descriptions indented unevenly under levels in ThinkingPicker. Replace the inner `<div>`s with `<span block text-left>` so the layout is deterministic across rows. 3. Visual polish in Thinking picker Label was `font-medium` at the parent's default `text-sm` (14px), chunky next to the 10px description. Drop to `text-[13px]`, bump description to `text-[11px] leading-snug` with `mt-0.5` so the contrast between rows feels less jarring. 4. Match Model picker's row typography to Thinking's Same `text-[13px]` for label + `text-[10px] mt-0.5` for the model ID. Both pickers now read as the same component family. 5. "Default" semantics: follow CLI config, not model factory default The chip displayed "Default" / "default" badge when no `thinking_level` was set, alongside a `[default]` chip on the model's factory-advertised default option in the menu. That was misleading: when Multica omits `--effort` (because picker is unset), it's the user's local CLI config (claude/codex) that decides the reasoning level — not the model's factory default. Showing "medium [default]" while the user has xhigh in their CLI config lies about what actually fires at the API. - Trigger label: "Default" → "Follow CLI config" (zh: "跟随 CLI 配置") - Footer clear button: "Use model default" → "Follow CLI config" - Footer tooltip: explicitly mentions claude/codex CLI config - Inline `[default]` badge on the factory-default option: removed - `defaultLevel` prop chain (picker + prop-row + test): cleaned up as now-dead code 6. Stop hiding the Thinking row while discovery loads `if (levels.length === 0 && !value) return null` hid the row while the runtime-models query was still in flight, which subscribed-then-unsubscribed from useQuery in such a way that the discovery only fired when the user manually opened the Model picker. Gate the early return on `!isLoading && !isFetching` so ThinkingPropRow stays mounted (and thus its useQuery keeps subscribed) until discovery returns; row appears as soon as data arrives, no Model-picker tap required. 7. Drop the inline tooltip on Thinking picker items The same description was rendered both inline under the label (always visible) and as a hover tooltip (overlapping the next row). The hover bubble was redundant — removed. Tests - `pnpm --filter @multica/views test thinking-picker` → 7/7 pass after renaming the "Default" assertion + clearing the unused defaultLevel test prop. - `pnpm --filter @multica/views typecheck` clean. * fix(test): align thinking-prop-row tests with renamed copy + loading-aware row gate CI surfaced 3 broken assertions in `thinking-prop-row.test.tsx` — all consequences of the polish PR's behaviour changes that the test file hadn't tracked: - "hides the row when ... no thinking levels and nothing is persisted" The row now stays mounted while runtime-models discovery is in flight (so the useQuery subscription actually survives long enough to issue the request — fixes the bug where Thinking only appeared after manually opening the Model picker). The assertion asserted absence only after `initiate` was called, but loading is still in progress at that point. Wrap the absence assertion in `waitFor` so it waits for the row to disappear after the query settles. - "clears the orphan value via the picker footer" Tooltip copy changed from "Clear and fall back to this model's default reasoning level" → "Clear the override and let the local CLI config decide the reasoning level". Update the regex. - "renders the row with \"Default\" when value is empty" Trigger label changed from "Default" → "Follow CLI config" to reflect that Multica omits --effort and the local CLI config decides. Update the assertion + test name. `pnpm --filter @multica/views test` → 701/701 pass. * fix(agent-inspector): drop loading-row gate + per-field optimistic rollback (MUL-2339) Addressing review feedback on #2919: - ThinkingPropRow no longer keeps the row visible during discovery. The previous explanation ("early return null aborts the useQuery subscription") was wrong — React doesn't unmount a component that returns null, so hooks (and their subscriptions) stay live. The loading-aware gate only succeeded in showing an empty "Follow CLI config" row that opened to an empty menu before discovery settled. Restore the simple `levels empty && !value -> null` behavior; the sibling ModelPicker mounts unconditionally and keeps the shared runtime-models query active regardless. - AgentDetailPage.handleUpdate now rolls back only the fields the failing PATCH wrote, instead of restoring a whole-list snapshot. A whole-list snapshot rollback discards any concurrent successful inspector mutation that landed between snapshot and rollback. Per- field rollback + a final invalidate converges the cache on server truth without clobbering unrelated optimistic writes. - Sync the now-stale "use model/runtime default" wording in the thinking-related JSDoc and type comments: empty thinking_level is a "no override" sentinel — the backend omits --effort and the upstream CLI config decides — not a Multica-known default level. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 15:18:34 +08:00
Bohan Jiang	9d3b6e2241	feat(agent): inspector picker for thinking_level (MUL-2339) (#2912 ) * feat(agent): inspector picker for thinking_level (MUL-2339) PR1 (#2865) shipped the backend — column, daemon-side discovery, Claude/Codex injection, API validation — but the agent detail inspector had no UI to set the value. Users could only configure thinking_level via custom_env / API. This wires up the picker so it lives next to Runtime and Model where everything else editable already lives. Picker is per-(runtime, model): it reuses the same `runtimeModelsOptions` query the Model picker already runs (60s cache, no extra round-trip) and reads the active model's `thinking.supported_levels`. When the list is empty — every provider except Claude/Codex today, or a Claude model that doesn't expose `--effort` — the entire PropRow is hidden, not just rendered inert. The picker never gets to invent value/label pairs itself; they come verbatim from each CLI's own catalog (`Low`, `Extra high`, …) so the user sees exactly what `claude --effort` / `/effort` and Codex's TUI show. The `default_level` from the catalog is badged inside the popover so the user knows which value `""` (the persisted "use model default" sentinel) maps to. The clear footer sends `""` explicitly, which the backend already understands as the tri-state "explicit clear" branch of UpdateAgent. Invalid combinations (e.g. picking a value not in the target provider's enum after a runtime swap in the same PATCH) hit the existing 400 path on the server and surface as a toast via the inspector's standard `onUpdate` error handler — no extra client-side guard needed. Exports `RuntimeModelThinking` and `RuntimeModelThinkingLevel` from `@multica/core/types` so views consumers can refer to them by name. i18n keys added in EN and zh-Hans (parity test green). Co-authored-by: multica-agent <github@multica.ai> * fix(agent): preserve unknown thinking_level in picker label Stale persisted values (model swap, CLI catalog shrink) used to render as 'Default' even though the backend would still ship the orphaned token. Fall back to the raw value when no entry matches so the user sees what's actually saved and can clear it. Co-authored-by: multica-agent <github@multica.ai> * test(agent): unit tests for thinking-picker label + clear flow Covers the default-vs-set trigger label, the unknown-token preservation path added in `3452fae3f`, the read-only display, picking and re-picking into onChange, and the clear footer's empty-string emission. Co-authored-by: multica-agent <github@multica.ai> * fix(agent): keep Thinking row visible when value is stale (MUL-2339) Inspector was hiding the row whenever the active model had no supported_levels, which also hid persisted orphan tokens (model swap into a non-thinking runtime, or a CLI catalog that shrank). PR1's per-model invalid behavior is daemon-side warn/drop, not a synchronous DB clear, so the frontend has to surface the raw value and let the user explicit-clear it via the picker footer. Render the row when levels are empty AND value is empty; otherwise keep it. Extract ThinkingPropRow into its own file so the row-level logic is unit-testable. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-20 13:47:19 +08:00
Qi Yijiazhen	d9ae891064	fix(avatar): stop bg-muted bleeding through transparent images (#2670 ) ActorAvatar applies bg-muted on its container regardless of whether an image is loaded, so transparent regions of PNG/SVG avatars reveal the grey placeholder. agent-detail-inspector also wraps ActorAvatar in an outer bg-muted div, layering a second grey square. Make bg-muted conditional on the fallback state in ActorAvatar, and drop the redundant bg-muted from avatar-picker's image-loaded branch and the two inspector wrappers. Empty-state placeholders unchanged.	2026-05-18 18:23:46 +08:00
Naiyuan Qing	20c2f45b4a	fix(views): surface backend error messages on mutation failures (MUL-2317) (#2772 ) * fix(views): surface backend error messages on mutation failures (MUL-2317) Mutation toasts across the views package were swallowing the backend `error` string and showing only a generic i18n fallback. This made it impossible for users to see why an operation failed (most visibly: creating an issue with a duplicate title produced a vague "Failed to create issue" toast). The fix has three pieces: 1. Create-issue duplicate branch (A段) - New schema `DuplicateIssueErrorBodySchema` in core/api/schemas.ts. - `create-issue.tsx` parses `ApiError.body` via `parseWithFallback` and renders a dedicated amber-toned toast with a "view existing" link when the server returns `{ code: "active_duplicate_issue", issue: {...} }`. Schema drift downgrades to the normal error toast. - Schema intentionally omits `issue.status` so the toast does not depend on `StatusIcon`, which has no fallback for unknown enums. 2. User-facing mutation failure toasts (B段) - 47 sites converted to `err instanceof Error && err.message ? err.message : <existing fallback>` — preserves all existing code-specific branches (slug conflict, agent_unavailable, daemon_version_unsupported) and i18n keys. - Covers Type 1 (onError) and Type 2 (catch block) patterns across issues, projects, autopilots, inbox, runtimes, squads, comments, batch actions, workspace create, and agent config tabs. 3. Autopilot partial-success (Type 3) - New i18n keys `toast_create_partial_with_reason` / `toast_update_partial_with_reason` (double-brace `{{reason}}`). - `autopilot-dialog.tsx` captures `err.message` in the schedule `catch` and routes to the `_with_reason` variant when present, preserving the partial-success semantic (autopilot saved, schedule failed) while exposing the actual reason. Explicitly out of scope: - `packages/core/` mutation hooks (no global onError, no UI dependency) - No `toastApiError` helper (matches existing 14+ correct sites) - Sub-issue link aggregate `Promise.allSettled` keeps count-based toast (N independent requests cannot collapse to one err.message); only added a dev-side `console.error` per rejection. - Clipboard catches and `useUpdateChatSession` (not API mutation toasts) Tests: - `packages/core/api/schemas.test.ts` — schema contract (valid body, forward-compat fields, rename rejection, missing issue, wrong types). - `packages/views/modals/create-issue.test.tsx` — duplicate toast + view link, schema-drift fallback, err.message surfacing, non-Error fallback (4 new cases). - `packages/views/autopilots/components/autopilot-dialog-i18n.test.ts` — real i18next, asserts rendered text contains the reason verbatim (guards against `{reason}` vs `{{reason}}` regression). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * fix(autopilots): unify rotate-token catch + cover dialog partial-success render Address reviewer feedback on PR #2772: 1. webhook-token rotate (`autopilot-detail-page.tsx`) now follows the `err.message ?? fallback` ternary used by the sibling trigger delete/add paths, instead of swallowing the error. 2. Extract `formatSchedulePartialFailureToast` so the dialog's partial-success branches and the i18n test exercise the same helper. The test now drives the actual format function, so a variable-name typo at the call site (e.g. `{ msg }` instead of `{ reason }`) fails the substring assertion. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * test(modals): drop user.type for title in success path to dodge CI 5s timeout The success-path test typed the 42-character title via userEvent which triggers a controlled re-render per keystroke. On the slower CI runner the whole test crept up to ~5s and intermittently tripped the default vitest timeout. Setting the value in one shot via fireEvent.change cuts the cost while leaving the submit + toast interactions on userEvent. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-18 13:44:10 +08:00
Naiyuan Qing	0079a73430	fix(views): narrow agent/squad create dialogs to max-w-2xl (#2706 ) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-15 23:09:15 +08:00
Naiyuan Qing	f64d182fd1	fix(views): narrow agent/squad create dialogs from max-w-5xl to max-w-4xl (#2688 ) Both create dialogs were too wide at 5xl (1024px). Align with the codebase convention for full create dialogs (create-project, create-issue expanded) which use max-w-4xl (896px). Keeps both modals consistent. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-15 17:59:45 +08:00
Jiayuan Zhang	2f0e5b589e	[codex] Add member and agent task views	2026-05-15 07:23:00 +02:00
Jiayuan Zhang	675ed02aa6	MUL-2216: persist Mine/All tab selection on Agents and Squads pages (#2624 ) * MUL-2216: feat(agents,squads): persist Mine/All tab selection per workspace Tab selection on the Agents and Squads list pages was held in component-local state, so navigating into a detail page and back remounted the list and reset the tab to the default "Mine". Move `scope` into Zustand stores backed by `persist` + `createWorkspaceAwareStorage`, matching the pattern used by the Issues view store. Selection now survives list → detail → back navigation and page reloads, scoped per workspace. Only `scope` is persisted; `search`, `sort`, and other ephemeral filters intentionally still reset on remount. Co-authored-by: multica-agent <github@multica.ai> * fix(views): reset scope to mine when switching to a workspace with no persisted value zustand persist.rehydrate() is a no-op when storage returns null, so workspaces with no entry kept the previous workspace's in-memory scope ("all" leaked from one workspace into the next). Provide a custom merge that resets to the default "mine" when no persisted state is present. Add coverage for the missing-storage workspace-switch case for both Agents and Squads. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-14 14:11:22 +02:00
Naiyuan Qing	43b9a1173c	refactor(agents): drop template chooser from create-agent dialog (#2615 ) * refactor(agents): drop template chooser from create-agent dialog Removes the blank-vs-template chooser, the template picker, and the template detail step. The "Create agent" entry point now opens directly on the form. The createAgentFromTemplate API and types remain untouched — this only removes the UI entry. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * docs(squads): fix stale comment about createAgentFromTemplate Squad-scoped create flow no longer goes through the template path; the dialog now only calls api.createAgent then api.addSquadMember. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-14 18:05:37 +08:00
Naiyuan Qing	77b929fd3e	feat(squads): add agent live peek hover card on member avatars (#2608 ) * feat(squads): add agent live peek hover card on member avatars Squad members tab now opens a live-state peek card on agent avatar hover/focus — workload, current issue (clickable), and last activity. Identity (description / runtime / skills / owner) stays on the existing AgentProfileCard; new AgentLivePeekCard is the second `hoverCardVariant` on ActorAvatar so the 23+ existing profile-card call sites keep their behaviour. Reuses the workspace agent-task snapshot already fetched by the presence dot, so this adds zero new requests per row. Failed terminal tasks surface as a small ⚠ on the last-activity line without polluting workload (workload stays current-state only, matching the deliberate split documented in core/agents/types.ts). Co-authored-by: multica-agent <github@multica.ai> * fix(squads): only enable hover card for agent avatars Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-14 17:30:08 +08:00
Naiyuan Qing	0c4133ef5b	feat(agents): rewrite template catalog as 25 lightweight starters (#2587 ) * feat(agents): rewrite template catalog as 25 lightweight starters Replaces every Phase-1 template with a curated set built around the "persona + intake + scaffold + hard negatives" instruction shape. Cross- platform survey (Cursor / Cline / Roo / Continue / Custom GPTs) showed the industry baseline for starter agents is "few but sharp" — single intent, no methodology buy-in, mostly prompt-only. The original catalog went the opposite direction (avg 2.5 skills, six-skill Full-stack methodology stack) and felt heavy for first-time use. Catalog shape: - 25 templates across 7 categories: Engineering (8), Product (4), Writing (5), Design (3), Communication (2), Team (1), Productivity (2). New Product / Design / Communication / Team domains fill gaps the old Eng-heavy catalog ignored. - 16 / 25 are prompt-only (no skill fan-out). Avg 0.56 skill per template vs. 2.5 prior. Heaviest is 2 skills, only for templates whose intent cannot be expressed in instructions alone (Playwright runner, single- file HTML bundlers, design + UX-guidelines pair). - Universal top-frequency intents that the old catalog missed are now covered: Code Explainer (intent #1 across every platform surveyed), Translator (中英), Summarizer, Writing Critic, PRD Drafter/Critic, RCA Writer, ADR Writer, PR Description Writer, Commit Message Writer. Loader allows 0-skill templates: - server/internal/agenttmpl/loader.go drops the "must declare at least one skill" validation; comment explains the picker's "Prompt only" rendering path. - loader_test.go: removed the corresponding negative case, added TestLoadFromFS_PromptOnlyTemplate as a regression guard. - agent_template.go handler is unchanged — every len(tmpl.Skills) call site was already 0-safe (empty fan-out short-circuits the fetch phase and the in-tx loop both skip cleanly). Frontend: - template-picker.tsx: 18 new lucide icons (BookOpen, Bug, GitPullRequest, GitCommit, AlertTriangle, Scale, ClipboardList, Microscope, UserRound, Target, Highlighter, Languages, AlignLeft, GraduationCap, Lightbulb, Type, MessageSquare, Briefcase). Card renders a "Prompt only" badge when skills.length === 0 instead of "0 skills". - template-detail.tsx: skill list section is hidden entirely for prompt- only templates — a header reading "Includes 0 skills" above an empty list was just visual noise. Instructions section below carries the agent's identity for these. - locales/en + zh-Hans agents.json: new create_dialog.template_card. prompt_only key ("Prompt only" / "纯指令"). Verification: - go test ./internal/agenttmpl/ — 9/9 pass, including TestLoad_RealTemplates which fails closed if any new JSON is malformed. - pnpm typecheck — all 6 packages clean. - pnpm --filter @multica/views test — 482/482 pass. - pnpm lint — 0 errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(agents): add category filter pills to template picker 25 templates across 7 categories made the picker scroll-heavy on first open. Add a single-select category filter row above the grid so a PM can isolate Product templates in one click, an engineer can jump straight to Engineering, etc. Visual reuses the IssuesHeader scope-toggle pattern verbatim — Button variant="outline" + active class swap (bg-accent / text-muted-foreground) — so the affordance reads the same as the existing filter pills in issues / squads / runtimes / my-issues. flex-wrap keeps the 8 pills (All + 7 categories) honest on narrow widths. Counts are inlined into the label ("Engineering (8)") rather than shown as a separate badge — single-line-tall pills look right next to the picker grid, and surfacing the per-category density up front doubles as a hint at the catalog's "less but sharper" intent. When a specific category is active, the grid renders flat (no section headers) — the active pill already names what's on screen, and a header reading "Engineering" above an only-Engineering grid is visual duplication. "All" falls back to the prior grouped layout. State is component-local (no URL sync, no persistence) since the picker is dialog-internal transient state — closing the dialog naturally resets the filter, which is the expected behaviour for a "choose from a catalog" surface. i18n: new `create_dialog.template_picker.filter_all` key in en + zh. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 14:12:18 +08:00
LinYushen	d1c8c213e4	feat: extend pinyin search to all Agent/Member/Squad selectors (#2582 ) Integrate matchesPinyin into: - AssigneePicker (issue assignee selector) - IssuesHeader (assignee filter bar) - AgentPicker (autopilot agent selector) - SquadDetailPage (add member/agent picker) - QuickCreateIssue (agent/squad picker) - CreateProject (lead picker) - ProjectDetail (lead picker) - ProjectsPage (lead filter) - AgentsPage (agent search) - SquadsPage (squad search) Closes MUL-2179 extended scope. Co-authored-by: multica-agent <github@multica.ai>	2026-05-14 13:57:38 +08:00
Bohan Jiang	f15a745182	feat(squads): add Create Agent entry on Squad detail (MUL-2178) (#2579 ) Adds a Create Agent button on the Squad detail Members tab, visible only to workspace owner/admin (matching the AddSquadMember backend gate). The dialog reuses the existing CreateAgentDialog — both the manual and template paths now accept an optional squadId; when set, the dialog runs addSquadMember after createAgent / createAgentFromTemplate and skips the navigation to the agent detail page so the user lands back on the Members tab. Atomicity is best-effort frontend-serial (no new backend transaction): on partial failure the dialog surfaces a warning toast and the agent remains addable from the existing Add Member flow. Co-authored-by: multica-agent <github@multica.ai>	2026-05-14 13:32:28 +08:00
Naiyuan Qing	52d032335a	feat(agents): expose runtime + model on create-from-template (#2565 ) Template create used to silently default the runtime to "first usable" and never collected a model — users had no idea where the new agent would run or which model it would use until they opened the detail page. Add a Runtime + Model picker pair above the skill list on the template-detail step so the choice is visible (and overridable) before the one-click Use action. - Extract RuntimePicker out of create-agent-dialog so the form and the template-detail step share one popover; selection seeding moves into the picker too, since it's the only place that knows the active filter (mine/all). Parent keeps just the duplicate-mode pre-fill. - Mirror RuntimePicker's label-row + trigger DOM in ModelDropdown so the two pickers render at identical heights when sat side-by-side (fixes a 6-8px misalignment caused by inconsistent label-row sizing). - Send model in createAgentFromTemplate; server side already accepts the field (CreateAgentFromTemplateRequest.Model, omitempty), empty string still falls through to the runtime's default model. - Drop the runtime_register_first fallback hint that made the Runtime trigger two-line in the empty state, breaking alignment with Model's one-line trigger. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-14 11:33:39 +08:00
Naiyuan Qing	623d29f276	feat(agents): one-click create from curated templates (Phase 1) (#2520 ) * docs(agents): three-phase agent quick-create plan Captures the full design for moving agent creation from manual form + one-by-one skill attachment to a tiered experience: - Phase 1 (this PR): one-click curated templates, AI-free. - Phase 2 (next): AI-recommended skills via the existing quick-create task mechanism — no new server-side LLM dependency. - Phase 3 (later): AI creates the whole agent end-to-end, composing Phase 2 with a new `multica agent create` CLI driver. Documents the architectural decisions that keep all three phases on existing infrastructure (no SSE, no server-side LLM SDK, no new WS channels), the two soft blockers Phase 1 unlocks for later phases (createSkillWithFiles TX composability + skill same-name dedupe), and the scope decisions we explicitly opted out of (Anthropic plugin marketplace, ClawHub UI affordances). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(skills): harden import against invalid UTF-8 and binary files PG rejects two byte patterns in a TEXT column. Both crashed real skill imports we hit while assembling the template catalog: - Embedded NUL (0x00) -> SQLSTATE 22021. Already stripped by sanitizeNullBytes, kept as-is. - Other invalid UTF-8 (e.g. 0x91 — Windows-1252 smart quote in a skill whose author saved prose from Word). sanitizeNullBytes now also runs strings.ToValidUTF8 over the content so the second class no longer takes the whole import down. For non-text payloads (images, fonts, archives, compiled binaries), sanitization isn't the right fix — agents never read those as text, and the bytes can't survive a TEXT column at all. addFile now skips them by extension before the per-bundle cap counters tick, logging the skip so an unexpected drop leaves a breadcrumb. Function name kept for compatibility with the many call sites; both behaviours are strict supersets of the original. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(skills): split createSkillWithFiles for tx composition + add workspace find-or-create query Two soft blockers cleared so create-from-template (next commit) can fold N skill creates and the agent + binding writes into one outer transaction: 1. createSkillWithFiles used to Begin/Commit its own tx. Caller composition was impossible — N invocations meant N separate transactions and no atomicity over the whole materialise step. Pull the body into createSkillWithFilesInTx(ctx, qtx, input); the original function becomes a thin wrapper that manages its own tx for standalone callers. Existing call sites: zero behaviour change. 2. Add GetSkillByWorkspaceAndName sqlc query — workspace skill lookup by name, anchored to UNIQUE(workspace_id, name) from migration 008. Lets the template materialiser implement find-or-create: reuse the workspace's existing skill row when a template references the same name, rather than crashing on the unique constraint or polluting the workspace with `<name>-2` clones. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(agents): agent template catalog + create-from-template endpoint Server-side foundation for Phase 1 of the quick-create roadmap (see docs/agent-quick-create-plan.md). Adds: - server/internal/agenttmpl/ — embed-loaded catalog of curated agent templates. Each template ships pre-written instructions plus a list of skill URLs that get materialised into the workspace at create time. Validation runs at startup (init() panics on a malformed template) so a bad JSON ships as a deploy-time defect, not a runtime 500. Slug must equal the filename basename so the URL router is mirror-symmetric with the file layout. - 11 starter templates covering Engineering / Writing / Building / Testing (code-reviewer, frontend-builder, planner, docs-writer, one-pager, html-slides, full-stack-engineer, …). - Three new endpoints, all behind RequireWorkspaceMember: GET /api/agent-templates — picker list (no instructions) GET /api/agent-templates/:slug — detail with instructions POST /api/agents/from-template — materialise + create Create flow: 1. Auth + runtime authorization happen BEFORE the GitHub fan-out so a 403 never wastes 20s of upstream fetches. 2. Pre-flight dedupe by cached_name reuses workspace skills without an HTTP fetch — second create-from-the-same-template drops from 20s to <100ms. 3. Parallel fetch (30s per-URL timeout) for the remaining skills. 4. Single transaction: every skill insert, the agent insert, and the agent_skill bindings. On any upstream fetch failure the TX rolls back and the API returns 422 with `failed_urls` so the UI can name the bad source(s). 5. extra_skill_ids (user-supplied additions) are verified through GetSkillInWorkspace per id before attach, so a malicious client can't graft a skill from another workspace via UUID guessing. - multica agent create --from-template <slug> CLI flag dispatches to the new endpoint with a 60s ceiling, matching `multica skill import`. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(agents): one-click create-from-template UI Frontend half of Phase 1. CreateAgentDialog becomes a state machine spanning four steps: chooser → Start blank / From template cards blank-form → existing manual form (post-chooser) duplicate-form → existing form pre-filled from a duplicated agent template-picker → grid of templates, click navigates to detail template-detail → instructions + skill list preview + one-click Use Picking a template never lands on the form: name auto-deduped against existingAgentNames, runtime = first usable one, visibility = private. Refinement happens on the agent detail page if needed. Same rationale the doc spells out — templates exist precisely to skip configuration. New components, all collapsible-by-default so quick-create stays fast: - template-picker.tsx — categorised grid, lucide icons + semantic accent tokens resolved through static maps so Tailwind's JIT picks up every variant (dynamic class strings would silently miss). - template-detail.tsx — instructions preview, skill list with cached descriptions, Use CTA. Renders the failedURLs banner when a 422 fires — the only step that can trigger that response. - instructions-editor.tsx — collapsed preview-card / expanded full ContentEditor. - skill-multi-select.tsx + skill-picker-list.tsx — shared multi- select surface, also adopted by the existing skill-add-dialog. - avatar-picker.tsx — agent avatar upload, mirrors the inspector's visual language. Schema-defended client (CLAUDE.md → API Response Compatibility): the three new endpoints are wired through parseWithFallback with lenient zod schemas. Desktop builds outlive any given server — a future field rename / wrapping must not white-screen older installs. listAgentTemplates accepts both the current bare array and a future {templates: [...]} envelope. Coverage: 7 new schema-test cases in schema.test.ts (null body, missing skills/instructions, malformed create response, envelope migration). Catalog + detail go through TanStack Query with staleTime: Infinity — workspace-independent static data, no per-mount refetch. Other: - skill-add-dialog becomes a true multi-select (Confirm button + checkbox list); attached skills are filtered out of the list. - agents-page hands the freshly-created Agent back to the dialog so a follow-up setAgentSkills can attach the form-selected skills. - agent-overview-pane drops the mx-auto/max-w-2xl frame on config- tab content; the wider dialog visual language reads better with tabs filling the column. - Every new UI string lives in both en/agents.json and zh-Hans/agents.json under create_dialog.* / tab_body.skills.* — locales/parity.test.ts blocks drift in CI. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ci): align skill import test + drop next-only lint suppression - TestFetchFromSkillsSh_ResolvesRootLevelSkillMd now expects assets/logo.png to be skipped; matches the new addFile binary-extension guard (`6fafd86e`). The .png is intentionally dropped so PG TEXT inserts don't hit SQLSTATE 22021. - packages/views shares zero next/* deps, so the @next/next/no-img-element eslint plugin isn't loaded there. The eslint-disable directive referencing it produced a hard "rule not found" error in CI lint. Raw <img> is the right primitive in views; remove the disable comment. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> * test(agents): wrap CreateAgentDialog tests in workspace/navigation providers The dialog now calls useNavigation() and useWorkspacePaths(), both of which throw outside their providers. The existing tests rendered the dialog bare and tripped both new requirements: - NavigationProvider — supply a stub adapter so push() works for the agent-detail redirect. - WorkspaceSlugProvider — useWorkspacePaths() requires a slug. The blank-vs-template chooser is now the default first step; the existing tests target the runtime picker on the manual form, so the helper auto-clicks "Start blank" when no template is passed (duplicate-mode tests skip the chooser). Manual afterEach(cleanup) + document.body wipe. Base UI's Dialog portal renders into document.body and leaves focus-guard/inert wrapper divs behind across tests, so the second test in the suite saw two "All" / "My Runtime" matches and getByText failed. The wipe is local to this file rather than the shared setup because it isn't a global issue — only suites that open Base UI dialogs hit it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: multica-agent <github@multica.ai>	2026-05-13 18:26:04 +08:00
Bohan Jiang	63d215e1c3	feat(runtime): visibility (public/private) gate on CreateAgent / UpdateAgent (#2419 ) * feat(runtime): visibility (public/private) gate on CreateAgent / UpdateAgent Closes the hole where a plain workspace member could pick another member's runtime in the Create Agent dialog and bind an agent to it — the backend wasn't checking runtime ownership, so the agent ran on someone else's hardware / tokens. Reported on GH #1804. Schema - Migration 083 adds agent_runtime.visibility ('private' default, 'public') with a CHECK constraint. Existing rows default to private — same ownership semantics as before, no behavior change for legacy data. Backend - canUseRuntimeForAgent predicate: allow when caller is workspace owner/admin, the runtime owner, or the runtime is public. - CreateAgent and UpdateAgent both gate on it: UpdateAgent matters because a plain member could otherwise create on their own runtime, then re-bind to a private one. - PATCH /api/runtimes/:id accepts { visibility } — owner/admin only, validated against the same private/public allow-list. Frontend - Create-agent dialog renders other-owned private runtimes disabled with a Lock badge + tooltip explaining who to ask. - Inspector runtime-picker disables the same set so re-binding fails the same way at the UI layer. - Runtime detail diagnostics gains a Visibility editor (owner/admin) or read-only chip (everyone else). - Runtime list shows a private/public chip next to the name. Tests - Go: canUseRuntimeForAgent truth table; CreateAgent / UpdateAgent end-to-end gate tests (admin / runtime owner / plain member); PATCH visibility owner / admin / member / invalid-value coverage. - Vitest: create-agent dialog disabled state on private/public runtimes, default-runtime selection skips locked rows; runtime detail visibility editor → mutation, read-only fallback. Migrating runtimes: existing rows default to private to preserve the "owner only" status quo. Owners switch to public via the detail page diagnostics card. Co-authored-by: multica-agent <github@multica.ai> * fix(runtime): apply timezone+visibility atomically; don't seed locked template runtime Two issues surfaced in review of MUL-2062: 1. PATCH /api/runtimes/:id ran the timezone branch first, which: - returned early on a tz no-op, silently dropping a concurrent `visibility` patch in the same body; - committed the timezone mutation (+ usage rollup rebuild) before validating visibility, so an invalid visibility left the row half-updated. Validate every field first, then run the mutations in order. The no-op short-circuit now only triggers when nothing else is requested. 2. The Create Agent dialog in duplicate mode unconditionally seeded `template.runtime_id` as the selected runtime, even when that runtime is now private and owned by someone else — the user saw a selected row they couldn't submit (Create → backend 403). Fall back to the first usable runtime when the template's runtime is locked, and gate the Create button on `selectedRuntimeLocked` as defense in depth. Tests: - Go: TestUpdateAgentRuntime_CombinedPatchAppliesBoth (tz no-op + visibility flip), TestUpdateAgentRuntime_InvalidVisibilityDoesNotMutateTimezone (atomic-fail invariant). - Vitest: duplicate template pointing at a locked runtime now seeds the first usable one; Create button stays disabled when no usable alternative exists. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-11 22:53:07 +08:00
Bohan Jiang	b26f850d4e	feat(agents): gate private-agent surfaces with allowed_principals predicate (#2359 ) * feat(agents): gate private-agent surfaces with allowed_principals predicate Tighten chat/@-mention, history, edit, and delete entry points so private agents are only reachable by their owner or workspace owner/admin. Agent-to- agent traffic still bypasses the gate so A2A collaboration keeps working. - New canAccessPrivateAgent predicate in handler/agent_access.go; used by comment.enqueueMentionedAgentTasks (replacing the inline check), GetAgent, ListAgents (filter), ListAgentTasks, GetWorkspaceAgentRunCounts / Activity30d / TaskSnapshot (workspace-wide aggregations no longer leak private-agent existence + counts), chat.CreateChatSession, chat.SendChatMessage (re-checks on every send so role changes can't leave a stale session as a back-door), and autopilot.shouldSkipDispatch (caller = autopilot creator). - allowed_principals is computed inline as {agent.owner_id} ∪ workspace owner/admin members. No new table — manual config is intentionally not exposed in v1; the predicate is the extension seam. - Front-end agent detail page distinguishes 403 (private agent the caller can't access) from 404 (deleted/missing) and renders a "no access" placeholder with a back-to-agents button. - Go tests cover the pure predicate matrix + the four protected surfaces; vitest passes for the affected views. Co-authored-by: multica-agent <github@multica.ai> * feat(agents): gate issue assignment with the private-agent predicate Refactor validateAssigneePair to call the shared canAccessPrivateAgent helper. This closes the back door where a plain member could assign a private agent to an issue and let normal task dispatch run it, side- stepping the chat / @-mention gate. Agent callers (X-Agent-ID) bypass so A2A delegation onto a private assignee still works. Add an integration test covering all three callers (workspace owner, agent owner, plain member). Co-authored-by: multica-agent <github@multica.ai> * fix(agents): close three private-agent gate bypasses found in PR review 1. X-Agent-ID forgery (resolveActor): require X-Task-ID alongside X-Agent-ID before trusting the agent identity. Without this a plain workspace member could set X-Agent-ID to any visible agent UUID and short-circuit the gate to "actor=agent, allow". Daemons already pair the two headers, so legitimate A2A traffic is unaffected. 2. Chat history read path (chat.go): GetChatSession / ListChatMessages / GetPendingChatTask / MarkChatSessionRead now go through a new gateChatSessionForUser helper that re-applies canAccessPrivateAgent after the ownership check, so a session creator whose role was later downgraded loses transcript access. ListChatSessions and ListPendingChatTasks filter their result sets by the same predicate. 3. Cross-workspace @mention (comment.enqueueMentionedAgentTasks): resolve the mentioned agent via GetAgentInWorkspace scoped to the issue's workspace so a UUID belonging to a different workspace's private agent can't slip past the gate (the gate was being applied against the current workspace's role table, which is the wrong one). Regression tests cover each bypass, plus an update to the resolveActor unit test to reflect the new "X-Agent-ID without X-Task-ID falls back to member" contract. Co-authored-by: multica-agent <github@multica.ai> * test(handler): seed X-Task-ID alongside X-Agent-ID in existing agent-caller tests After tightening resolveActor to require both headers (X-Agent-ID + X-Task-ID) for the "agent" actor identity, three existing tests that set only X-Agent-ID started failing because their requests now resolve to "member" instead of "agent". Add createHandlerTestTaskForAgent helper and seed a task per agent-caller assertion. Also patch TestAgentExplicitMentionStillTriggers — it still passed only because the @mention path doesn't care about author type for member callers, but the test claims to exercise the agent path, so make it faithful. Co-authored-by: multica-agent <github@multica.ai> * test(handler): finish X-Task-ID seeding + fix cross-workspace mention test schema The previous CI run still failed in two places: 1. server/cmd/server integration tests — postCommentAsAgent → authRequestWithAgent only set X-Agent-ID, so resolveActor downgraded the request to "member" and the on_comment chain produced the wrong task counts. Fix: authRequestWithAgent now also sets X-Task-ID, fetched or seeded by a new ensureAgentTask(agentID) helper. 2. TestMentionAgent_RejectsCrossWorkspaceAgentUUID's hand-crafted comment INSERT was missing comment.workspace_id, which migration 025 made NOT NULL. Pass testWorkspaceID into the seed row. Build + vet clean locally; both packages compile. Co-authored-by: multica-agent <github@multica.ai> --------- Co-authored-by: multica-agent <github@multica.ai>	2026-05-11 12:39:45 +08:00
Bohan Jiang	fe8326fa0c	feat(agents): add search box to skill picker dialog (#2269 ) Filters available skills by name + description (case-insensitive) as the user types. Auto-focuses on open and clears the query on close. Shows a distinct "no match" empty state vs. the existing "all assigned" one. Closes #2266 Co-authored-by: multica-agent <github@multica.ai>	2026-05-08 17:12:11 +08:00

1 2

90 Commits