100 Commits

Author SHA1 Message Date
LinYushen
cb68669c73 feat(composio): gate MCP apps behind feature flag (#4876)
* feat(composio): server-side connect flow + connections REST (Notion MVP) (MUL-3720) (#4608)

* feat(composio): server-side connect flow + connections REST (Notion MVP) (MUL-3720)

Compose the merged server/pkg/composio SDK into a user-facing connection
manager: signed-state connect handshake, local user_composio_connection
mirror, idempotent disconnect, and a per-user MCP session helper (not yet
wired into task dispatch).

- migration 127_user_composio_connection (no FK/cascade, per DB rules)
- sqlc queries: upsert (idempotent on user_id+connected_account_id), list
  active, owner-scoped get, mark revoked
- internal/integrations/composio: signed HMAC-SHA256 state, BeginConnect,
  CompleteCallback (idempotent upsert), ListConnections, Disconnect
  (upstream 404 = idempotent success), CreateMCPSession (no-op when empty,
  pins connected_accounts per toolkit), CallbackRedirect
- REST handlers under /api/integrations/composio (user-scoped, 503 when
  COMPOSIO_API_KEY unset): connect/init, callback (302), connections list,
  delete
- router wiring gated by COMPOSIO_API_KEY; COMPOSIO_AUTH_CONFIGS_JSON maps
  toolkit->auth_config (MVP: notion); state secret from COMPOSIO_STATE_SECRET
  or derived from JWT_SECRET; callback base from COMPOSIO_CALLBACK_BASE_URL
  or MULTICA_PUBLIC_URL
- tests: state (expire/tamper/wrong-secret), service (mapping, callback
  idempotency, non-success, disconnect owner/404 idempotency, MCP pin),
  handlers (httptest), redact regression for Bearer mcp_ tokens

MVP scope: Notion only; no task-dispatch overlay, sharing, or webhook
event handling (later stages).

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): bind callback account to user + idempotent revoked disconnect (MUL-3720)

Address PR 4608 review (CHANGES_REQUESTED):

- callback: verify connected_account_id with Composio before mirroring it.
  The signed state only proved user/toolkit/exp, so a valid state paired with
  a tampered connected_account_id would be written verbatim. CompleteCallback
  now calls ListConnectedAccounts and fails closed (ErrAccountVerification)
  unless the account belongs to the state's user (composio_user_id == multica
  user id) and was created under the toolkit's auth config. No row is written
  on mismatch / unknown account / upstream error.

- disconnect: short-circuit to a no-op when the local row is already revoked,
  before touching upstream. Previously a second DELETE re-hit Composio and a
  non-404 upstream error surfaced as a 502, breaking the 204-idempotent
  contract.

- CreateMCPSession: document the v1 single-active-connection-per-(user,toolkit)
  constraint and make duplicate selection deterministic (newest-wins, rows are
  connected_at DESC) instead of order-dependent map overwrite. Stage 3 owns the
  real single-account-enforcement vs multi-account-shape decision.

Tests: tampered/wrong-auth-config/unknown-account callback rejection, revoked-row
disconnect no-op (asserts upstream not re-hit). composio pkg 85% coverage; all
green.

Co-authored-by: multica-agent <github@multica.ai>

* feat(composio): list all toolkits + dynamic auth-config resolution (MUL-3720)

Yushen's follow-up to the Notion MVP: surface the full Composio toolkit
catalog, render it in Settings, and drop the static env mapping in favor of
dynamic auth-config discovery.

Config correctness (per Composio docs):
- Remove COMPOSIO_AUTH_CONFIGS_JSON entirely. The toolkit→auth_config mapping
  is now resolved at request time from the project's /auth_configs (cached,
  5-min TTL), so enabling a toolkit is a dashboard action, not a redeploy.
- Do NOT add COMPOSIO_PROJECT_ID. The project API key (x-api-key) authenticates
  to exactly one project; the project is resolved from the key. Only org-level
  endpoints use x-org-api-key, which this integration never calls.

Backend:
- SDK: server/pkg/composio/auth_configs.go — ListAuthConfigs (toolkit_slug,
  is_composio_managed, show_disabled, limit, cursor).
- service: dynamic resolver (authConfigMap cache; betterAuthConfig prefers a
  custom/white-label config over Composio-managed, newest wins); BeginConnect
  and CompleteCallback resolve via it; ListToolkits fetches the full catalog
  (paginated, capped) annotated with connectable = has an enabled auth config,
  connectable-first ordering.
- handler + route: GET /api/integrations/composio/toolkits (user-scoped, 503
  when COMPOSIO_API_KEY unset) returning slug/name/logo/category/connectable.

Frontend:
- core: ComposioToolkit/ComposioConnection types, api client methods, and
  composio query options (@multica/core/composio).
- views: Settings → Integrations now has a Composio section rendering every
  toolkit as a card with search. Connect is gated on `connectable`;
  non-connectable toolkits show a muted "not configured" hint instead of a
  dead button. Connected toolkits show a badge + Disconnect (with confirm).
- i18n: composio block added to en/zh-Hans/ja/ko settings.

Tests: SDK + service (dynamic resolution, custom-over-managed preference,
connectable flag, resolver-error soft-degrade) and handler toolkits endpoint;
composio pkg 85.7% coverage. go build/vet/gofmt clean; core+views typecheck,
core+views lint, and core tests (691) all green.

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): close cross-toolkit callback fail-open by signing auth_config_id into state (MUL-3720)

Re-review blocker: CompleteCallback resolved the toolkit's auth config at
callback time and ignored a resolve error/empty result, while
verifyAccountOwnership skipped the auth-config comparison when the expected
value was empty. A user could then pass another toolkit's connected_account_id
into this toolkit's callback — the owner check passed and it was written under
the wrong toolkit_slug/account binding.

Fix: the auth_config_id is already resolved in BeginConnect (before the state
is signed), so sign it into the state and compare it exactly at callback. No
re-resolve, no fail-open. verifyAccountOwnership now fails closed when the
expected auth config is empty (rejects instead of skipping) and requires an
exact match — closing the cross-toolkit binding gap.

Tests: state round-trips auth_config_id; BeginConnect signs it; callback
rejects wrong/cross-toolkit auth config and an empty (no-mapping) auth config
fails closed. composio pkg 85.2% coverage, all green.

Frontend (non-blocking): the Composio settings tab now surfaces an error when
the connections query fails instead of silently rendering everything as
unconnected.

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): hide Settings section entirely when integration unconfigured (MUL-3720)

Decision (option 2, hide-then-merge): don't show a card that leaks the internal
COMPOSIO_API_KEY env-var name to every end user. IntegrationsTab now gates the
whole Composio section (heading + body) on the toolkits query — a 503 means the
key is unset, so the section is withheld instead of rendering the not-configured
card. Admin-only setup guidance is a later, role-gated affordance.

Removed the notConfigured card (and now-unused ApiError import) from
ComposioTab; it only mounts when configured. views typecheck + lint clean.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* feat(composio): Stage 2 frontend polish — callback toast, last_used & expired UI, e2e (MUL-3718) (#4688)

* feat(composio): callback toast + refresh, last_used & expired UI, e2e (MUL-3718)

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): real callback redirect route + StrictMode-safe toast dedup (MUL-3718 review)

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): callback endpoint should not require Multica auth (MUL-3843) (#4709)

* fix(composio): move OAuth callback out of the Auth group (MUL-3843)

Composio 302-redirects the browser to /api/integrations/composio/callback
at the end of the OAuth flow, but PR #4608 mounted it inside the cookie-auth
middleware group. When the session cookie is absent (expired session,
SameSite=Strict / Safari ITP, private window, self-hosted callback subdomain)
the Auth middleware returned a hard 401 and a JSON blob instead of the
settings redirect, breaking the flow.

Identity never came from the cookie anyway: it is carried by the HMAC-signed
state param that CompleteCallback verifies (signature, expiry, replay) and
cross-checked by verifyAccountOwnership; h.Composio == nil still 503s. So the
callback is registered alongside the other public OAuth/webhook routes; the
other four composio endpoints stay session-gated.

Refs MUL-3843, MUL-3715.

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): correct stale callback routing comments (MUL-3843)

The package header and ComposioCallback doc comments still described the
callback as sitting under the Auth middleware group. After the route was
moved out (this PR), update both to state it is a public route whose identity
comes from the signed state — addressing review nit from 张大彪.

Refs MUL-3843.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* feat(composio): inject MCP overlay into agent runtime at task dispatch (MUL-3721) (#4704)

Stage 3 of the Composio epic. Wires the per-user Composio MCP session into
every agent task so the agent process sees the initiator's connected tools
without any prompt-time plumbing.

Server side
  - Migration 128 adds agent_task_queue.runtime_mcp_overlay JSONB plus a
    BEFORE-UPDATE trigger that wipes the column on any transition into a
    terminal status (completed / failed / cancelled). A trigger is the single
    source of truth — future queries that flip status cannot bypass it.
  - composio.Service.BuildTaskOverlay(userID) reuses CreateMCPSession and
    emits the Claude-style { mcpServers: { composio: { type: http, url,
    headers } } } shape the daemon's existing sidecar generators consume.
    Returns (nil, nil) on zero active connections so we never burn a
    Composio session for a user with nothing to call.
  - TaskService grows a Composio ComposioOverlayBuilder seam, wired in
    router.go after composiointeg.NewService succeeds. Five enqueue paths
    (issue / mention / quick-create / chat / auto-retry) attach the overlay
    after CreateAgentTask returns and before the daemon is notified — so
    every claim reads a settled row, with no second daemon hop. Best-effort:
    a builder failure logs and proceeds with no overlay.
  - resolveInitiatorFromTriggerComment derives the initiator user from the
    trigger comment when it was authored by a member. Agent-authored
    triggers are not treated as initiators (their connected-apps view is
    empty by construction).

Daemon side
  - handler/daemon.go claim path merges task.runtime_mcp_overlay onto
    agent.mcp_config via mergeMCPOverlay before populating
    TaskAgentData.McpConfig. Overlay wins on server-name collisions
    because it carries the live user-scoped session URL. Errors fall back
    to the agent config unchanged — a bad overlay must not surprise-disable
    saved MCP tools. The existing execenv sidecar generators (cursor /
    codex / openclaw / opencode / hermes-kiro) need no changes: they keep
    consuming the merged result through TaskAgentData.McpConfig.

Tests
  - 9 merge cases (mcp_overlay_test): both-nil short-circuit, agent-only
    pass-through, overlay-only canonicalization, two-side merge, name
    collision (overlay wins), top-level key preservation, malformed agent
    fallback, malformed overlay fallback, non-object server rejection.
  - 4 dispatch cases (composio): zero-connections returns nil without
    CreateSession, happy-path emits the right shape with the right user
    id, empty-URL defensive branch, SDK error surfacing.
  - 4 TaskService helper cases: nil Composio is a no-op (Queries-safe),
    invalid initiator does not call the builder, nil overlay skips the
    UPDATE, builder error swallowed without panic.
  - Migration 128 verified to roll up + down + up cleanly against the test
    database.

Out of scope (deferred): assignment-triggered enqueue paths with no
trigger comment get no overlay attached today (no initiator UUID flows
through enqueueIssueTask in that case). Retry paths recompute the overlay
fresh from the parent's initiator_user_id instead of inheriting the bearer
from the parent row, so a stale token can never resurface on a retry.

Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>

* feat(composio): per-agent allowlist + originator-scoped MCP overlay (MUL-3869) (#4736)

* feat(composio): per-agent allowlist + originator-scoped MCP overlay (MUL-3869)

Stage 3.1 of the Composio epic (MUL-3721 parent). PR #4704 wired in the
runtime_mcp_overlay column and a per-task dispatch hook; this change
inverts the default from "all-on" to opt-in and locks the overlay to the
agent owner's own connected apps:

- Agents carry composio_toolkit_allowlist TEXT[]. NULL or [] => no MCP.
  Owner-only read/write; non-owner GET/PUT silently redacts/drops the
  field (same shape as mcp_config).
- agent_task_queue carries originator_user_id UUID. Set from the
  top-of-chain HUMAN at every enqueue path:
    * issue/mention comment by member  -> author_id
    * issue/mention comment by agent   -> inherit via comment.source_task_id
                                          -> parent task originator_user_id
    * quick-create                     -> requester_id
    * chat                             -> initiator_user_id
    * retry                            -> SQL-inherited from parent row
    * autopilot                        -> NULL (system-driven)
- BuildTaskOverlay (composio dispatch) now takes (ctx, originatorUserID,
  agent) and short-circuits on five gates: invalid originator,
  originator != agent.owner_id, empty allowlist, empty intersection of
  allowlist ∩ active connections, defensive empty session URL. Composio
  CreateSession is called with BOTH `toolkits.slugs` (the intersection)
  AND `connected_accounts` (the pinned account ids), narrowing the
  tool-router twice.
- The originator-vs-owner gate closes the agent-fanout privacy hole: any
  workspace member who can @-mention a public agent used to project the
  owner's connected apps into their run. Now the overlay only mounts
  when the human at the top of the chain IS the agent owner.

Tests:
- dispatch_test.go covers all 5 gates plus uppercase/whitespace slug
  normalisation.
- task_runtime_mcp_overlay_test.go covers the no-op gates of the new
  applyRuntimeMCPOverlay signature.
- agent_composio_allowlist_test.go (handler): owner roundtrip
  (list/empty/null), workspace-admin silent-drop, owner-only GET
  visibility, pure normaliseComposioToolkitAllowlist.
- resolve_originator_test.go (service, DB-backed): member-authored,
  agent-authored inherits via comment.source_task_id, invalid id.

Migration 129 up/down/up verified against docker postgres.

Co-authored-by: multica-agent <github@multica.ai>

* chore(composio): gofmt + regenerate sqlc with v1.31.1 (MUL-3869 review nits)

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): accept nested connected account auth config

* feat(views): creator-only MCP tab for per-agent Composio allowlist (MUL-3870) (#4743)

Stage 3.2 frontend on top of the Stage 3.1 backend (MUL-3869, 4708dba97).
Adds an agent-detail tab that lets the agent owner pick which of their own
active Composio connections this agent may mount as MCP servers, writing the
selection to agent.composio_toolkit_allowlist via the existing PUT /api/agents.

- core/types: composio_toolkit_allowlist (+ _redacted) on Agent; tri-state
  composio_toolkit_allowlist on UpdateAgentRequest (omit/no-change, null/clear,
  array/replace), matching the backend contract.
- core/agents: useUpdateAgentAllowlist - optimistic mutation hook (patches the
  cached workspace agent list, rolls back on error, invalidates on settle).
- views: AgentMcpTab renders the owner's active connections as checkboxes;
  empty state links to Settings -> Integrations; defensive redacted state.
- views: wired into AgentOverviewPane as tab "composio_mcp", labeled "MCP Apps"
  to disambiguate from the existing raw-JSON "MCP" (mcp_config) tab. The entry
  is gated to the creator (currentUserId === agent.owner_id), matching the
  backend's owner-only read/write of the allowlist.
- i18n: tabs.composio_mcp + tab_body.composio_mcp.* in en/ja/ko/zh-Hans.
- tests: agent-mcp-tab.test.tsx (gating, toggle->allowlist body, active-only,
  empty, redacted); e2e/agent-mcp.spec.ts (creator sees tab + PUT body,
  non-creator hidden) with Composio + agent endpoints mocked at the boundary.

Note: the product spec says "creator"; the schema has no creator_id - the
backend gate and redaction are keyed on owner_id, so the tab uses owner_id.

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): mount remote MCP for codex

* feat(agents): agent invocation permission system (MUL-3963) (#4844)

* feat(agents): agent invocation permission system (permission_mode + invocation targets)

MUL-3963: split who may INVOKE an agent out of the overloaded visibility
column into an explicit, extensible model on feature/composio-integration.

- DB: agent.permission_mode (private|public_to) + agent_invocation_target
  table (workspace/member/team targets) + lossless backfill from visibility
  (migration 130).
- canInvokeAgent: owner-only for private (NO admin bypass, NO A2A bypass);
  public_to honours the allow-list; A2A judged by the top-of-chain originator.
- All trigger paths rewired: issue assign, comment @agent/@squad, chat,
  quick-create, autopilot, squad leader, child-done.
- Agent API: permission_mode + invocation_targets on responses and
  create/update (owner-only writes); legacy visibility kept as a derived field
  so old clients never see a permission widening.
- Composio: BuildTaskOverlay now FOLLOWS invocation permission and uses the
  agent OWNER connection (removed the originator==owner gate); front-end warns
  when a shared agent enables Composio apps.
- CLI: --permission-mode / --public-to-workspace / --public-to-member (legacy
  --visibility still mapped).
- Frontend: AccessPicker (Private / workspace / specific people / team soon),
  permission rules mirror canInvokeAgent, Composio warning banner.
- Tests: migration backfill, admin cannot invoke others private, public_to
  workspace/member whitelist, A2A by originator, Composio overlay uses owner
  connection.

Co-authored-by: multica-agent <github@multica.ai>

* feat(agents): stackable, mixed public_to invocation targets (MUL-3963)

Follow-up on PR #4844: public_to now supports selecting MULTIPLE, MIXED
targets on one agent (e.g. Public to workspace + specific people + team),
with canInvokeAgent admitting on ANY matching target (OR).

- Frontend AccessPicker: reworked from a single exclusive kind into a
  stackable multi-select — an "Everyone in workspace" toggle, a member
  multi-select checklist, and a (disabled, v1) team placeholder can be
  combined freely. Emits the full union of selected targets; empty union
  collapses to Private. Existing team targets are preserved across saves.
  Added the access.public_group locale string (en/zh-Hans/ja/ko).
- Backend already supported this (agent_invocation_target is multi-row per
  agent; create/update take a target ARRAY and batch-replace the whole
  allow-list; canInvokeAgent OR-matches). Added tests to lock it in:
  mixed member+team targets, overlapping-member batch replace, and
  workspace+member stacking then narrowing.

Refs MUL-3963.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): address review on invocation permission (MUL-3963)

张大彪 review on PR #4844 — three blockers + product ruling + nits:

1. Migration 130: drop the FK/cascade on agent_invocation_target
   (agent_id, created_by) per the Multica no-FK rule; relationships are now
   maintained in the app layer (matching MUL-3515 §4). Added
   DeleteAgentInvocationTargetsByArchivedRuntimeAgents and call it before
   DeleteArchivedAgentsByRuntime in all three runtime-delete paths
   (runtime.go x2, runtime_profile.go) so hard-deleting agents can't orphan
   target rows.
2. revokeAndRemoveMember: prune the leaving member's member-target grants
   (DeleteAgentInvocationTargetsByMember) in the same tx as the member-row
   delete, so a re-invited user can't reclaim a stale invocation grant.
3. Empty public_to is a phantom — parsePermissionInput now normalises a
   public_to with no resolvable targets to a single workspace target, so
   `--permission-mode public_to` alone (and any empty target array) means
   "public to workspace" instead of "shared but nobody can run it".

Product ruling: the system/no-human-originator → workspace-target path in
canInvokeAgent is a deliberate, documented exception (webhook/system/
workspace-wide automation); member/team targets still fail closed without a
resolved originator. Documented in code + locked with a test.

Nits: refreshed the stale "originator must be owner" comments — models.go
(via migration 130 COMMENT ON COLUMN + sqlc regen for composio_toolkit_allowlist
and originator_user_id) and agent-mcp-tab.tsx — to the owner-connection +
invocation-permission rules.

Tests: member remove/re-add regression, system workspace exception + member
fail-closed, empty public_to → workspace (plus the earlier mixed/overlap/
batch-replace suite). Migration 130 applied to the test DB; Go handler/service/
composio suites green; views typecheck clean.

Refs MUL-3963.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): scope member invocation-target cleanup to one workspace (MUL-3963)

张大彪 3rd review — cross-workspace permission bug + comment nits:

- DeleteAgentInvocationTargetsByMember was a GLOBAL delete by user id, so
  removing a user from workspace A also wiped their member-target grants on
  agents in workspace B. Scoped it to a single workspace by joining through
  agent.workspace_id; revokeAndRemoveMember now passes (workspaceID, userID).
- Regression test TestRevokeMember_InvocationTargetCleanupIsWorkspaceScoped:
  same user allow-listed by agents in two workspaces; removal from one leaves
  the other workspace's target intact.
- Nits: refreshed the remaining stale "originator == agent.owner_id" /
  "owner-vs-originator" comments — CreateRetryTask (agent.sql, regenerated),
  and the AgentResponse allowlist doc + ListAgents/UpdateAgent redaction
  rationale in agent.go — to the owner-connection + invocation-permission rule.

Migration 130 applied to the test DB; Go handler/service/composio suites green;
go vet clean.

Refs MUL-3963.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): agent access owner-only editable, read-only for others (MUL-3963) (#4853)

* fix(agents): make agent access owner-only editable, read-only for others (MUL-3963)

Interaction bug: a non-owner (incl. workspace admin) could open the AccessPicker
and set an agent public — the backend silently ignored it and the UI bounced
back to private. Access is owner-only, so non-owners must see a read-only state
and the backend must reject real changes explicitly.

Frontend:
- AccessPicker renders a static, non-interactive read-only state when the
  viewer is not the owner: the current access value + a lock affordance + a
  tooltip "Only the agent owner can change who can run this agent." No clickable
  trigger is rendered, so a non-owner can never open a control the backend would
  reject (the GitHub/Notion pattern for permission settings you can see but not
  edit). The editable multi-select picker is unchanged for the owner.
- agent-detail-inspector gates the picker on ownership specifically
  (currentUserId === agent.owner_id), NOT the general canEdit (which also admits
  admins, who may edit other fields but not access).
- New locale key access.owner_only_readonly (en/zh-Hans/ja/ko).

Backend:
- UpdateAgent now returns an explicit 403 when a non-owner submits a REAL
  permission change (permissionInputChangesAgent compares requested mode +
  target set against the persisted state); a no-op resubmit (admin PATCH-as-PUT
  echoing unchanged permission) is still tolerated so admin edits of other
  fields keep working. Replaces the previous silent-drop that caused the bounce.

Tests:
- access-picker.test.tsx: non-owner gets a non-interactive read-only display
  with the owner-only tooltip; owner gets an interactive picker; owner can pick
  a member and stack workspace + member.
- TestUpdateAgent_AccessChangeIsOwnerOnly: admin real change → 403; admin no-op
  resubmit → 200; admin editing other fields → 200; owner change → 200.

Incidental: fixed a pre-existing base typecheck break in
slash-command-suggestion.test.tsx (stray `signal` arg not in the suggestion
items type) that otherwise fails the whole @multica/views typecheck.

Refs MUL-3963.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): compare legacy visibility, not expanded permission, for no-op detection (MUL-3963)

PR #4853 review: permissionInputChangesAgent expanded a legacy-only
visibility:"private" into a real private permission and compared it against the
agent's actual permission. A member-only public_to agent derives legacy
visibility "private", so an admin PATCH-as-PUT echoing visibility:"private"
while editing another field was misread as a public_to→private downgrade and
rejected with 403 — contradicting the "unchanged permission no-op is allowed"
contract.

Fix (per review): when a request carries ONLY legacy `visibility` (no
permission_mode / invocation_targets), derive the agent's CURRENT legacy
visibility from its real targets and compare the legacy string values. Equal =
no-op (allowed); a real legacy change (e.g. "workspace") still returns 403.
Requests that carry permission_mode / invocation_targets keep the precise
mode+target comparison.

Regression test TestUpdateAgent_LegacyVisibilityNoOpForMemberOnlyPublicTo:
member-only public_to agent — admin submitting visibility:"private" + a
non-permission field → 200 with targets unchanged; admin submitting
visibility:"workspace" → 403.

Go handler/composio suites green; migration 130 applied; go vet clean.

Refs MUL-3963.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* feat(composio): brief agents on connected apps

* feat(composio): gate MCP apps behind feature flag

* fix(mobile): parse agent invocation permissions

* fix(tests): update agent fixtures for access fields

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Multica Eve <eve@devv.ai>
Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: Eve <eve@multica-ai.local>
2026-07-03 14:18:43 +08:00
Bohan Jiang
3a6d3522c8 feat(slack): two-command channel reads — chat history (overview) + chat thread [id] (MUL-3871) (#4762)
Replaces the single scoped `multica chat history --scope` read with two clean
noun-commands so the agent can navigate a channel with many threads (e.g. read
the specific thread a user referred to):

- `multica chat history` — the channel OVERVIEW: recent top-level messages, each
  thread tagged with thread_id + reply_count + latest_reply (it does NOT expand
  thread contents). Backed by GET /api/chat/history + slack.History.ChannelOverview
  (conversations.history).
- `multica chat thread [id]` — read one thread: no id = the thread you're in,
  an id = a specific thread IN THE SAME channel. Backed by GET /api/chat/thread +
  slack.History.Thread (conversations.replies; DM falls back to history).

The channel stays server-pinned to the session; a thread id is only a
within-channel locator, so the security boundary (no cross-channel reads) is
unchanged. `--scope` is removed.

The prompt now teaches both commands and, via a new chat_in_thread signal
(derived from the binding: last_thread_id != last_message_id), tells the agent
which to start with — `chat history` for a top-level @mention, `chat thread` for
an in-thread one.

Tests: slack ChannelOverview/Thread (current/by-id/DM-fallback/no-binding/clamp),
handler both endpoints + auth, prompt top-level vs in-thread guidance.

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-07-01 12:46:47 +08:00
Bohan Jiang
a961d63611 feat(slack): make the chat agent explicitly channel-aware (MUL-3871) (#4755)
Before this, the chat prompt only carried a generic, always-on hint ('if this
came from a chat channel...'), and the task carried no channel signal — so the
agent never definitively knew it was inside Slack. For an ambiguous ask like
'what did you just talk about', it could read Multica instead of the Slack
conversation.

- Thread a chat_channel_type ('slack') signal: the server sets it on the chat
  task response when the session has a Slack binding
  (GetChannelChatSessionBindingBySession); the daemon Task carries it.
- buildChatPrompt now emits an EXPLICIT block only when channel-backed: 'You are
  operating inside a Slack conversation … this conversation and its history live
  in Slack, NOT in Multica … read it with multica chat history, do NOT look in
  Multica.' Web-only chat sessions get no such block (their history is the
  Multica chat_session the agent already resumes).

Tests: slack-backed prompt asserts the explicit Slack/“NOT in Multica”/command
copy; web-only prompt asserts the block is absent.

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-30 17:24:46 +08:00
beast
20eecfb093 fix(projects): honor repo resource checkout refs (MUL-3593) (#4470) 2026-06-24 16:25:17 +08:00
Bohan Jiang
5038c983c0 MUL-3281: Add daemon skill bundle refs (#4445)
* feat: add daemon skill bundle refs

Co-authored-by: multica-agent <github@multica.ai>

* fix: tighten skill bundle resolve safeguards

Co-authored-by: multica-agent <github@multica.ai>

* feat: add task prepare lease

Co-authored-by: multica-agent <github@multica.ai>

* fix: isolate prepare lease concurrent index migration

Co-authored-by: multica-agent <github@multica.ai>

* fix: keep prepare lease active through start

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 16:19:16 +08:00
Naiyuan Qing
3ce97453b3 fix(issues): pre-trigger preview + run-confirm + handoff UX polish (MUL-3375) (#4454)
* fix(issues): stop issue-trigger preview flicker

The pre-trigger preview re-rendered/refetched on every workspace task
event: WS task lifecycle invalidated issueTriggerPreviewAll (staleTime 0),
forcing a background refetch whose isFetching was surfaced as isLoading,
collapsing and reopening CreateRunHint's reveal band.

The assign source (create / assignee change) cancels existing tasks before
enqueuing, so its verdict can't shift from a task event at all; the status
source's pending dedup could, but the preview is advisory and the write
path re-evaluates authoritatively, so a rare stale label is harmless. Drop
the WS invalidation so the preview refetches only on input (signature)
change. Keep the comment-trigger invalidation — its verdict genuinely
changes mid-compose and its chips drive an immediate, unconfirmed send.

Align the hook's data handling with the comment-trigger preview:
keepPreviousData so an input switch swaps in place instead of collapsing,
and treat only the first load (no prior data) as loading.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(issues): skip run-confirm modal for backlog assign

Assigning a Backlog issue to an agent/squad never starts a run (the
parking lot — server/internal/service/issue_trigger.go), so the
pre-trigger confirm modal only rendered an empty "won't start" box with
a single Apply button. Apply directly instead: the single path checks
issue.status, the batch path skips only when every selected issue is
Backlog (mixed selections still confirm — the non-backlog ones trigger).
Mirrors the existing backlog short-circuit in handleBatchStatus.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(modals): run-confirm loading state + submit spinner

The dialog grew in height after open: it rendered the short "won't
start" variant while POST /api/issues/preview-trigger was in flight, then
the note box appeared when the predicate landed. Keep the note box
mounted (disabled) during loading so assign mode opens at its resolved
height, and show a Spinner + 'checking' headline while loading.

Submit had no feedback — buttons only disabled, which read as frozen for
note assigns (the request starts an agent server-side). Track which
footer action is in flight and show a Spinner on the clicked button.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(issues): show handoff note in execution-log trigger text

An assignment-triggered run that carried a handoff note showed the
generic "Initial run" label. Surface the note inline (truncated, like
comment triggers show their text) so the row reads as the handoff.

taskToResponse now populates handoff_note for all callers (dropping the
now-redundant explicit set in ClaimTaskByRuntime); the field is added to
the AgentTask type + zod schema (optional, additive — old clients ignore
it via the loose schema, new clients fall back to "Initial run").

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 16:15:44 +08:00
Naiyuan Qing
4ab335b8a5 MUL-3416: Issue pre-trigger preview + Handoff Note (#4383)
* feat(issues): unify run-enqueue decision behind WillEnqueueRun + preview endpoint

Collapse the issue update/batch enqueue copies into one service predicate
service.IssueService.WillEnqueueRun, shared verbatim with a new dry-run
endpoint POST /api/issues/preview-trigger so the four entry points stop
drifting (squad/self-loop/batch omissions, MUL-3375). The private-agent gate
stays at the HTTP boundary: write paths inject allow-all, preview injects the
real gate so it never leaks a private agent's readiness.

Add suppress_run to issue update/batch: the change applies but no run starts.
Remove the now-dead handler mirrors shouldEnqueueSquadLeaderOnAssign /
isSquadLeaderReady. service.Create and the comment trigger chain are untouched.

Tests: preview behavior, preview<->write-path match, batch aggregation,
member no-trigger, suppress_run skip, malformed-body 400.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): inject handoff note into assigned runs via first-class task field

Add an optional handoff_note carried by issue assign/promote into the run's
opening prompt and issue_context.md, via a dedicated agent_task_queue column
(migration 122) and a daemon assignment-handoff render branch — never a
fabricated comment, never trigger_comment_id (MUL-3375 §6.1).

Thread the note through enqueueIssueTask/enqueueMentionTask + WithHandoff
public variants and dispatchIssueRun; suppress_run or a parked write drops it
(no run = nothing to inject). Soft version gate: MinHandoffCLIVersion +
HandoffSupported, surfaced per-trigger as handoff_supported in the preview so
the UI can gray the note box on old daemons; the assignment never hard-fails.

Tests: daemon prompt + issue_context render via the assignment branch (not
quick-create/comment), version helper matrix, note persists on the task,
suppressed assign enqueues nothing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): leave a display-only handoff record on the timeline

When an assign/promote with a handoff note starts a run, write one
type='handoff' timeline record via TaskService.RecordHandoff — a direct
Queries.CreateComment + timeline event that bypasses Handler.CreateComment, so
it never reaches triggerTasksForComment and cannot start a second run
(MUL-3375 §6.2, the must-not-retrigger invariant). Author is the actor who
handed off; body is the note. Migration 123 admits the 'handoff' comment type.
Recorded only on a real run start: suppress_run or a parked write writes
nothing. enqueueSquadLeaderTask now reports whether it enqueued so the trace
is gated on an actual dispatch.

Test: exactly one handoff record on assign-with-note, exactly one task (no
re-trigger), and no record when suppressed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): frontend plumbing for issue-trigger preview + handoff (core)

Add api.previewIssueTrigger + IssueTriggerPreviewSchema (zod parseWithFallback),
the use-issue-trigger-preview hook, issueKeys.issueTriggerPreview(+All) with WS
queue-state invalidation, suppress_run/handoff_note on UpdateIssueRequest, the
'handoff' CommentType, and stripping of the control fields from optimistic
update/batch cache patches (MUL-3375 §9).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): exclude handoff records from new-comment counting

type='handoff' is a display-only timeline record, not conversation. Exclude it
from CountNewCommentsSince so a handoff note never inflates the count of
"new comments to catch up on" fed to a claiming agent (MUL-3375 §12). Analytics
already excludes it (RecordHandoff is a direct write that emits no analytics
event), and the comment-trigger path is already bypassed.

Test: a handoff record does not bump the new-comment count; a real comment does.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): pre-trigger preview UI, handoff note, timeline card (web/desktop)

Wire the §9 frontend onto the preview endpoint + handoff fields:
- Delete the backlog blocking dialog (backlog-agent-hint*) and its modal type;
  the over-eager nag is gone. Backlog awareness is now a passive label.
- RunConfirmModal: single assign + batch assign/status route here. Shows the
  backend predicate's verdict ("将启动 @X" / "将启动 N 个" / parked), an optional
  handoff note (assign only, soft-gated by handoff_supported), and 暂不启动 —
  then applies via update/batch. No frontend guessing.
- create modal: passive CreateRunHint ("将启动 @X" / backlog parked).
- single status change stays a direct apply (unchanged).
- timeline: render type='handoff' as a distinct, non-interactive handoff card.
- i18n run_confirm + handoff_card across en/ja/ko/zh-Hans; drop backlog action
  keys; locale parity green.

Tests: use-issue-actions (assign → run-confirm modal, member → direct),
create-issue + comment-card suites updated/green; views typecheck + lint clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(issues): use a valid anchor in the handoff count-exclusion test

CountNewCommentsSince filters id <> @anchor_id; SQL id <> NULL is NULL and
excludes every row, so an empty anchor made the control assertion read 0. The
production caller always passes a real anchor — mirror that with a non-matching
sentinel uuid.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(issues): RunConfirmModal apply logic (start/suppress/note-gate/batch)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(core): preview schema malformed/missing/null fallback coverage

Cover IssueTriggerPreviewSchema via parseWithFallback (MUL-3375): well-formed
parse, top-level + item default fills (empty/older backend), and fallback to
{ triggers: [], total_count: 0 } for malformed shapes, a dropped required
issue_id, a wrong-typed total_count, and null/non-object bodies — so the four
entry points degrade to "nothing will start" instead of throwing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* refactor(issues): remove display-only handoff timeline record (留痕)

The handoff "留痕" timeline record (type='handoff' comment written on run
start) was judged superfluous and dropped per product call. This removes
only the display-only trace; the handoff NOTE injection into the run's
opening prompt + issue_context.md is untouched.

- backend: drop RecordHandoff + its call in dispatchIssueRun
- db: drop the `type <> 'handoff'` exclusion in CountNewCommentsSince and
  migration 123 (comment_type_check reverts to the 4-type set from 001);
  no production data exists for this unreleased feature
- frontend: drop the "handoff" CommentType, HandoffCard, and handoff_card
  i18n (all locales)
- tests: drop handoff_count_test.go and the record-write assertions in
  issue_trigger_preview_test.go (note-injection tests retained)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): dismissable run-confirm modal + team-handoff copy

Two fixes to the pre-trigger confirm modal (MUL-3375).

1. Dismissable: switch RunConfirmModal from AlertDialog to the standard
   shadcn Dialog so it has the close (X) button + Esc + click-outside.
   Previously the only choices were "start" / "don't start now" with no
   way to abort the action entirely; dismissing now cancels with no write.

2. Copy: rework the action-surface wording away from the backend term
   "run" toward team-handoff voice — 指派 / 开始 / 交接 (run stays only on
   record surfaces). Unifies the note's three names to "交接说明", and
   parallels the rewrite across en/ja/ko.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* chore(agent): bump handoff note min CLI version to 0.3.28

The daemon release that renders handoff notes ships in 0.3.28 (0.3.27
was the prior tag), so move the soft-gate threshold up. Below this the
note is silently dropped and the frontend grays the note box — assignment
is never blocked.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(issues): skip run-confirm when batch-moving issues to backlog

A move into backlog never starts a run (service/issue_trigger.go), so the
pre-trigger confirm modal degenerated to an empty "won't start" box with a
single Apply button — pure friction. Apply directly instead, matching the
single-issue status path. Other target statuses still route through the
modal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(issues): refine pre-trigger preview hint and copy

- Move the create-issue run hint to a reveal band (grid 0fr→1fr) above the
  property toolbar. It was sharing the footer button row and, lacking a
  width constraint, reflowed the submit buttons whenever it appeared.
  Restyle to a borderless, comment-style avatar+caption that is purely a
  caption (non-interactive avatar).
- Distinguish squad from agent in the pre-trigger copy: a squad's leader
  evaluates and delegates rather than "starting work" itself. Add
  will_start_named_squad / will_start_squad / create_will_start_squad across
  en/zh/ja/ko (reusing the squad_leader_* evaluate→arrange vocabulary) and
  branch run-confirm + the create hint on squad assignees.
- Bold the assignee name in the run-confirm headline via a language-safe
  sentinel split (no per-language prefix/suffix keys).
- Align zh "开始处理" → "开始工作" on the single-assign copy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(issues): stub ActorAvatar in create-issue suite

CreateRunHint now renders an ActorAvatar for agent/squad assignees, which
pulls in getActorInitials/getActorAvatarUrl + the workspace/presence/navigation
hook tree. This form-focused suite only stubbed getActorName, so the
squad-forwarding test crashed with "getActorInitials is not a function". Stub
the avatar inert — its own behavior is covered elsewhere.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Walt <walt@multica.ai>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 13:17:13 +08:00
Bohan Jiang
da72e2fa22 feat(daemon): inject project description into the agent brief (MUL-3465) (#4395)
* feat(daemon): inject project description into the agent brief

Issues bound to a project only surfaced the project title in the runtime
brief; the project description (durable, project-wide context the owner
sets) was loaded but dropped. Carry it end-to-end:

- claim handler reads proj.Description onto the response (issue-bound and
  quick-create paths)
- new ProjectDescription field on AgentTaskResponse, daemon Task, and
  TaskContextForEnv
- rendered in the brief's `## Project Context` section and written to
  .multica/project/resources.json as project_description

Empty descriptions render nothing (no extra heading). Updated the
projects-and-resources built-in skill docs in the same change.

MUL-3465

Co-authored-by: multica-agent <github@multica.ai>

* feat(projects): clarify project description is injected as agent context

The project description is now durable context injected into every task's
brief, but the UI still presented it as a plain "Description" field, so
existing descriptions could silently become agent input. Add a hint under
the description editor on the project detail page and in the create-project
modal, in all four locales, stating it is shared with agents as context for
every task in the project. No data-semantics change.

Addresses review feedback on PR #4395. MUL-3465

Co-authored-by: multica-agent <github@multica.ai>

* test(handler): assert project description flows through task claim

The execenv tests cover brief rendering, but nothing pinned the claim
handler boundary where proj.Description is read onto the response. Add
two tests — issue-bound and quick-create paths — so a regression in that
assignment fails loudly instead of silently dropping the description.

Addresses review feedback on PR #4395. MUL-3465

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-22 23:39:27 +08:00
Jiayuan Zhang
eb6dffdbc6 MUL-3341: clear incompatible model on runtime switch
Closes MUL-3341
2026-06-17 08:23:20 +02:00
Bohan Jiang
f9c193e06b fix: fail closed on agent task auth tokens (#4142)
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-15 16:34:35 +08:00
YOMXXX
34d4cd3a28 feat(openclaw): support connecting to existing OpenClaw gateway (#3260) [MUL-3158] (#3664)
* feat(openclaw): support connecting to existing OpenClaw gateway (#3260)

When the daemon host is a lightweight dev machine or CI coordinator, the
heavy agent work (LLM inference, code execution, tool use) often belongs
on a more powerful remote server already running an OpenClaw gateway.
Multica historically hard-coded `openclaw agent --local`, forcing every
turn to execute in-process on the daemon host.

This change adds an opt-in gateway routing mode controlled per-agent via
`runtime_config`:

  {
    "mode": "gateway",
    "gateway": { "host": "...", "port": 18789, "token": "...", "tls": false }
  }

- Backend: ExecOptions gains OpenclawMode + OpenclawGateway; buildOpenclawArgs
  drops `--local` when mode == "gateway". Per-task openclaw-config.json
  wrapper pins gateway.{host,port,auth.{mode,token},tls} so users do not
  need to edit the daemon host's `~/.openclaw/openclaw.json` to point at
  a different endpoint.
- Daemon: AgentData carries the raw runtime_config; decoding is fail-soft
  (malformed JSON falls back to local mode rather than blocking dispatch).
- API: gateway.token is masked to "***" on every GET; PATCH replays the
  sentinel back, and the update handler restores the persisted token so
  the round-trip never destroys the secret. Defense-in-depth masking on
  WS broadcasts, plus String/MarshalJSON masking on the in-memory struct
  to block stray `%+v` / json.Marshal leaks.
- UI: openclaw-only "Routing" tab on the agent detail page with mode
  selector + structured endpoint form. Token uses a "saved — submit a
  new value to rotate" UX and matching backend preserve hook.

Empty `runtime_config` keeps the historical embedded behaviour, so
existing agents are unaffected.

* fix(openclaw): address #3664 review — drop dead gateway field, gate pin on mode

Per Bohan-J's review:

- Remove the dead ExecOptions.OpenclawGateway field (+ its String/MarshalJSON and
  the daemon.go construction block). It carried the plaintext bearer token but was
  never read — buildOpenclawArgs only consumes OpenclawMode and the live gateway
  path runs through execenv.OpenclawGatewayPin — so this narrows the secret's
  footprint.
- Gate the gateway pin on mode=="gateway" in decodeOpenclawRuntimeConfig: a
  {"mode":"local","gateway":{...,"token"}} payload no longer writes the token into
  the 0o600 per-task wrapper that --local makes openclaw ignore.
- Warn on an unrecognized non-empty mode (e.g. "gatway") instead of silently
  falling back to local.
- Run preserveMaskedGatewayToken in CreateAgent too, so a literal "***" at create
  time can't persist as a real bearer token.
- Document the gateway host:port trust boundary (SSRF note for shared daemon hosts).

Adds regression tests for the local-mode pin drop and the unknown-mode warning.
2026-06-13 15:33:28 +08:00
Bohan Jiang
c8ab73d38d MUL-3244: Bind quick-create attachments to created issues (#4062)
* fix: bind quick-create attachments to created issues

Co-authored-by: multica-agent <github@multica.ai>

* test: use real image markdown in quick-create attachment test

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-12 16:45:38 +08:00
Bohan Jiang
24b162cdbc feat(daemon): surface the real task initiator to the agent runtime (MUL-2645) (#3899)
* feat(daemon): surface the real task initiator to the agent runtime (MUL-2645)

In a multi-person workspace the agent runtime only ever saw the runtime
OWNER identity: the brief's `## Requesting User` is sourced from
runtime.OwnerID and the task-scoped token is owner-bound, so every
requester (whoever commented, @mentioned, or chatted) appeared to the
agent as the owner. Agents that route by initiator for permission,
privacy, or audit all misjudged.

Resolve the real task initiator at claim time and surface it distinctly
from the owner:
- comment / mention trigger -> triggering comment's author (member or agent)
- chat task -> chat session creator (sessions are creator-only)
- on-assign / autopilot / quick-create -> no attributable initiator (omitted)

Adds initiator_{type,id,name,email} to the claim response, the daemon
Task, and TaskContextForEnv, rendered into the brief as a new
`## Task Initiator` section. The section documents the privacy boundary:
the agent's credentials stay owner-scoped, so this is an attested
identity for the agent's own routing/privacy logic, not act-as. No DB
migration — both paths are derivable from existing rows.

Tests: brief rendering (member/agent/omit/sanitize) + email guard unit
tests, and claim-handler tests for the comment and chat paths.

Co-authored-by: multica-agent <github@multica.ai>

* fix(chat): store real sender as task initiator, not chat_session creator (MUL-2645)

Review fix (Niko, PR #3899). v1 resolved the chat task initiator from
chat_session.creator_id at claim time. That is correct for web chat and
Lark p2p (creator == sender), but WRONG for Lark group chats: the group
session creator is deliberately the installer (stable identity across
member churn), not the message sender. So in a Lark group, every member
who triggered the agent showed up in the brief as the installer/owner —
the exact bug this issue is about, still live at that entry point.

Capture the real sender at enqueue time instead of deriving it from the
session creator at claim time:

- migration 117: agent_task_queue.initiator_user_id (FK user, ON DELETE
  SET NULL); NULL for non-chat and pre-migration rows.
- EnqueueChatTask now takes an explicit initiatorUserID. Web chat passes
  the authenticated request user; the Lark dispatcher threads the inbound
  sender (binding.MulticaUserID) through scheduleRun -> flushChatRun. The
  debouncer keeps the latest scheduled flush per session, so in a multi-
  sender silence window the LATEST sender wins (documented + tested).
- claim handler resolves the initiator from task.initiator_user_id and
  drops the creator_id fallback entirely.

The Lark group session creator stays the installer (unchanged) — only the
task initiator is corrected, keeping the two concepts cleanly separate.

Tests: dispatcher group regression (initiator = sender, not installer),
latest-sender-wins, p2p initiator assertion; the chat claim handler test
now sets creator != initiator and asserts the stored sender wins.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-08 19:29:57 +08:00
Bohan Jiang
3808049361 fix(codex): set semantic thread names (#3887)
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-08 14:53:31 +08:00
Naiyuan Qing
b9334dd59f fix: anchor comment triggers to thread roots (#3746)
Co-authored-by: multica-agent <github@multica.ai>
2026-06-04 13:47:05 +08:00
LinYushen
de900b2ba6 feat(server): funnel/community/commercial business metrics + PostHog pairing (MUL-2949) (#3698)
* feat(server): funnel/community/commercial business metrics + PostHog pairing (MUL-2949)

PR3 of the Grafana board metrics split (parent MUL-2328).

Adds 23 new Prometheus counter/histogram families to the PR2 BusinessMetrics
collector covering the activation/community/commercial funnels, and binds
every PostHog event emission to a matching metric increment so the two sides
cannot drift.

Funnel: signup, workspace_created, team_invite_sent/accepted, onboarding_*,
cloud_waitlist_joined.
Content: issue_created, chat_message_sent, agent_created, squad_created,
autopilot_created, issue_executed.
Runtime: runtime_registered/ready/failed/offline + ready_seconds histogram,
daemon_ws_message_received_total.
Autopilot: autopilot_run_started/terminal/skipped.
Webhook/GitHub: webhook_delivery_total, github_event_received_total,
github_pr_review_total, github_pr_merge_seconds histogram.
CloudRuntime: cloudruntime_request_total + duration histogram, wired through
a small RequestRecorder interface so the cloudruntime package stays decoupled
from metrics.
Commercial: feedback_submitted, contact_sales_submitted.

The pairing helper metrics.RecordEvent(client, m, ev) emits the PostHog
event AND increments the matching counter via IncForEvent dispatch, reading
labels from the analytics event Properties. Every existing
h.Analytics.Capture(analytics.X(...)) call site has been migrated to the
helper across handler/, service/, and cmd/server/runtime_sweeper.go.

Lint enforcement (server/internal/metrics/business_pairing_test.go):
- TestEveryAnalyticsEventHasPrometheusCounter: every Event* constant in
  analytics/events.go either dispatches via IncForEvent or is in the
  taskMetricEvents allow-list (PR2 typed RecordTask* methods).
- TestNoNakedAnalyticsCaptureInHandlersOrServices: AST-walks handler/
  service/cmd-server for direct Analytics.Capture(...) calls — only
  service/task.go's captureTaskEvent helper is allow-listed.
- TestEveryAnalyticsRecordEventTakesAnalyticsHelper: validates the third
  arg of every metrics.RecordEvent call is built from analytics.*.

Cardinality protection: all new label values pass through fixed allow-lists
in labels_pr3.go; unknown values collapse to 'other'/'unknown'/'error'.

Refs:
- Spec MUL-2328 / MUL-2949.
- Builds on PR2 (MUL-2948) — collectors registered through the same
  BusinessMetrics struct, no separate Registry.
- Uses PR1's taskfailure.Reason (MUL-2946) for runtime_failed's failure_reason
  label via NormalizeFailureReason.

Out of scope: Sampler-class metrics (PR4 / MUL-2947), pr_review_total
emission point (no review event handler exists yet — counter is defined,
TODO to wire up when /api/webhooks/github grows pull_request_review handling).

Co-authored-by: multica-agent <github@multica.ai>

* fix(server): tighten PR3 review items — signup_source bucket, fill platform/kind/form_source enums, onboarding_started server emission, lint scope (MUL-2949)

Addresses 张大彪's review on #3698:

1. signup_source: NormalizeSignupSource added to labels_pr3.go with a
   fixed allow-list bucket (direct/google/twitter/linkedin/.../other).
   Parses JSON cookie payload for utm_source/source/referrer fields,
   strips URL schemes, maps well-known hostnames to channel buckets.
   PostHog event still ships the raw cookie value for analytics; only
   the Prometheus label is bucketed.

2. Filled the unknown/other label gaps:
   - analytics.IssueCreated and analytics.ChatMessageSent now take a
     platform parameter sourced from middleware.ClientMetadataFromContext
     (X-Client-Platform header) at the handler. Autopilot-originated
     issues stamp PlatformServer.
   - analytics.FeedbackSubmitted now takes a kind parameter; CreateFeedback
     reads req.Kind (default "general") so the picker selection lights up
     the metric's kind label instead of long-term "other".
   - analytics.ContactSalesSubmitted now takes a formSource (page /
     onboarding / agents_page); CreateContactSales reads req.Source.
     The metric reads ev.Properties["form_source"] so the analytics
     CoreProperties.Source ("marketing_contact_sales") stays
     backward-compat for PostHog dashboards.

3. analytics.OnboardingStarted helper added; server-side emission lives
   in PatchOnboarding, fired exactly once per user on the first PATCH
   that carries a non-empty questionnaire payload (firstTouch logic
   compares prior bytes against {} / null). Frontend onboarding_started
   keeps firing on page open; the server emission is what guarantees the
   Prometheus counter exists so Grafana can be cross-checked against the
   PostHog funnel without depending on the SDK roundtrip.

4. business_pairing_test.go tightened:
   - TestNoNakedAnalyticsCaptureInHandlersOrServices now allow-lists at
     function granularity (just captureTaskEvent in service/task.go), not
     whole-file. Any future naked Capture in the same file fails CI.
   - TestEveryAnalyticsRecordEventTakesAnalyticsHelper now does def-use
     tracking inside the enclosing FuncDecl: when RecordEvent's third
     arg is an *ast.Ident, the test walks the function body for the
     assignment that defined it and confirms the RHS is an
     analytics.<Helper>(...) call. Bare local idents that didn't
     originate from analytics are now caught.

5. gofmt -w applied across the touched files; gofmt -l clean.

Tests: go test ./internal/metrics/... ./internal/analytics/... pass.
Pre-existing TestClaimTask_/TestWebhook_MergedPR/TestDeleteIssueByIdentifier
failures on origin/main are DB-environment-dependent and not regressions
from this change.

Co-authored-by: multica-agent <github@multica.ai>

* fix(server): normalise onboarding_started platform label + regression test (MUL-2949)

Addresses 张大彪's last review nit:

- IncForEvent's EventOnboardingStarted case now wraps the platform
  property with NormalizePlatform, matching every other platform-bearing
  metric. A misbehaving frontend can no longer leak a raw X-Client-Platform
  header value into the multica_onboarding_started_total{platform=...}
  series.

- New labels_pr3_test.go covers every PR3 normalizer with both a happy-path
  value and an unknown value, asserting the unknown collapses to the
  documented fallback bucket. Includes a focused regression for
  onboarding_started: emits one event with an attacker-shaped platform
  string and asserts the metric only exposes web + unknown label values
  (no raw header bleed).

- testutil.go gains a small GatherForTest helper so the regression test
  can pull the typed MetricFamily map without re-implementing the
  registry-walk dance.

Co-authored-by: multica-agent <github@multica.ai>

* fix(server): NormalizeTaskSource on workspace_created + document lint limitations (MUL-2949)

Final review touch-ups before merge:

- IncForEvent's EventWorkspaceCreated case wraps source through
  NormalizeTaskSource, matching the other source-bearing dispatches
  (issue_created, agent_created, issue_executed). Closes the last raw
  property leak in the dispatcher table.

- business_pairing_test.go inline docstrings now spell out the two
  known limitations of the lint gate that 张大彪 / Eve flagged:
  analyticsBackedIdents matches by ident NAME (not SSA def-use, so a
  nested-scope shadow could pass) and isMetricsRecordEvent hard-codes
  the import alias set. PR description carries a Follow-ups section
  with the same two items so the work is visible after merge.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: 魏和尚 <agent+wei@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-03 16:39:06 +08:00
Multica Eve
9616d78e47 MUL-2785: optimize resumed comment reads (#3509)
* feat(comments): skip default thread read on resumed comment sessions

Co-authored-by: multica-agent <github@multica.ai>

* fix(comments): scope since delta to trigger thread

Co-authored-by: multica-agent <github@multica.ai>

* chore(comments): address thread delta review nits

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-29 14:57:14 +08:00
Naiyuan Qing
3187bbf90c feat(comments): re-add since-delta + cold-start thread read + parent-root write normalization (#3494)
* feat(comments): since-delta new-comment hint + default-on comment session resume (#3432)

* feat(db): add unresolved comment count + list filter queries

Add CountUnresolvedComments (excludes the agent's own comments) and
ListUnresolvedCommentsForIssue. Both are additive — existing callers stay
on the unfiltered queries — so old clients are unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handler): support unresolved-only comment listing

Wire an additive `unresolved` query param into ListComments. Defaults off
so an old CLI that never sends it gets unchanged behavior; only true/1
enable it. Rejects combining unresolved with thread/recent (whole-issue
filter vs navigation models). Includes filter + count query tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handler): plumb unresolved count + thread root into claim, gate comment resume

Populate trigger_parent_id (thread root of the trigger comment) and
unresolved_count (excludes the agent's own comments) on comment-triggered
claim responses. Both fields are omitempty so old daemons ignore them.

Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION
(default off): resumed comment turns can inherit the prior turn's "Done."
final message, so this stays an explicit rollout switch. The runtime-match
and poisoned-session guards still apply regardless of the flag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(daemon): inject unresolved-comments hint + resolve step into agent brief

Add a shared BuildUnresolvedCommentsHint helper rendered on both the
per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It
ships only the count and the relevant CLI call — never comment bodies — so
the server stays cheap. Thread case points at --thread <root>; issue case
points at --unresolved. Suppressed when the count is 0.

Also add a workflow step telling the agent to `multica comment resolve
<thread-root>` once a thread is fully handled, so the unresolved set
converges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): add comment list --unresolved and comment resolve command

Add an --unresolved filter to `issue comment list` (wired to the server's
unresolved param, rejected when combined with --thread/--recent) and a
top-level `comment resolve <id>` command that POSTs to the existing
/api/comments/{id}/resolve endpoint, letting an agent close threads it has
fully handled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(comments): since-delta new-comment hint + default-on comment resume

Simplifies the comment-triggered agent flow down to what's actually needed:

- New-comment awareness is now a pure time delta: the claim response carries
  new_comment_count + new_comments_since (anchored on the prior run's
  started_at, never completed_at so a long run can't miss comments). The
  per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s)
  since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the
  two surfaces can't drift. Cold start (no prior run) falls back to a plain read.
- Comment-triggered tasks resume the prior session by default (same runtime),
  dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS
  comment" prompt guard defends against inheriting the prior turn's "Done."
  marker; GetLastTaskSession still excludes poisoned sessions.
- Drops the resolved-based machinery from the first draft: CountUnresolvedComments
  / ListUnresolvedCommentsForIssue queries, the `comment list --unresolved`
  flag, the `multica comment resolve` command, and the resolve workflow step.
- Removes the verbose cursor-pagination paragraph from the comment prompt; the
  --thread/--recent/--since flags stay in the CLI/API, just no longer explained
  inline every turn.

Compatibility: new claim fields are omitempty (old daemons ignore them).
Comment resume is default-on and affects even old daemons, which already
consume prior_session_id.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(comments): collapse reply parent_id to thread root on write

Comment threads are a 2-level model (root + flat replies, like Linear/Slack),
enforced today only by the UI and the agent path — the CreateComment handler
stored whatever parent_id it was handed, and the agent-side flatten walked just
one level, so a reply-to-a-reply could land at depth 3+. Add GetThreadRoot (a
recursive walk to the parent_id=NULL root) and run both write paths
(handler.CreateComment, service.createAgentComment) through it, so every stored
reply's parent_id IS its thread root. Readers can now treat parent_id as the
thread root without re-walking. The agent-drift guard still compares the raw
parent_id to the trigger comment before normalization.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(comments): cold-start reads triggering thread, warm keeps --thread pointer

The since-delta rework dropped the thread-first read on the COLD path: a
first-time agent fell back to the flat `comment list` dump (oldest-first, cap
2000), burying the trigger's context in ancient chatter. Point cold start at the
triggering conversation instead via a shared BuildColdCommentsHint
(`--thread <trigger> --tail 30` + a --recent pointer for cross-thread
background). On the WARM path, --since is a pure time delta and can miss the
triggering thread's pre-anchor history, so BuildNewCommentsHint now also emits a
--thread pointer. Both surfaces (per-turn prompt + CLAUDE.md workflow) render via
the shared helpers so they cannot drift (PR #2816 rule).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 10:38:37 +08:00
Bohan Jiang
270fb6aa73 MUL-2792 fix(agent): preserve skills in update/archive/restore response (#3464)
* MUL-2792 fix(agent): preserve skills in update/archive/restore response (#3459)

agentToResponse always initialises Skills as []; the mutation handlers
relied on the caller to refresh it, but only GetAgent and ListAgents
actually did. UpdateAgent / ArchiveAgent / RestoreAgent therefore
returned "skills": [] regardless of what the agent_skill junction table
contained.

The DB write path was never wrong — skills weren't actually deleted —
but the misleading response (and its matching agent:status / archived /
restored WS broadcast) scared users into manually re-running
`agent skills set` and risked scripted clients writing the empty set
back as truth.

Extract the existing GetAgent skill-reload block into attachAgentSkills
and call it from the three buggy handlers. Add regression tests that
attach skills, hit each mutation endpoint, and assert both the response
and the junction table.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): attach skills before env/template broadcasts (#3459)

Two follow-up sites flagged in PR #3464 review that shared the same
"agentToResponse zeroes Skills, callers forget to reload" pattern as the
mutation handlers:

- agent_env.go: the agent:status broadcast after UpdateAgentEnv used a
  bare agentToResponse, so subscribers saw skills wiped on every env
  rotation. HTTP body is AgentEnvResponse so the response itself is
  unaffected, but the WS event still misleads any cache that ingests it.
- agent_template.go: CreateAgentFromTemplate attaches imported and extra
  skills inside the tx, then builds the response/agent:created broadcast
  without reloading them — so callers (and any client tracking the
  create event) see the freshly created agent as skill-less despite the
  template having just imported them.

Both call sites now reuse attachAgentSkills introduced for UpdateAgent.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-28 19:06:57 +08:00
Naiyuan Qing
d90732750f Revert "feat(comments): since-delta new-comment hint + default-on comment ses…" (#3455)
This reverts commit 5e78e5100a.
2026-05-28 17:52:59 +08:00
Naiyuan Qing
5e78e5100a feat(comments): since-delta new-comment hint + default-on comment session resume (#3432)
* feat(db): add unresolved comment count + list filter queries

Add CountUnresolvedComments (excludes the agent's own comments) and
ListUnresolvedCommentsForIssue. Both are additive — existing callers stay
on the unfiltered queries — so old clients are unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handler): support unresolved-only comment listing

Wire an additive `unresolved` query param into ListComments. Defaults off
so an old CLI that never sends it gets unchanged behavior; only true/1
enable it. Rejects combining unresolved with thread/recent (whole-issue
filter vs navigation models). Includes filter + count query tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(handler): plumb unresolved count + thread root into claim, gate comment resume

Populate trigger_parent_id (thread root of the trigger comment) and
unresolved_count (excludes the agent's own comments) on comment-triggered
claim responses. Both fields are omitempty so old daemons ignore them.

Gate comment-triggered session resume behind MULTICA_RESUME_COMMENT_SESSION
(default off): resumed comment turns can inherit the prior turn's "Done."
final message, so this stays an explicit rollout switch. The runtime-match
and poisoned-session guards still apply regardless of the flag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(daemon): inject unresolved-comments hint + resolve step into agent brief

Add a shared BuildUnresolvedCommentsHint helper rendered on both the
per-turn prompt and the CLAUDE.md workflow (kept in sync per PR #2816). It
ships only the count and the relevant CLI call — never comment bodies — so
the server stays cheap. Thread case points at --thread <root>; issue case
points at --unresolved. Suppressed when the count is 0.

Also add a workflow step telling the agent to `multica comment resolve
<thread-root>` once a thread is fully handled, so the unresolved set
converges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cli): add comment list --unresolved and comment resolve command

Add an --unresolved filter to `issue comment list` (wired to the server's
unresolved param, rejected when combined with --thread/--recent) and a
top-level `comment resolve <id>` command that POSTs to the existing
/api/comments/{id}/resolve endpoint, letting an agent close threads it has
fully handled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(comments): since-delta new-comment hint + default-on comment resume

Simplifies the comment-triggered agent flow down to what's actually needed:

- New-comment awareness is now a pure time delta: the claim response carries
  new_comment_count + new_comments_since (anchored on the prior run's
  started_at, never completed_at so a long run can't miss comments). The
  per-turn prompt and CLAUDE.md workflow render one line — "N new comment(s)
  since your last run, --since <ts>" — via a shared BuildNewCommentsHint so the
  two surfaces can't drift. Cold start (no prior run) falls back to a plain read.
- Comment-triggered tasks resume the prior session by default (same runtime),
  dropping the MULTICA_RESUME_COMMENT_SESSION rollout gate. The "Focus on THIS
  comment" prompt guard defends against inheriting the prior turn's "Done."
  marker; GetLastTaskSession still excludes poisoned sessions.
- Drops the resolved-based machinery from the first draft: CountUnresolvedComments
  / ListUnresolvedCommentsForIssue queries, the `comment list --unresolved`
  flag, the `multica comment resolve` command, and the resolve workflow step.
- Removes the verbose cursor-pagination paragraph from the comment prompt; the
  --thread/--recent/--since flags stay in the CLI/API, just no longer explained
  inline every turn.

Compatibility: new claim fields are omitempty (old daemons ignore them).
Comment resume is default-on and affects even old daemons, which already
consume prior_session_id.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 15:58:42 +08:00
Bohan Jiang
1195255e43 MUL-2771: feat(transcript): server-derived relative work_dir chip (#3428)
* MUL-2771: feat(transcript): server-derived relative work_dir chip

Adds a privacy-safe `relative_work_dir` field to the agent task wire
shape so the transcript dialog can show where a task ran without
leaking the user's home directory. Standard tasks strip the daemon's
workspaces root to `<wsUUID>/<taskShort>/workdir`; local_directory
tasks fall back to the trailing two path segments (`repos/foo`),
which keeps enough context for the user to recognise the directory
without exposing $HOME or the username.

The derivation lives in `taskToResponse` so every endpoint that
serves a task — list, snapshot, claim, rerun, cancel, complete,
fail — fills the field consistently. taskToResponse now also
populates `workspace_id`, which the prior shape declared but never
set. shortTaskID mirrors execenv.shortID; a colocated test pins the
two helpers together so future daemon-side layout changes don't
silently degrade the chip into the local_directory fallback.

Replaces the front-end stripping attempt in PR #3379, which passed
issue_id where workspace_id was required and therefore rendered the
full absolute path on every standard task.

Co-authored-by: multica-agent <github@multica.ai>

* MUL-2771: harden privacy guards on transcript work_dir chip

Address second-round review feedback from PR #3428:

1. Drop the `title={task.work_dir}` tooltip in the transcript dialog.
   The visible chip was safe but native browser tooltips re-rendered the
   absolute `/Users/<name>/...` on hover, leaking into screen shares,
   screenshots, and recordings — defeating the stated goal of the chip.
   The absolute path now never reaches the DOM (no title, aria, or data
   attribute).

2. Replace the "tail two segments" fallback for local_directory paths
   with explicit home-prefix stripping plus a basename-only final
   fallback. The old behaviour leaked the username on shallow paths like
   `/Users/alice/foo`, `/home/alice/project`, and `C:\Users\alice\foo`.
   The new behaviour recognises common per-user home layouts on macOS,
   Linux, and Windows (case-insensitive), strips them down to the
   remainder, and falls back to the basename for any path under an
   unrecognised root — a single segment can never carry the home prefix.

3. Align the Go and TypeScript field comments with the real fallback
   policy so future readers see "strip home / basename" instead of the
   outdated "tail two segments" description.

Tests: expanded `TestRelativeWorkDir` to cover shallow `/Users/...`,
`/home/...`, and `C:\Users\...` paths, the exact-home edge cases,
case-insensitive matching, and the non-home basename-only fallback.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-28 15:53:16 +08:00
Bohan Jiang
17714c3ad1 fix(create-issue): preserve parent_issue_id through Create with agent flow (MUL-2534) (#3083)
* fix(create-issue): preserve parent_issue_id through Create with agent flow (MUL-2534)

When the create-issue modal was opened from the "Add sub issue" entry on
an existing issue and the user switched to "Create with agent", the
parent_issue_id was silently dropped: switchToAgent only forwarded
prompt + actor + project_id, the AgentCreatePanel had no notion of
parent context, and the daemon prompt never instructed the agent to
pass --parent <uuid>. The sub-issue intent was lost and the new issue
landed as a standalone.

This fix threads parent_issue_id through the whole pipeline silently —
no new editable form field, the existing carry channel handles it:

- Frontend: ManualCreatePanel.switchToAgent + AgentCreatePanel.switchToManual
  now carry parent_issue_id (and identifier, for display) so the sub-issue
  intent survives mode flips in either direction. AgentCreatePanel reads
  parent from `data`, forwards to api.quickCreateIssue, and renders a
  read-only "Sub-issue of MUL-XX" chip so the user can see the relationship.
- API: quickCreateIssue accepts optional parent_issue_id.
- Backend: QuickCreateIssueRequest validates parent_issue_id belongs to the
  same workspace (same path as CreateIssue), persists it in
  QuickCreateContext, and the daemon claim handler resolves the parent's
  identifier for prompt context.
- Daemon prompt: when ParentIssueID is set, buildQuickCreatePrompt instructs
  the agent to pass `--parent <uuid>` and treat the modal entry point as
  authoritative.

Tests cover all three hops: switchToAgent carry payload, AgentCreatePanel →
api.quickCreateIssue, and the daemon prompt's --parent injection (with both
identifier-present and UUID-only fallback branches).

Co-authored-by: multica-agent <github@multica.ai>

* test(create-issue): cover quick-create parent trust boundary + identifier fallback (MUL-2534)

Address review on PR #3083:

- Add server-side test for POST /api/issues/quick-create parent_issue_id:
  same-workspace parent threads through QuickCreateContext.ParentIssueID,
  foreign-workspace and bogus UUIDs return 400 and never enqueue a task.
- Fall back to `data.parent_issue_identifier` in ManualCreatePanel's
  switchToAgent when the parent detail query hasn't hydrated yet, so the
  agent chip never renders "Sub-issue of " with an empty tail.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-27 14:18:48 +08:00
Multica Eve
744b474199 revert(agent): remove per-agent local skill toggle (MUL-2603) (#3286)
* Revert "feat(agents): hide skills_local toggle for runtimes that don't honour it (MUL-2603) (#3276)"

This reverts commit 0b50c5a209.

Co-authored-by: multica-agent <github@multica.ai>

* Revert "fix(agent): surface host OAuth token via env var on macOS isolation (MUL-2603) (#3267)"

This reverts commit a67bf81225.

Co-authored-by: multica-agent <github@multica.ai>

* Revert "fix(agents): tighten skills-tab intro and drop redundant import hint (#3265)"

This reverts commit d8075a5775.

Co-authored-by: multica-agent <github@multica.ai>

* Revert "fix(agent): mirror $HOME/.claude.json into isolated config dir (MUL-2661) (#3261)"

This reverts commit 40da88fc16.

Co-authored-by: multica-agent <github@multica.ai>

* Revert "feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200)"

This reverts commit 960befa56f.

Co-authored-by: multica-agent <github@multica.ai>

* Add migration cleanup for reverted agent skills toggle

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-26 17:00:01 +08:00
Bohan Jiang
960befa56f feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603) (#3200)
* feat(agent): per-agent toggle to isolate host-machine skills (MUL-2603)

Adds an agent-scoped `skills_local` switch ("ignore" default / "merge") so
shared agents stop inheriting the operator's user-global Claude skill
directory. A single broken local skill on one operator's machine was
crashing the Claude CLI before it ever read stdin — the daemon saw a
"broken pipe" with no recoverable signal (GitHub #3052).

- DB: migration 108 adds `agent.skills_local` (NOT NULL DEFAULT 'ignore'),
  with sqlc CreateAgent/UpdateAgent updates and handler validation.
- Claude runtime: when the agent is in "ignore" mode the backend points
  CLAUDE_CONFIG_DIR at an empty per-task scratch dir under the task cwd
  (fallback: OS temp), strips any inherited override, and cleans up after
  the run. Workspace skills under `{cwd}/.claude/skills/` still load.
  "merge" preserves the legacy inherit-from-machine behavior; Codex and
  other isolated backends are no-ops.
- UI: new Skills toggle in the Create Agent dialog and the Agent → Skills
  tab, with EN/zh-Hans copy and SkillsLocalToggle shared between the two.
- Tests: unit coverage for the new env helper, isolation dir lifecycle,
  full Claude execute paths (ignore + merge), and the handler tristate
  contract. Existing skills-tab test updated for the new copy.
- Docs: updated `/skills` docs (EN + ZH) and added a 0.3.7 changelog entry
  in the landing-page i18n.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): preserve claude login + validate skills_local input (MUL-2603)

Address Elon's review on PR #3200:

1. Skill isolation no longer drops the operator's Claude login. The
   per-task scratch dir now mirrors every entry under `~/.claude/`
   as symlinks except `skills/`, so `.credentials.json`, settings,
   plugins, etc. reach the CLI exactly as on the host while the
   user-global skills directory stays hidden. Without this, default
   `ignore` would have broken every Claude agent on a non-API-key
   host the moment migration 108 landed.

2. Internal CreateAgent callers (agent_template, onboarding_shim)
   now set `SkillsLocal: "ignore"`. The Go zero value was about to
   trip the migration-108 CHECK constraint and 500 template /
   onboarding agent creation.

3. Create / update handler validation no longer normalizes garbage
   to "ignore". The strict 400 path is now reachable on bad client
   input; the drift-safe `normalizeSkillsLocal` stays on the read
   side only.

UI copy + docs clarified that the toggle is Claude-only; other
runtimes ignore the setting.

Verification:
- `go test ./...` green (full suite locally).
- `pnpm --filter @multica/views exec vitest run agents/components/tabs/skills-tab.test.tsx` green.
- Handler DB-backed tests still skip locally without docker (same
  as Elon's run) — CI will validate the create / update paths
  against migration 108.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): mirror effective claude config dir with windows fallback (MUL-2603)

Address Elon's second-round review on PR #3200:

1. The per-task scratch dir now mirrors the *effective* host Claude
   config dir, not unconditionally `~/.claude/`. Precedence: agent
   `custom_env` CLAUDE_CONFIG_DIR > parent process env > `~/.claude/`.
   Without this, an operator who pinned Claude at a managed install
   (custom env CLAUDE_CONFIG_DIR) would get the wrong credentials in
   the scratch dir, because `buildClaudeEnv` strips that env before
   handing it to the child. We resolve the source up front and feed
   it to the mirror, so the override env still points at the right
   bytes.

2. Mirror entries now go through platform-aware linkers. On Windows
   without Developer Mode / admin, `os.Symlink` is denied, which
   previously left the scratch dir empty and broke Claude Code auth
   on default `ignore`. The new helpers try symlink first, then fall
   back to a directory junction (`mklink /J`) for dirs or a hardlink
   (same-volume content share) / copy for files. Mirrors the
   execenv/codex_home_link_windows.go pattern.

3. Tests:
   - `TestResolveHostClaudeConfigDir` locks in the custom_env >
     parent_env > `~/.claude` precedence.
   - `TestNewIsolatedClaudeConfigDirMirrorsCustomHostDir` confirms
     the scratch dir picks up `.credentials.json` from a synthetic
     custom host dir, proving the source resolution actually
     propagates into the mirror.
   - `TestNewIsolatedClaudeConfigDirEmptyHostIsNoop` documents the
     env-var-auth-only case (no host source ⇒ empty scratch dir).
   - `TestMirrorHostClaudeExceptSkillsWith_FallbackWhenSymlinkFails`
     exercises the Windows-no-Developer-Mode path via the new
     `mirrorHostClaudeExceptSkillsWith` seam, asserting credentials
     and sub-dir children still reach the scratch dir after the
     symlink stand-in fails.
   - `TestMirrorHostClaudeExceptSkillsWith_PropagatesFirstLinkError`
     confirms callers see the per-entry error when even fallback
     fails (so the warn-log fires on broken Windows installs).
   - `TestCopyFileRoundTrip` covers the last-resort copy fallback
     and its EXCL no-overwrite contract.
   - `TestClaudeExecuteIsolatesUsesCustomEnvSource` is the
     end-to-end check: an agent with custom_env CLAUDE_CONFIG_DIR
     reads its credentials from the pinned dir, not `~/.claude/`.

4. Docs: `apps/docs/content/docs/skills.{mdx,zh.mdx}` updated to
   describe the effective-source resolution and the Windows
   fallback chain so the docs match the runtime behaviour.

Verification:
- `go test ./...` green (full server suite locally, including
  `pkg/agent` 23 cases covering the new + existing isolation
  paths).
- `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and
  `go test -c -o /dev/null` both compile clean, confirming the
  Windows-tagged linker file builds.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): default skills_local to merge to preserve legacy behavior (MUL-2603)

Per Bohan's product decision on PR #3200, the per-agent host-skill toggle
defaults to "merge" — the pre-MUL-2603 inherit-from-machine behavior —
so existing personal workflows that rely on locally installed Claude
Skills keep working unchanged. Agent owners explicitly opt into "ignore"
when they need to harden a shared agent against a broken local skill on
one operator's machine (GitHub #3052).

Also audited all 11 runtimes for user-global skill discovery paths and
documented the scope of the toggle. Only Claude reads a user-global
`~/.claude/skills/`; Codex isolates via `CODEX_HOME`, the ACP backends
(Hermes / Kimi / Kiro) and the JSON-stream backends (Copilot / Cursor /
Gemini / Pi / OpenCode / OpenClaw) anchor discovery to the task workdir
and never read a user-global skill directory. UI copy and docs now say
"for runtimes that support it (currently Claude Code)" everywhere so
the scope is explicit.

Changes:

- Migration 108: column default flipped to 'merge'.
- Handler CreateAgent: missing field → "merge"; explicit "ignore" /
  "merge" still validated, garbage still 400.
- normalizeSkillsLocal: drift-safe coercion now lands on "merge" for
  anything that isn't the exact literal "ignore".
- agent_template.go / onboarding_shim.go: internal CreateAgent callers
  send "merge" instead of "ignore" to match the new default.
- Claude runtime (`claude.go`): isolate-mode gate flipped from
  `SkillsLocal != "merge"` to `SkillsLocal == "ignore"`, so "" (legacy
  daemons / older clients) and "merge" both walk `~/.claude/` directly.
- Create Agent dialog + Skills tab: toggle defaults to on (merge); only
  duplicate of an explicit "ignore" agent carries through. The
  isolation opt-in is now `skills_local: "ignore"` when the user flips
  off; "merge" is omitted from the request body.
- i18n (EN + zh-Hans): copy reframed — "On (default) — merged"; "Off —
  ignored. Recommended for shared agents".
- Docs (`/skills`, `/guides/agents.zh`): describe new default and
  enumerate which runtimes act on the toggle.
- Landing changelog 0.3.7: retitled "Per-Agent Local-Skill Toggle"; note
  the on-by-default behavior + off-to-isolate framing.
- Tests:
  - `TestClaudeExecuteIsolatesHostSkillsWhenIgnoreOptedIn` replaces the
    old by-default isolation case (now requires explicit "ignore").
  - New `TestClaudeExecuteDefaultModeKeepsHostConfigDir` locks in that
    default ExecOptions preserve the host CLAUDE_CONFIG_DIR.
  - `TestClaudeExecuteIsolatesUsesCustomEnvSource` now explicitly opts
    into "ignore" mode.
  - Handler tests: omitted → "merge"; explicit "ignore" round-trips;
    preserve-existing test seeds "ignore" and asserts "merge" flip-back.
  - `TestNormalizeSkillsLocal_DriftStaysSafe`: only literal "ignore"
    maps to ignore; everything else → "merge".
  - `skills-tab.test.tsx`: toggle ON by default; flip OFF when agent
    opted into "ignore". Intro-text matcher anchored to a more specific
    phrase so it no longer collides with the toggle hint copy.

Verification:
- `go test ./...` green (full server suite locally).
- `GOOS=windows GOARCH=amd64 go vet ./pkg/agent/...` and
  `go test -c -o /dev/null` both compile clean (windows-tagged linker
  file still builds).
- `pnpm typecheck` green across all packages and apps.
- `pnpm --filter @multica/views test` 88 files / 771 tests green.
- `pnpm --filter @multica/core test` 43 files / 390 tests green.
- Handler DB-backed tests still skip locally without docker; CI will
  validate the create / update paths against migration 108.

Co-authored-by: multica-agent <github@multica.ai>

* chore(landing): drop 0.3.7 changelog entry from this PR (MUL-2603)

The landing-page release notes belong in a separate release-prep PR, not in the feature PR.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): propagate skills_local=ignore to codex user-skill seed (MUL-2603)

Make the per-agent skills_local toggle real for Codex too, not just Claude.
Previously the toggle was only consumed by the Claude backend, while the
daemon's execenv layer always seeded Codex's per-task CODEX_HOME with the
host machine's user-installed skills from ~/.codex/skills/. A shared Codex
agent with skills_local=ignore could still inherit a broken local skill
from one operator's machine.

Now: PrepareParams/ReuseParams carry SkillsLocal; hydrateCodexSkills
skips seedUserCodexSkills when SkillsLocal == "ignore" so the per-task
CODEX_HOME exposes only workspace skills to the codex CLI. Default
("merge", or empty from older servers/clients) preserves existing
inherit-from-machine behavior. UI / docs are updated to reflect the
contract honestly: Claude Code and Codex honor the toggle; other
runtimes (Hermes / Kimi / Kiro / Copilot / Cursor / Gemini / Pi /
OpenCode / OpenClaw) leave $HOME untouched and discover user-level
skills natively, so the toggle is a no-op for them today.

New tests: TestPrepareCodexSkillsLocalIgnoreSkipsUserSeed,
TestPrepareCodexSkillsLocalMergeSeedsUserSkills, and
TestReuseCodexSkillsLocalIgnoreSkipsUserSeed cover Prepare(ignore),
Prepare(merge), and the toggle-flip-on-reuse path.

Co-authored-by: multica-agent <github@multica.ai>

* docs(skills): scope skills_local toggle copy to Claude Code + Codex (MUL-2603)

Off-state hint and Skills tab intro now explicitly call out Claude Code +
Codex as the only runtimes that honor the toggle, with "other runtimes
ignore this setting" wired into both states (en + zh-Hans), so users on
non-Claude/Codex agents don't read "Off" as runtime-wide isolation.

Docs (skills.mdx, skills.zh.mdx, guides/agents.zh.mdx) stop describing
Hermes / Kimi / Gemini / Copilot / Cursor / Pi / OpenCode / OpenClaw / Kiro
as having native user-level skill discovery; the daemon simply does not
manage user-level skill discovery for those runtimes today, and the toggle
is a no-op regardless of where it is set.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-26 13:26:33 +08:00
Bohan Jiang
13f74e651a feat(agents): remove custom_env from agent resources, add audited env endpoint (MUL-2600) (#3209)
* feat(agents): remove custom_env from agent resources, add audited env endpoint (MUL-2600)

The agent resource shape (list / get / create / update / archive /
restore responses + WebSocket events) no longer carries `custom_env`
values. Reads/writes of env now flow exclusively through a dedicated
`/api/agents/{id}/env` endpoint that is owner/admin-only, rejects
agent-actor sessions, applies a "****" sentinel preserve guard on
PUT, and writes a persistent audit row per reveal/update.

Why
- `multica agent list --output json` historically returned plaintext
  `custom_env` for owner/admin callers (the redaction gate gave only
  members the masked map). Any agent token running on the workspace
  inherits its owner's role and could read every other agent's
  secrets just by listing.
- Patching list/get redaction alone (PR #3175 direction) left
  symmetric leaks via mutation responses, WS events, the "reveal"
  path itself (no actor-aware auth), and a `****` overwrite footgun
  on UpdateAgent.

What changed
- Backend: drop `custom_env` from AgentResponse; add coarse
  `has_custom_env` + `custom_env_key_count`. Strip env handling from
  UpdateAgent (silently ignored if sent). Keep CreateAgent's
  custom_env acceptance.
- Backend: new GET/PUT `/api/agents/{id}/env` handlers in
  `internal/handler/agent_env.go`:
  - resolveActor → 403 for agent actors (closes the lateral-movement
    path).
  - Owner/admin role gate via existing helper.
  - PUT honours value == "****" as "preserve existing value".
  - Both write to `activity_log` with `agent_env_revealed` /
    `agent_env_updated` actions. Audit details record key names only,
    never values.
- Daemon claim path (`ClaimAgentTask`) unchanged — `TaskAgentData`
  still carries plaintext env for runtime injection.
- SQL: new `UpdateAgentCustomEnv` query; sqlc regenerated (v1.31.1).
- CLI: new `multica agent env get|set` subcommands. `--custom-env*`
  flags removed from `multica agent update`; the no-fields error
  now points to the new path.
- Frontend: drop env fields from `Agent` + `UpdateAgentRequest`; add
  `getAgentEnv` / `updateAgentEnv` client methods; rewrite env-tab
  to show "N variables configured" + explicit "Reveal & edit"
  button, fetching values only on intentional reveal.
- Locales: parity-safe additions to en + zh-Hans.
- Docs: agents-create.{mdx,zh.mdx} reflect the new threat model and
  endpoint.
- Mobile: schema drops `custom_env` / `custom_env_redacted`, adds
  metadata fields.

Tests
- Handler tests pinned the new invariants: no env in list/get
  responses, owner reveal happy-path + audit row, agent-actor 403,
  `****` sentinel preserves real values, UpdateAgent silently
  ignores `custom_env`, pure `mergeAgentEnv` cases.
- CLI tests pivot to the new flag surface: `agent update` MUST NOT
  expose the env flags; `agent env set` MUST expose
  --custom-env-stdin/--custom-env-file.
- Frontend test fixtures updated; pnpm typecheck / test / lint
  pass cleanly.

This is a breaking API change. Scripts that read `custom_env` from
`/api/agents` must migrate to `GET /api/agents/{id}/env`.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): close actor-spoofing + audit fail-closed in env endpoints (MUL-2600)

Addresses Elon's review of #3209:

* Mint a task-scoped `mat_` token per claim, bound to (agent, task,
  workspace, owner). Daemon injects it into the agent process in place
  of its own credential. Auth middleware authoritatively rebuilds
  X-User-ID / X-Agent-ID / X-Task-ID from the token row and sets
  X-Actor-Source=task_token; that header is server-set only — incoming
  values are stripped before any auth branch runs. resolveActor honors
  the header so an agent that strips X-Agent-ID / X-Task-ID still
  resolves as actor=agent.
* GetAgentEnv / UpdateAgentEnv are now fail-closed on audit-log
  failures: GET refuses to return plaintext, PUT persists inside the
  same tx as the audit row so they commit/roll back together.
* PUT /api/agents/{id} returns 400 when the body carries custom_env
  instead of silently dropping it — directs callers to the audited env
  endpoint.
* Agent actors never see mcp_config, even when the underlying member
  is owner/admin; mutation broadcasts go through a redaction shim so
  WS subscribers don't pick it up either.
* Fix backend test that asserted dense JSON (jsonb::text renders
  whitespace) and frontend test that assumed a unique "Test User"
  match.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): close residual MUL-2600 gaps from review (MUL-2600)

Migration 108 FK now correctly references agent_task_queue(id) instead
of the non-existent agent_task table; the previous name blocked CI
backend migrations.

Task-token-authenticated requests can no longer be re-routed at a
different workspace by passing workspace_slug / workspace_id /
?workspace_id / a URL workspace param. ResolveWorkspaceIDFromRequest
and resolveWorkspaceUUID both short-circuit on X-Actor-Source=task_token
and return only the token-bound X-Workspace-ID; buildMiddleware adds a
defence-in-depth 403 if any URL-resolved workspace disagrees with the
token binding.

mcp_config no longer leaks back to agent actors through UpdateAgent /
CreateAgent / ArchiveAgent / RestoreAgent HTTP responses — the same
redactAgentResponseForActor helper that GetAgent/ListAgents use is now
applied to mutation responses too. WS broadcasts were already redacted
via broadcastAgentResponse.

FailTask and every TaskService cancel path (CancelTask /
CancelTasksForIssue / CancelTasksForAgent / CancelTasksByTriggerComment
/ BroadcastCancelledTasks) now eagerly DeleteTaskTokensByTask so the
mat_ token's 24h window doesn't outlive a terminated task. Failure is
non-fatal — the FK cascade and expiry remain durable guards.

Doc-only: clarify that PUT /api/agents/{id} now hard-rejects bodies
that carry custom_env (was previously "silently ignores").

Tests:
- middleware: TestResolveWorkspaceIDFromRequest gains a task_token
  case asserting client-supplied slug/id/query cannot override the
  bound workspace.
- handler: TestUpdateAgent_RedactsMcpConfigForAgentActor and
  TestUpdateAgent_KeepsMcpConfigForMemberActor pin the mutation-
  response redaction contract per actor type.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): match redacted mcp_config as JSON null, not Go nil (MUL-2600)

`AgentResponse.McpConfig` is `json.RawMessage` without `omitempty`, so
the redacted response serialises as `"mcp_config": null`. On decode,
`json.RawMessage` keeps the literal bytes `null` rather than collapsing
to Go nil, which made the assertion fire on a non-leak.

The product contract (field always present, distinguished from "no
config" via `mcp_config_redacted`) is intentional, so adjust the test
to check for "no secret-bearing content" instead of weakening the
contract via `omitempty`.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-25 18:42:48 +08:00
LinYushen
8e9df90d32 feat: include repo description in agent brief (#3203)
Add Description field to RepoData structs so that workspace repo
descriptions (set via the settings UI) are preserved through
normalization and rendered in the agent brief as:
  - <url> — <description>

When no description is set, the existing format is unchanged.

Closes MUL-2610

Co-authored-by: multica-agent <github@multica.ai>
2026-05-25 15:16:22 +08:00
Bohan Jiang
a55c03a0b3 fix(agent): inject Workspace Context into agent brief (MUL-2542) (#3078)
* fix(agent): inject Workspace Context into agent brief (MUL-2542)

The per-workspace `workspace.context` field (Settings → General) was
stored in the DB but never reached the agent prompt. Plumb it from the
workspace row through the claim response, the daemon's Task struct and
TaskContextForEnv, and render it as `## Workspace Context` in the meta
brief above `## Available Commands`. Heading is skipped when the field
is empty so workspaces that haven't set a context don't see a bare
header. Applies to every task kind — issue, comment, chat, autopilot,
quick-create — so the shared system prompt is consistent regardless of
trigger source.

Co-authored-by: multica-agent <github@multica.ai>

* chore(server): gofmt files touched by workspace-context injection

Run gofmt on the files that buildWorkspaceContext injection touched.
Cleans up composite-literal alignment in execenv task context and
struct-tag alignment in Task / AgentTaskResponse / RegisterRequest.
No behavior change.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: J <agent-j@multica.ai>
2026-05-22 17:23:27 +08:00
Kagura
eefc6cebaa feat(server): add workspace-level always_redact_env setting (MUL-2495) (#2367)
* feat(server): add workspace-level always_redact_env setting

When a workspace opts into always_redact_env (via workspace settings JSON),
all agent GET/LIST responses will have custom_env values masked and
mcp_config nulled regardless of the caller's role. This provides a stricter
security posture for single-tenant self-hosts or environments where
screen-sharing or pairing makes plaintext secrets a risk.

The setting is opt-in and defaults to false (preserving existing behavior).
Owners can still write secrets via the update path; they just cannot read
them back through the API when this setting is enabled.

Closes #2352

* fix(server): fail-closed on GetWorkspace, add HTTP tests, distinguish redaction reason

Address review feedback on #2367:

1. GetWorkspace failure now returns 500 instead of silently defaulting
   to alwaysRedact=false (fail-open → fail-closed).

2. Add HTTP-level regression tests for always_redact_env:
   - GetAgent with flag on → owner sees redacted env
   - ListAgents with flag on → owner sees redacted env
   - GetAgent with default settings → owner sees plaintext env

3. Add custom_env_redacted_reason field ('policy' | 'role') to
   distinguish workspace-policy redaction from role-based redaction.
   UI now only sets readOnly when reason is 'role', allowing owners
   to edit env even when always_redact_env is enabled.

4. Write-back footgun tracked in #2999.

Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>

* fix(test): clear workspace settings before DefaultNoRedactForOwner

Guard against test-order leakage: if a preceding test enabled
always_redact_env on the shared workspace and its cleanup didn't
run (e.g. due to -shuffle or parallel execution), this test would
incorrectly see policy-level redaction. Explicitly reset settings
to NULL before assertions.

Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>

* fix(ui): make EnvTab read-only when env is redacted by any policy

Previously the readOnly guard only checked for 'role' redaction,
leaving the tab editable under 'policy' redaction. This meant
a user could save the form with '****' placeholder values,
permanently overwriting the actual secrets.

Use the boolean custom_env_redacted flag instead so the tab is
locked regardless of the redaction reason.

Fixes the regression flagged in the third-pass review.

Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>

* fix: reset workspace settings to empty JSON instead of NULL

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: gofmt AgentResponse struct alignment

Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>

---------

Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-22 14:10:09 +08:00
Bohan Jiang
2bec2221d2 feat(agent): per-agent thinking_level for claude + codex (MUL-2339) (#2865)
* feat(agent): persist thinking_level per agent (MUL-2339)

Adds a nullable `thinking_level` column to the `agent` table so the
backend can route a runtime-native reasoning/effort token (e.g. Claude's
`xhigh`, Codex's `minimal`) through to the agent CLI on every dispatch.

The column is intentionally TEXT rather than an enum — Claude and Codex
publish overlapping but distinct vocabularies and we want the persisted
value to round-trip exactly through whichever CLI receives it. NULL is
the "use runtime default" sentinel that every downstream consumer reads
as "do not inject --effort / reasoning_effort".

This commit is just the storage layer (migration + sqlc); subsequent
commits wire it through the API, daemon, and agent backends.

Co-authored-by: multica-agent <github@multica.ai>

* feat(agent-backend): inject reasoning effort for claude + codex (MUL-2339)

Extends ExecOptions with a runtime-native ThinkingLevel string and wires
it into the Claude and Codex backends. Discovery is driven by the local
CLI so the daemon advertises whatever the host install supports rather
than a hand-maintained list that goes stale.

Per Elon's PR1 review:
- Claude: parses `claude --help` to learn the `--effort` superset and
  projects through a per-model allow-list (xhigh is Opus-only; max is
  session-only on the smaller models). Falls back to a conservative
  static list when the binary is missing or help drift hides the line.
- Codex: drives `codex debug models --output json` so per-model
  reasoning subsets and the documented default come directly from the
  CLI. The older config-error probe trick is gone — the JSON path is
  stable and doesn't pollute stderr with an intentional misconfig.
- Cache key includes (provider, executablePath, cliVersion) so a CLI
  upgrade invalidates entries that referenced the older help / catalog.

Per Trump's PR1 constraint, all three Codex injection points
(thread/start.config, thread/resume.config, turn/start.effort) flow
through one helper (`applyCodexReasoningEffort`) so they cannot drift
independently. The shared `codexReasoningCases` fixture in
`thinking_test.go` asserts the same value→{shape, key} contract at
each site for every level the runtimes know about.

Claude's `--effort` is also added to `claudeBlockedArgs` so a user
custom_args entry can't silently outvote the daemon-injected value.

Co-authored-by: multica-agent <github@multica.ai>

* feat(api): wire thinking_level through API + daemon contract (MUL-2339)

End-to-end plumbing for the per-agent reasoning/effort setting:

- AgentResponse / TaskAgentData now carry `thinking_level`; the daemon's
  claim response includes it and the daemon's executor passes it through
  to agent.ExecOptions, where the Claude and Codex backends already know
  what to do with it.
- ModelEntry on the runtime-models wire format gains a `thinking` block
  carrying `supported_levels` + `default_level` per model so the UI can
  render a runtime-aware picker without the server having to know about
  the local CLI install. `handleModelList` projects the agent-package
  catalog (including the new Thinking field) into the wire shape.
- CreateAgent / UpdateAgent gate the field with a synchronous provider
  enum check (claude / codex only today). UpdateAgent is tri-state:
  field omitted = no change, "" = explicit clear (new
  `ClearAgentThinkingLevel` query, mirrors the existing mcp_config null
  pattern), non-empty = validate then set.

Per Trump's PR1 review, the API NEVER auto-clears on a runtime/model
swap and ALWAYS returns 400 on an unknown literal value — same shape
across CreateAgent, UpdateAgent, and combined patches that move
runtime + level in one request. Per-model combination failures (e.g.
`xhigh` against a model that only supports up to `high`) surface as a
daemon-side task error, not a silent server-side rewrite.

TS types follow the same shape: `Agent.thinking_level`,
`CreateAgentRequest`/`UpdateAgentRequest` add the field, `RuntimeModel`
grows a `thinking` block. Older backends omit the field, which the
front-end treats as "no picker for this model" — installed desktop
builds keep working.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agent): correct codex debug models argv + pin via runner test (MUL-2339)

`codex debug models --output json` is rejected by codex-cli 0.131.0 —
the subcommand emits JSON on stdout by default and has no `--output`
flag. Drop the flag and add `--bundled` to skip the network refresh
discovery doesn't need. Move the argv to a package-level var and add
a test that runs a fake `codex` to assert the binary actually
receives exactly `debug models --bundled`, so the contract can't
silently drift on the next refactor.

Also teach ValidateThinkingLevel to resolve an empty model to the
provider's default model entry. Without this, every default-model
task with a persisted thinking_level would be misjudged "unknown
model" by the daemon guard.

Co-authored-by: multica-agent <github@multica.ai>

* fix(api): reject runtime switch that would leave invalid thinking_level (MUL-2339)

A PATCH that changed `runtime_id` without touching `thinking_level`
used to silently keep the existing value, so a Claude agent storing
`max` could land on a Codex runtime where `max` is not a recognised
token at all, and the daemon would receive a literal-invalid level.

Hold the same "always 400 on literal-invalid, never silent coerce"
rule on this implicit path. When runtime_id changes and the existing
value is not in the new provider's enum, return 400 with the
recovery options (clear via `thinking_level=""` or re-set in the
same PATCH).

Add coverage for both the kept-when-still-valid and the rejected
cases, plus the two recovery paths (clear and replace).

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): guard runTask with per-model thinking_level validator (MUL-2339)

ValidateThinkingLevel existed but had no call site — `task.Agent.
ThinkingLevel` flowed straight into ExecOptions, so `xhigh` configured
on a non-Opus Claude model, or API-side stale values that escaped the
provider enum gate, would be injected anyway.

Run the validator before building ExecOptions. Invalid combinations
log a warning and drop the level instead of failing the task: the
agent still runs, just at the runtime's default reasoning effort.
Discovery errors fail open (keep the level, let the CLI surface any
objection) so a transient `claude --help` failure can't strand work.

Empty model is forwarded as-is; the validator resolves it to the
provider's default model internally per the cross-package contract.

Co-authored-by: multica-agent <github@multica.ai>

* chore(agent): drop stale `--output json` comments + unused scanner (MUL-2339)

Codex CLI's `debug models` subcommand emits JSON without an `--output`
flag, and `parseCodexDebugModels` never read from the bufio.Scanner.
Sync the comments with the actual invocation and remove the dead init.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 12:30:10 +08:00
Jiayuan Zhang
2ad1cd8ff8 feat(profile): user profile description injected into agent brief (MUL-2406)
## Summary

Adds per-user `profile_description` so coding agents have cheap, durable context about who is asking. v1 per the brief Xeon locked in on [MUL-2406](mention://issue/63a7247c-4f6a-42cf-90d1-7c746e77158a):

- **DB** — `user.profile_description TEXT NOT NULL DEFAULT ''` (migration 096). 2000-rune cap enforced server-side. No nullable / privacy state to manage.
- **API** — `PATCH /api/me` accepts the field; `UserResponse` always emits it. Client wraps `updateMe` in a lenient `UserSchema` + `EMPTY_USER` fallback per CLAUDE.md API Response Compatibility.
- **UI** — Settings → Account gains an "About you" textarea with live `n/2000` counter, `maxLength` guard, and a localized too-long error (EN + zh-Hans).
- **CLI** — `multica user profile get` / `multica user profile update` with `--description / --description-stdin / --description-file / --clear`, mirroring the existing `issue comment add` input-mode menu.
- **Daemon injection** — claim handler resolves the runtime owner and stamps `requesting_user_name` + `requesting_user_profile_description` on the task. `buildMetaSkillContent` emits `## Requesting User` between `## Agent Identity` and `## Available Commands`, blockquoted and framed as background context. The block is omitted entirely when the description is empty (no token cost when unused).

Brief is written **once per task** via `CLAUDE.md` / `AGENTS.md`, not the per-turn prompt — same path the agent already reads for identity, so no extra per-turn cost.

## Test plan

- [x] `go build ./...`, `go vet ./...`, `go test ./internal/cli/ ./internal/daemon/ ./internal/daemon/execenv/ ./cmd/multica/`
- [x] New brief tests: `TestBuildMetaSkillContentEmitsRequestingUser`, `TestBuildMetaSkillContentOmitsRequestingUserWhenEmpty`
- [x] `pnpm typecheck`, `pnpm lint`, `pnpm test` (74 files, 644 tests pass)
- [ ] Handler DB tests (`TestUpdateMe*`) require a migrated test DB — not runnable in this sandbox
- [ ] Manual: open Settings → Account, set a description, confirm the next daemon-run agent's `CLAUDE.md` shows `## Requesting User`
2026-05-19 19:51:28 +02:00
Bohan Jiang
fe1ccb19c9 Revert "MUL-2324 conditionally inject non-core rule blocks (#2771)" (#2802)
This reverts commit e8fb0efe3d.
2026-05-18 17:48:44 +08:00
Multica Eve
e8fb0efe3d MUL-2324 conditionally inject non-core rule blocks (#2771)
* feat(runtime): conditionally inject non-core rule blocks

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): tighten mention rule triggers

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-18 12:52:54 +08:00
LinYushen
b7a58c06ac Revert "feat(task): wire claim lease into TaskService and sweeper (MUL-2246) …" (#2673)
This reverts commit bb32be0e50.
2026-05-15 16:06:58 +08:00
LinYushen
bb32be0e50 feat(task): wire claim lease into TaskService and sweeper (MUL-2246) (#2662)
* feat(task): wire claim lease queries into TaskService and sweeper (MUL-2246)

- ClaimTask now uses ClaimAgentTaskWithLease (generates claim_token + lease)
- StartTask accepts optional claim_token for token-verified start
- AgentTaskResponse includes claim_token for daemon to use
- Daemon client sends claim_token in StartTask body
- Sweeper calls RequeueExpiredClaimLeases each tick
- Legacy daemons without claim_token still work (graceful fallback)

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): address PR #2662 review blockers (MUL-2246)

1. ClaimAgentTaskForRuntime: push runtime_id into atomic SQL WHERE clause
   so runtime A cannot claim tasks queued for runtime B under the same agent.

2. Legacy StartAgentTask: add claim_token IS NULL guard so leased rows
   cannot be started without token verification. Handler rejects malformed
   tokens with 400 instead of silently degrading to legacy path.

3. StartAgentTaskWithClaimToken: validate claim_expires_at >= now(),
   preserve claim_token until terminal state (only clear claim_expires_at),
   use CTE + UNION ALL for idempotent retry when daemon resends after a
   lost StartTask response. Return 409 Conflict on token mismatch/expiry.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): StartTask 409 handling, transport retry, claim_token on FailTask (MUL-2246)

- StartTask 409 (claim superseded): release slot, don't call FailTask
- StartTask transport timeout/5xx: retry once with same token, then
  check task status before failing
- FailTask now sends claim_token; server-side FailAgentTask SQL adds
  AND (claim_token IS NULL OR claim_token = @claim_token) guard so
  stale daemons cannot fail tasks that have been re-claimed

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): close FailTask token bypass and RequeueExpiredClaimLeases liveness gap (MUL-2246)

Blocker 1 - FailTask token validation:
- SQL: change (param IS NULL OR claim_token = param) to
  (param IS NULL AND claim_token IS NULL) OR claim_token = param
  so tokenless requests can only fail legacy (tokenless) rows.
- task.go: malformed claim_token now returns ErrInvalidClaimToken (400)
  instead of being silently dropped to NULL.
- Handler: maps ErrInvalidClaimToken→400, ErrClaimTokenInvalid→409.
- Service: when UPDATE returns no rows but task is still active,
  return ErrClaimTokenInvalid (token mismatch) instead of silent success.

Blocker 2 - RequeueExpiredClaimLeases runtime liveness:
- SQL: JOIN agent_runtime, only requeue tasks where runtime is 'online'.
  Dead/offline runtime tasks stay dispatched for FailTasksForOfflineRuntimes.
- FOR UPDATE → FOR UPDATE OF atq (required with JOIN).

Regression tests:
- task_claim_token_test.go: malformed, tokenless-on-tokened, wrong-token
- requeue_lease_test.go: SQL must JOIN agent_runtime with online filter

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): move expired lease requeue to ClaimTaskForRuntime preflight, add heartbeat freshness backstop (MUL-2246)

- Add RequeueExpiredClaimLeasesForRuntime: per-runtime preflight self-requeue
  in ClaimTaskForRuntime. Runtime proves liveness by actively claiming, so no
  heartbeat check needed.
- Update global RequeueExpiredClaimLeases to require ar.last_seen_at freshness
  (stale_threshold_secs param). Prevents requeuing to a dead runtime in the
  90s gap between lease expiry (60s) and offline detection (150s).
- Add regression tests verifying the heartbeat freshness check and that the
  preflight query does not join agent_runtime.

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): use LivenessStore for global requeue, move preflight before empty-cache (MUL-2246)

Blocker 1: Global RequeueExpiredClaimLeases now uses LivenessStore.IsAliveBatch
to verify runtimes are truly alive before requeuing expired leases. When
LivenessStore is unavailable (no Redis), global requeue is skipped entirely —
the preflight self-requeue in ClaimTaskForRuntime handles live runtimes. This
closes the 60-150s gap where a dead runtime still appears online in DB.

Blocker 2: Moved RequeueExpiredClaimLeasesForRuntime BEFORE EmptyClaim.IsEmpty
fast-path in ClaimTaskForRuntime. Expired leases are now requeued (which bumps
the empty cache via notifyTaskAvailable) before the empty check can
short-circuit the claim path.

Also adds ListRuntimesWithExpiredClaimLeases SQL query and LivenessChecker
interface on TaskService.

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): wire EmptyClaimCache into backend taskSvc for backstop requeue (MUL-2246)

The backend taskSvc used by the sweeper only had Liveness wired but not
EmptyClaim. When global backstop requeue called notifyTaskAvailable,
s.EmptyClaim.Bump() was a nil no-op — the handler's empty-cache was never
invalidated, so the daemon's next claim hit a stale empty verdict.

Fix: wire the same Redis-backed EmptyClaimCache into the backend taskSvc
in main.go (same Redis keys as router.go:139 handler instance).

Add regression test verifying backstop requeue invalidates the handler's
empty-cache.

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): global backstop must not requeue — alive runtimes use preflight, dead stay dispatched (MUL-2246)

- RequeueExpiredClaimLeases is now a no-op (returns 0 always)
- Alive runtimes self-requeue via ClaimTaskForRuntime preflight
- Dead runtimes stay dispatched for FailTasksForOfflineRuntimes
- Rewriting to queued on dead runtime creates 2h blackhole (offline
  sweeper only handles dispatched/running)
- Test actually calls RequeueExpiredClaimLeases and asserts 0 in all cases

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): remove duplicate usage reporting block after merge conflict (MUL-2246)

The merge resolution introduced a second ReportTaskUsage call after the
status check, duplicating the usage-before-early-return block that already
runs right after runner.run. Remove the duplicate and add a regression test
asserting /usage is called exactly once on the normal completion path.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-15 15:15:31 +08:00
Bohan Jiang
fdf19cac8f fix(quick-create): default squad-picked issues to the squad, not the leader (#2611)
When the user opens quick-create with a squad selected, the task is
enqueued against the squad's leader agent — but the squad, not the
leader, is the expected owner. The prompt previously instructed the
leader to "default to YOURSELF" using its own agent UUID, hiding new
issues from the squad's delegation flow.

Surface the squad's id + name on the claim response and branch the
default-assignee instruction in buildQuickCreatePrompt: when SquadID is
present, point --assignee-id at the squad UUID and explicitly forbid
self-assignment.

MUL-2203

Co-authored-by: multica-agent <github@multica.ai>
2026-05-14 17:48:02 +08:00
Naiyuan Qing
86aa5199fc feat(chat): support attachments & images in chat input (#2445)
* docs(plans): chat attachment & image support implementation plan

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(db): add chat_session_id/chat_message_id to attachment

Co-authored-by: multica-agent <github@multica.ai>

* feat(db): sqlc — chat_session_id on CreateAttachment + LinkAttachmentsToChatMessage

Co-authored-by: multica-agent <github@multica.ai>

* feat(file): upload-file accepts chat_session_id form field

Co-authored-by: multica-agent <github@multica.ai>

* feat(chat): SendChatMessage links uploaded attachments to the new message

Co-authored-by: multica-agent <github@multica.ai>

* feat(api): uploadFile accepts chatSessionId; sendChatMessage accepts attachmentIds

Co-authored-by: multica-agent <github@multica.ai>

* feat(core): useFileUpload supports chatSessionId context

Co-authored-by: multica-agent <github@multica.ai>

* feat(chat): support paste/drag/upload attachments in chat input

Co-authored-by: multica-agent <github@multica.ai>

* test(e2e): chat input attachment upload + send round-trip

Co-authored-by: multica-agent <github@multica.ai>

* chore(chat): keep lazy-created session title empty so untitled fallback localizes

Co-authored-by: multica-agent <github@multica.ai>

* fix(chat): address review — dedupe ensureSession + parse upload response

- chat-window: cache in-flight createSession promise in a ref so a file drop
  followed by a quick send no longer spawns two sessions (and orphans the
  attachment on the losing one).
- Attachment type + EMPTY_ATTACHMENT + AttachmentResponseSchema: include the
  new chat_session_id / chat_message_id fields the server now returns.
- uploadFile: route the response through parseWithFallback so a malformed
  body returns EMPTY_ATTACHMENT instead of an undefined-keyed Attachment,
  matching the API boundary rule.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(chat): address PR #2445 review — test ctx, send gating, attachment surface

1. Backend test was 400ing because the handler reads workspace from
   middleware-injected ctx, and `newRequest` only sets the header. Helper
   `withChatTestWorkspaceCtx` mirrors the agent-access-test pattern and
   loads the member row + SetMemberContext before invoking the handler.

2. Attachment metadata now flows end-to-end:
   - new sqlc `ListAttachmentsByChatMessageIDs` (batch lookup, mirrors the
     comment-side query)
   - `chatMessageToResponse` takes `attachments` and `ChatMessageResponse`
     surfaces them — same shape as CommentResponse
   - `ListChatMessages` loads them via a new `groupChatMessageAttachments`
     helper so the chat bubble can render file cards
   - daemon claim path pulls `ListAttachmentsByChatMessage` for the latest
     user message and ships `ChatMessageAttachments` to the daemon
   - `buildChatPrompt` lists id+filename+content_type and instructs the
     agent to `multica attachment download <id>` — fixes the private-CDN
     expiring-URL problem where the markdown URL would have expired by
     the time the agent acts
   - TS `ChatMessage` gains an optional `attachments` field

3. Chat composer now blocks send while uploads are in flight:
   - `pendingUploads` counter increments in handleUpload, SubmitButton
     uses it to disable
   - handleSend also gates on `editorRef.current.hasActiveUploads()` to
     catch the Mod+Enter path that bypasses the button
   - new vitest covers the "drop large file → immediate send" scenario
     where attachment id would otherwise be silently dropped

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* chore: drop implementation plan doc

Process artefact, not something the repo needs to keep.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-12 10:57:54 +08:00
Bohan Jiang
63d215e1c3 feat(runtime): visibility (public/private) gate on CreateAgent / UpdateAgent (#2419)
* feat(runtime): visibility (public/private) gate on CreateAgent / UpdateAgent

Closes the hole where a plain workspace member could pick another member's
runtime in the Create Agent dialog and bind an agent to it — the backend
wasn't checking runtime ownership, so the agent ran on someone else's
hardware / tokens. Reported on GH #1804.

Schema
- Migration 083 adds agent_runtime.visibility ('private' default, 'public')
  with a CHECK constraint. Existing rows default to private — same
  ownership semantics as before, no behavior change for legacy data.

Backend
- canUseRuntimeForAgent predicate: allow when caller is workspace
  owner/admin, the runtime owner, or the runtime is public.
- CreateAgent and UpdateAgent both gate on it: UpdateAgent matters because
  a plain member could otherwise create on their own runtime, then re-bind
  to a private one.
- PATCH /api/runtimes/:id accepts { visibility } — owner/admin only,
  validated against the same private/public allow-list.

Frontend
- Create-agent dialog renders other-owned private runtimes disabled with a
  Lock badge + tooltip explaining who to ask.
- Inspector runtime-picker disables the same set so re-binding fails
  the same way at the UI layer.
- Runtime detail diagnostics gains a Visibility editor (owner/admin) or
  read-only chip (everyone else).
- Runtime list shows a private/public chip next to the name.

Tests
- Go: canUseRuntimeForAgent truth table; CreateAgent / UpdateAgent
  end-to-end gate tests (admin / runtime owner / plain member);
  PATCH visibility owner / admin / member / invalid-value coverage.
- Vitest: create-agent dialog disabled state on private/public runtimes,
  default-runtime selection skips locked rows; runtime detail visibility
  editor → mutation, read-only fallback.

Migrating runtimes: existing rows default to private to preserve the
"owner only" status quo. Owners switch to public via the detail page
diagnostics card.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): apply timezone+visibility atomically; don't seed locked template runtime

Two issues surfaced in review of MUL-2062:

1. PATCH /api/runtimes/:id ran the timezone branch first, which:
   - returned early on a tz no-op, silently dropping a concurrent
     `visibility` patch in the same body;
   - committed the timezone mutation (+ usage rollup rebuild) before
     validating visibility, so an invalid visibility left the row
     half-updated.

   Validate every field first, then run the mutations in order. The
   no-op short-circuit now only triggers when nothing else is requested.

2. The Create Agent dialog in duplicate mode unconditionally seeded
   `template.runtime_id` as the selected runtime, even when that runtime
   is now private and owned by someone else — the user saw a selected
   row they couldn't submit (Create → backend 403). Fall back to the
   first usable runtime when the template's runtime is locked, and gate
   the Create button on `selectedRuntimeLocked` as defense in depth.

Tests:
- Go: TestUpdateAgentRuntime_CombinedPatchAppliesBoth (tz no-op +
  visibility flip), TestUpdateAgentRuntime_InvalidVisibilityDoesNotMutateTimezone
  (atomic-fail invariant).
- Vitest: duplicate template pointing at a locked runtime now seeds
  the first usable one; Create button stays disabled when no usable
  alternative exists.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-11 22:53:07 +08:00
Bohan Jiang
b26f850d4e feat(agents): gate private-agent surfaces with allowed_principals predicate (#2359)
* feat(agents): gate private-agent surfaces with allowed_principals predicate

Tighten chat/@-mention, history, edit, and delete entry points so private
agents are only reachable by their owner or workspace owner/admin. Agent-to-
agent traffic still bypasses the gate so A2A collaboration keeps working.

- New canAccessPrivateAgent predicate in handler/agent_access.go; used by
  comment.enqueueMentionedAgentTasks (replacing the inline check), GetAgent,
  ListAgents (filter), ListAgentTasks, GetWorkspaceAgentRunCounts /
  Activity30d / TaskSnapshot (workspace-wide aggregations no longer leak
  private-agent existence + counts), chat.CreateChatSession,
  chat.SendChatMessage (re-checks on every send so role changes can't leave
  a stale session as a back-door), and autopilot.shouldSkipDispatch
  (caller = autopilot creator).
- allowed_principals is computed inline as {agent.owner_id} ∪ workspace
  owner/admin members. No new table — manual config is intentionally not
  exposed in v1; the predicate is the extension seam.
- Front-end agent detail page distinguishes 403 (private agent the caller
  can't access) from 404 (deleted/missing) and renders a "no access"
  placeholder with a back-to-agents button.
- Go tests cover the pure predicate matrix + the four protected surfaces;
  vitest passes for the affected views.

Co-authored-by: multica-agent <github@multica.ai>

* feat(agents): gate issue assignment with the private-agent predicate

Refactor validateAssigneePair to call the shared canAccessPrivateAgent
helper. This closes the back door where a plain member could assign a
private agent to an issue and let normal task dispatch run it, side-
stepping the chat / @-mention gate. Agent callers (X-Agent-ID) bypass
so A2A delegation onto a private assignee still works.

Add an integration test covering all three callers (workspace owner,
agent owner, plain member).

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): close three private-agent gate bypasses found in PR review

1. X-Agent-ID forgery (resolveActor): require X-Task-ID alongside
   X-Agent-ID before trusting the agent identity. Without this a plain
   workspace member could set X-Agent-ID to any visible agent UUID and
   short-circuit the gate to "actor=agent, allow". Daemons already
   pair the two headers, so legitimate A2A traffic is unaffected.

2. Chat history read path (chat.go): GetChatSession / ListChatMessages /
   GetPendingChatTask / MarkChatSessionRead now go through a new
   gateChatSessionForUser helper that re-applies canAccessPrivateAgent
   after the ownership check, so a session creator whose role was later
   downgraded loses transcript access. ListChatSessions and
   ListPendingChatTasks filter their result sets by the same predicate.

3. Cross-workspace @mention (comment.enqueueMentionedAgentTasks):
   resolve the mentioned agent via GetAgentInWorkspace scoped to the
   issue's workspace so a UUID belonging to a different workspace's
   private agent can't slip past the gate (the gate was being applied
   against the current workspace's role table, which is the wrong
   one).

Regression tests cover each bypass, plus an update to the resolveActor
unit test to reflect the new "X-Agent-ID without X-Task-ID falls back
to member" contract.

Co-authored-by: multica-agent <github@multica.ai>

* test(handler): seed X-Task-ID alongside X-Agent-ID in existing agent-caller tests

After tightening resolveActor to require both headers (X-Agent-ID +
X-Task-ID) for the "agent" actor identity, three existing tests that
set only X-Agent-ID started failing because their requests now resolve
to "member" instead of "agent". Add createHandlerTestTaskForAgent
helper and seed a task per agent-caller assertion. Also patch
TestAgentExplicitMentionStillTriggers — it still passed only because
the @mention path doesn't care about author type for member callers,
but the test claims to exercise the agent path, so make it faithful.

Co-authored-by: multica-agent <github@multica.ai>

* test(handler): finish X-Task-ID seeding + fix cross-workspace mention test schema

The previous CI run still failed in two places:

1. server/cmd/server integration tests — postCommentAsAgent → authRequestWithAgent
   only set X-Agent-ID, so resolveActor downgraded the request to "member"
   and the on_comment chain produced the wrong task counts. Fix:
   authRequestWithAgent now also sets X-Task-ID, fetched or seeded by a new
   ensureAgentTask(agentID) helper.

2. TestMentionAgent_RejectsCrossWorkspaceAgentUUID's hand-crafted comment
   INSERT was missing comment.workspace_id, which migration 025 made
   NOT NULL. Pass testWorkspaceID into the seed row.

Build + vet clean locally; both packages compile.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-11 12:39:45 +08:00
Multica Eve
ce00e05169 Add canonical PostHog core metrics events (#2302)
* Add canonical PostHog core metrics events

Co-authored-by: multica-agent <github@multica.ai>

* Address analytics review feedback

Co-authored-by: multica-agent <github@multica.ai>

* Tighten analytics review follow-ups

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 13:12:00 +08:00
Jiayuan Zhang
fe956fc670 feat(issues): add Copy local workdir path to issue menu (#2196)
* feat(issues): add Copy local workdir path to issue menu

Surface the daemon-pinned task work_dir on the AgentTaskResponse and add a
"Copy local workdir path" action to the issue dropdown / context menu. The
action picks the most recent task with a recorded work_dir and writes it
to the clipboard so users can jump straight to the local execution
directory to inspect results.

Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): preserve user activation in Copy local workdir path

Move the task list subscription out of useIssueActions and into
IssueActionsMenuItems, where Base UI lazily mounts the menu content
only after the user opens the menu. The click handler now reads
straight from the cached query result and writes to the clipboard
synchronously, so the awaited fetch no longer drops the browser's
transient user activation when the cache is cold (e.g. opening the
context menu on an issue list row that hasn't pre-populated the
ExecutionLogSection cache).

Per Emacs PR review.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-07 06:05:14 +02:00
Bohan Jiang
d0ac67dea2 fix(skills): drop SKILL.md content from list endpoints (#2180)
* fix(skills): drop SKILL.md content from list endpoints (#2174)

`GET /api/skills` and `GET /api/agents/{id}/skills` were SELECT *'ing the
skill row and shipping the full SKILL.md `content` blob to every caller.
SKILL.md bodies routinely run 50–200KB each, so a workspace with 30–40
skills returned multi-megabyte JSON arrays — past the CLI's 15s timeout
on high-latency links and locking out non-US users entirely.

Add `ListSkillSummariesByWorkspace` / `ListAgentSkillSummaries` sqlc
queries that omit `content`, plus a dedicated `SkillSummaryResponse`
wire shape so the contract is explicit (versus stuffing
`Content: ""` back into the existing struct). Detail endpoints
(`GET /api/skills/{id}`, agent CRUD return values) keep returning the
full body.

`AgentResponse.skills` and the matching TS `Agent.skills` now use
`SkillSummary[]` — frontend list/columns code already only read
id/name/description/config.origin, so the type narrowing matches actual
usage and prevents new code from accidentally depending on a content
field that won't be there.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): narrow embedded skills to AgentSkillSummary; gofmt agent.go

GPT-Boy review of #2180: the previous commit typed AgentResponse.Skills as
[]SkillSummaryResponse, but the agent list batch query
(ListAgentSkillsByWorkspace) only joins agent_id/id/name/description, so
the wider type left workspace_id/config/created_at/updated_at as zero
values. Define a dedicated AgentSkillSummary {id,name,description} that
matches what the batch query actually returns and what the frontend
actually reads (`agent.skills.map(s => s.name|s.id)`); the standalone
GET /api/agents/{id}/skills endpoint keeps SkillSummaryResponse for
callers that need the source/origin info.

Switch GetAgent's per-agent skills load from ListAgentSkills (full Skill
rows including content) back to ListAgentSkillSummaries to avoid reading
SKILL.md bodies just to discard them.

Re-run gofmt on agent.go to fix the field-tag alignment that drifted when
Skills changed type.

Co-authored-by: multica-agent <github@multica.ai>

* docs(types): correct SkillSummary JSDoc — Agent.skills is AgentSkillSummary[]

GPT-Boy spotted on review: comment said SkillSummary was "embedded in
Agent.skills", but that field is now AgentSkillSummary[]. Re-point the
reader at the right type to avoid future confusion.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-07 01:36:29 +08:00
Jiayuan Zhang
d492b9d7a6 Revert "feat(quick-create): add preset issue fields (#2002)" (#2042)
This reverts commit a039c4d803.
2026-05-03 20:02:40 +02:00
ayakabot
a039c4d803 feat(quick-create): add preset issue fields (#2002)
Fixed: #2001
2026-05-03 19:37:12 +08:00
Bohan Jiang
da5dbc6224 refactor(repos): drop unused description + tighten create-project layout (#1930)
* refactor(repos): drop unused description + tighten create-project layout

Two related changes that touch the workspace-repos surface together.

1. Remove the per-repo `description` field everywhere it was threaded.
   The only place it ever surfaced was a markdown table column the daemon
   wrote into the agent runtime config, where most rows just read "—"
   anyway. Agents already discover project structure by running
   `multica project` / `multica issue` against the CLI, so the human-
   readable description string carried no real value while taking up an
   extra Settings input row and propagating through six layers (settings
   UI → workspace.repos jsonb → handler RepoData → daemon RepoData →
   repocache.RepoInfo → execenv.RepoContextForEnv).

   - Settings → Repositories drops the description input; the URL field
     now spans the whole row.
   - WorkspaceRepo TS type loses `description`; backend RepoData /
     RepoInfo / RepoContextForEnv all collapse to URL only.
   - Daemon's runtime_config Repositories block changes from a
     `| URL | Description |` markdown table to a simple bullet list.
   - Tests updated; jsonb residue in existing workspaces is dropped at
     normalize time, so no migration needed.

2. Tighten the Create Project modal footer: pull the Status / Priority /
   Lead / Repos pills onto the same row as the Create Project button
   (Linear-style single-row footer) instead of stacking them above it,
   and swap the Repos pill icon from `FolderGit` to a real GitHub mark
   (lucide-react v1 dropped brand icons, so the mark lives inline as a
   small SVG component in this file).

   I tried promoting Repos to its own "Resources" strip above the footer
   to separate the resources abstraction from project metadata, but with
   a single pill it looked too sparse — leaving a TODO comment in the
   footer to revisit once we add Linear / Notion / Figma / Slack
   resource types.

* fix(daemon test): drop residual Description field on RepoData literals

* fix(repos): drop Description residue surfaced after rebase on #1929

Project-resource github_repo lift path (#1929) and registerTaskRepos
both still constructed RepoData{...Description: ...} after the rebase.
Two test sites in daemon_test.go and execenv_test.go also reintroduced
the field. Strip them so the Description-removal change builds and
tests pass with the latest main.
2026-04-30 14:55:03 +08:00
Bohan Jiang
44608713bb feat(projects): typed project resources + agent runtime injection (#1926)
* feat(projects): typed project resources + agent runtime injection

Adds a `project_resource` table that lets a project carry typed pointers
(github_repo today, more later) and surfaces them at agent runtime.

Server
- migration 065: project_resource (resource_type TEXT + resource_ref JSONB)
- sqlc CRUD + handler at /api/projects/{id}/resources
- claim handler attaches project_id/title + resources to issue tasks

Daemon
- TaskContextForEnv carries project context
- writes .multica/project/resources.json into workdir
- adds "## Project Context" block to CLAUDE.md / AGENTS.md / GEMINI.md
  via type-dispatched formatter so new resource types just add a case

CLI
- multica project create --repo <url> attaches repos in one step
- multica project resource add/list/remove

Frontend
- Project create modal: Repos pill (workspace repos + ad-hoc URL)
- Project detail sidebar: collapsible Resources section with attach/remove

Docs
- New "Project Resources" chapter explaining the abstraction and
  exactly what code to touch when adding a new resource type

Co-authored-by: multica-agent <github@multica.ai>

* fix(projects): transactional resources[] on create + generic CLI ref + test fix

Addresses review feedback on PR #1926:

1. CI red: TestProjectResourceLifecycle delete step called withURLParam
   twice, which replaced the chi route context and dropped the project id.
   Switched to the existing withURLParams helper from daemon_test.go.

2. POST /api/projects now accepts resources[] and attaches them in the
   same transaction as the project. Invalid refs roll back the whole
   create — no more half-attached projects on failure. Web modal + CLI
   `project create --repo` both use the new bundled payload.

3. CLI `project resource add` now accepts a generic --ref '<json>' flag
   so a new resource_type works without a CLI change. Per-type
   shortcuts (--url for github_repo) remain as a convenience but are no
   longer the only way in. Docs updated to drop the CLI from the
   "files you must touch" list.

Adds two new server handler tests:
- TestCreateProjectAttachesResources (resources[] happy path)
- TestCreateProjectRollsBackOnInvalidResource (transactional rollback)

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-04-30 14:00:43 +08:00
Bohan Jiang
9baa72cc68 fix: polish quick-create UX (kind labeling, dark toast, placeholder) (#1831)
* fix: polish quick-create UX (kind labeling, dark toast, placeholder)

Three small fixes shaken out from using the agent-create flow:

- AgentTaskResponse now carries a `kind` discriminator
  ("comment" | "autopilot" | "chat" | "quick_create" | "direct"), computed
  from the existing FK shape with no extra DB access. The Activity row
  uses it to label quick-create tasks as "Creating issue" instead of
  falling through to the generic "Untracked" — once the agent finishes
  and the new issue is linked, the row transitions to the normal
  identifier+title display.

- Sonner Toaster reads `resolvedTheme` instead of `theme`, so toasts
  follow the actual dark/light state. Forwarding "system" let sonner
  pick its own answer from `prefers-color-scheme`, which in the Electron
  renderer can disagree with next-themes' `html.dark` class — the toast
  rendered light on a dark UI.

- Agent-create placeholder rephrased to a more conversational example
  with a project reference: "let Bohan fix the inbox loading slowness
  in the Web project". Drops the priority hint (priority isn't widely
  used) and matches how people actually instruct the agent.

* fix(quick-create): link new issue back to task on completion

Addresses the review on PR #1831: completed quick-create tasks were
left with issue_id=NULL forever, so the activity row stayed on
"Creating issue" instead of transitioning to the normal MUL-XXX +
title rendering once the agent finished.

- Server: notifyQuickCreateCompleted now writes the resolved issue id
  back to agent_task_queue.issue_id via a new LinkTaskToIssue query
  (guarded by `issue_id IS NULL` so it only ever fills the unset
  quick-create case). Best-effort: a write failure logs but doesn't
  block the inbox notification.
- Frontend: defensive wording fallback — kind=quick_create rows in
  terminal status (completed/failed/cancelled) now render as
  "Quick create" instead of the active "Creating issue" label,
  covering rows whose link write failed or whose agent never
  produced an issue at all.
2026-04-29 15:40:59 +08:00
Naiyuan Qing
f745a3bbbe feat(agent): presence v3 + execution log + trigger summary (#1823)
* refactor(views): migrate agent/runtime/skill lists to TanStack DataTable

Replace the per-page CSS Grid + minmax(min, fr) + sticky-first-col + truncate
implementation with a TanStack Table backend rendered through a Dice UI-style
DataTable shell. Column widths are now px-based via column.size, so cells
no longer shrink or auto-truncate as the viewport narrows; when the sum of
columns exceeds the viewport, the container scrolls horizontally instead.

- Add @tanstack/react-table to the catalog (8.21.3) and wire it into
  packages/ui (dep) and packages/views (peerDep).
- packages/ui: new DataTable + DataTableColumnHeader + lib/data-table.ts
  (getColumnPinningStyle), adapted from Dice UI's registry. The shell
  renders <table> directly (skipping shadcn's <Table> wrapper) so its own
  outer overflow controls both axes — no nested overflow conflicts.
- packages/views: each list now declares ColumnDef[] with explicit
  cell renderers. Row click navigates to detail via onRowClick (instead of
  wrapping <tr> in <a>, which is invalid HTML); kebab dropdowns
  stopPropagation so they don't trigger the row navigation.
- Drop the previous AGENT_LIST_GRID / GRID_WITH_OWNER / ROW_GRID
  templates and the sticky-first-col / subgrid mechanics that came with
  them. agent-list-item.tsx is removed; runtime-list.tsx and
  skills-page.tsx are trimmed to thin wrappers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(agent): cap description at 255 chars (db + api + ui)

Symmetric enforcement across DB, server, and UI:

- Migration 060: pre-flight truncate of any oversize rows, then ADD
  CONSTRAINT NOT VALID + VALIDATE CONSTRAINT so the new check doesn't
  block writes during validation.
- Server handler validates utf8.RuneCountInString on Create/Update and
  rejects over-limit input with 400.
- Front-end gets AGENT_DESCRIPTION_MAX_LENGTH in core/agents/constants
  (single source of truth shared by the create dialog + edit modal +
  test suite) and a CharCounter component that warns at 90% and errors
  past the cap.
- Description editor moves from a 288px popover to a roomy modal.
  Editor body is mounted only while the dialog is open, so the local
  draft state is locked in at mount time and never reset by an external
  WS update — the React-recommended replacement for the
  useEffect(reset, [value]) anti-pattern.

Counted in code points everywhere (rune count / spread length /
char_length) so multibyte input agrees across all three layers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(views): data-table polish across runtime + skill lists

Builds on the DataTable migration in 2be0f287:

- Add ColumnMeta.grow flag — declared via TanStack module augmentation
  in ui/lib/data-table.ts. Columns marked meta.grow skip their inline
  width so fixed table-layout assigns them the leftover container space
  (no spacer column). The Title-grows / others-fixed pattern from
  Linear / GitHub PR rows.
- Authoritative table min-width = sum of column.size, applied to the
  <table> itself (fixed-layout ignores cell-level min-width per spec,
  so the floor has to live on the table).
- Header tightens to h-8 + uppercase + tracking-wider; pinned cells
  switch to opaque bg + group-hover so they cover content scrolling
  beneath them and follow row hover state.
- Toolbar slot removed from DataTable (callers wrap the toolbar
  themselves now — keeps DataTable single-purpose).

Also: hover-card popup stops contextmenu / auxclick / dblclick from
bubbling out (in addition to click). Stops the popup from triggering
ancestor handlers (e.g. issue list rows) on right-click / middle-click
without breaking Base UI's outside-click dismiss, which listens to
pointerdown — pointerdown is deliberately NOT stopped.

Runtime + skill list pages updated to use the new sizing model.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(agent): drop LastTaskState, introduce 3-state Workload

Continues the presence-model rework started in #1794 / #1798.

The previous LastTaskState union (running / completed / failed /
cancelled / idle) carried historical outcome at the list level — a
runtime-healthy agent whose last task failed showed a sticky red dot
indistinguishable from a daemon-dead agent.

New model: presence is two orthogonal "right-now" dimensions:

  AgentAvailability — runtime reachability only (online / unstable /
                      offline). Drives the dot colour everywhere.
  Workload          — current load (working / queued / idle). Three
                      states, never historical. Failure / completion /
                      cancellation are surfaced via Recent Work + Inbox,
                      not list-level state.

`queued` (= nothing running, ≥1 queued) is an honest "stuck on offline
runtime" signal. To avoid amber flashes during the brief enqueue→claim
race on healthy runtimes, the queued chip composes with availability:
muted on online, warning amber otherwise.

Activity tab cleanup that follows from the new model:
  - failureReasonLabel relocated from agents/presence.ts to
    tabs/task-failure.ts (presence no longer owns historical state).
  - Recent Work paginates (5 initial, +20 per "Show more"); chat-session
    tasks are filtered out of every Agent-scoped surface to keep
    "team work" separate from private chat.
  - Agents page drops the lastTaskFilter chip group; users find broken
    agents via Inbox / Recent Work, not a list-level filter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(task): trigger summary snapshot + task:queued lifecycle event

Two task-lifecycle improvements that ship together because they share
the same enqueue/retry hot paths and changes interleave inside task.go:

1. trigger_summary snapshot (migration 061)

   New nullable column on agent_task_queue. Comment-triggered tasks
   snapshot the comment content; autopilot tasks snapshot the run title.
   Truncated to 200 runes via strings.Builder so multibyte input counts
   correctly without O(N²) concatenation. Snapshot survives source
   edits/deletes — every task row self-describes across surfaces (issue
   detail Execution log, agent activity tooltip, inbox) without joining
   back to the originating row.

   Retry rows inherit the parent's snapshot (CreateRetryTask SELECT) so
   the description stays meaningful across attempts. The UI is
   responsible for stacking "Retry #N" context on top.

2. task:queued WS event

   New protocol event covering the ∅ → queued transition. Front-end
   types/events.ts registers it; use-realtime-sync's task: prefix path
   already invalidates task caches via onAny, so old clients without
   this exact-match subscription still refresh correctly. Specific
   subscribers (sticky banner) get sub-second updates instead of
   waiting for daemon claim.

   Retry path now broadcasts task:queued (not task:dispatch) — same
   status transition shape as enqueue, so all "new task created" paths
   agree on one event type.

   Ordering: broadcastTaskEvent runs *before* notifyTaskAvailable so
   the queued event is published into the WS bus before the daemon is
   poked. Without this, a fast daemon could claim and emit task:dispatch
   over the wire before the in-process queued broadcast fan-out reached
   clients — race window is tiny but unsafe-by-construction.

   Per-agent task list (agentTasksKeys.all) and per-issue task list
   (["issues","tasks"]) added to the task: invalidation set so Activity
   tab Recent Work and the Execution log section stay fresh.

Type contracts: AgentTask gains parent_task_id / attempt /
trigger_comment_id (already returned by the API, just missing from TS)
plus the new trigger_summary field.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(issue): ExecutionLogSection — unified active+past runs panel

Replaces two pieces:
  - the click-to-expand timeline that lived inside AgentLiveCard
  - the standalone TaskRunHistory below the main content

with a single right-panel section that lists every agent run for the
issue. Active runs sit at the top (always visible when present); past
runs collapse behind a "Show past runs (N)" toggle, sorted failed →
cancelled → completed within group.

Active rows show the trigger summary, status + relative time, and
Cancel / Transcript actions on hover (gradient backdrop fades the
status text rather than hard-clipping). Past rows show the same
shape minus Cancel.

Retry tasks prepend "Retry #N · " to the inherited summary so they're
distinguishable from their parent (which would otherwise share the
exact same trigger text).

Cache key registered as issueKeys.tasks(issueId); the global
useRealtimeSync task: prefix path already invalidates ["issues","tasks"]
on every task lifecycle event, so the section stays fresh without
local WS subscriptions.

AgentLiveCard slims down to a header-only "agent is working" sticky
banner — keeps the at-a-glance "is anyone working on this right now"
signal and the Stop / Transcript actions, drops the inline timeline
that ExecutionLogSection now owns. Subscribes to both task:queued and
task:dispatch so retries (which only emit queued) land in the banner
without waiting for daemon claim.

issue-detail mounts ExecutionLogSection in the right panel and removes
the now-defunct TaskRunHistory call site.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:50:58 +08:00
Bohan Jiang
2d9c153695 feat: quick-create issue (async agent + inbox completion) (#1786)
* feat(server): add quick-create issue async task path

Adds POST /api/issues/quick-create which validates the picked agent's
reachability up front (not archived, has runtime, runtime online) then
queues an issue-less agent task whose context JSONB carries the user's
natural-language prompt + requester + workspace. Daemon claim resolves
the workspace from the context, and the prompt builder switches to a
quick-create template instructing the agent to translate the prompt
into a single multica issue create call.

Task completion writes a success inbox item to the requester pointing at
the newly-created issue (located by querying the agent's most recent
issue in the workspace since task start, so we don't depend on agent
stdout shape). Failures write an action_required inbox item carrying the
original prompt + agent id so the frontend can offer "Edit as advanced
form" without losing input.

* feat(views): quick-create issue modal + inbox failure CTA

Adds a streamlined create-issue UI bound to the c shortcut: pick an
agent, type one line, submit. The modal closes immediately and the
agent translates the prompt into a multica issue create call in the
background. Shift+c keeps the legacy advanced form for users who want
every field. The "Advanced" button inside the new modal seeds the
shared issue-draft store with the prompt + picked agent so switching
mid-flow doesn't lose input.

Last-used agent persists per (user, workspace) via a workspace-aware
zustand store so frequent users skip the picker on every open.

Inbox renders quick_create_done items with a status pin to the new
issue and quick_create_failed items with an "Edit as advanced form"
CTA that re-seeds the legacy modal with the original prompt.

ApiError now carries the parsed JSON body so the modal can branch on
the structured agent_unavailable code without parsing the error
message.

* fix(quick-create): execenv injection, claim race, private-agent permission

Addresses GPT-Boy review on #1786:

1. execenv was rendering the assignment-task issue_context.md / runtime
   workflow even for quick-create, telling the agent to call
   `multica issue get/status/comment add` against an empty IssueID.
   Adds QuickCreatePrompt to TaskContextForEnv, plus a quick-create
   branch in renderIssueContext + the runtime_config workflow that
   instructs the agent to run a single `multica issue create` and
   exit, with explicit "do NOT call issue get/status/comment add"
   guards.

2. ClaimAgentTask serialized only on issue_id / chat_session_id, so
   concurrent quick-creates on the same agent (both NULL on those
   columns) ran in parallel — making the success-inbox lookup race
   over "most recent issue by this agent". Adds a third OR clause
   that treats "all four FKs NULL" as a serialization key for the
   same agent, so quick-create tasks on a given agent run one at a
   time.

3. QuickCreateIssue handler bypassed the private-agent ownership rule
   that validateAssigneePair enforces elsewhere — a user could POST a
   private agent_id they didn't own and trigger it. Now routes the
   picked agent through validateAssigneePair before the runtime
   liveness check.

4. Clarifies the quick-create-store namespacing comment to match the
   actual workspace-aware StateStorage convention used by the other
   issue stores (per-user is browser-profile-local).

* fix(quick-create): branch Output section + deterministic origin lookup

Addresses GPT-Boy's second-pass review on #1786:

1. The runtime_config.go Output section forced "Final results MUST be
   delivered via multica issue comment add" for every non-autopilot
   task — quick-create still got this conflicting instruction even
   though there's no issue to comment on. Switched the Output block
   to a three-way switch so quick-create gets a tailored "stdout is
   captured automatically; do NOT call comment add" branch matching
   the autopilot variant.

2. Completion lookup was "most recent issue created by this agent
   since task.started_at", which races against concurrent issue
   creates by the same agent (assignment task running alongside
   quick-create when max_concurrent_tasks > 1). Replaced with a
   deterministic origin link:

   - Migration 060 extends issue.origin_type CHECK to allow
     'quick_create'.
   - Daemon sets MULTICA_QUICK_CREATE_TASK_ID env var when running a
     quick-create task.
   - multica issue create CLI reads the env var and stamps the new
     issue with origin_type=quick_create + origin_id=<task_id>.
   - Server CreateIssue handler accepts (origin_type, origin_id)
     from trusted callers (only "quick_create" is allowed; the pair
     is rejected unless both fields are provided together).
   - notifyQuickCreateCompleted now calls GetIssueByOrigin keyed on
     (workspace_id, "quick_create", task.ID) — no more time-window
     racing against parallel agent activity.

The old GetRecentIssueByCreatorSince query is removed.
2026-04-29 14:05:26 +08:00
Naiyuan Qing
21e3cfaa01 Agent runtime status redesign: split presence into availability + last-task (#1794)
* feat(agent-status): add workspace live-tasks endpoint and TaskFailureReason type

Lays the API + type contract for the front-end agent presence cache:

- New `GET /api/active-tasks` returns active (queued/dispatched/running)
  tasks plus failed tasks within the last 2 minutes for the current
  workspace. The 2-minute window powers a UI-side auto-clearing "Failed"
  agent state without back-end pollers.
- `agent_task_queue` has no workspace_id column, so the query JOINs agent;
  `SELECT atq.*` keeps `failure_reason` (migration 055) on the wire.
- Adds `TaskFailureReason` to `AgentTask` so the UI can map the 5 backend
  classifiers (agent_error / timeout / runtime_offline / runtime_recovery
  / manual) to copy without parsing free-text errors.
- New `api.getActiveTasksForWorkspace()` client method; workspace is
  resolved server-side from the X-Workspace-Slug header (no path param,
  matching /api/agents and /api/runtimes conventions).

Includes the joint engineering plan and designer brief that scope the
broader Agent / Runtime status redesign — Phase 0 is this contract plus
the front-end derivation layer landing in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(agent-status): derive presence/health states with WS sync and desktop IPC bridge

Adds the front-end derivation layer that turns raw server data into the
user-facing 5-state agent / 4-state runtime enums. UI files are
deliberately untouched in this commit — derivation lives behind hooks
(useAgentPresence, useRuntimeHealth) that any component can call with
zero additional network traffic.

Architecture:
- Derivation is pure functions in packages/core/{agents,runtimes}; the
  back-end stays free of UI translation. Agents algorithm: runtime
  offline > recent failed (2-min window) > running > queued > available.
  Runtimes algorithm: status + last_seen_at -> online / recently_lost /
  offline / about_to_gc.
- A single workspace-wide active-tasks query backs all per-agent
  presence reads, eliminating N+1 across hover cards, list rows, and
  pickers. 30-second tick re-renders the hooks so the failed window
  expires even when no underlying data changes.
- WS task lifecycle events (dispatch / completed / failed / cancelled)
  invalidate active-tasks via the prefix dispatcher. completed/failed
  were removed from specificEvents so they go through both the prefix
  invalidate and the existing chat ws.on() handlers. Reconnect refetch
  picks up active-tasks too.
- Desktop bridges window.daemonAPI.onStatusChange directly into the
  runtimes cache via setQueryData, giving the local daemon sub-second
  feedback (vs. 75s server sweep). Bridge is wsId-bound so workspace
  switches automatically rebind the subscription; daemon_id matching
  covers the same-daemon-multiple-providers case.

24 derivation unit tests cover all branches plus null/empty/boundary
inputs (FAILED_WINDOW_MS edges, null last_seen_at, missing
completed_at). Full core suite: 112 tests passing. Typecheck green
across all 8 workspace packages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(agent-status): redesign agent runtime status as two orthogonal dimensions

Splits the conflated 5-state agent presence into two independent axes:

- AgentAvailability (3-state): online / unstable / offline — drives the
  dot indicator everywhere a dot appears. Pure runtime reachability;
  never sticky-red because of a past task outcome.

- LastTaskState (5-state): running / completed / failed / cancelled /
  idle — surfaced as text + icon on focused surfaces (hover card,
  agent detail page, agents list, runtime detail). Never colours the dot.

Major changes:

* Domain layer: AgentPresence union → AgentAvailability + LastTaskState.
  derive-presence split into deriveAgentAvailability + deriveLastTaskState
  + deriveAgentPresenceDetail orchestrator. Tests reorganised into three
  groups (availability invariants, last-task invariants, composition).

* Visual config: presenceConfig (5 entries) → availabilityConfig (3) +
  taskStateConfig (5). availabilityOrder + lastTaskOrder for filter chips.

* Workspace-level presence prefetch: new useWorkspacePresencePrefetch
  hook + WorkspacePresencePrefetch mount component, wired into
  DashboardLayout (web) and WorkspaceRouteLayout (desktop). Hover cards
  render synchronously with no skeleton flash on first hover.

* ActorAvatar hover: flipped default — disableHoverCard removed,
  enableHoverCard added (default false). Opt-in at ~14 decision-moment
  surfaces; pickers / decoration sub-chips stay plain. Status dot
  decoupled (showStatusDot prop) so picker rows can show presence
  without nesting popovers.

* Hover cards: AgentProfileCard simplified — availability dot only,
  Detail link top-right (logs live on the detail page). New
  MemberProfileCard mirrors the structure: name + role + email +
  top-2 owned agents (sorted by 30d run count) with click-through to
  agent detail.

* Agents list: split Status into two columns — availability (3-color
  dot + label) and Last run (task icon + label, optional running
  counts). Two independent filter chip groups (Status + Last run);
  combination acts as intersection ("online + failed" finds broken-
  but-alive agents).

* Other UI surfaces (issue list/board/detail, comments, autopilots,
  projects, runtimes, mention autocomplete, subscribers picker)
  updated to the new dot semantics; status dot now strictly 3-color.

Server changes accompany the client redesign — workspace-wide
agent-task-snapshot endpoint, runtime usage queries, etc. — to feed
the derive layer with the data it needs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(agent-detail): drop last-task chip from detail header + inspector

The Recent work section on the agent detail page already shows the same
data (with task titles, timestamps, error context) — surfacing
"Completed" / "Failed" / etc. up in the header was redundant chrome.
Detail surfaces now show only the 3-state availability dot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tables): handle narrow viewports across agents / skills / runtimes

Three table layouts were squeezing content into adjacent cells at
intermediate widths. Each fix is small and targeted:

* runtime-list: the Runtime cell's base name had `shrink-0`, so it
  refused to truncate when its grid column was narrowed under width
  pressure — the name visually overflowed into the Health column
  ("ClaudeOnline" etc). Removed shrink-0, added truncate. The Health
  column was also a fixed 9.5rem reservation for the worst-case
  "Recently lost · 2m 14s ago" copy; switched to minmax(0,1fr) so it
  competes fairly with Runtime.

* skills-page: had a single grid template with no responsive
  breakpoints — all 6 columns were rendered at any width and got
  visually jammed below md. Added a <md template that drops Source +
  Updated; the row markup hides those cells via `hidden md:block` /
  `md:contents`.

* agent-list-item: the new Last run column was reserved at minmax(8rem,
  max-content); on narrow md viewports the 8rem floor pushed the row
  past available width. Changed to minmax(0,max-content) so the cell
  shrinks under pressure (its content already truncates).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(agent-card): hover-only Detail + add Runtime row + breathing room

Three small polish tweaks to the agent hover card:

- Detail link gets `mr-1` + fades in only on card hover (group-hover).
  It was visually flush against the popover edge and competing for
  attention; now it stays out of the way during a quick glance and
  surfaces only when the user is dwelling on the card.

- Runtime row is back, in the meta block (cloud/local icon + runtime
  name). The earlier removal was over-aggressive — knowing where an
  agent runs is part of "who is this agent". The wifi badge stays
  dropped because the availability dot in the header already conveys
  reachability.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(runtime): wifi-style health icon (4-state) for runtime list + agent card

Replaces the 6px coloured dot with a wifi-shape icon that carries both
state (Wifi vs WifiOff) and severity (success/warning/muted/destructive).

Mapping:
- online        → Wifi (success)
- recently_lost → WifiHigh (warning) — transient hiccup, fewer bars
- offline       → WifiOff (muted)    — long unreachable
- about_to_gc   → WifiOff (destructive) — sweeper coming soon

Used in two places:

- Runtime list: replaces HealthDot in the dedicated leading-icon column.
  Bumped the column from 0.5rem (dot-sized) to 0.875rem (icon-sized).

- Agent profile card RuntimeRow: derives runtime health from runtime +
  clock (matching the 4-state semantics) and renders HealthIcon next
  to the runtime name. Cloud runtimes always read as online. The
  duplicate signal with the header availability dot is intentional —
  it confirms WHICH runtime is the one currently in the dot's state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 19:21:13 +08:00