37 Commits

Author SHA1 Message Date
LinYushen
cb68669c73 feat(composio): gate MCP apps behind feature flag (#4876)
* feat(composio): server-side connect flow + connections REST (Notion MVP) (MUL-3720) (#4608)

* feat(composio): server-side connect flow + connections REST (Notion MVP) (MUL-3720)

Compose the merged server/pkg/composio SDK into a user-facing connection
manager: signed-state connect handshake, local user_composio_connection
mirror, idempotent disconnect, and a per-user MCP session helper (not yet
wired into task dispatch).

- migration 127_user_composio_connection (no FK/cascade, per DB rules)
- sqlc queries: upsert (idempotent on user_id+connected_account_id), list
  active, owner-scoped get, mark revoked
- internal/integrations/composio: signed HMAC-SHA256 state, BeginConnect,
  CompleteCallback (idempotent upsert), ListConnections, Disconnect
  (upstream 404 = idempotent success), CreateMCPSession (no-op when empty,
  pins connected_accounts per toolkit), CallbackRedirect
- REST handlers under /api/integrations/composio (user-scoped, 503 when
  COMPOSIO_API_KEY unset): connect/init, callback (302), connections list,
  delete
- router wiring gated by COMPOSIO_API_KEY; COMPOSIO_AUTH_CONFIGS_JSON maps
  toolkit->auth_config (MVP: notion); state secret from COMPOSIO_STATE_SECRET
  or derived from JWT_SECRET; callback base from COMPOSIO_CALLBACK_BASE_URL
  or MULTICA_PUBLIC_URL
- tests: state (expire/tamper/wrong-secret), service (mapping, callback
  idempotency, non-success, disconnect owner/404 idempotency, MCP pin),
  handlers (httptest), redact regression for Bearer mcp_ tokens

MVP scope: Notion only; no task-dispatch overlay, sharing, or webhook
event handling (later stages).

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): bind callback account to user + idempotent revoked disconnect (MUL-3720)

Address PR 4608 review (CHANGES_REQUESTED):

- callback: verify connected_account_id with Composio before mirroring it.
  The signed state only proved user/toolkit/exp, so a valid state paired with
  a tampered connected_account_id would be written verbatim. CompleteCallback
  now calls ListConnectedAccounts and fails closed (ErrAccountVerification)
  unless the account belongs to the state's user (composio_user_id == multica
  user id) and was created under the toolkit's auth config. No row is written
  on mismatch / unknown account / upstream error.

- disconnect: short-circuit to a no-op when the local row is already revoked,
  before touching upstream. Previously a second DELETE re-hit Composio and a
  non-404 upstream error surfaced as a 502, breaking the 204-idempotent
  contract.

- CreateMCPSession: document the v1 single-active-connection-per-(user,toolkit)
  constraint and make duplicate selection deterministic (newest-wins, rows are
  connected_at DESC) instead of order-dependent map overwrite. Stage 3 owns the
  real single-account-enforcement vs multi-account-shape decision.

Tests: tampered/wrong-auth-config/unknown-account callback rejection, revoked-row
disconnect no-op (asserts upstream not re-hit). composio pkg 85% coverage; all
green.

Co-authored-by: multica-agent <github@multica.ai>

* feat(composio): list all toolkits + dynamic auth-config resolution (MUL-3720)

Yushen's follow-up to the Notion MVP: surface the full Composio toolkit
catalog, render it in Settings, and drop the static env mapping in favor of
dynamic auth-config discovery.

Config correctness (per Composio docs):
- Remove COMPOSIO_AUTH_CONFIGS_JSON entirely. The toolkit→auth_config mapping
  is now resolved at request time from the project's /auth_configs (cached,
  5-min TTL), so enabling a toolkit is a dashboard action, not a redeploy.
- Do NOT add COMPOSIO_PROJECT_ID. The project API key (x-api-key) authenticates
  to exactly one project; the project is resolved from the key. Only org-level
  endpoints use x-org-api-key, which this integration never calls.

Backend:
- SDK: server/pkg/composio/auth_configs.go — ListAuthConfigs (toolkit_slug,
  is_composio_managed, show_disabled, limit, cursor).
- service: dynamic resolver (authConfigMap cache; betterAuthConfig prefers a
  custom/white-label config over Composio-managed, newest wins); BeginConnect
  and CompleteCallback resolve via it; ListToolkits fetches the full catalog
  (paginated, capped) annotated with connectable = has an enabled auth config,
  connectable-first ordering.
- handler + route: GET /api/integrations/composio/toolkits (user-scoped, 503
  when COMPOSIO_API_KEY unset) returning slug/name/logo/category/connectable.

Frontend:
- core: ComposioToolkit/ComposioConnection types, api client methods, and
  composio query options (@multica/core/composio).
- views: Settings → Integrations now has a Composio section rendering every
  toolkit as a card with search. Connect is gated on `connectable`;
  non-connectable toolkits show a muted "not configured" hint instead of a
  dead button. Connected toolkits show a badge + Disconnect (with confirm).
- i18n: composio block added to en/zh-Hans/ja/ko settings.

Tests: SDK + service (dynamic resolution, custom-over-managed preference,
connectable flag, resolver-error soft-degrade) and handler toolkits endpoint;
composio pkg 85.7% coverage. go build/vet/gofmt clean; core+views typecheck,
core+views lint, and core tests (691) all green.

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): close cross-toolkit callback fail-open by signing auth_config_id into state (MUL-3720)

Re-review blocker: CompleteCallback resolved the toolkit's auth config at
callback time and ignored a resolve error/empty result, while
verifyAccountOwnership skipped the auth-config comparison when the expected
value was empty. A user could then pass another toolkit's connected_account_id
into this toolkit's callback — the owner check passed and it was written under
the wrong toolkit_slug/account binding.

Fix: the auth_config_id is already resolved in BeginConnect (before the state
is signed), so sign it into the state and compare it exactly at callback. No
re-resolve, no fail-open. verifyAccountOwnership now fails closed when the
expected auth config is empty (rejects instead of skipping) and requires an
exact match — closing the cross-toolkit binding gap.

Tests: state round-trips auth_config_id; BeginConnect signs it; callback
rejects wrong/cross-toolkit auth config and an empty (no-mapping) auth config
fails closed. composio pkg 85.2% coverage, all green.

Frontend (non-blocking): the Composio settings tab now surfaces an error when
the connections query fails instead of silently rendering everything as
unconnected.

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): hide Settings section entirely when integration unconfigured (MUL-3720)

Decision (option 2, hide-then-merge): don't show a card that leaks the internal
COMPOSIO_API_KEY env-var name to every end user. IntegrationsTab now gates the
whole Composio section (heading + body) on the toolkits query — a 503 means the
key is unset, so the section is withheld instead of rendering the not-configured
card. Admin-only setup guidance is a later, role-gated affordance.

Removed the notConfigured card (and now-unused ApiError import) from
ComposioTab; it only mounts when configured. views typecheck + lint clean.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* feat(composio): Stage 2 frontend polish — callback toast, last_used & expired UI, e2e (MUL-3718) (#4688)

* feat(composio): callback toast + refresh, last_used & expired UI, e2e (MUL-3718)

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): real callback redirect route + StrictMode-safe toast dedup (MUL-3718 review)

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): callback endpoint should not require Multica auth (MUL-3843) (#4709)

* fix(composio): move OAuth callback out of the Auth group (MUL-3843)

Composio 302-redirects the browser to /api/integrations/composio/callback
at the end of the OAuth flow, but PR #4608 mounted it inside the cookie-auth
middleware group. When the session cookie is absent (expired session,
SameSite=Strict / Safari ITP, private window, self-hosted callback subdomain)
the Auth middleware returned a hard 401 and a JSON blob instead of the
settings redirect, breaking the flow.

Identity never came from the cookie anyway: it is carried by the HMAC-signed
state param that CompleteCallback verifies (signature, expiry, replay) and
cross-checked by verifyAccountOwnership; h.Composio == nil still 503s. So the
callback is registered alongside the other public OAuth/webhook routes; the
other four composio endpoints stay session-gated.

Refs MUL-3843, MUL-3715.

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): correct stale callback routing comments (MUL-3843)

The package header and ComposioCallback doc comments still described the
callback as sitting under the Auth middleware group. After the route was
moved out (this PR), update both to state it is a public route whose identity
comes from the signed state — addressing review nit from 张大彪.

Refs MUL-3843.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* feat(composio): inject MCP overlay into agent runtime at task dispatch (MUL-3721) (#4704)

Stage 3 of the Composio epic. Wires the per-user Composio MCP session into
every agent task so the agent process sees the initiator's connected tools
without any prompt-time plumbing.

Server side
  - Migration 128 adds agent_task_queue.runtime_mcp_overlay JSONB plus a
    BEFORE-UPDATE trigger that wipes the column on any transition into a
    terminal status (completed / failed / cancelled). A trigger is the single
    source of truth — future queries that flip status cannot bypass it.
  - composio.Service.BuildTaskOverlay(userID) reuses CreateMCPSession and
    emits the Claude-style { mcpServers: { composio: { type: http, url,
    headers } } } shape the daemon's existing sidecar generators consume.
    Returns (nil, nil) on zero active connections so we never burn a
    Composio session for a user with nothing to call.
  - TaskService grows a Composio ComposioOverlayBuilder seam, wired in
    router.go after composiointeg.NewService succeeds. Five enqueue paths
    (issue / mention / quick-create / chat / auto-retry) attach the overlay
    after CreateAgentTask returns and before the daemon is notified — so
    every claim reads a settled row, with no second daemon hop. Best-effort:
    a builder failure logs and proceeds with no overlay.
  - resolveInitiatorFromTriggerComment derives the initiator user from the
    trigger comment when it was authored by a member. Agent-authored
    triggers are not treated as initiators (their connected-apps view is
    empty by construction).

Daemon side
  - handler/daemon.go claim path merges task.runtime_mcp_overlay onto
    agent.mcp_config via mergeMCPOverlay before populating
    TaskAgentData.McpConfig. Overlay wins on server-name collisions
    because it carries the live user-scoped session URL. Errors fall back
    to the agent config unchanged — a bad overlay must not surprise-disable
    saved MCP tools. The existing execenv sidecar generators (cursor /
    codex / openclaw / opencode / hermes-kiro) need no changes: they keep
    consuming the merged result through TaskAgentData.McpConfig.

Tests
  - 9 merge cases (mcp_overlay_test): both-nil short-circuit, agent-only
    pass-through, overlay-only canonicalization, two-side merge, name
    collision (overlay wins), top-level key preservation, malformed agent
    fallback, malformed overlay fallback, non-object server rejection.
  - 4 dispatch cases (composio): zero-connections returns nil without
    CreateSession, happy-path emits the right shape with the right user
    id, empty-URL defensive branch, SDK error surfacing.
  - 4 TaskService helper cases: nil Composio is a no-op (Queries-safe),
    invalid initiator does not call the builder, nil overlay skips the
    UPDATE, builder error swallowed without panic.
  - Migration 128 verified to roll up + down + up cleanly against the test
    database.

Out of scope (deferred): assignment-triggered enqueue paths with no
trigger comment get no overlay attached today (no initiator UUID flows
through enqueueIssueTask in that case). Retry paths recompute the overlay
fresh from the parent's initiator_user_id instead of inheriting the bearer
from the parent row, so a stale token can never resurface on a retry.

Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>

* feat(composio): per-agent allowlist + originator-scoped MCP overlay (MUL-3869) (#4736)

* feat(composio): per-agent allowlist + originator-scoped MCP overlay (MUL-3869)

Stage 3.1 of the Composio epic (MUL-3721 parent). PR #4704 wired in the
runtime_mcp_overlay column and a per-task dispatch hook; this change
inverts the default from "all-on" to opt-in and locks the overlay to the
agent owner's own connected apps:

- Agents carry composio_toolkit_allowlist TEXT[]. NULL or [] => no MCP.
  Owner-only read/write; non-owner GET/PUT silently redacts/drops the
  field (same shape as mcp_config).
- agent_task_queue carries originator_user_id UUID. Set from the
  top-of-chain HUMAN at every enqueue path:
    * issue/mention comment by member  -> author_id
    * issue/mention comment by agent   -> inherit via comment.source_task_id
                                          -> parent task originator_user_id
    * quick-create                     -> requester_id
    * chat                             -> initiator_user_id
    * retry                            -> SQL-inherited from parent row
    * autopilot                        -> NULL (system-driven)
- BuildTaskOverlay (composio dispatch) now takes (ctx, originatorUserID,
  agent) and short-circuits on five gates: invalid originator,
  originator != agent.owner_id, empty allowlist, empty intersection of
  allowlist ∩ active connections, defensive empty session URL. Composio
  CreateSession is called with BOTH `toolkits.slugs` (the intersection)
  AND `connected_accounts` (the pinned account ids), narrowing the
  tool-router twice.
- The originator-vs-owner gate closes the agent-fanout privacy hole: any
  workspace member who can @-mention a public agent used to project the
  owner's connected apps into their run. Now the overlay only mounts
  when the human at the top of the chain IS the agent owner.

Tests:
- dispatch_test.go covers all 5 gates plus uppercase/whitespace slug
  normalisation.
- task_runtime_mcp_overlay_test.go covers the no-op gates of the new
  applyRuntimeMCPOverlay signature.
- agent_composio_allowlist_test.go (handler): owner roundtrip
  (list/empty/null), workspace-admin silent-drop, owner-only GET
  visibility, pure normaliseComposioToolkitAllowlist.
- resolve_originator_test.go (service, DB-backed): member-authored,
  agent-authored inherits via comment.source_task_id, invalid id.

Migration 129 up/down/up verified against docker postgres.

Co-authored-by: multica-agent <github@multica.ai>

* chore(composio): gofmt + regenerate sqlc with v1.31.1 (MUL-3869 review nits)

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): accept nested connected account auth config

* feat(views): creator-only MCP tab for per-agent Composio allowlist (MUL-3870) (#4743)

Stage 3.2 frontend on top of the Stage 3.1 backend (MUL-3869, 4708dba97).
Adds an agent-detail tab that lets the agent owner pick which of their own
active Composio connections this agent may mount as MCP servers, writing the
selection to agent.composio_toolkit_allowlist via the existing PUT /api/agents.

- core/types: composio_toolkit_allowlist (+ _redacted) on Agent; tri-state
  composio_toolkit_allowlist on UpdateAgentRequest (omit/no-change, null/clear,
  array/replace), matching the backend contract.
- core/agents: useUpdateAgentAllowlist - optimistic mutation hook (patches the
  cached workspace agent list, rolls back on error, invalidates on settle).
- views: AgentMcpTab renders the owner's active connections as checkboxes;
  empty state links to Settings -> Integrations; defensive redacted state.
- views: wired into AgentOverviewPane as tab "composio_mcp", labeled "MCP Apps"
  to disambiguate from the existing raw-JSON "MCP" (mcp_config) tab. The entry
  is gated to the creator (currentUserId === agent.owner_id), matching the
  backend's owner-only read/write of the allowlist.
- i18n: tabs.composio_mcp + tab_body.composio_mcp.* in en/ja/ko/zh-Hans.
- tests: agent-mcp-tab.test.tsx (gating, toggle->allowlist body, active-only,
  empty, redacted); e2e/agent-mcp.spec.ts (creator sees tab + PUT body,
  non-creator hidden) with Composio + agent endpoints mocked at the boundary.

Note: the product spec says "creator"; the schema has no creator_id - the
backend gate and redaction are keyed on owner_id, so the tab uses owner_id.

Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): mount remote MCP for codex

* feat(agents): agent invocation permission system (MUL-3963) (#4844)

* feat(agents): agent invocation permission system (permission_mode + invocation targets)

MUL-3963: split who may INVOKE an agent out of the overloaded visibility
column into an explicit, extensible model on feature/composio-integration.

- DB: agent.permission_mode (private|public_to) + agent_invocation_target
  table (workspace/member/team targets) + lossless backfill from visibility
  (migration 130).
- canInvokeAgent: owner-only for private (NO admin bypass, NO A2A bypass);
  public_to honours the allow-list; A2A judged by the top-of-chain originator.
- All trigger paths rewired: issue assign, comment @agent/@squad, chat,
  quick-create, autopilot, squad leader, child-done.
- Agent API: permission_mode + invocation_targets on responses and
  create/update (owner-only writes); legacy visibility kept as a derived field
  so old clients never see a permission widening.
- Composio: BuildTaskOverlay now FOLLOWS invocation permission and uses the
  agent OWNER connection (removed the originator==owner gate); front-end warns
  when a shared agent enables Composio apps.
- CLI: --permission-mode / --public-to-workspace / --public-to-member (legacy
  --visibility still mapped).
- Frontend: AccessPicker (Private / workspace / specific people / team soon),
  permission rules mirror canInvokeAgent, Composio warning banner.
- Tests: migration backfill, admin cannot invoke others private, public_to
  workspace/member whitelist, A2A by originator, Composio overlay uses owner
  connection.

Co-authored-by: multica-agent <github@multica.ai>

* feat(agents): stackable, mixed public_to invocation targets (MUL-3963)

Follow-up on PR #4844: public_to now supports selecting MULTIPLE, MIXED
targets on one agent (e.g. Public to workspace + specific people + team),
with canInvokeAgent admitting on ANY matching target (OR).

- Frontend AccessPicker: reworked from a single exclusive kind into a
  stackable multi-select — an "Everyone in workspace" toggle, a member
  multi-select checklist, and a (disabled, v1) team placeholder can be
  combined freely. Emits the full union of selected targets; empty union
  collapses to Private. Existing team targets are preserved across saves.
  Added the access.public_group locale string (en/zh-Hans/ja/ko).
- Backend already supported this (agent_invocation_target is multi-row per
  agent; create/update take a target ARRAY and batch-replace the whole
  allow-list; canInvokeAgent OR-matches). Added tests to lock it in:
  mixed member+team targets, overlapping-member batch replace, and
  workspace+member stacking then narrowing.

Refs MUL-3963.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): address review on invocation permission (MUL-3963)

张大彪 review on PR #4844 — three blockers + product ruling + nits:

1. Migration 130: drop the FK/cascade on agent_invocation_target
   (agent_id, created_by) per the Multica no-FK rule; relationships are now
   maintained in the app layer (matching MUL-3515 §4). Added
   DeleteAgentInvocationTargetsByArchivedRuntimeAgents and call it before
   DeleteArchivedAgentsByRuntime in all three runtime-delete paths
   (runtime.go x2, runtime_profile.go) so hard-deleting agents can't orphan
   target rows.
2. revokeAndRemoveMember: prune the leaving member's member-target grants
   (DeleteAgentInvocationTargetsByMember) in the same tx as the member-row
   delete, so a re-invited user can't reclaim a stale invocation grant.
3. Empty public_to is a phantom — parsePermissionInput now normalises a
   public_to with no resolvable targets to a single workspace target, so
   `--permission-mode public_to` alone (and any empty target array) means
   "public to workspace" instead of "shared but nobody can run it".

Product ruling: the system/no-human-originator → workspace-target path in
canInvokeAgent is a deliberate, documented exception (webhook/system/
workspace-wide automation); member/team targets still fail closed without a
resolved originator. Documented in code + locked with a test.

Nits: refreshed the stale "originator must be owner" comments — models.go
(via migration 130 COMMENT ON COLUMN + sqlc regen for composio_toolkit_allowlist
and originator_user_id) and agent-mcp-tab.tsx — to the owner-connection +
invocation-permission rules.

Tests: member remove/re-add regression, system workspace exception + member
fail-closed, empty public_to → workspace (plus the earlier mixed/overlap/
batch-replace suite). Migration 130 applied to the test DB; Go handler/service/
composio suites green; views typecheck clean.

Refs MUL-3963.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): scope member invocation-target cleanup to one workspace (MUL-3963)

张大彪 3rd review — cross-workspace permission bug + comment nits:

- DeleteAgentInvocationTargetsByMember was a GLOBAL delete by user id, so
  removing a user from workspace A also wiped their member-target grants on
  agents in workspace B. Scoped it to a single workspace by joining through
  agent.workspace_id; revokeAndRemoveMember now passes (workspaceID, userID).
- Regression test TestRevokeMember_InvocationTargetCleanupIsWorkspaceScoped:
  same user allow-listed by agents in two workspaces; removal from one leaves
  the other workspace's target intact.
- Nits: refreshed the remaining stale "originator == agent.owner_id" /
  "owner-vs-originator" comments — CreateRetryTask (agent.sql, regenerated),
  and the AgentResponse allowlist doc + ListAgents/UpdateAgent redaction
  rationale in agent.go — to the owner-connection + invocation-permission rule.

Migration 130 applied to the test DB; Go handler/service/composio suites green;
go vet clean.

Refs MUL-3963.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): agent access owner-only editable, read-only for others (MUL-3963) (#4853)

* fix(agents): make agent access owner-only editable, read-only for others (MUL-3963)

Interaction bug: a non-owner (incl. workspace admin) could open the AccessPicker
and set an agent public — the backend silently ignored it and the UI bounced
back to private. Access is owner-only, so non-owners must see a read-only state
and the backend must reject real changes explicitly.

Frontend:
- AccessPicker renders a static, non-interactive read-only state when the
  viewer is not the owner: the current access value + a lock affordance + a
  tooltip "Only the agent owner can change who can run this agent." No clickable
  trigger is rendered, so a non-owner can never open a control the backend would
  reject (the GitHub/Notion pattern for permission settings you can see but not
  edit). The editable multi-select picker is unchanged for the owner.
- agent-detail-inspector gates the picker on ownership specifically
  (currentUserId === agent.owner_id), NOT the general canEdit (which also admits
  admins, who may edit other fields but not access).
- New locale key access.owner_only_readonly (en/zh-Hans/ja/ko).

Backend:
- UpdateAgent now returns an explicit 403 when a non-owner submits a REAL
  permission change (permissionInputChangesAgent compares requested mode +
  target set against the persisted state); a no-op resubmit (admin PATCH-as-PUT
  echoing unchanged permission) is still tolerated so admin edits of other
  fields keep working. Replaces the previous silent-drop that caused the bounce.

Tests:
- access-picker.test.tsx: non-owner gets a non-interactive read-only display
  with the owner-only tooltip; owner gets an interactive picker; owner can pick
  a member and stack workspace + member.
- TestUpdateAgent_AccessChangeIsOwnerOnly: admin real change → 403; admin no-op
  resubmit → 200; admin editing other fields → 200; owner change → 200.

Incidental: fixed a pre-existing base typecheck break in
slash-command-suggestion.test.tsx (stray `signal` arg not in the suggestion
items type) that otherwise fails the whole @multica/views typecheck.

Refs MUL-3963.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): compare legacy visibility, not expanded permission, for no-op detection (MUL-3963)

PR #4853 review: permissionInputChangesAgent expanded a legacy-only
visibility:"private" into a real private permission and compared it against the
agent's actual permission. A member-only public_to agent derives legacy
visibility "private", so an admin PATCH-as-PUT echoing visibility:"private"
while editing another field was misread as a public_to→private downgrade and
rejected with 403 — contradicting the "unchanged permission no-op is allowed"
contract.

Fix (per review): when a request carries ONLY legacy `visibility` (no
permission_mode / invocation_targets), derive the agent's CURRENT legacy
visibility from its real targets and compare the legacy string values. Equal =
no-op (allowed); a real legacy change (e.g. "workspace") still returns 403.
Requests that carry permission_mode / invocation_targets keep the precise
mode+target comparison.

Regression test TestUpdateAgent_LegacyVisibilityNoOpForMemberOnlyPublicTo:
member-only public_to agent — admin submitting visibility:"private" + a
non-permission field → 200 with targets unchanged; admin submitting
visibility:"workspace" → 403.

Go handler/composio suites green; migration 130 applied; go vet clean.

Refs MUL-3963.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* feat(composio): brief agents on connected apps

* feat(composio): gate MCP apps behind feature flag

* fix(mobile): parse agent invocation permissions

* fix(tests): update agent fixtures for access fields

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Multica Eve <eve@devv.ai>
Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: Eve <eve@multica-ai.local>
2026-07-03 14:18:43 +08:00
Multica Eve
5d79696fb5 MUL-3794: rewrite comment routing cascade 2026-06-30 12:24:57 +08:00
LinYushen
3692b6a862 fix(squad): inject leader briefing by task flag, not issue assignee (MUL-3730) (#4606)
* fix(squad): inject leader briefing by task flag, not issue assignee

Key squad-leader briefing injection off task.IsLeaderTask + task.SquadID
instead of issue.AssigneeType=='squad'. The old gate missed the most common
path — an @squad mention in a comment on an issue assigned to a plain agent
(MUL-3724) — so the leader booted with zero squad context and did the work
itself instead of orchestrating.

- migration 127: add agent_task_queue.squad_id (no FK) + partial index
- sqlc: CreateAgentTask stamps squad_id; CreateRetryTask inherits it
- service: thread squadID through EnqueueTaskForSquadLeader(+WithHandoff),
  enqueueMentionTask, and the rerun path; all 5 call sites pass the squad id
- daemon claim: unified injection keyed on leader-task + squad_id, with a
  defensive leader-identity re-check; quick-create block retained (it serves
  issue-less tasks and sets resp.SquadID/SquadName)
- briefing: strengthen leader Operating Protocol opening
- tests: claim-time injection (comment-mention/non-leader/null-squad),
  squad_id enqueue stamping, retry inheritance; existing fixture updated

Co-authored-by: multica-agent <github@multica.ai>

* test+docs(squad): dangling squad_id regression + clarify quick-create path

Address review nits on #4606:
- Add TestClaim_LeaderTaskWithDanglingSquadID_NoBriefing: squad hard-deleted
  after enqueue leaves task.squad_id dangling (no FK); claim still 200 and
  skips injection via the err!=nil guard. This is the load-bearing contract
  for dropping the FK.
- Rewrite the daemon.go injection comment to state quick-create does NOT use
  the is_leader_task/squad_id columns — it routes squad via the context JSON
  branch (qc.SquadID) and must not be folded into the column-based path.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: 魏和尚 <agent@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-26 16:01:33 +08:00
Bohan Jiang
5038c983c0 MUL-3281: Add daemon skill bundle refs (#4445)
* feat: add daemon skill bundle refs

Co-authored-by: multica-agent <github@multica.ai>

* fix: tighten skill bundle resolve safeguards

Co-authored-by: multica-agent <github@multica.ai>

* feat: add task prepare lease

Co-authored-by: multica-agent <github@multica.ai>

* fix: isolate prepare lease concurrent index migration

Co-authored-by: multica-agent <github@multica.ai>

* fix: keep prepare lease active through start

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 16:19:16 +08:00
Naiyuan Qing
4ab335b8a5 MUL-3416: Issue pre-trigger preview + Handoff Note (#4383)
* feat(issues): unify run-enqueue decision behind WillEnqueueRun + preview endpoint

Collapse the issue update/batch enqueue copies into one service predicate
service.IssueService.WillEnqueueRun, shared verbatim with a new dry-run
endpoint POST /api/issues/preview-trigger so the four entry points stop
drifting (squad/self-loop/batch omissions, MUL-3375). The private-agent gate
stays at the HTTP boundary: write paths inject allow-all, preview injects the
real gate so it never leaks a private agent's readiness.

Add suppress_run to issue update/batch: the change applies but no run starts.
Remove the now-dead handler mirrors shouldEnqueueSquadLeaderOnAssign /
isSquadLeaderReady. service.Create and the comment trigger chain are untouched.

Tests: preview behavior, preview<->write-path match, batch aggregation,
member no-trigger, suppress_run skip, malformed-body 400.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): inject handoff note into assigned runs via first-class task field

Add an optional handoff_note carried by issue assign/promote into the run's
opening prompt and issue_context.md, via a dedicated agent_task_queue column
(migration 122) and a daemon assignment-handoff render branch — never a
fabricated comment, never trigger_comment_id (MUL-3375 §6.1).

Thread the note through enqueueIssueTask/enqueueMentionTask + WithHandoff
public variants and dispatchIssueRun; suppress_run or a parked write drops it
(no run = nothing to inject). Soft version gate: MinHandoffCLIVersion +
HandoffSupported, surfaced per-trigger as handoff_supported in the preview so
the UI can gray the note box on old daemons; the assignment never hard-fails.

Tests: daemon prompt + issue_context render via the assignment branch (not
quick-create/comment), version helper matrix, note persists on the task,
suppressed assign enqueues nothing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): leave a display-only handoff record on the timeline

When an assign/promote with a handoff note starts a run, write one
type='handoff' timeline record via TaskService.RecordHandoff — a direct
Queries.CreateComment + timeline event that bypasses Handler.CreateComment, so
it never reaches triggerTasksForComment and cannot start a second run
(MUL-3375 §6.2, the must-not-retrigger invariant). Author is the actor who
handed off; body is the note. Migration 123 admits the 'handoff' comment type.
Recorded only on a real run start: suppress_run or a parked write writes
nothing. enqueueSquadLeaderTask now reports whether it enqueued so the trace
is gated on an actual dispatch.

Test: exactly one handoff record on assign-with-note, exactly one task (no
re-trigger), and no record when suppressed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): frontend plumbing for issue-trigger preview + handoff (core)

Add api.previewIssueTrigger + IssueTriggerPreviewSchema (zod parseWithFallback),
the use-issue-trigger-preview hook, issueKeys.issueTriggerPreview(+All) with WS
queue-state invalidation, suppress_run/handoff_note on UpdateIssueRequest, the
'handoff' CommentType, and stripping of the control fields from optimistic
update/batch cache patches (MUL-3375 §9).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(issues): exclude handoff records from new-comment counting

type='handoff' is a display-only timeline record, not conversation. Exclude it
from CountNewCommentsSince so a handoff note never inflates the count of
"new comments to catch up on" fed to a claiming agent (MUL-3375 §12). Analytics
already excludes it (RecordHandoff is a direct write that emits no analytics
event), and the comment-trigger path is already bypassed.

Test: a handoff record does not bump the new-comment count; a real comment does.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): pre-trigger preview UI, handoff note, timeline card (web/desktop)

Wire the §9 frontend onto the preview endpoint + handoff fields:
- Delete the backlog blocking dialog (backlog-agent-hint*) and its modal type;
  the over-eager nag is gone. Backlog awareness is now a passive label.
- RunConfirmModal: single assign + batch assign/status route here. Shows the
  backend predicate's verdict ("将启动 @X" / "将启动 N 个" / parked), an optional
  handoff note (assign only, soft-gated by handoff_supported), and 暂不启动 —
  then applies via update/batch. No frontend guessing.
- create modal: passive CreateRunHint ("将启动 @X" / backlog parked).
- single status change stays a direct apply (unchanged).
- timeline: render type='handoff' as a distinct, non-interactive handoff card.
- i18n run_confirm + handoff_card across en/ja/ko/zh-Hans; drop backlog action
  keys; locale parity green.

Tests: use-issue-actions (assign → run-confirm modal, member → direct),
create-issue + comment-card suites updated/green; views typecheck + lint clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(issues): use a valid anchor in the handoff count-exclusion test

CountNewCommentsSince filters id <> @anchor_id; SQL id <> NULL is NULL and
excludes every row, so an empty anchor made the control assertion read 0. The
production caller always passes a real anchor — mirror that with a non-matching
sentinel uuid.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(issues): RunConfirmModal apply logic (start/suppress/note-gate/batch)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(core): preview schema malformed/missing/null fallback coverage

Cover IssueTriggerPreviewSchema via parseWithFallback (MUL-3375): well-formed
parse, top-level + item default fills (empty/older backend), and fallback to
{ triggers: [], total_count: 0 } for malformed shapes, a dropped required
issue_id, a wrong-typed total_count, and null/non-object bodies — so the four
entry points degrade to "nothing will start" instead of throwing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* refactor(issues): remove display-only handoff timeline record (留痕)

The handoff "留痕" timeline record (type='handoff' comment written on run
start) was judged superfluous and dropped per product call. This removes
only the display-only trace; the handoff NOTE injection into the run's
opening prompt + issue_context.md is untouched.

- backend: drop RecordHandoff + its call in dispatchIssueRun
- db: drop the `type <> 'handoff'` exclusion in CountNewCommentsSince and
  migration 123 (comment_type_check reverts to the 4-type set from 001);
  no production data exists for this unreleased feature
- frontend: drop the "handoff" CommentType, HandoffCard, and handoff_card
  i18n (all locales)
- tests: drop handoff_count_test.go and the record-write assertions in
  issue_trigger_preview_test.go (note-injection tests retained)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): dismissable run-confirm modal + team-handoff copy

Two fixes to the pre-trigger confirm modal (MUL-3375).

1. Dismissable: switch RunConfirmModal from AlertDialog to the standard
   shadcn Dialog so it has the close (X) button + Esc + click-outside.
   Previously the only choices were "start" / "don't start now" with no
   way to abort the action entirely; dismissing now cancels with no write.

2. Copy: rework the action-surface wording away from the backend term
   "run" toward team-handoff voice — 指派 / 开始 / 交接 (run stays only on
   record surfaces). Unifies the note's three names to "交接说明", and
   parallels the rewrite across en/ja/ko.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* chore(agent): bump handoff note min CLI version to 0.3.28

The daemon release that renders handoff notes ships in 0.3.28 (0.3.27
was the prior tag), so move the soft-gate threshold up. Below this the
note is silently dropped and the frontend grays the note box — assignment
is never blocked.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(issues): skip run-confirm when batch-moving issues to backlog

A move into backlog never starts a run (service/issue_trigger.go), so the
pre-trigger confirm modal degenerated to an empty "won't start" box with a
single Apply button — pure friction. Apply directly instead, matching the
single-issue status path. Other target statuses still route through the
modal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(issues): refine pre-trigger preview hint and copy

- Move the create-issue run hint to a reveal band (grid 0fr→1fr) above the
  property toolbar. It was sharing the footer button row and, lacking a
  width constraint, reflowed the submit buttons whenever it appeared.
  Restyle to a borderless, comment-style avatar+caption that is purely a
  caption (non-interactive avatar).
- Distinguish squad from agent in the pre-trigger copy: a squad's leader
  evaluates and delegates rather than "starting work" itself. Add
  will_start_named_squad / will_start_squad / create_will_start_squad across
  en/zh/ja/ko (reusing the squad_leader_* evaluate→arrange vocabulary) and
  branch run-confirm + the create hint on squad assignees.
- Bold the assignee name in the run-confirm headline via a language-safe
  sentinel split (no per-language prefix/suffix keys).
- Align zh "开始处理" → "开始工作" on the single-assign copy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* test(issues): stub ActorAvatar in create-issue suite

CreateRunHint now renders an ActorAvatar for agent/squad assignees, which
pulls in getActorInitials/getActorAvatarUrl + the workspace/presence/navigation
hook tree. This form-focused suite only stubbed getActorName, so the
squad-forwarding test crashed with "getActorInitials is not a function". Stub
the avatar inert — its own behavior is covered elsewhere.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Walt <walt@multica.ai>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 13:17:13 +08:00
LinYushen
c5eb778532 Revert "fix: keep runtime provider arbiter during profile rollout (#4251)" (#4258)
This reverts commit a08281a1b2.
2026-06-17 18:23:46 +08:00
Multica Eve
a08281a1b2 fix: keep runtime provider arbiter during profile rollout (#4251)
Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-17 17:51:47 +08:00
LinYushen
52e76e7b23 MUL-3284: server API + daemon (custom runtime PR2) (#4149)
* MUL-3284: add runtime_profile schema (custom runtime PR1)

Schema-only foundation for custom runtimes. Additive migration 120:

- New workspace-level `runtime_profile` table: the shared, team-visible
  definition of a custom runtime (e.g. an in-house Codex wrapper).
  protocol_family is CHECK-constrained to the exact backend list in
  agent.New() (server/pkg/agent/agent.go). The only args column is
  `fixed_args` (args every agent on the runtime must inherit); there is
  deliberately no generic per-agent args field — those stay on
  agent.custom_args.
- `agent_runtime.profile_id` (nullable, FK -> runtime_profile ON DELETE
  CASCADE): NULL = built-in runtime, non-NULL = a registered instance of
  a custom profile.
- Partial unique index agent_runtime_workspace_daemon_profile_key on
  (workspace_id, daemon_id, profile_id) WHERE profile_id IS NOT NULL.

The legacy UNIQUE (workspace_id, daemon_id, provider) constraint is left
INTACT so the existing registration upsert
(ON CONFLICT (workspace_id, daemon_id, provider) in runtime.sql) keeps
resolving its arbiter and the server stays green. Converting that key to
a partial (WHERE profile_id IS NULL) index and making the upsert
profile-aware is PR2's registration work, not this migration.

Verified up + down against Postgres 17: full `migrate up` applies 120;
schema shows the table, column, partial index and intact legacy
constraint; functional checks pass (partial index blocks dup
(ws,daemon,profile), allows same profile on another daemon; CHECK and
display_name uniqueness reject bad input; legacy ON CONFLICT still
resolves; profile delete cascades to instances); down/up round-trip is
clean.

Co-authored-by: multica-agent <github@multica.ai>

* MUL-3284: drop DB FKs/cascade from runtime_profile migration (review fix)

Per review (house rule: no new database foreign keys / cascades; relational
integrity lives in the application layer):

- runtime_profile.workspace_id: drop REFERENCES workspace ON DELETE CASCADE
  -> plain UUID NOT NULL.
- runtime_profile.created_by: drop REFERENCES "user" ON DELETE SET NULL
  -> plain UUID.
- agent_runtime.profile_id: drop REFERENCES runtime_profile ON DELETE CASCADE
  -> plain UUID.

CHECK constraints, UNIQUE (workspace_id, display_name), the workspace index,
and the partial unique index agent_runtime_workspace_daemon_profile_key are
unchanged. The legacy UNIQUE (workspace_id, daemon_id, provider) constraint
remains untouched.

Behavioral consequence: the database no longer auto-removes a profile's
agent_runtime instance rows on profile delete. That cleanup moves into PR2's
profile-delete path. Up-migration comments document this; down-migration
comment no longer references FKs/cascade.

Re-verified on Postgres 17: migrate up applies 120; no FK constraints exist on
the new columns; partial index still blocks dup (ws,daemon,profile_id); CHECK
and display_name uniqueness still reject bad input; deleting a profile now
leaves the runtime row orphaned (proving cascade is gone); down/up round-trip
clean with the legacy constraint intact.

Co-authored-by: multica-agent <github@multica.ai>

* MUL-3284 PR2 (server): runtime_profile CRUD + profile-aware registration

Server/DB half of the custom-runtime feature.

- Migration 121: convert the legacy UNIQUE (workspace_id, daemon_id, provider)
  constraint on agent_runtime into a partial unique index scoped to built-in
  rows (WHERE profile_id IS NULL). With 120's partial index on profile_id this
  lets one daemon host the built-in provider AND custom profiles of the same
  protocol family without collision.
- Queries: runtime_profile CRUD; ListEnabledRuntimeProfilesForWorkspace
  (daemon-facing); CountAgentsByProfile + DeleteAgentRuntimesByProfile for the
  app-layer cascade; profile-aware UpsertAgentRuntimeWithProfile; the built-in
  UpsertAgentRuntime ON CONFLICT now spells out WHERE profile_id IS NULL so it
  targets the right partial index. sqlc regenerated.
- agent.SupportedTypes / IsSupportedType: single-source protocol_family
  whitelist, in lockstep with agent.New and the migration 120 CHECK.
- Handlers + routes: runtime_profile CRUD (member-read, admin-write) with
  protocol_family whitelist validation, display_name uniqueness (409), and
  fixed_args validation (no generic per-agent args — iron rule); a
  daemon-token endpoint GET /api/daemon/workspaces/{id}/runtime-profiles;
  DeleteRuntimeProfile does the app-layer cascade (delete instance rows then
  profile, in one tx) and refuses (409) while active agents are bound.
- DaemonRegister accepts an optional per-runtime profile_id: validates the
  profile belongs to the workspace and is enabled, registers via the
  profile-aware upsert, and skips legacy hostname merge for custom rows.
  AgentRuntimeResponse now carries profile_id.

Verified on Postgres 17: migrate up through 121; built-in + custom codex
coexist on one daemon; both upsert arbiters are idempotent; delete-by-profile
cascade removes only the custom instance; migrate down reverses 121 then 120
and replays clean. go build ./... and go vet pass; handler test package
compiles.

Daemon-side wiring (fetch profiles, PATH-resolve command_name, register with
profile_id, exec uses command_name) lands in a follow-up commit on this branch.

Co-authored-by: multica-agent <github@multica.ai>

* MUL-3284 PR2 (daemon): pull profiles, PATH-resolve, register, exec command

Daemon-side half of custom runtime profiles, against the server contract on
this branch.

- client.go: GetRuntimeProfiles(workspaceID) -> GET
  /api/daemon/workspaces/{id}/runtime-profiles (mirrors GetWorkspaceRepos);
  RuntimeProfile / RuntimeProfilesResponse types.
- types.go: Runtime gains profile_id (parsed from the register response so
  runtimeIndex carries it).
- daemon.go:
  * appendProfileRuntimes — called inside registerRuntimesForWorkspace before
    the empty-runtimes guard. Best-effort fetch (older server 404s are logged
    and swallowed; never fails registration). Per enabled profile: resolve
    command_name via PATH (exec.LookPath, behind a `lookPath` test hook),
    skip+log when absent, best-effort version probe, record the resolved
    absolute path keyed by profile_id, and append a registration entry
    {name, type=protocol_family, version, status:online, profile_id}. A
    custom-only host (no built-in agents) still registers.
  * profileCommandPaths map (guarded by d.mu) + recordProfileCommandPath /
    customCommandPathForRuntime helpers.
  * runTask: looks up the claimed task's RuntimeID -> profile command path and
    overrides the executable path, synthesizing an AgentEntry so a custom
    runtime runs even when the host has no built-in agent of the same
    provider. provider (=protocol_family) is unchanged so agent.New still
    selects the right backend.
- Tests: GetRuntimeProfiles request shape; profile runtime appended + path
  recorded (custom-only host); profile skipped when command not on PATH;
  profiles-fetch-404 is best-effort; customCommandPathForRuntime bookkeeping.
- agent: lockstep test pinning SupportedTypes to agent.New and the migration
  120 protocol_family CHECK.

Iron rule honored: profile carries no generic per-agent args. fixed_args are
parsed and carried but intentionally NOT wired into the launch command yet
(optional/best-effort; explicit TODO(MUL-3284) in appendProfileRuntimes).

Verified: go build ./... clean; go vet ./internal/daemon/... clean;
go test ./internal/daemon/... pass (existing + 5 new); full
go test ./internal/handler/ suite passes against a migrated Postgres 17;
agent lockstep test passes.

Co-authored-by: multica-agent <github@multica.ai>

* MUL-3284 PR2: profile delete runs full archived-agent cascade (fix 500)

Review fix. DeleteRuntimeProfile previously guarded only on ACTIVE agents, but
agent.runtime_id is ON DELETE RESTRICT — a profile whose runtimes had only
ARCHIVED agents passed the guard, then DeleteAgentRuntimesByProfile hit the FK
and the handler 500'd.

Now it mirrors the mature runtime-delete cascade (DeleteAgentRuntime): in one
transaction it enumerates the profile's runtime rows, refuses (409) any with
active agents or active squads led by archived agents, then for each runtime
pauses autopilots pinned to its archived agents, drops archived squads led by
them, and hard-deletes the archived agents before removing the runtime rows
and the profile. No code path can now fall through to a raw FK error.

- queries: ListAgentRuntimeIDsByProfile (sqlc regen). Reuses the existing
  per-runtime teardown queries (CountActiveSquadsWithArchivedLeadersByRuntime,
  ListArchivedAgentIDsByRuntime, PauseAutopilotsByAgentAssignees,
  DeleteSquadsByArchivedAgentsOnRuntime, DeleteArchivedAgentsByRuntime).
- tests: TestDeleteRuntimeProfile_ArchivedAgentCascade (archived-only profile
  deletes cleanly: 204, runtime + archived agent + profile gone) and
  TestDeleteRuntimeProfile_ActiveAgentBlocks (active agent → 409, survives).

Verified against Postgres 17: both new tests pass; full handler suite, daemon
tests, and agent lockstep test pass; go vet clean.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-06-17 11:33:09 +08:00
Bohan Jiang
24b162cdbc feat(daemon): surface the real task initiator to the agent runtime (MUL-2645) (#3899)
* feat(daemon): surface the real task initiator to the agent runtime (MUL-2645)

In a multi-person workspace the agent runtime only ever saw the runtime
OWNER identity: the brief's `## Requesting User` is sourced from
runtime.OwnerID and the task-scoped token is owner-bound, so every
requester (whoever commented, @mentioned, or chatted) appeared to the
agent as the owner. Agents that route by initiator for permission,
privacy, or audit all misjudged.

Resolve the real task initiator at claim time and surface it distinctly
from the owner:
- comment / mention trigger -> triggering comment's author (member or agent)
- chat task -> chat session creator (sessions are creator-only)
- on-assign / autopilot / quick-create -> no attributable initiator (omitted)

Adds initiator_{type,id,name,email} to the claim response, the daemon
Task, and TaskContextForEnv, rendered into the brief as a new
`## Task Initiator` section. The section documents the privacy boundary:
the agent's credentials stay owner-scoped, so this is an attested
identity for the agent's own routing/privacy logic, not act-as. No DB
migration — both paths are derivable from existing rows.

Tests: brief rendering (member/agent/omit/sanitize) + email guard unit
tests, and claim-handler tests for the comment and chat paths.

Co-authored-by: multica-agent <github@multica.ai>

* fix(chat): store real sender as task initiator, not chat_session creator (MUL-2645)

Review fix (Niko, PR #3899). v1 resolved the chat task initiator from
chat_session.creator_id at claim time. That is correct for web chat and
Lark p2p (creator == sender), but WRONG for Lark group chats: the group
session creator is deliberately the installer (stable identity across
member churn), not the message sender. So in a Lark group, every member
who triggered the agent showed up in the brief as the installer/owner —
the exact bug this issue is about, still live at that entry point.

Capture the real sender at enqueue time instead of deriving it from the
session creator at claim time:

- migration 117: agent_task_queue.initiator_user_id (FK user, ON DELETE
  SET NULL); NULL for non-chat and pre-migration rows.
- EnqueueChatTask now takes an explicit initiatorUserID. Web chat passes
  the authenticated request user; the Lark dispatcher threads the inbound
  sender (binding.MulticaUserID) through scheduleRun -> flushChatRun. The
  debouncer keeps the latest scheduled flush per session, so in a multi-
  sender silence window the LATEST sender wins (documented + tested).
- claim handler resolves the initiator from task.initiator_user_id and
  drops the creator_id fallback entirely.

The Lark group session creator stays the installer (unchanged) — only the
task initiator is corrected, keeping the two concepts cleanly separate.

Tests: dispatcher group regression (initiator = sender, not installer),
latest-sender-wins, p2p initiator assertion; the chat claim handler test
now sets creator != initiator and asserts the stored sender wins.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-08 19:29:57 +08:00
Thanh Minh
8abdc77961 MUL-2489 fix(runtime): delete archived squads before runtime teardown (#2955)
* fix(runtime): delete squads referencing archived agents before runtime teardown

The DeleteAgentRuntime handler was failing with 500 'failed to clean up
archived agents' because squad.leader_id has an ON DELETE RESTRICT FK on
agent(id). When an archived agent was still referenced as a squad leader
(even on an archived squad), the DELETE FROM agent query was blocked.

Fix: add DeleteSquadsByArchivedAgentsOnRuntime query that removes squads
whose leader_id points to an archived agent on the target runtime, and
call it before DeleteArchivedAgentsByRuntime in the handler.

Closes TMI-85

* test(runtime): cover squad cleanup before archived-agent deletion

Adds four tests around the DeleteSquadsByArchivedAgentsOnRuntime fix:

* TestDeleteSquadsByArchivedAgentsOnRuntime_Query — query-level: deletes
  squads whose leader is an archived agent on the target runtime, leaves
  squads with active leaders or archived leaders on a different runtime
  alone, and is safe to call when nothing matches. Covers the archived-
  squad case that originally hid the FK blocker from `multica squad list`.
* TestDeleteAgentRuntime_RemovesSquadsLedByArchivedAgents — handler
  end-to-end regression for TMI-85. Reverting the handler change makes
  this fail with the exact 500 'failed to clean up archived agents' the
  user reported.
* TestDeleteAgentRuntime_NoSquadsRegression — happy path for runtimes
  whose archived agents were never squad leaders, ensuring the new step
  is a no-op there.
* TestDeleteAgentRuntime_StillBlockedByActiveAgents — preserves the 409
  CountActiveAgentsByRuntime guard so the active-agent contract isn't
  silently regressed by the new cleanup ordering.

Refs TMI-85

* chore: remove internal issue tracker references from test comments

* fix(runtime): keep active squads during runtime teardown

* fix(runtime): block runtime delete on active archived-leader squads

* fix(runtime): make runtime delete 409 path a no-op

---------

Co-authored-by: Kiro <kiro@multica.ai>
2026-06-08 13:08:38 +08:00
Bohan Jiang
341ce7bfa5 feat: support local working directory for projects (MUL-2618 v1) (#3283)
* feat(project): add local_directory project_resource type (MUL-2662)

Adds a second project_resource type alongside github_repo so a project
can be pinned to an existing directory on a specific daemon (the v1 of
the local-working-directory flow tracked in MUL-2618). The ref schema is
{ local_path, daemon_id, label? }; local_path must be absolute and
daemon_id is required. The same (daemon_id, local_path) pair is allowed
on multiple projects by design — no UNIQUE constraint is added.

Implementation reuses the existing project_resource API surface: the new
type is wired through the validator switch with no migration, no new
events, and no daemon-handler changes (daemon already passes through
arbitrary resource types via ProjectResources). The CLI gains
--local-path / --daemon-id / --ref-label shortcuts so
`multica project resource add --type local_directory` mirrors the
existing `--type github_repo --url ...` ergonomics; the generic --ref
flag still works for both types.

Tests cover the full CRUD lifecycle, the same-path-across-projects
allowance, the same-path-same-project conflict, the validator rejections
(missing/blank/relative path, missing daemon_id, wrong payload type),
and the cross-platform isAbsoluteLocalPath helper.

Co-authored-by: multica-agent <github@multica.ai>

* feat(project): add update endpoint + label-shadow guard for project_resource (MUL-2662)

Addresses the Elon review on PR #3263:

- Add PUT /api/projects/{id}/resources/{resourceId} with sqlc query,
  matching handler, CLI `project resource update`, and a new
  EventProjectResourceUpdated WS event. resource_type stays immutable;
  ref/label/position are all individually optional.
- Catch same-project (daemon_id, local_path) collisions where only the
  embedded label differs — the row-level UNIQUE only matches the full
  ref JSON, so a label typo would otherwise let the same working
  directory bind twice.
- Tests cover the update lifecycle (label-only / ref / clear / 404 /
  invalid path) and the label-shadow conflict on both create and
  update; the in-place rename still succeeds because the conflict
  scan ignores the row being edited.

Incidental: regenerating sqlc picked up a missing skills_local scan in
UpdateAgentCustomEnv that drifted in from #3200.

Co-authored-by: multica-agent <github@multica.ai>

* fix(project): close bundled-create label-shadow gap + merge resource_ref on CLI update (MUL-2662)

Two follow-ups from MUL-2662 review round 2:

- CreateProject inline resources path now dedupes local_directory entries on
  (daemon_id, local_path) before opening the transaction. The DB-level
  UNIQUE(project_id, resource_type, resource_ref) constraint only fires on a
  full JSON match, so two rows with the same target but different `label`
  would otherwise slip past. Standalone POST/PUT already cover this via
  findLocalDirectoryConflict; bundled create was the missing surface.
- `multica project resource update` now seeds resource_ref from the existing
  row before applying per-type shortcut flags, so `--default-branch-hint x`
  on its own no longer constructs a payload missing `url` (which the server
  400s on). Local_directory partial edits get the same merge behavior.

Co-authored-by: multica-agent <github@multica.ai>

* feat(desktop): local_directory project_resource UI (MUL-2665) (#3273)

* feat(desktop): local_directory project_resource UI (MUL-2665)

First UI surface for the local-working-directory flow tracked in MUL-2618.
Lets users on the desktop pin a project to an existing folder on this
machine; web stays read-only since the per-daemon check can't be done in
the browser.

What's new for the renderer:

- ProjectResourcesSection grows a desktop-only "Add local directory"
  button next to the existing GitHub-repo popover. Clicking it opens
  Electron's native folder picker, validates the path through a new
  IPC pair (existence + r/w), and submits a project_resource of
  resource_type=local_directory with daemon_id pulled live from
  daemonAPI.getStatus.
- LocalDirectoryRow renders the rename pencil + path tooltip, and
  greys out when ref.daemon_id != this machine's daemon_id (with a
  "only available on the machine that registered this directory"
  tooltip). Delete stays enabled so users can drop stale registrations
  from any device.
- LocalDirectoryHint sits above the issue-detail comment composer and
  shows "Agent will work in-place at {label} ({path})" when the issue's
  project has a local_directory matching this daemon. Hidden on web.
- TaskStatusPill picks up a new "waiting_for_directory_release" stage
  that the daemon will publish when it dequeues a task but can't
  acquire the path lock. The render is in place now so the daemon
  sibling subtask can wire the status string without an additional UI
  PR.

Plumbing:

- @multica/core/types gains LocalDirectoryResourceRef +
  UpdateProjectResourceRequest, and the api client gets the matching
  PUT method backed by the server endpoint that landed in
  2ac3faebb (MUL-2662). A useUpdateProjectResource hook drives the
  in-place label edit.
- New Electron handlers under apps/desktop/src/main/local-directory.ts:
    local-directory:pick     -> dialog.showOpenDialog (openDirectory)
    local-directory:validate -> stat + access(R_OK + W_OK)
  exposed through the preload as desktopAPI.pickDirectory /
  validateLocalDirectory. View code talks to them via a thin
  packages/views/platform helper that returns reason=unsupported on
  web instead of crashing.
- useLocalDaemonStatus exposes the local daemon's id, device name, and
  running flag from daemonAPI.onStatusChange so the renderer can do the
  cross-device match without coupling to the desktop preload typings.

Tests:

- pickStageKeys gets a unit test covering the new stage and proving
  the directory-release status outranks availability hints.
- LocalDirectoryHint tests cover the four render branches (no project,
  no daemon, foreign daemon, matching daemon).
- i18n parity stays green; new keys added under projects.resources.*
  and chat.status_pill.stages.waiting_for_directory_release in both
  locales.

Out of scope (will land separately):
- The daemon-side waiting/lock signal that flips the pill into the
  new state.
- Adding local_directory to the create-project modal's bulk
  attach flow.
- Docs page refresh for project-resources.mdx — left for the
  MUL-2618 umbrella sweep.

Co-authored-by: multica-agent <github@multica.ai>

* fix(desktop): hide rename for foreign daemon local_directory rows (MUL-2618)

Address review nit on #3273: the rename pencil was gated only by
`canEdit`, so a foreign / unknown-daemon row still showed it even
though the spec says cross-device rows are disabled. Gate rename on
`!mismatch` so it disappears on those rows; delete stays available
so a stale registration can still be dropped from any device.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663) (#3274)

* feat(daemon): local_directory execution + path mutex + GC exception (MUL-2663)

Wires up the daemon side of the local_directory project_resource introduced
in MUL-2662. When a task is dispatched against a project whose resources
include a local_directory pinned to this daemon's UUID, the daemon now:

  - Validates the path (absolute, exists, daemon process can read+write,
    not in the system-root / $HOME blacklist) and fails the task fast on
    any precondition violation, with a user-readable reason.
  - Serialises concurrent tasks on the same on-disk path via a
    daemon-local LocalPathLocker keyed by symlink-resolved realpath. The
    lock is held for the entire task lifetime (claim → context write →
    agent → result report).
  - When the lock is contended, the daemon flips the row to a new
    waiting_local_directory status on the server (carrying a wait_reason
    like "<path> (held by task <short id>)") so the UI can render
    "等待本地目录释放" instead of leaving the row silently in dispatched
    past the sweeper timeout. The status accepts being woken into running
    once the lock is acquired.
  - Sets execenv.WorkDir to the user's path (no copy, no mount). envRoot
    still lives under workspacesRoot/<wsID>/ and hosts output/, logs/, and
    .gc_meta.json — the daemon's logbook for the run.
  - Stamps GCMeta.LocalDirectory=true so the GC loop never RemoveAlls
    envRoot for these tasks (gcActionClean → gcActionCleanArtifacts,
    gcActionOrphan → gcActionSkip). The user's directory was never under
    envRoot to begin with, so this is defense in depth.
  - Skips execenv.Reuse for local_directory tasks because the prior
    WorkDir is the user's path and reusing it through that code path
    loses the envRoot association the GC loop needs. Prepare is cheap
    here (no clone, no copy), so always running it is fine.

Server-side protocol changes:

  - New CHECK value 'waiting_local_directory' on agent_task_queue.status
    plus a wait_reason TEXT column (migration 109).
  - All cancel / active / counted-as-running / orphan-recovery queries
    expanded to include the new status; FailStaleTasks intentionally
    excludes it (the daemon owns the wait).
  - New SQL MarkAgentTaskWaitingLocalDirectory(id, reason) and a relaxed
    StartAgentTask that accepts both dispatched and
    waiting_local_directory as preconditions (and clears wait_reason on
    the way through).
  - New POST /api/daemon/tasks/{taskId}/wait-local-directory endpoint,
    TaskService.MarkTaskWaitingLocalDirectory broadcaster, and matching
    daemon Client.MarkTaskWaitingLocalDirectory.

Tests cover: path blacklist + R/W enforcement, mutex serialisation +
ctx-cancelled wait, lock handover between two tasks, GC never returns
gcActionClean / gcActionOrphan for local_directory rows (with negative
control for the standard path), and Prepare/Cleanup correctly substitute
+ protect the user's WorkDir.

The desktop UI side (UI for adding a local_directory resource, surfacing
the "等待本地目录" badge) is MUL-2665; the agent-task lifecycle changes
(no branch switch, dirty-tree tolerant, auto-commit) are MUL-2664.

This PR targets the shared MUL-2618 v1 feature branch agent/j/912b8cb1,
not main; the whole v1 will be merged to main together when complete.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): tighten local_directory status, symlink, cancel handling (MUL-2618)

Address the 3 must-fix items from Elon's review of PR #3274.

1. Status string unified. The server / daemon publish
   `waiting_local_directory`; align views, locales, and the
   pickStageKeys test (PR #3273 had used `waiting_for_directory_release`
   on a placeholder string). Without this, the daemon's wait state
   never reached the pill once the two siblings merged.

2. validateLocalPath now also runs the blacklist against the
   symlink-resolved realpath, with macOS's `/etc` -> `/private/etc`
   redirect handled via `isBlacklistedRealPath` which compares
   canonical forms. Without this, a symlink such as
   `/Users/me/proj/home -> /Users/me` slipped the literal $HOME check
   while every daemon write still landed in the user's home. Tests
   cover symlink-to-home, symlink-to-system-root, and the negative
   case (symlink to a regular subdirectory).

3. acquireLocalDirectoryLockIfNeeded now spins up a cancellation
   watcher inside `onWait` (lazy — the fast path stays free) so the
   gap between dispatch and StartTask responds to server-side cancel
   or row deletion. If the watcher fires while the daemon is parked
   on the path mutex, the lock-wait context is cancelled, Acquire
   returns promptly, and the helper exits silently the same way the
   run-phase poller does. New TestAcquireLocalDirectoryLock_CancelDuringWait
   exercises the path end-to-end with a fake server.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): unconditional canonical blacklist + Windows drive-root generalisation (MUL-2618)

- validateLocalPath now always runs isBlacklistedRealPath on the
  symlink-resolved path, not only when it differs from absPath. The old
  guard let users type the canonical form of an OS-symlinked banned root
  (e.g. /private/tmp, /private/etc, /private/var on macOS) straight
  through, since EvalSymlinks is a no-op on already-canonical input.
- Windows drive-root rejection moved off the static C/D/E/F enumeration
  onto filepath.VolumeName via a new isDriveRoot helper, so removable /
  network drives mounted at G:..Z: and UNC \\server\share roots are also
  blocked. systemRootBlacklist keeps the well-known C:\ trees only.
- Tests: macOS-only case exercises direct /private/{tmp,etc,var}; a
  new TestIsDriveRoot covers the Windows generalisation (skipped on
  POSIX runners by runtime guard).

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>

* feat(views): wire waiting_local_directory end-to-end in issue UI + presence (MUL-2618)

Connect the daemon-emitted `task:waiting_local_directory` and `task:running`
events through to issue execution log, sticky agent banner, activity indicator,
and agent presence so a parked task is no longer invisible on the issue page.

- Add `waiting_local_directory` to `AgentTask.status` and the typed
  `task:running` / `task:waiting_local_directory` WS event payloads.
- Chat realtime sync writes both new statuses into the pending-task cache so
  the chat StatusPill flips out of a stale `dispatched` frame.
- ExecutionLogSection: count `waiting_local_directory` as active, add tone +
  status label, treat parked tasks the same as dispatched for time anchor /
  transcript visibility / terminate-confirm note.
- AgentLiveCard: subscribe to both new events, rank the parked state between
  dispatched and queued, and surface a "is waiting for the local directory"
  banner with the muted "Clock" treatment used for queued.
- IssueAgentActivityIndicator: route parked tasks into the queued bucket so
  the hover stack and chip stay visible.
- derive-presence: parked tasks count toward `queuedCount` so the agent
  workload chip stays out of `idle` while the daemon waits on the path lock.
- Locales: add `agent_live.is_waiting_local_directory` and
  `execution_log.status_waiting_local_directory` (en + zh-Hans).

Co-authored-by: multica-agent <github@multica.ai>

* feat(project): enforce one local_directory per (project, daemon) (MUL-2618)

The daemon-side resolver picks the first matching local_directory by
daemon_id, so allowing two rows on the same daemon — even at different
paths — let the agent silently write into whichever sorted first. Tighten
the invariant top to bottom:

- server: `findLocalDirectoryConflict` rejects any second row sharing a
  daemon_id, regardless of `local_path` or label. Bundled-create surface in
  `CreateProject` runs the same daemon-scoped dedupe up front.
- daemon: `findLocalDirectoryAssignment` fails fast when it finds more than
  one row pinned to the current daemon (older API client / direct DB
  writes can still produce that state — refuse to guess).
- desktop UI: hide the "Add local directory" action once the current
  daemon owns a row on this project, with a hint and a defensive toast on
  the call path; foreign-daemon rows stay visible read-only as before.
- Tests:
  * daemon: new `two local_directory rows on this daemon fail fast` /
    `local_directory rows on different daemons coexist` cases.
  * handler: rewrite the legacy `LabelShadow` cases as
    `DaemonScopedConflict` / `BundledLocalDirectoryDaemonConflict` —
    asserts 409 on same-daemon different-path, 201 on per-daemon bundles.
- Locales: en + zh-Hans copy for the new hint + toast.

Co-authored-by: multica-agent <github@multica.ai>

* chore(sqlc): drop stale skills_local in UpdateAgentCustomEnv (MUL-2618)

Follow-up to the main-merge in 0f8e8ca7: the auto-merge preserved most
of main's skills_local revert but kept the column reference inside the
UpdateAgentCustomEnv scanner because that block hadn't been touched by
either side. Re-running `sqlc generate` regenerates the file without
skills_local in this query, matching the rest of the file and the
post-revert schema.

Co-authored-by: multica-agent <github@multica.ai>

* feat(create-project): binary source picker — repos OR local directory

Turn the create-project dialog's "Repos" pill into a binary Source
picker. A project's source is mutually exclusive: either a set of
GitHub repos (worktree mode, default) or a single local working
directory (local mode, desktop-only). Mirrors the constraint the
backend will enforce next.

Behavior:
- Pill shows the active mode's selection (GitHub icon + repo count, or
  folder icon + local label/path).
- Popover has a 2-tab segmented control at the top; the Local tab is
  hidden entirely on web (local_directory needs a daemon_id).
- Local tab requires the daemon online — amber notice + disabled picker
  when offline, re-renders automatically via useLocalDaemonStatus.
- Switching tabs preserves the other side's stash, but handleSubmit
  only emits the resource matching the active sourceMode, so abandoned
  picks never leak into the created project.

Backend mutual-exclusion validation + the resources-section
conditional-add-button still to come — this PR just unblocks the
dialog so it can be demoed.

* fix(mobile): cover waiting_local_directory in run row status maps (MUL-2618)

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Multica J <j@multica.ai>
2026-05-27 13:44:31 +08:00
LinYushen
bf8a346cf0 feat(runtimes): cascade-archive agents on runtime delete (MUL-2667) (#3266)
* feat(runtimes): cascade-archive agents on runtime delete (MUL-2667)

Replace the bare 409 "cannot delete runtime: it has active agents" with a structured response carrying the blocking agent list, and wire a cascade endpoint that archives those agents, cancels their tasks, pauses dangling autopilots and deletes the runtime in a single transaction. The unified DeleteRuntimeDialog opens directly in cascade mode when the runtime has bound agents, pivots from light to cascade if the strict DELETE refuses with runtime_has_active_agents, and re-prompts when the cascade refuses with runtime_delete_plan_changed (live agent set drifted while the dialog was open). The online-local self-healing rule is preserved at the affordance level (kebab hidden, Diagnostics button disabled with tooltip) and re-checked at confirm time as defence in depth.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtimes): close cascade race + i18n delete dialog (PR #3266 review)

- Acquire FOR UPDATE on the runtime row at the top of the cascade tx so
  FK-validated agent INSERTs/UPDATEs that would point at this runtime
  block until commit, and lock each currently-active agent row via
  ListActiveAgentsByRuntimeForUpdate so a concurrent archive/move of
  an existing active row also blocks.
- Switch the bulk archive from runtime-keyed (ArchiveAgentsByRuntime)
  to ID-keyed (ArchiveAgentsByIDs), narrowed to the user-confirmed
  expected_active_agent_ids set. Combined with the runtime row lock,
  this guarantees no agent outside the confirmed plan can be silently
  archived between plan-compare and archive even at read-committed.
- Wire delete-runtime-dialog.tsx to runtimes locale via useT(); add
  detail.delete_dialog.{light,cascade} keys (EN with _one/_other
  plurals, zh-Hans _other) covering titles, descriptions, warning,
  notices, checkbox, buttons, table headers, presence labels, and
  toasts. Resolves the i18next/no-literal-string CI failure.
- Locale parity test passes (51 tests). All 4 dialog test cases pass
  unmodified (EN copy preserves original wording). Full views vitest:
  91 files / 792 tests green; full server go test: green.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-26 14:59:38 +08:00
LinYushen
5bacfd9742 MUL-2526 feat: add member(user_id, workspace_id) index + upgrade sqlc to v1.31.1 (#3046)
- Add migration 106: CREATE INDEX CONCURRENTLY on member(user_id, workspace_id)
- Rewrite ListWorkspaces to drive from member table with explicit fields
- Regenerate all sqlc code with v1.31.1 (intentional version upgrade)

Co-authored-by: multica-agent <github@multica.ai>
2026-05-22 12:26:56 +08:00
YYClaw
614dfae884 MUL-2488 feat(timezone): Scheduling / Viewing two-layer timezone architecture (#2968)
* docs(timezone): add scheduling/viewing timezone architecture RFC

* feat(db): replace daily rollups with task_usage_hourly, add user.timezone

Migrations 100-104: add "user".timezone (Viewing tz), build the UTC
hourly task_usage_hourly rollup with its pipeline, drop the legacy
task_usage_daily / task_usage_dashboard_daily pipelines, and drop the
agent_runtime.timezone column. Report queries now slice day boundaries
at read time by the caller-supplied @tz instead of materialising in a
fixed tz. Regenerate sqlc.

* feat(server): add task_usage_hourly backfill command

Replace the two legacy backfill commands (daily / dashboard_daily) with
a single backfill_task_usage_hourly that loads historical task_usage
into the new UTC hourly rollup, sliced per workspace.

* refactor(server): resolve viewing timezone in report handlers

Report handlers resolve the Viewing tz per request (?tz query param,
then user.timezone, then UTC) and pass it to the hourly-rollup queries.
Drop the UseDailyRollup feature flags and the old raw-scan/daily-rollup
dual paths, remove the /api/usage endpoints, and stop the daemon from
reporting and the runtime handler from accepting host timezone.

* refactor(core): switch report queries to viewing timezone

API client and dashboard/runtime queries send ?tz with each report
request, the user schema/types carry the new timezone field, and the
runtime timezone field/mutation is removed.

* feat(views): add viewing timezone preference and UI

Add the useViewingTimezone hook and a Timezone setting in Preferences;
report charts and the dashboard week boundary follow the viewer tz.
Remove the runtime detail timezone editor and its locale strings.

* fix(test): update fixtures and stabilize tests for timezone refactor

The timezone architecture refactor changed several types without
updating dependent test code:

- RuntimeDevice no longer has a timezone field — drop it from the
  create-agent-dialog runtime fixture.
- User now requires a timezone field — add it to the apps/web mockUser
  fixture.
- The PreferencesTab timezone tests asserted on the async save handler
  (PATCH then store update) with a bare expect, racing the mutation's
  settle callback, and timed out querying the Select's ~600-option IANA
  list on a loaded CI runner. Wrap the assertions in waitFor and extend
  the timeout for those three tests.

* docs(timezone): document self-host migration order and trigger invariant

Add a SELF-HOST UPGRADE ORDER runbook to the backfill command's package
comment: applying migrations 100-104 in a single migrate-up drops the
legacy daily rollups before the hourly backfill runs, leaving dashboards
empty until cron catches up.

Add an INVARIANT comment on trg_atq_dirty_hourly noting that agent_id
must be added to the trigger's OF list if it ever becomes mutable,
otherwise dirty buckets for the old agent_id are silently missed.

* style(runtimes): drop trailing blank line in runtime-detail
2026-05-21 15:33:47 +08:00
Jiayuan Zhang
fc8528d64d feat(autopilot): support assigning to a squad (MUL-2429) (#2888)
* feat(autopilot): support assigning autopilot to a squad (MUL-2429)

Path A (Squad-as-Leader) from the RFC: when an autopilot's assignee is a
squad, dispatch resolves to squad.leader_id and executes against the
leader's runtime — semantics match a human manually assigning the issue
to that squad, no fan-out.

Backend scope only; frontend picker change is a follow-up PR.

Changes:
- 096_autopilot_squad_assignee migration: drop agent FK on
  autopilot.assignee_id, add assignee_type column (default 'agent'),
  add autopilot_run.squad_id attribution column.
- service.AgentReadiness: single source of truth for archived /
  runtime-bound / runtime-online checks. Shared by autopilot
  admission gate, run_only dispatch, and isSquadLeaderReady.
- service.resolveAutopilotLeader: translates assignee_type/id to the
  agent that actually runs the work.
- dispatchCreateIssue: stamps issue with assignee_type='squad' for
  squad autopilots and enqueues via EnqueueTaskForSquadLeader.
- dispatchRunOnly: belt-and-braces readiness re-check after resolving
  squad → leader so a leader that went offline between admission and
  dispatch produces a clean failure instead of a doomed task.
- handler.CreateAutopilot / UpdateAutopilot: accept assignee_type with
  squad/agent existence + leader-archived validation. Backward-compatible
  default of "agent" preserves the contract for older clients.
- Analytics: AutopilotRunStarted/Completed/Failed events carry
  assignee_type and squad_id; PostHog can now group autopilot runs by
  squad without joining back to the autopilot row.

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): reject archived squads, route post-admission skips, cleanup dangling-agent autopilots (MUL-2429)

Addresses three review findings on PR #2888:

1. Archived squad handling: validateAutopilotAssignee now rejects squads
   with archived_at set; resolveAutopilotLeader returns errSquadArchived
   so the admission gate fails closed; DeleteSquad now mirrors the issue
   transfer for autopilot rows (TransferSquadAutopilotsToLeader) so
   surviving autopilots flip to assignee_type='agent' (leader) instead
   of dangling at the archived squad.

2. dispatchRunOnly post-admission readiness: introduces errDispatchSkipped
   sentinel, recognised by DispatchAutopilot via handleDispatchSkip so
   the run is recorded as `skipped` (not `failed`). Manual triggers no
   longer 500 when the leader's runtime goes offline between admission
   and task creation. New TestManualTriggerDoesNotErrorOnPostAdmissionSkip
   locks the behaviour in.

3. Dangling agent assignee after migration 096 dropped the FK:
   shouldSkipDispatch now distinguishes pgx.ErrNoRows / errSquadArchived
   (hard skip — retrying won't help) from transient DB errors
   (fail-open). DeleteAgentRuntime pauses autopilots that target agents
   about to be hard-deleted (ListArchivedAgentIDsByRuntime +
   PauseAutopilotsByAgentAssignees) so the breakage surfaces as a paused
   row in the UI instead of a quiet skip-burning loop.

Unit tests cover the sentinel unwrap contract and errSquadArchived
errors.Is behaviour. Integration test
TestAutopilotDispatchSkipsWhenRuntimeOffline re-verified against a fresh
DB with migration 096 applied.

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): bump last_run_at on post-admission skip (MUL-2429)

Match recordSkippedRun (pre-flight skip) and the success path so the
scheduler / "last seen" UI both reflect that this tick evaluated the
trigger, even when the post-admission readiness gate caught a late
regression.

Addresses Emacs review caveat #1 on PR #2888.

Co-authored-by: multica-agent <github@multica.ai>

* feat(autopilot): mixed agent/squad assignee picker in dialog (MUL-2429)

End-to-end UI for assigning an autopilot to a squad. Closes the PR #2888
backend gap: the squad-as-assignee feature was already wired in Go (Path A,
RFC §4) but the desktop dialog never offered the choice.

- core/types/autopilot: add `AutopilotAssigneeType`, surface
  `assignee_type` on `Autopilot` + Create/Update request payloads.
- views/autopilots/pickers/agent-picker: switch to a polymorphic
  AssigneeSelection (`{type, id}`); render agents and squads as two
  grouped sections with shared pinyin search.
- views/autopilots/autopilot-dialog: maintain `assigneeType` state, send
  it on create/update, render the trigger avatar / hover dot with
  `assignee.type`.
- views/autopilots/autopilots-page + autopilot-detail-page: render the
  assignee row using `autopilot.assignee_type` so squad-typed autopilots
  show the squad avatar + name, not a broken agent lookup.
- locales: add `agents_group` / `squads_group` / `select_assignee` keys
  (en + zh-Hans), keep legacy `select_agent` for callers that still
  reference it.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 05:30:13 +02:00
LinYushen
319b23eb39 Revert "feat(task): add claim lease mechanism (Phase 2, MUL-2246) (#2660)" (#2674)
This reverts commit 3137feecdf.
2026-05-15 16:07:23 +08:00
LinYushen
3137feecdf feat(task): add claim lease mechanism (Phase 2, MUL-2246) (#2660)
Add claim_token + claim_expires_at columns to agent_task_queue and three
new SQL queries for the claim lease protocol:

- ClaimAgentTaskWithLease: generates a UUID token and sets a lease expiry
  when claiming a task, so the daemon must prove it received the response
- StartAgentTaskWithClaimToken: validates the token on StartTask, preventing
  stale daemons from starting requeued tasks
- RequeueExpiredClaimLeases: moves dispatched tasks with expired leases back
  to queued for re-claim

This closes the reliability gap where a claim response lost in transit
leaves a task stuck in dispatched until the 60s dispatch timeout fires.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-15 15:14:05 +08:00
Jiayuan Zhang
4d6b5ad06f fix(squad): wake leader when dual-role agent posts as worker (MUL-2218) (#2626)
* fix(squad): wake leader when dual-role agent posts as worker (MUL-2218)

The squad-leader self-trigger guard skipped a comment whenever the
author equalled the squad's leader id, regardless of the role the agent
was acting in. For an agent that holds both leader and worker roles in
the same squad, this meant the leader role never reacted to its own
worker output and the issue stalled.

Tag each enqueued task with is_leader_task and consult the agent's
most recent task on the issue from both self-trigger guards (comment
path + @squad mention path) — skip only when that task was itself a
leader task.

Co-authored-by: multica-agent <github@multica.ai>

* fix(squad): inherit is_leader_task on retry task clone (MUL-2218)

CreateRetryTask cloned a parent task into a fresh queued attempt but
omitted is_leader_task from the column list, so the child silently fell
back to the column default (false). For a leader task that hit auto-retry
through MaybeRetryFailedTask, the retried task posed as a worker task —
the self-trigger guard then no longer recognised the leader's own
comments, re-opening the very loop MUL-2218 closes.

Inherit p.is_leader_task in the clone and add a query-level test that
covers both leader and worker retries.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-14 15:23:36 +02:00
Bohan Jiang
63d215e1c3 feat(runtime): visibility (public/private) gate on CreateAgent / UpdateAgent (#2419)
* feat(runtime): visibility (public/private) gate on CreateAgent / UpdateAgent

Closes the hole where a plain workspace member could pick another member's
runtime in the Create Agent dialog and bind an agent to it — the backend
wasn't checking runtime ownership, so the agent ran on someone else's
hardware / tokens. Reported on GH #1804.

Schema
- Migration 083 adds agent_runtime.visibility ('private' default, 'public')
  with a CHECK constraint. Existing rows default to private — same
  ownership semantics as before, no behavior change for legacy data.

Backend
- canUseRuntimeForAgent predicate: allow when caller is workspace
  owner/admin, the runtime owner, or the runtime is public.
- CreateAgent and UpdateAgent both gate on it: UpdateAgent matters because
  a plain member could otherwise create on their own runtime, then re-bind
  to a private one.
- PATCH /api/runtimes/:id accepts { visibility } — owner/admin only,
  validated against the same private/public allow-list.

Frontend
- Create-agent dialog renders other-owned private runtimes disabled with a
  Lock badge + tooltip explaining who to ask.
- Inspector runtime-picker disables the same set so re-binding fails
  the same way at the UI layer.
- Runtime detail diagnostics gains a Visibility editor (owner/admin) or
  read-only chip (everyone else).
- Runtime list shows a private/public chip next to the name.

Tests
- Go: canUseRuntimeForAgent truth table; CreateAgent / UpdateAgent
  end-to-end gate tests (admin / runtime owner / plain member);
  PATCH visibility owner / admin / member / invalid-value coverage.
- Vitest: create-agent dialog disabled state on private/public runtimes,
  default-runtime selection skips locked rows; runtime detail visibility
  editor → mutation, read-only fallback.

Migrating runtimes: existing rows default to private to preserve the
"owner only" status quo. Owners switch to public via the detail page
diagnostics card.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): apply timezone+visibility atomically; don't seed locked template runtime

Two issues surfaced in review of MUL-2062:

1. PATCH /api/runtimes/:id ran the timezone branch first, which:
   - returned early on a tz no-op, silently dropping a concurrent
     `visibility` patch in the same body;
   - committed the timezone mutation (+ usage rollup rebuild) before
     validating visibility, so an invalid visibility left the row
     half-updated.

   Validate every field first, then run the mutations in order. The
   no-op short-circuit now only triggers when nothing else is requested.

2. The Create Agent dialog in duplicate mode unconditionally seeded
   `template.runtime_id` as the selected runtime, even when that runtime
   is now private and owned by someone else — the user saw a selected
   row they couldn't submit (Create → backend 403). Fall back to the
   first usable runtime when the template's runtime is locked, and gate
   the Create button on `selectedRuntimeLocked` as defense in depth.

Tests:
- Go: TestUpdateAgentRuntime_CombinedPatchAppliesBoth (tz no-op +
  visibility flip), TestUpdateAgentRuntime_InvalidVisibilityDoesNotMutateTimezone
  (atomic-fail invariant).
- Vitest: duplicate template pointing at a locked runtime now seeds
  the first usable one; Create button stays disabled when no usable
  alternative exists.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-11 22:53:07 +08:00
Bohan Jiang
f5c2994aed feat(workspace): revoke a member's runtimes when they leave or are removed (#2401)
* feat(workspace): revoke a member's runtimes when they leave or are removed

Previously, leaving or being removed from a workspace only deleted the
member row — every runtime the departed user owned in that workspace
remained in the DB, kept its daemon_token valid, and stayed reachable to
the workspace's other members. The departed user lost access but their
machine kept doing work.

This change converges the runtime state in the same transaction as the
member-row deletion: agents pinned to those runtimes are archived,
in-flight tasks are cancelled (so the daemon's per-task status poller
interrupts the running agent gracefully), the runtimes are forced
offline, and the daemon_token rows are deleted. After commit the
DaemonTokenCache is invalidated and agent:archived / daemon:register
events fire so connected clients reconcile immediately.

Server-side state convergence is the production safety net; the
daemon_token revoke takes effect once the mdt_ flow is live (today most
daemons fall back to PAT/JWT, and the member-row deletion is what stops
those requests via requireWorkspaceMember).

Daemon-side handling (recognising the resulting 401/404 and tearing down
the local pairing for that workspace) lands in a follow-up.

Co-authored-by: multica-agent <github@multica.ai>

* fix(workspace): also cancel tasks for archived agents on member revoke

CancelAgentTasksByRuntime only matched tasks whose runtime_id was in the
revoked set, missing a real path: agent.runtime_id can be reassigned via
UpdateAgent, but agent_task_queue.runtime_id keeps the value from when
the task was queued. So an agent currently bound to the leaving member's
runtime gets archived correctly, but its older tasks still pinned to a
prior runtime stay 'queued' — and ClaimAgentTask does not gate on
agent.archived_at, so those orphaned tasks remain claimable by the
prior runtime.

Replace CancelAgentTasksByRuntime with CancelAgentTasksByRuntimeOrAgent,
which OR-matches runtime_ids and the archived agent IDs in one UPDATE.
Pass the archived agent IDs through from revokeAndRemoveMember.

Adds TestDeleteMember_CancelsTasksFromAgentReassignment as a regression
guard: same agent, two runtimes, the older task on the surviving runtime
must end up cancelled while the surviving runtime stays online.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-11 15:06:50 +08:00
Multica Eve
d6349c16ec feat(runtime): per-runtime timezone for token-usage aggregation (MUL-1950) (#2394)
* feat: per-runtime timezone for token usage aggregation

The runtime token-usage charts (daily and hourly tabs on the
runtime-detail page) bucketed every event by the Postgres session
timezone, which is UTC in production. For an operator in UTC+8 that
meant a Tuesday afternoon's tasks landed in Tuesday early-morning's
bar — the chart was always one off.

Fix: store an IANA timezone on agent_runtime and aggregate under it.

* migrations 081 / 082 add agent_runtime.timezone (TEXT NOT NULL
  DEFAULT 'UTC') and rebuild the rollup pipeline (window function
  and both trigger functions) to compute bucket_date with
  AT TIME ZONE rt.timezone instead of bare DATE().
* No historical backfill — task_usage_daily rows already on disk
  keep their UTC bucket_date; only future writes / re-touches
  recompute under the new tz. (Product call from MUL-1950: 'guarantee
  future correctness'.)
* runtime_usage.sql gains a @tz parameter on ListRuntimeUsage and
  GetRuntimeUsageByHour and threads tz through GetRuntimeTaskHourly  Activity. ListRuntimeUsageDaily reads bucket_date as-is since the
  rollup already wrote it in tz.
* parseSinceParamInTZ replaces the raw N×24h cutoff with start-of-
  day-N in the runtime's tz so 'last 7 days' lines up with bucket
  boundaries.
* Daemon registration sends the host's IANA tz (TZ env, then
  time.Local), and UpsertAgentRuntime preserves any user override
  via a CASE-on-existing-value pattern so a daemon reconnect can't
  silently revert the operator's setting.
* New PATCH /api/runtimes/:id endpoint (UpdateAgentRuntime) lets
  the runtime detail page edit the tz; the editor seeds with the
  browser tz on first interaction.

Refs: MUL-1950

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix: harden runtime timezone rollups

Co-authored-by: multica-agent <github@multica.ai>

* fix: address runtime timezone review nits

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Eve <eve@multica-ai.local>
2026-05-11 14:39:35 +08:00
Multica Eve
ce00e05169 Add canonical PostHog core metrics events (#2302)
* Add canonical PostHog core metrics events

Co-authored-by: multica-agent <github@multica.ai>

* Address analytics review feedback

Co-authored-by: multica-agent <github@multica.ai>

* Tighten analytics review follow-ups

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 13:12:00 +08:00
LinYushen
cc527c34be perf(heartbeat): batch runtime last_seen_at writes (#2213)
Batches runtime heartbeat last_seen_at updates while preserving the 60s flush / 150s sweeper stale-window invariant. Also drains pending heartbeat writes during graceful shutdown.
2026-05-07 15:50:27 +08:00
LinYushen
250ada1fb3 chore(db): drop unused agent_task_queue.last_heartbeat_at (#2212)
Drops the unused agent_task_queue.last_heartbeat_at column and removes the hot-path task heartbeat write.
2026-05-07 15:45:29 +08:00
Bohan Jiang
09f04847d3 feat(server): redis-backed runtime liveness with DB fallback (#2121) 2026-05-06 14:31:33 +08:00
Bohan Jiang
b1345685a3 fix(task): rerun starts a fresh session, skip poisoned resume (#1928)
* fix(task): rerun starts a fresh session, skip poisoned resume

When a task ended in a known agent fallback ("I reached the iteration
limit and couldn't generate a summary.", "Put your final update inside
the content string. Keep it concise.") the (agent_id, issue_id) resume
lookup would still pick that session, so a manual rerun inherited the
poisoned state and reproduced the same bad output.

Two complementary guards:

1. Daemon classifies poisoned terminal output and routes it through the
   blocked path with failure_reason set ('iteration_limit' /
   'agent_fallback_message'). GetLastTaskSession excludes failed tasks
   with those reasons, so even comment-triggered tasks no longer resume
   them. Tasks that failed mid-flight (timeout, runtime_recovery, etc.)
   are still resumable, preserving MUL-1128's auto-retry contract.

2. Manual rerun marks the new task force_fresh_session=true. The daemon
   claim handler skips the resume lookup entirely when the flag is set,
   capturing the user-intent signal that "the prior output was bad" even
   when poisoned classification misses a future fallback wording.

Auto-retry of orphaned mid-flight failures (MaybeRetryFailedTask →
CreateRetryTask) does not take this path, so it keeps resuming.

Tests: classifyPoisonedOutput unit test; integration tests assert the
SQL filter excludes poisoned classifiers, RerunIssue flips the flag,
and the normal enqueue path leaves it false.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): cap poisoned-output matcher to short trimmed text

GPT-Boy review on MUL-1630: the previous strings.Contains match would
classify any output that quoted the marker substring — including a
review/analysis that simply discussed the marker itself. Real fallback
messages are short single-sentence affairs, so cap the candidate at
~one paragraph and trim whitespace before matching. Adds regression
tests covering a long quoting review and a marker buried in a long
real conclusion; both must stay classified as completed.

Co-authored-by: multica-agent <github@multica.ai>

* fix(migrations): rename 065 force_fresh_session → 066 to clear collision

main introduced 065_project_resources after this branch was cut, so
both files shared the 065_ prefix. The readiness check
(server/cmd/server/health.go → migrations.LatestVersion) takes the
last entry by lexical order, which is 065_project_resources, leaving
this branch's 065_force_fresh_session unguarded — a deploy that
applied project_resources but not force_fresh_session would still
report ready, and the next enqueue / rerun / claim would crash on
"column force_fresh_session does not exist".

Renaming to 066_force_fresh_session puts it strictly after
project_resources so readiness blocks until it's applied.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-04-30 14:17:53 +08:00
Naiyuan Qing
f745a3bbbe feat(agent): presence v3 + execution log + trigger summary (#1823)
* refactor(views): migrate agent/runtime/skill lists to TanStack DataTable

Replace the per-page CSS Grid + minmax(min, fr) + sticky-first-col + truncate
implementation with a TanStack Table backend rendered through a Dice UI-style
DataTable shell. Column widths are now px-based via column.size, so cells
no longer shrink or auto-truncate as the viewport narrows; when the sum of
columns exceeds the viewport, the container scrolls horizontally instead.

- Add @tanstack/react-table to the catalog (8.21.3) and wire it into
  packages/ui (dep) and packages/views (peerDep).
- packages/ui: new DataTable + DataTableColumnHeader + lib/data-table.ts
  (getColumnPinningStyle), adapted from Dice UI's registry. The shell
  renders <table> directly (skipping shadcn's <Table> wrapper) so its own
  outer overflow controls both axes — no nested overflow conflicts.
- packages/views: each list now declares ColumnDef[] with explicit
  cell renderers. Row click navigates to detail via onRowClick (instead of
  wrapping <tr> in <a>, which is invalid HTML); kebab dropdowns
  stopPropagation so they don't trigger the row navigation.
- Drop the previous AGENT_LIST_GRID / GRID_WITH_OWNER / ROW_GRID
  templates and the sticky-first-col / subgrid mechanics that came with
  them. agent-list-item.tsx is removed; runtime-list.tsx and
  skills-page.tsx are trimmed to thin wrappers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(agent): cap description at 255 chars (db + api + ui)

Symmetric enforcement across DB, server, and UI:

- Migration 060: pre-flight truncate of any oversize rows, then ADD
  CONSTRAINT NOT VALID + VALIDATE CONSTRAINT so the new check doesn't
  block writes during validation.
- Server handler validates utf8.RuneCountInString on Create/Update and
  rejects over-limit input with 400.
- Front-end gets AGENT_DESCRIPTION_MAX_LENGTH in core/agents/constants
  (single source of truth shared by the create dialog + edit modal +
  test suite) and a CharCounter component that warns at 90% and errors
  past the cap.
- Description editor moves from a 288px popover to a roomy modal.
  Editor body is mounted only while the dialog is open, so the local
  draft state is locked in at mount time and never reset by an external
  WS update — the React-recommended replacement for the
  useEffect(reset, [value]) anti-pattern.

Counted in code points everywhere (rune count / spread length /
char_length) so multibyte input agrees across all three layers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(views): data-table polish across runtime + skill lists

Builds on the DataTable migration in 2be0f287:

- Add ColumnMeta.grow flag — declared via TanStack module augmentation
  in ui/lib/data-table.ts. Columns marked meta.grow skip their inline
  width so fixed table-layout assigns them the leftover container space
  (no spacer column). The Title-grows / others-fixed pattern from
  Linear / GitHub PR rows.
- Authoritative table min-width = sum of column.size, applied to the
  <table> itself (fixed-layout ignores cell-level min-width per spec,
  so the floor has to live on the table).
- Header tightens to h-8 + uppercase + tracking-wider; pinned cells
  switch to opaque bg + group-hover so they cover content scrolling
  beneath them and follow row hover state.
- Toolbar slot removed from DataTable (callers wrap the toolbar
  themselves now — keeps DataTable single-purpose).

Also: hover-card popup stops contextmenu / auxclick / dblclick from
bubbling out (in addition to click). Stops the popup from triggering
ancestor handlers (e.g. issue list rows) on right-click / middle-click
without breaking Base UI's outside-click dismiss, which listens to
pointerdown — pointerdown is deliberately NOT stopped.

Runtime + skill list pages updated to use the new sizing model.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(agent): drop LastTaskState, introduce 3-state Workload

Continues the presence-model rework started in #1794 / #1798.

The previous LastTaskState union (running / completed / failed /
cancelled / idle) carried historical outcome at the list level — a
runtime-healthy agent whose last task failed showed a sticky red dot
indistinguishable from a daemon-dead agent.

New model: presence is two orthogonal "right-now" dimensions:

  AgentAvailability — runtime reachability only (online / unstable /
                      offline). Drives the dot colour everywhere.
  Workload          — current load (working / queued / idle). Three
                      states, never historical. Failure / completion /
                      cancellation are surfaced via Recent Work + Inbox,
                      not list-level state.

`queued` (= nothing running, ≥1 queued) is an honest "stuck on offline
runtime" signal. To avoid amber flashes during the brief enqueue→claim
race on healthy runtimes, the queued chip composes with availability:
muted on online, warning amber otherwise.

Activity tab cleanup that follows from the new model:
  - failureReasonLabel relocated from agents/presence.ts to
    tabs/task-failure.ts (presence no longer owns historical state).
  - Recent Work paginates (5 initial, +20 per "Show more"); chat-session
    tasks are filtered out of every Agent-scoped surface to keep
    "team work" separate from private chat.
  - Agents page drops the lastTaskFilter chip group; users find broken
    agents via Inbox / Recent Work, not a list-level filter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(task): trigger summary snapshot + task:queued lifecycle event

Two task-lifecycle improvements that ship together because they share
the same enqueue/retry hot paths and changes interleave inside task.go:

1. trigger_summary snapshot (migration 061)

   New nullable column on agent_task_queue. Comment-triggered tasks
   snapshot the comment content; autopilot tasks snapshot the run title.
   Truncated to 200 runes via strings.Builder so multibyte input counts
   correctly without O(N²) concatenation. Snapshot survives source
   edits/deletes — every task row self-describes across surfaces (issue
   detail Execution log, agent activity tooltip, inbox) without joining
   back to the originating row.

   Retry rows inherit the parent's snapshot (CreateRetryTask SELECT) so
   the description stays meaningful across attempts. The UI is
   responsible for stacking "Retry #N" context on top.

2. task:queued WS event

   New protocol event covering the ∅ → queued transition. Front-end
   types/events.ts registers it; use-realtime-sync's task: prefix path
   already invalidates task caches via onAny, so old clients without
   this exact-match subscription still refresh correctly. Specific
   subscribers (sticky banner) get sub-second updates instead of
   waiting for daemon claim.

   Retry path now broadcasts task:queued (not task:dispatch) — same
   status transition shape as enqueue, so all "new task created" paths
   agree on one event type.

   Ordering: broadcastTaskEvent runs *before* notifyTaskAvailable so
   the queued event is published into the WS bus before the daemon is
   poked. Without this, a fast daemon could claim and emit task:dispatch
   over the wire before the in-process queued broadcast fan-out reached
   clients — race window is tiny but unsafe-by-construction.

   Per-agent task list (agentTasksKeys.all) and per-issue task list
   (["issues","tasks"]) added to the task: invalidation set so Activity
   tab Recent Work and the Execution log section stay fresh.

Type contracts: AgentTask gains parent_task_id / attempt /
trigger_comment_id (already returned by the API, just missing from TS)
plus the new trigger_summary field.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(issue): ExecutionLogSection — unified active+past runs panel

Replaces two pieces:
  - the click-to-expand timeline that lived inside AgentLiveCard
  - the standalone TaskRunHistory below the main content

with a single right-panel section that lists every agent run for the
issue. Active runs sit at the top (always visible when present); past
runs collapse behind a "Show past runs (N)" toggle, sorted failed →
cancelled → completed within group.

Active rows show the trigger summary, status + relative time, and
Cancel / Transcript actions on hover (gradient backdrop fades the
status text rather than hard-clipping). Past rows show the same
shape minus Cancel.

Retry tasks prepend "Retry #N · " to the inherited summary so they're
distinguishable from their parent (which would otherwise share the
exact same trigger text).

Cache key registered as issueKeys.tasks(issueId); the global
useRealtimeSync task: prefix path already invalidates ["issues","tasks"]
on every task lifecycle event, so the section stays fresh without
local WS subscriptions.

AgentLiveCard slims down to a header-only "agent is working" sticky
banner — keeps the at-a-glance "is anyone working on this right now"
signal and the Stop / Transcript actions, drops the inline timeline
that ExecutionLogSection now owns. Subscribes to both task:queued and
task:dispatch so retries (which only emit queued) land in the banner
without waiting for daemon claim.

issue-detail mounts ExecutionLogSection in the right panel and removes
the now-defunct TaskRunHistory call site.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 14:50:58 +08:00
LinYushen
0b1333fb00 feat(server): orphan-task recovery + auto-retry + manual rerun (MUL-1128) (#1476)
* feat(server): orphan-task recovery + auto-retry + manual rerun (MUL-1128)

When the daemon process crashed mid-task the issue was stuck at
in_progress for up to 2.5h: the in-flight task timeout was the only
mechanism that ever moved the row, and the runtime heartbeat sweeper
only fires after the runtime stays offline for 45s — a quick restart
beats both windows.

This change implements the A+B plan from the issue thread:

A. lifecycle hygiene
- migration 055 adds attempt / max_attempts / parent_task_id /
  failure_reason / last_heartbeat_at to agent_task_queue
- new daemon-auth endpoint POST /runtimes/{id}/recover-orphans:
  daemon calls it on every register so the server fails any
  dispatched/running tasks the previous process left behind
- new daemon-auth endpoint POST /tasks/{id}/session: persists the
  agent's session_id + work_dir mid-flight so a crash doesn't
  lose the resume pointer (claude+codex emit MessageStatus with
  SessionID; daemon forwards on the first one it sees)
- FailAgentTask / FailStaleTasks / FailTasksForOfflineRuntimes
  now set failure_reason ('agent_error' / 'timeout' /
  'runtime_offline')

B. auto-retry with resume context
- TaskService.MaybeRetryFailedTask spawns a fresh queued attempt
  carrying parent's session_id/work_dir when the failure reason
  is infrastructure-shaped (timeout, runtime_offline,
  runtime_recovery) and attempt < max_attempts; skips autopilot
- wired into the runtime sweeper paths and TaskService.FailTask
  so the user transparently sees a new in_progress run instead of
  a stuck row
- new user-auth POST /api/issues/{id}/rerun + multica issue rerun
  CLI for the manual escape hatch

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(server): address PR review for orphan-task recovery (MUL-1128)

Three review-must-fix items on top of the A+B implementation:

1. recover-orphans now funnels through TaskService.HandleFailedTasks,
   the same shared post-failure pipeline used by the runtime sweeper.
   This guarantees task:failed events are emitted, agent status is
   reconciled, and issues stuck in_progress with no remaining active
   task are reset to todo even when no auto-retry is created
   (max_attempts exhausted, autopilot, non-retryable reason).

2. RerunIssue now uses CancelAgentTasksByIssueAndAgent, scoped to the
   issue's current assignee. The previous implementation called
   CancelAgentTasksByIssue, which would collateral-cancel parallel
   @-mention agents on the same issue.

3. GetLastTaskSession now considers both completed and failed tasks
   (mirroring GetLastChatTaskSession), ordering by the most recent
   timestamp. With UpdateAgentTaskSession pinning session_id/work_dir
   mid-flight, an auto-retry or manual rerun of a daemon-crash failure
   now actually resumes the prior conversation context instead of
   starting fresh — matching the stated B-branch behaviour.

go build / go vet pass; the existing service and agent test suites pass.
runtime_sweeper / handler integration tests require a local DB with the
055 migration (and the pre-existing 050 first_executed_at column).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-22 13:08:37 +08:00
devv-eve
637bdc8eb3 feat(analytics): full PostHog pipeline + 6 funnel events (MUL-1122) (#1367)
* feat(analytics): add PostHog client with async batch shipping

Introduces server/internal/analytics, the shipping layer for the product
funnel defined in docs/analytics.md. Capture is non-blocking — events are
enqueued into a bounded channel and a background worker batches them to
PostHog's /batch/ endpoint. A broken backend drops events rather than
blocking request handlers.

Local dev and self-hosted instances run a noop client until the operator
sets POSTHOG_API_KEY. This is PR 1 of MUL-1122; signup and workspace_created
emission land in the follow-up commit so this change is independently
reviewable.

* feat(server): emit signup and workspace_created analytics events

Wires analytics.Client through handler.New and main, then emits the first
two funnel events:

- signup fires from findOrCreateUser (which now reports isNew), covering
  both the verification-code and Google OAuth entry points — a single
  emission site guarantees Google signups aren't missed.
- workspace_created fires after the CreateWorkspace transaction commits,
  with is_first_workspace computed from a post-commit ListWorkspaces count
  so we can distinguish fresh-user activation from returning-user
  expansion.

Tests use analytics.NoopClient so nothing ships from test runs. PR 1 of
MUL-1122; runtime_registered and issue_executed follow in later PRs per
the plan.

* refactor(analytics): drop is_first_workspace from workspace_created

Stamping "is this the user's first workspace?" at emit time races under
concurrent CreateWorkspace requests: two transactions committing close
together can both read a post-commit count greater than one and both emit
false. Fixing it at the SQL layer requires a schema change we don't want in
PR 1.

PostHog answers the same question exactly from the event stream (funnel on
"first time user does X" / cohort on $initial_event), so removing the
property loses no information and makes the emit side race-free.

* docs(analytics): document self-host safety defaults

Spell out why self-hosted instances never ship events upstream by default
(empty POSTHOG_API_KEY → noop client) and explain how operators can point
at their own PostHog project without any code change.

* feat(analytics): emit runtime_registered, issue_executed, team_invite_*

Three server-side funnel events, all gated on first-time state transitions
so retries and re-runs don't inflate the WAW buckets:

- runtime_registered fires from DaemonRegister when UpsertAgentRuntime
  reports (xmax = 0) — i.e. the row was inserted, not updated. Heartbeats
  and re-registrations stay silent.
- issue_executed fires from CompleteTask after an atomic
  UPDATE issue SET first_executed_at = now() WHERE id = $1 AND
  first_executed_at IS NULL flips the column for the first time. Retries,
  re-assignments, and comment-triggered follow-up tasks hit the WHERE
  clause and no-op. Carries nth_issue_for_workspace so the ≥1/≥2/≥5/≥10
  buckets filter without extra queries.
- team_invite_sent fires from CreateInvitation and team_invite_accepted
  from AcceptInvitation, closing the expansion funnel.

Adds a 050 migration for issue.first_executed_at plus a partial index so
the workspace-scoped executed-count query doesn't scan the never-executed
tail.

* feat(config): surface PostHog key via /api/config

Extends AppConfig with posthog_key / posthog_host sourced from env on
every request (so operators can rotate the key via secret refresh without
a restart). Reading the key off the server — rather than baking it into
the frontend bundle via NEXT_PUBLIC_* — means self-hosted instances
inherit the blank key automatically and never ship events upstream.

* feat(analytics): wire posthog-js identify + UTM capture on the client

Adds @multica/core/analytics — a thin wrapper around posthog-js that owns
attribution capture and identity merge. Posthog-js config comes from
/api/config (not NEXT_PUBLIC_*), so self-hosted instances whose server
returns an empty key automatically run the SDK inert.

captureSignupSource stamps a multica_signup_source cookie with UTM params
and the referrer's origin (never the full referrer — that can leak OAuth
code/state in the callback URL). The backend signup event reads this
cookie on new-user creation.

Identity flows:
- auth-initializer fires identify() right after getMe() resolves, on both
  cookie and token paths. A getConfig/getMe race is handled by buffering
  a pending identify inside the analytics module and flushing it once
  initAnalytics finishes.
- auth store calls identify() on verifyCode / loginWithGoogle /
  loginWithToken and resetAnalytics() on logout so the next login merges
  cleanly without bleeding events.

* docs(analytics): describe runtime_registered, issue_executed, invite events

Fills in the schema for the remaining funnel events. Captures the
design commentary that belongs next to the contract rather than in a PR
description — in particular why issue_executed uses the atomic
first_executed_at flip instead of counting task-terminal events, and why
runtime_registered relies on xmax = 0 rather than a query-then-write.

* fix(analytics): drop non-atomic nth_issue_for_workspace from issue_executed

Computing the workspace's Nth-issue ordinal at emit time is not atomic
under concurrent first-completions — two transactions can both run
MarkIssueFirstExecuted, then both run CountExecutedIssuesInWorkspace, and
both observe count=1 before either has committed, so both events go out
stamped as n=1. Serialising it would mean a per-workspace advisory lock
or a SERIALIZABLE-isolated tx; PostHog answers the same question exactly
at query time via row_number() partitioned by workspace_id, so the
emit-time property adds risk without adding information.

Removes the property from analytics.IssueExecuted, deletes the unused
CountExecutedIssuesInWorkspace query, and regenerates sqlc. The partial
index stays — any future workspace-scoped executed-issue query will want
it.

* fix(analytics): wire $pageview and harden signup_source cookie payload

Two frontend fixes from the PR review:

- PageviewTracker, mounted under WebProviders, fires capturePageview on
  every Next.js App Router path / query-string change. Without this the
  capturePageview helper in @multica/core/analytics was never called and
  the acquisition funnel's / → signup step was empty.
- captureSignupSource now caps each UTM / referrer value at 96 chars
  *before* JSON.stringify, and drops the whole cookie when the serialised
  payload still exceeds 512 chars. Previously the overall slice(0, 256)
  could leave a half-JSON string on the wire that neither the backend nor
  PostHog could parse.

Both capturePageview and identify now buffer a single pending call when
fired before initAnalytics resolves — otherwise the initial "/" pageview
and same-turn login identify race the /api/config fetch and get dropped.
resetAnalytics clears both buffers so a logout→login cycle stays clean.

* fix(analytics): URL-decode signup_source cookie on read

Go does not URL-decode Cookie.Value automatically, so the frontend's
JSON-then-encodeURIComponent payload was landing in PostHog as
percent-encoded garbage (%7B%22utm_source...). Unescape on read so the
backend receives the original JSON string the frontend intended, and
drop values that fail to decode or exceed the server-side cap — sending
truncated garbage is worse than sending nothing. Oversized-cookie guard
matches the frontend's SIGNUP_SOURCE_MAX_LEN.

* docs(analytics): reflect nth-issue drop, $pageview wiring, cookie encoding

Pulls the schema doc back in line with the code: issue_executed no longer
advertises nth_issue_for_workspace (with a note about why PostHog derives
it at query time instead), the frontend $pageview section names the
actual PageviewTracker component that fires it, and the signup_source
section documents the per-value cap / overall drop rule and the
encode-on-write / decode-on-read contract.

---------

Co-authored-by: Jiang Bohan <bhjiang@outlook.com>
2026-04-21 14:42:52 +08:00
Bohan Jiang
a73336dcf8 feat(daemon): persistent UUID identity + legacy-id merge at register-time (#1220)
* feat(daemon): persistent UUID identity + legacy-id merge at register-time

daemon_id is now a stable UUID persisted to `<profile-dir>/daemon.id` on
first start, replacing the hostname-derived id that drifted whenever
`.local` appeared/disappeared, a system was renamed, or a profile
switched — each of which used to mint a fresh `agent_runtime` row and
strand agents on the old one.

To migrate existing installs without operator intervention, the daemon
reports every legacy id it may have registered under previously
(`host`, `host` with `.local` stripped, and `host[-profile]` variants
for both). At register-time the server looks up each candidate row
scoped to (workspace, provider), re-points its agents and tasks onto
the new UUID-keyed row, records which legacy id was subsumed in the
new `legacy_daemon_id` column for audit, and deletes the stale row.
Result: users running `xxx.local`-keyed runtimes today transparently
land on the new UUID row on next daemon restart.

The hostname-prefix `MigrateAgentsToRuntime` / `daemon_id LIKE '...-%'`
compatibility shim is no longer needed and has been removed along with
the handler call that invoked it.

* fix(daemon): handle bidirectional .local drift and case drift in legacy merge

Review on #1220 flagged two gaps in the legacy-id migration candidate set:

1. Reverse .local: LegacyDaemonIDs only added the stripped variant when the
   current hostname ended in `.local`. The opposite direction — DB has
   `foo.local`, current host is `foo` — was missed, so runtimes registered
   under the `.local` variant stayed orphaned after upgrade. Now both
   variants (`foo` and `foo.local`) are always emitted, regardless of what
   `os.Hostname()` currently returns, plus their `-<profile>` suffix forms.

2. Case drift: os.Hostname() has been observed returning different casings
   on the same machine across mDNS/reboot state. A case-sensitive `=`
   comparison stranded rows like `Jiayuans-MacBook-Pro.local` when the
   daemon later reported `jiayuans-macbook-pro.local`. FindLegacyRuntimeByDaemonID
   now uses `LOWER(daemon_id) = LOWER(@daemon_id)` on both sides, so casing
   differences merge rather than orphan. The (workspace_id, provider) prefix
   still bounds the scan to a tiny set of rows so the non-indexed LOWER()
   comparison has negligible cost.

Tests: TestLegacyDaemonIDs gets the mixed-case + reverse-direction cases;
daemon_test.go adds TestDaemonRegister_MergesLegacyDaemonIDRuntime_ReverseDotLocal
and TestDaemonRegister_MergesLegacyDaemonIDRuntime_CaseDrift.

* fix(daemon): consolidate every case-duplicate legacy runtime, not just the first

Follow-up review on #1220: after switching to `LOWER(daemon_id) =
LOWER(@daemon_id)`, the single-row lookup still only merged one legacy
row per candidate. If a machine already had two rows in the DB that
differed only in casing (e.g. `Jiayuans-MacBook-Pro.local` AND
`jiayuans-macbook-pro.local` coexisting because earlier hostname drift
already minted a duplicate), only one of them got consolidated and the
other stayed orphaned — violating the "no duplicate runtime per machine
after backfill" acceptance.

- FindLegacyRuntimeByDaemonID → FindLegacyRuntimesByDaemonID (:many)
- mergeLegacyRuntimes iterates every returned row and dedupes across
  overlapping legacy candidates so `foo` and `foo.local` both resolving
  to the same stored row don't double-process

Test: TestDaemonRegister_MergesAllCaseDuplicateLegacyRuntimes seeds two
case-duplicate rows with one agent each and confirms both rows are
deleted and both agents end up on the new UUID-keyed row.
2026-04-17 15:10:38 +08:00
Jiayuan Zhang
ff5f6ac2ee fix(daemon): prevent duplicate runtime registration on profile switch (#906)
* fix(daemon): prevent duplicate runtime registration on profile switch

The daemon_id included a profile name suffix (e.g. "hostname-staging"),
so switching profiles created a new daemon_id that bypassed the UPSERT
dedup constraint, leaving orphaned runtime records in the database.

Three changes:
- Remove profile suffix from daemon_id — use stable hostname only.
  The unique constraint (workspace_id, daemon_id, provider) already
  prevents collisions within the same workspace.
- Auto-migrate agents from old offline runtimes to the newly registered
  runtime during DaemonRegister (same workspace/provider/owner).
- Add TTL-based GC in the runtime sweeper to delete offline runtimes
  with no active agents after 7 days.

Closes MUL-695

* fix(daemon): address code review issues on PR #906

1. Move gcRuntimes() to the main sweep loop — previously it was inside
   sweepStaleRuntimes() after an early return, so it only ran when new
   runtimes were marked stale. Now it runs every sweep cycle independently.

2. Fix DeleteStaleOfflineRuntimes to exclude runtimes with ANY agent
   reference (not just active ones). The FK agent.runtime_id is ON DELETE
   RESTRICT, so archived agents also block deletion.

3. Scope MigrateAgentsToRuntime to the same machine by matching
   daemon_id LIKE '<current_daemon_id>-%'. This prevents cross-machine
   agent migration when the same user has multiple devices.
2026-04-14 01:52:34 +08:00
Bohan Jiang
6209e2f3ae fix(server): allow deleting runtimes when all bound agents are archived (#589)
Previously, runtimes could never be deleted once an agent was created
because agents can only be archived (not deleted) and the count check
included archived agents. Now the check only counts active agents, and
archived agents are cleaned up before runtime deletion.
2026-04-09 19:17:54 +08:00
LinYushen
ff27a249cc feat(runtime): add owner tracking, filtering, and delete (#535)
Add owner_id to agent_runtime table to track who registered each runtime.
Backend: new delete endpoint with role-based permissions (owner/admin can
delete any, members only their own), list filtering by owner (?owner=me),
and agent dependency check before deletion.
Frontend: Mine/All filter toggle in runtime list, owner display in list
items and detail view, delete button with AlertDialog confirmation.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 13:38:46 +08:00
Jiayuan
b112d1f1ae feat(tasks): add coalescing queue and task lifecycle guards
- Coalescing queue: use HasPendingTaskForIssue (queued/dispatched only)
  instead of HasActiveTaskForIssue so comments during a running task
  enqueue exactly one follow-up task that picks up all new comments.
- Stale task cleanup: runtime sweeper now fails orphaned tasks when
  their runtime goes offline (daemon crash/network partition).
- Cancel-aware daemon: handleTask checks task status after execution
  and discards results if the task was cancelled mid-run (e.g. reassign).
- Terminal issue guard: ClaimTaskForRuntime auto-cancels pending tasks
  for done/cancelled issues instead of executing them.
- Race condition safety net: unique partial index ensures at most one
  pending task per issue at the DB level.
2026-03-29 17:52:35 +08:00
Jiayuan
b3bbf92a1d fix(runtime): add server-side sweeper to detect stale runtimes
The only path to marking a runtime offline was the daemon's deregister
call on graceful shutdown. If the daemon crashed, was killed, or lost
network, the status stayed "online" forever. Add a background goroutine
that sweeps every 30s and marks runtimes offline after 45s without a
heartbeat (3 missed intervals).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-29 14:22:12 +08:00
Jiayuan
38d595d81d feat(cli): restructure CLI commands for better UX
- Add top-level `multica login` that combines auth + workspace auto-discovery
- Restructure daemon into subcommands: start, stop, status, logs
- Add background daemon mode with PID management
- Add daemon deregistration on shutdown (new API endpoint + SQL query)
- Remove unused commands: runtime list, status, agent get/delete/stop
- Make `config` show config directly instead of requiring `config show`
- Update README to reflect new CLI structure

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-29 01:43:45 +08:00
Jiayuan Zhang
cdfa63af15 feat(runtime): add local codex daemon pairing 2026-03-24 12:03:14 +08:00