Commit Graph

183 Commits

Author SHA1 Message Date
Jiayuan Zhang
fc8528d64d feat(autopilot): support assigning to a squad (MUL-2429) (#2888)
* feat(autopilot): support assigning autopilot to a squad (MUL-2429)

Path A (Squad-as-Leader) from the RFC: when an autopilot's assignee is a
squad, dispatch resolves to squad.leader_id and executes against the
leader's runtime — semantics match a human manually assigning the issue
to that squad, no fan-out.

Backend scope only; frontend picker change is a follow-up PR.

Changes:
- 096_autopilot_squad_assignee migration: drop agent FK on
  autopilot.assignee_id, add assignee_type column (default 'agent'),
  add autopilot_run.squad_id attribution column.
- service.AgentReadiness: single source of truth for archived /
  runtime-bound / runtime-online checks. Shared by autopilot
  admission gate, run_only dispatch, and isSquadLeaderReady.
- service.resolveAutopilotLeader: translates assignee_type/id to the
  agent that actually runs the work.
- dispatchCreateIssue: stamps issue with assignee_type='squad' for
  squad autopilots and enqueues via EnqueueTaskForSquadLeader.
- dispatchRunOnly: belt-and-braces readiness re-check after resolving
  squad → leader so a leader that went offline between admission and
  dispatch produces a clean failure instead of a doomed task.
- handler.CreateAutopilot / UpdateAutopilot: accept assignee_type with
  squad/agent existence + leader-archived validation. Backward-compatible
  default of "agent" preserves the contract for older clients.
- Analytics: AutopilotRunStarted/Completed/Failed events carry
  assignee_type and squad_id; PostHog can now group autopilot runs by
  squad without joining back to the autopilot row.

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): reject archived squads, route post-admission skips, cleanup dangling-agent autopilots (MUL-2429)

Addresses three review findings on PR #2888:

1. Archived squad handling: validateAutopilotAssignee now rejects squads
   with archived_at set; resolveAutopilotLeader returns errSquadArchived
   so the admission gate fails closed; DeleteSquad now mirrors the issue
   transfer for autopilot rows (TransferSquadAutopilotsToLeader) so
   surviving autopilots flip to assignee_type='agent' (leader) instead
   of dangling at the archived squad.

2. dispatchRunOnly post-admission readiness: introduces errDispatchSkipped
   sentinel, recognised by DispatchAutopilot via handleDispatchSkip so
   the run is recorded as `skipped` (not `failed`). Manual triggers no
   longer 500 when the leader's runtime goes offline between admission
   and task creation. New TestManualTriggerDoesNotErrorOnPostAdmissionSkip
   locks the behaviour in.

3. Dangling agent assignee after migration 096 dropped the FK:
   shouldSkipDispatch now distinguishes pgx.ErrNoRows / errSquadArchived
   (hard skip — retrying won't help) from transient DB errors
   (fail-open). DeleteAgentRuntime pauses autopilots that target agents
   about to be hard-deleted (ListArchivedAgentIDsByRuntime +
   PauseAutopilotsByAgentAssignees) so the breakage surfaces as a paused
   row in the UI instead of a quiet skip-burning loop.

Unit tests cover the sentinel unwrap contract and errSquadArchived
errors.Is behaviour. Integration test
TestAutopilotDispatchSkipsWhenRuntimeOffline re-verified against a fresh
DB with migration 096 applied.

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): bump last_run_at on post-admission skip (MUL-2429)

Match recordSkippedRun (pre-flight skip) and the success path so the
scheduler / "last seen" UI both reflect that this tick evaluated the
trigger, even when the post-admission readiness gate caught a late
regression.

Addresses Emacs review caveat #1 on PR #2888.

Co-authored-by: multica-agent <github@multica.ai>

* feat(autopilot): mixed agent/squad assignee picker in dialog (MUL-2429)

End-to-end UI for assigning an autopilot to a squad. Closes the PR #2888
backend gap: the squad-as-assignee feature was already wired in Go (Path A,
RFC §4) but the desktop dialog never offered the choice.

- core/types/autopilot: add `AutopilotAssigneeType`, surface
  `assignee_type` on `Autopilot` + Create/Update request payloads.
- views/autopilots/pickers/agent-picker: switch to a polymorphic
  AssigneeSelection (`{type, id}`); render agents and squads as two
  grouped sections with shared pinyin search.
- views/autopilots/autopilot-dialog: maintain `assigneeType` state, send
  it on create/update, render the trigger avatar / hover dot with
  `assignee.type`.
- views/autopilots/autopilots-page + autopilot-detail-page: render the
  assignee row using `autopilot.assignee_type` so squad-typed autopilots
  show the squad avatar + name, not a broken agent lookup.
- locales: add `agents_group` / `squads_group` / `select_assignee` keys
  (en + zh-Hans), keep legacy `select_agent` for callers that still
  reference it.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 05:30:13 +02:00
Jiayuan Zhang
2ad1cd8ff8 feat(profile): user profile description injected into agent brief (MUL-2406)
## Summary

Adds per-user `profile_description` so coding agents have cheap, durable context about who is asking. v1 per the brief Xeon locked in on [MUL-2406](mention://issue/63a7247c-4f6a-42cf-90d1-7c746e77158a):

- **DB** — `user.profile_description TEXT NOT NULL DEFAULT ''` (migration 096). 2000-rune cap enforced server-side. No nullable / privacy state to manage.
- **API** — `PATCH /api/me` accepts the field; `UserResponse` always emits it. Client wraps `updateMe` in a lenient `UserSchema` + `EMPTY_USER` fallback per CLAUDE.md API Response Compatibility.
- **UI** — Settings → Account gains an "About you" textarea with live `n/2000` counter, `maxLength` guard, and a localized too-long error (EN + zh-Hans).
- **CLI** — `multica user profile get` / `multica user profile update` with `--description / --description-stdin / --description-file / --clear`, mirroring the existing `issue comment add` input-mode menu.
- **Daemon injection** — claim handler resolves the runtime owner and stamps `requesting_user_name` + `requesting_user_profile_description` on the task. `buildMetaSkillContent` emits `## Requesting User` between `## Agent Identity` and `## Available Commands`, blockquoted and framed as background context. The block is omitted entirely when the description is empty (no token cost when unused).

Brief is written **once per task** via `CLAUDE.md` / `AGENTS.md`, not the per-turn prompt — same path the agent already reads for identity, so no extra per-turn cost.

## Test plan

- [x] `go build ./...`, `go vet ./...`, `go test ./internal/cli/ ./internal/daemon/ ./internal/daemon/execenv/ ./cmd/multica/`
- [x] New brief tests: `TestBuildMetaSkillContentEmitsRequestingUser`, `TestBuildMetaSkillContentOmitsRequestingUserWhenEmpty`
- [x] `pnpm typecheck`, `pnpm lint`, `pnpm test` (74 files, 644 tests pass)
- [ ] Handler DB tests (`TestUpdateMe*`) require a migrated test DB — not runnable in this sandbox
- [ ] Manual: open Settings → Account, set a description, confirm the next daemon-run agent's `CLAUDE.md` shows `## Requesting User`
2026-05-19 19:51:28 +02:00
Bohan Jiang
54368fd826 feat(projects): scheduled-only Gantt data source + WS reactivity (MUL-1881) (#2856)
* feat(projects): scheduled-only Gantt data source + WS reactivity (MUL-1881)

Project Gantt now fetches its own scheduled-only data instead of riding the
Board/List pagination cache. The Unscheduled drawer and pagination warning
banner are gone, and any WS-driven issue change (create / update / delete)
invalidates the new cache so the timeline stays live.

- Backend: `GET /api/issues?scheduled=true` adds an
  `(i.start_date IS NOT NULL OR i.due_date IS NOT NULL)` predicate on both
  ListIssues and CountIssues. New SQL filter is plumbed through sqlc + handler.
- Frontend: new `projectGanttIssuesOptions(wsId, projectId)` issues a single
  fetch and lives under its own cache key. WS handlers and mutations
  invalidate the prefix on create/update/delete so the bar reacts to
  start_date / due_date changes from other tabs and from this tab without
  waiting on the WS round-trip.
- GanttView: drops the Unscheduled section, the pagination warning banner,
  and the load-all button; renders only scheduled rows.
- Removes now-dead `useLoadAllRemaining`, `myIssueListPaginationOptions`,
  `summarizeIssueListPagination`, and the gantt locale strings that
  supported the old plumbing.

Co-authored-by: multica-agent <github@multica.ai>

* fix(projects): page through Gantt fetch and isolate per-view data sources

- Walk paginated `scheduled=true` issues until total is reached so projects
  with more than 500 scheduled bars no longer silently truncate.
- Gantt mode disables the bucketed Board/List query and reads its own
  scheduled cache for the project empty-state check, so the page never
  short-circuits Gantt with a Board-derived "no issues" CTA.
- `onIssueLabelsChanged` patches matching rows in the Project Gantt cache
  in-place, keeping label filters consistent after attach/detach from
  other tabs or agents.

MUL-1881

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-19 17:04:16 +08:00
Naiyuan Qing
93153d08b7 feat(my-issues): cover squad assignees via involves_user_id (MUL-2397) (#2829)
Re-introduces the `involves_user_id` filter on the issues list / open-list /
count / grouped paths, but with the semantics nailed down for the second time
around: tab 3 surfaces issues whose assignee is an *indirect* extension of the
user (owned agent, or a squad they're a human member of / lead via owned agent
/ have an owned agent inside) — and explicitly NOT direct member assignment,
which is tab 1's meaning.

- server/pkg/db/queries/issue.sql: 4-branch filter on ListIssues /
  ListOpenIssues / CountIssues. Each subquery clamps workspace_id because
  issue.assignee_id is polymorphic with no FK. Leader resolution reads
  squad.leader_id directly, not the squad_member copy row (squad.go ignores
  errors when seeding that copy, so it can be missing). FindActiveDuplicateIssue
  switched from positional $2/$3/$4 to named sqlc.arg() — pure hygiene so the
  generated struct field names don't drift when new nargs are added.
- server/internal/handler/issue.go: parse involves_user_id and plumb it into
  the three sqlc params; ListGroupedIssues (hand-written dynamic SQL) gets a
  mirrored 4-branch fragment, no shortcut.
- packages/core: ListIssuesParams / ListGroupedIssuesParams / MyIssuesFilter /
  api.listIssues / api.listGroupedIssues all carry the new param through.
- packages/views/my-issues: tab 3 switches from client-side agent-fanout to
  involves_user_id=user.id. agentListOptions import and the myAgentIds memo
  go away.
- server/internal/handler/issue_involves_test.go: 13 integration tests cover
  every branch (positive + cross-workspace negatives) plus the critical
  ExcludesDirectMemberAssignee negative on BOTH the sqlc and the grouped paths,
  locking tab 3 ∩ tab 1 = ∅.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-19 10:37:38 +08:00
Naiyuan Qing
5476e7678d Revert "feat(my-issues): cover squad assignees via involves_user_id (MUL-2364…" (#2828)
This reverts commit 3c510c31ed.
2026-05-19 09:31:43 +08:00
Naiyuan Qing
3c510c31ed feat(my-issues): cover squad assignees via involves_user_id (MUL-2364) (#2801)
* feat(my-issues): cover squad assignees via involves_user_id (MUL-2364)

The "My Agents" tab on /my-issues only resolved agents owned by the
caller, so issues assigned to squads (member, leader, or agent-member of
mine) never surfaced. This added a UNION-based involves_user_id filter
that the backend expands to "me + agents I own + squads I relate to" in
a single query.

- SQL: ListIssues / ListOpenIssues / CountIssues accept narg
  involves_user_id and OR a workspace-scoped 3-branch UNION on the
  squad assignee subquery. Leader is sourced from canonical
  squad.leader_id (not the best-effort squad_member copy row whose
  AddSquadMember error is dropped in squad.go:177-188 and :259-263).
- Handler: parses involves_user_id via parseUUIDOrBadRequest, plumbs
  into all three list params, and mirrors the same UNION fragment into
  the grouped dynamic SQL path.
- Frontend: ListIssuesParams / ListGroupedIssuesParams / MyIssuesFilter
  gain involves_user_id; api client forwards it to the querystring.
- My Issues page: "agents" scope now passes involves_user_id instead of
  fanning out owned-agent IDs client-side. Tab label widens to
  "我的智能体 / 小队" / "My Agents / Squads".
- Tests: Go suite covers all three squad relations including the
  canonical-leader-without-squad_member-copy variant, cross-workspace
  isolation for agent / leader / squad_member branches, combination
  with creator_id, and the malformed-UUID 400 path. Client test pins
  the involves_user_id querystring wiring for both list endpoints.

The FindActiveDuplicateIssue query gets explicit sqlc.arg() names so
sqlc regeneration keeps the existing struct field names regardless of
the local sqlc version (no behavior change).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(my-issues): tighten cross-workspace negatives for involves_user_id UNION

Cross-workspace negative tests previously put both the foreign actor and the
foreign issue in the foreign workspace, so the outer i.workspace_id = $1
already excluded the row before the UNION branches were exercised. Stripping
a.workspace_id = $1 / s.workspace_id = $1 from any of the UNION subqueries
would not have failed the tests.

Rewrite the three existing negative cases to seed the issue in
testWorkspaceID with a polymorphic assignee_id pointing at a foreign-workspace
agent or squad (issue.assignee_id has no FK per migrations/001_init.up.sql:61).
Now each UNION branch must enforce its own workspace scoping for the issue to
stay out of the result.

Also add ExcludesOtherWorkspaceSquadAgentMember: the squad_member.agent UNION
branch had only positive coverage; this test pins that s.workspace_id = $1
and a.workspace_id = $1 must both hold there too.

Verified by mutation: stripping the workspace clause from each branch makes
the corresponding test fail.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-19 09:01:51 +08:00
Bohan Jiang
6f5fbb7813 feat(comments): thread-aware list with composite cursor (MUL-2340) (#2787)
* feat(comments): thread-aware list with composite cursor (MUL-2340)

Adds three optional query params to GET /api/issues/{id}/comments and the
matching `multica issue comment list` flags:

- `thread=<comment-uuid>` resolves the anchor to the thread root via a
  recursive CTE (defends against any future nested replies) and returns
  root + all descendants chronologically. Anchor can be any comment in
  the thread, root or reply.
- `recent=<N>` returns the newest N comments for the issue, ordered
  chronologically in the response.
- `before=<RFC3339>` + `before-id=<uuid>` form a composite cursor for
  stable pagination of `recent`. Both must be set together; a
  timestamp-only cursor is rejected because ties on `created_at` would
  let the existing `(created_at ASC, id ASC)` total order skip or
  duplicate rows across pages.

Flag combination rules: `thread` is exclusive with `recent` and the
cursor; both may combine with `since`. Server and CLI enforce the same
matrix; the CLI fails fast locally so callers don't pay for a 400
round-trip.

Default behaviour (no params) is unchanged — full chronological dump
capped at commentHardCap — so the desktop UI and existing `--since`
polling are untouched. Agent prompt updates land in a follow-up PR so
the new CLI capabilities ship and bake first.

Co-authored-by: multica-agent <github@multica.ai>

* fix(comments): reject cursor without recent and align CLI/server on invalid --recent (MUL-2340)

Elon's PR #2787 second review flagged two gaps in the flag combination
matrix:

- server: GET /comments?before=...&before_id=... without `recent` was
  silently dropped by fetchCommentsForList (RecentN=0 fell through to
  the default / since path), so callers got the full timeline instead
  of the documented "before X" semantics. Now returns 400.
- CLI: --recent 0 / --recent -3 were collapsed with "flag not passed"
  by `recent > 0`, so an explicit invalid value silently fell back to
  the default list. Switched to Flags().Changed("recent") so explicit
  non-positive values fail loudly. Also enforces that --before /
  --before-id only appear with explicit --recent (mirrors the new
  server-side rule).

Tests:
- server flag matrix gains `before + before_id without recent → 400`.
- CLI gains TestRunIssueCommentListFlagGuards covering `--recent 0`,
  `--recent -3`, cursor-without-recent, and the thread/recent
  exclusivity path under the new Changed()-based check. The mock
  server fatals if a request reaches /comments, proving the guards
  fire before any HTTP round-trip.

Co-authored-by: multica-agent <github@multica.ai>

* feat(comments): make `recent` thread-grouped with a thread cursor (MUL-2340)

Bohan pushed back on the row-based `recent=N` shape: comments form a tree,
not a list, and the newest N rows can come from N unrelated threads, giving
the agent N disjoint conversational tails. Replace the row-based query with
a thread-grouped one before #2787 merges so we never ship the wrong shape:

- `recent=N` now returns the N most recently active threads (root + every
  descendant per thread). A thread's recency is MAX(created_at) across its
  whole subtree, so a stale-but-recently-replied thread outranks an old
  quiet one — exactly the property row-recent loses.
- The cursor is now a *thread* cursor: `before` = a thread's
  last_activity_at, `before_id` = its root comment id. The pair walks
  threads strictly less recent than the page's oldest-active thread. The
  cursor surfaces via `X-Multica-Next-Before` / `X-Multica-Next-Before-Id`
  response headers (empty when there are no older threads); the CLI
  forwards the same pair to stderr after listing.
- Row-based `recent` is gone — there is no internal caller and the prompt
  update has not shipped yet, so there is no compat surface to preserve.
- Response body shape unchanged (flat JSON array, chronological). Default
  and `--since` paths untouched. Desktop UI keeps working.

Tests:
- recent=1 returns the freshest-active thread fully; recent=2 returns both
  with the older-active thread first (oldest-active → freshest tail).
- Stale-but-fresh: a thread whose root is older but has a fresh reply
  outranks a thread whose root is newer but quiet.
- Cursor headers emitted only on full pages; empty on the final page.
- Pagination walks threads root2 → root1 → empty, no skips/duplicates.
- Tie-break: three threads sharing last_activity_at paginate one-at-a-time
  using (last_activity_at, root_id) ordering — verifies the timestamp-only
  cursor failure mode is fixed for the thread case too.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-18 19:28:26 +08:00
Jiayuan Zhang
46c1e2c889 feat(squads): show member working status on squad detail page (#2768)
* feat(squads): show member working status on squad detail page

Add a new GET /api/squads/{id}/members/status endpoint that returns each
member's derived working/idle/offline/unstable status, the issues each
agent is currently running, and the last observed activity timestamp.
The Squad detail page's Members tab consumes this snapshot to render a
status pill and an active-issue link next to each agent, with live
refresh wired through the existing task/agent/daemon WS events.

Human members are returned with status=null so the UI can keep them in
the same list without implying a presence signal. Archived agents stay
in the response and surface as offline rather than being filtered out.

Co-authored-by: multica-agent <github@multica.ai>

* fix(squads): address review feedback on member status endpoint

- i18n the "blocked" issue-status pill in squad members tab (was a
  bare literal that failed `i18next/no-literal-string` lint).
- Treat any dispatched/running task as working, even when its
  `agent_task_queue.issue_id` is NULL (chat / quick-create tasks).
  The agent slot is occupied regardless of whether we can render an
  issue link.
- Force `offline` for archived agents so they appear in the list
  but never look like they're still on duty, matching the RFC
  decision in MUL-2319.
- Include `workspaceKeys.squads` in the post-reconnect /
  workspace-switch bulk invalidation so members-status recovers
  after a disconnect during which task/runtime events were missed.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-18 10:35:18 +02:00
Bohan Jiang
2323b72710 feat(autopilots): webhook delivery layer + idempotency/signature/replay (MUL-2334) [PR1] (#2774)
* feat(autopilots): webhook delivery layer + idempotency / signature / replay (MUL-2334)

Splits "inbound webhook receipt" from "autopilot run creation" so we can
record duplicate attempts, signature outcomes, and ignored/skipped
deliveries — and replay a delivery on demand. v1 ingress wrote straight
into autopilot_run.trigger_payload, which collapsed the two concerns and
left run_only autopilots vulnerable to provider retry storms.

Backend only (PR1). UI Deliveries tab follows in PR2.

Schema (migration 093):
  - autopilot_trigger.provider: 'generic' | 'github' (default 'generic').
  - autopilot_trigger.signing_secret: nullable plaintext (HMAC needs it
    cleartext; mirrors how webhook_token is stored).
  - webhook_delivery: one row per inbound POST. Carries raw_body,
    selected_headers, dedupe_key/source, signature_status,
    autopilot_run_id, replayed_from_delivery_id, response_status / body.
  - Partial unique index on (trigger_id, dedupe_key) excludes NULL and
    'rejected' rows, so a wrong-secret 401 does NOT permanently block a
    future retry with the same X-GitHub-Delivery once the operator fixes
    the secret.

Ingress flow (autopilot_webhook.go), persist-first + sync dispatch:
  1. IP rate limit -> 2. token lookup -> 3. token rate limit ->
  4. read raw body -> 5. autopilot/workspace cross-check ->
  6. normalize JSON (400 without persistence on parse failure) ->
  7. compute dedupe key + signature status ->
  8. INSERT delivery (status=queued). On (trigger_id, dedupe_key)
     unique-violation: bump attempt_count on existing row and return
     the original delivery_id + autopilot_run_id with 200 ->
  9. invalid/missing signature: UPDATE -> rejected, return 401 with
     delivery_id (no dispatch, not replayable) ->
 10. trigger disabled / autopilot paused/archived: UPDATE -> ignored,
     return 200 ->
 11. DispatchAutopilot synchronously, UPDATE -> dispatched/skipped/failed
     with autopilot_run_id and the response body we returned ->
 12. TouchAutopilotTriggerFiredAt and return 200.

No new long-running worker. A stale 'queued' row only happens if the
process dies between INSERT and UPDATE; that's a follow-up sweeper, not
this PR.

Authenticated API:
  - GET    /api/autopilots/{id}/deliveries (slim list)
  - GET    /api/autopilots/{id}/deliveries/{deliveryId} (with raw_body)
  - POST   /api/autopilots/{id}/deliveries/{deliveryId}/replay -> creates
    a new delivery row (replayed_from_delivery_id set), dispatches a
    new run, never collapses onto the original via dedupe.
  - PUT    /api/autopilots/{id}/triggers/{triggerId}/signing-secret
    Write-only; trigger response surfaces has_signing_secret +
    signing_secret_hint (last 4 chars), never the secret itself.

Signature verification reuses the GitHub-compatible
X-Hub-Signature-256: sha256=<hex(hmac(body, secret))> scheme; the
HMAC helper is constant-time. Invalid/missing signatures still count
against per-IP and per-token rate limits.

autopilot_run.trigger_payload is intentionally preserved — delivery
records the HTTP receipt; run records the normalized envelope handed
to the agent. They are two different views.

Tests (Postgres-backed):
  - delivery persistence on accept
  - dedupe via Idempotency-Key and X-GitHub-Delivery; run_only retry
    storm pin (3 retries -> 1 run)
  - invalid signature: 401 + rejected row + no run linkage
  - missing signature when secret configured: 401 + 'missing' state
  - valid signature dispatches
  - signing secret never echoed in trigger responses; hint shows last 4
  - min-length and clear-by-empty for signing secret PUT
  - replay creates a NEW delivery + new run; rejected deliveries cannot
    be replayed
  - list omits raw_body; detail includes it; cross-autopilot ID returns
    404 (workspace isolation defense in depth)
  - provider validation: unknown -> 400, github -> 201 round-trips
  - bad-signature stream still counts against per-token rate limit

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilots): address PR review on webhook delivery layer (MUL-2334)

- Exclude `failed` from the (trigger_id, dedupe_key) partial unique index
  alongside `rejected`, so a transient ingress failure does not strand the
  provider's stable X-GitHub-Delivery / Idempotency-Key retry. Update the
  dedupe lookup to prefer non-terminal rows under the same predicate.
- Tighten delivery status enum: drop `skipped` from the CHECK constraint
  and from the handler. A run that was admission-skipped (e.g. runtime
  offline) is now recorded as delivery=`dispatched` linked to the
  skipped run, with the response payload carrying status=`skipped`.
  Source of truth for skipped-ness is autopilot_run.status, not the
  delivery row — keeps the Deliveries UI enum unambiguous.
- On dispatch error, link the (possibly non-nil) autopilot_run returned
  by DispatchAutopilot to the failed delivery so Deliveries UI can
  navigate to the run row for debugging.
- Slim list projection: ListWebhookDeliveriesByAutopilot no longer pulls
  raw_body / selected_headers / response_body — a 100-row page × 256 KiB
  would otherwise round-trip ~25 MiB from Postgres per Deliveries reload.
  Detail endpoint continues to return the full row.
- Fix backend CI: TestGetDelivery_ReturnsFullPayload now decodes the
  response and asserts on the parsed raw_body instead of substring-
  matching against an escaped JSON string; raise the test-suite default
  webhook rate limits in TestMain so the shared 192.0.2.1 IP bucket
  doesn't fill across the suite and leak 429s into unrelated tests.
- Add regression coverage for the dedupe-after-failure path.

cd server && go test ./... is green locally.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-18 14:59:40 +08:00
Kerim Incedayi
9418d2a2c1 feat(autopilots): webhook triggers (server + CLI + UI + docs) MUL-2049 (#2348)
* feat(server): add webhook trigger DB migration + sqlc queries

Lays the foundation for webhook autopilot triggers:
- partial unique index on autopilot_trigger.webhook_token (kind=webhook only)
  so the public ingress route can resolve a trigger in O(1)
- GetWebhookTriggerByToken / TouchAutopilotTriggerFiredAt /
  RotateAutopilotTriggerWebhookToken / SetAutopilotTriggerWebhookToken
  queries, regenerated with sqlc

* feat(server): webhook token generator + payload normalizer

Two pure helpers for the webhook autopilot work:
- generateWebhookToken: 32 random bytes -> base64-url, "awt_" prefix.
  256 bits of entropy keeps brute-force off the table; the prefix makes
  leaked tokens recognisable in logs.
- normalizeWebhookPayload: turns arbitrary JSON into the WebhookEnvelope
  shape (event/eventPayload/request) used by trigger_payload. Header- and
  body-based event inference covers GitHub, GitLab, X-Event-Type, and
  caller-provided envelopes; scalar/empty/invalid bodies are rejected so
  the handler can answer 400.

* feat(server): generate webhook tokens and expose rotate endpoint

- New handler.Config.PublicURL fed by MULTICA_PUBLIC_URL env so
  /api/autopilots/.../triggers responses can include an absolute
  webhook_url alongside the always-present webhook_path.
- CreateAutopilotTrigger now mints a webhook_token via crypto/rand
  for kind=webhook and ignores cron/timezone for non-schedule kinds.
  api triggers stay accepted-but-inert per PLAN.md.
- New POST /api/autopilots/{id}/triggers/{triggerId}/rotate-webhook-token
  protected by the existing workspace auth group; old tokens stop
  working immediately because the unique-index lookup keys on the
  current row value.

* feat(server): public webhook ingress route + per-token rate limiter

- New POST /api/webhooks/autopilots/{token} route, mounted outside the
  authenticated group: the path token is the credential. Workspace
  context is derived from the joined autopilot row, never headers.
- Body capped at 256 KiB via http.MaxBytesReader; oversized payloads
  return 413 mid-read instead of being fully buffered.
- Disabled triggers / paused / archived autopilots return
  200 {"status":"ignored"} so providers stop retrying.
- Skipped-runtime dispatches surface 200 {"status":"skipped"} with the
  reason from the autopilot service's pre-flight admission check.
- WebhookRateLimiter interface with sliding-window in-memory + Redis
  Lua-script implementations. Default 60 req/min per token. Test
  coverage on the in-memory path; Redis variant fails open on cache
  errors so a Redis hiccup never blocks ingress.
- Integration tests exercise token generation, dispatch, payload
  envelope persistence, GitHub-header inference, paused/disabled
  short-circuits, oversized rejection, and rotate-then-old-token-404.

* feat(server): include webhook payload in create_issue description

When an autopilot run is triggered by a webhook and execution_mode is
create_issue, the agent only sees the issue body — never the run's
trigger_payload. Append a 'Webhook event:' line and a fenced JSON block
with the normalized eventPayload so the agent has the inbound context
inline. Schedule / manual runs are unchanged.

Tests cover:
  - schedule path keeps existing italic note, no webhook block
  - webhook path emits event line + payload block, italic before block
  - non-envelope JSON falls back to raw body (defensive)
  - non-webhook source with payload still gets no webhook block

* feat(core): types, API client and mutations for webhook triggers

- AutopilotRunStatus gains 'skipped' so the run-list UI handles the
  admission-skipped state explicitly instead of falling through to a
  generic case (the backend already emits it via MUL-1899).
- AutopilotTrigger picks up optional webhook_path / webhook_url. Both
  are optional so older self-hosted servers that pre-date this change
  still parse cleanly.
- buildAutopilotWebhookUrl helper composes a usable absolute URL with
  the priority webhook_url > apiBaseUrl + path > origin + path > path.
  Tested with seven cases covering each branch.
- ApiClient.rotateAutopilotTriggerWebhookToken posts to
  /api/autopilots/{id}/triggers/{triggerId}/rotate-webhook-token; the
  HTTP-contract test pins URL + method.
- useRotateAutopilotTriggerWebhookToken mutation invalidates
  autopilotKeys.detail on settle, mirroring the existing trigger-mutation
  pattern.

* feat(views): webhook trigger UI in Add Trigger dialog and trigger row

Add Trigger dialog gains a Schedule/Webhook segmented toggle:
  - Schedule reuses TriggerConfigSection unchanged.
  - Webhook hides the cron config and shows a help line; the trigger is
    created with kind=webhook and the URL is generated server-side.
  - Toast text differentiates schedule vs webhook on success.

TriggerRow grows a webhook branch:
  - Webhook icon, kind translated via trigger_kind.
  - URL shown in a truncating monospace pill, with copy + rotate
    buttons. Copy uses navigator.clipboard with toast feedback; rotate
    uses an AlertDialog confirm because the old URL stops working
    immediately.
  - api triggers render a Deprecated badge and skip URL/copy/rotate
    affordances.

RunRow gains a 'skipped' RUN_VISUAL entry (muted dash) so admission-
skipped runs don't fall through to a generic case. Source label uses the
new run_source i18n key instead of capitalize.

Locales: en + zh-Hans gain run_status.skipped, run_source.*,
trigger_kind.*, trigger_row.{copy_url,rotate_url,*_confirm_*,toast_*},
add_trigger_dialog.{type_*,webhook_help,toast_added_{schedule,webhook}}.

* feat(cli): support webhook trigger creation and URL rotation

- multica autopilot trigger-add now takes --kind schedule|webhook
  (default schedule for backward compatibility). For webhook it skips
  --cron / --timezone validation and prints the resulting webhook URL,
  preferring the server-provided webhook_url and falling back to
  client.BaseURL + webhook_path.
- New multica autopilot trigger-rotate-url <autopilot-id> <trigger-id>
  command for rotating the bearer URL of a webhook trigger.

* docs(autopilots): add webhook trigger guide (en + zh)

Replaces the 'Webhook and API triggers are not available yet' section
with end-to-end webhook documentation: how the URL is generated, what
payload shapes are accepted, the inferred-event rules, the bearer-secret
warning + rotate flow, status-code semantics for accepted/skipped/
ignored/4xx/5xx outcomes, and the MULTICA_PUBLIC_URL self-host
configuration.

Run history list now mentions skipped status. The 'unavailable
features' section narrows to api-kind triggers, HMAC signing, IP
allowlists, and provider presets.

* feat(views): add Schedule/Webhook toggle to the create autopilot dialog

Closes the gap where a brand-new autopilot could only be created with a
schedule trigger. The right-column config now has a Trigger section
with a segmented Schedule/Webhook control:
  - Schedule keeps the existing cron/timezone UI.
  - Webhook hides the cron UI and shows a help line; on submit, a
    kind=webhook trigger is created right after the autopilot.

In edit mode the toggle is intentionally hidden (PLAN.md treats trigger-
type changes as delete-old + create-new, not in-place updates), but the
panel still picks the right kind based on props.triggers[0].kind so a
webhook autopilot doesn't render an irrelevant cron form.

Locales: section_trigger_kind, trigger_kind_{schedule,webhook},
section_webhook, webhook_help_{create,edit} added in en + zh-Hans.

* feat(views): show webhook URL inline after creating a webhook autopilot

After a successful create with kind=webhook, the dialog stays open and
swaps to a confirmation panel showing the freshly minted URL with a
copy button + 'Treat this URL like a password' warning + Done button.
Avoids the friction of "create the autopilot, then go find it in the
list, click in, scroll to triggers, copy URL."

Locales: dialog.webhook_created_{title,description,warning,done} added
in en + zh-Hans.

Schedule create flow is unchanged (toast + close). The success panel is
gated on the trigger returned from the create mutation, so a partial
failure (autopilot created, trigger creation errored) still falls
through to the toast_create_partial path.

* feat(views): show webhook payload in run detail dialog

The agent transcript dialog now accepts an optional headerSlot that
sits above the event list. The autopilot RunRow drops a
WebhookPayloadPreview into that slot when the run came from a webhook
and trigger_payload is non-empty.

The preview is collapsed by default (the transcript itself is the main
event), shows the inferred event name + receivedAt in the header, and
reveals the eventPayload as pretty-printed JSON with a copy button on
expand. Falls back gracefully if the row's trigger_payload doesn't
match the WebhookEnvelope shape — the whole value is shown instead so
nothing is hidden.

Closes the "agent didn't echo the payload, now I can't see what
triggered the run" gap. PLAN.md tracked this as
"Payload preview in run history" under follow-ups.

Locales: webhook_payload.{label, unknown_event, payload, content_type,
copy, copied, copied_short, copy_failed} added in en + zh-Hans.

* chore(server): wire MULTICA_PUBLIC_URL through self-host compose

Two small follow-ups split out of the webhook trigger PR:

- docker-compose.selfhost.yml passes MULTICA_PUBLIC_URL into the
  backend container so a self-hosted deployment behind a real domain
  gets absolute webhook URLs in the trigger response. Documented in
  .env.example with the rationale for not deriving the public host
  from request headers.
- Drop a duplicated 'invalid json:' prefix in the webhook ingress
  400 error path. normalizeWebhookPayload already prefixes its
  errors, so the handler doesn't need to re-prefix.

* fix(migrations): renumber webhook trigger migration 081 → 089 to avoid collision

The branch's 081_autopilot_webhook_triggers.{up,down}.sql collided
numerically with 081_runtime_timezone.{up,down}.sql that landed on
main, making migration apply order undefined. Renumber to 089 so the
file slots after the latest main migration (088_squad_instructions).

The SQL itself doesn't conflict — it only creates a partial unique
index on autopilot_trigger.webhook_token — but the duplicate prefix
is what the migration runner sees, so the filename must move.

* fix(autopilot-webhook): address PR review blocking issues

- Redact bearer tokens from request logs: paths matching
  /api/webhooks/autopilots/<token> now log "[redacted]" instead of the
  token. The resolved trigger ID is plumbed via context so audit lines
  stay useful for debugging. (Review item Blocking #1.)
- Distinguish pgx.ErrNoRows from transient DB errors in token lookup:
  no-row stays 404 (so providers don't retry on a deleted webhook),
  other errors return 500 (which providers DO retry, avoiding silent
  drops on DB blips). (Review item Blocking #2.)
- Add per-IP sliding-window rate limiter that runs BEFORE the token
  lookup, so spraying random tokens can no longer probe the
  autopilot_trigger index unboundedly. Reuses the existing Lua script
  with a separate Redis key namespace; falls open on Redis errors.
  Default budget 30 req/min/IP. (Review item Blocking #3.)

The webhook handler now applies the gates in the order: per-IP rate
limit → token lookup → per-token rate limit → handler logic.

* fix(autopilot): atomic webhook trigger creation + strict kind/timezone validation

- Mint the webhook bearer token BEFORE the INSERT and pass it via
  CreateAutopilotTriggerParams so the row never exists in a half-written
  kind=webhook + webhook_token=NULL state. On the (vanishingly rare)
  unique-index collision the whole INSERT is retried with a fresh token
  — no UPDATE second step. Removes the now-dead attachFreshWebhookToken
  helper. (Review item Recommended #4.)
- Add new GET /api/autopilots/{id}/runs/{runId} endpoint that returns a
  single run including the full trigger_payload. The list response is
  now slim (omits trigger_payload) so worst-case payload size drops
  from ~5 MB to ~5 KB. (Review item Recommended #5, server side.)
- Reject kind=api with 400 ("kind=api is deprecated; use schedule or
  webhook") and reject kind=webhook with --timezone with 400 — both
  surfaces stragglers loudly instead of silently dropping fields.
  CLI mirrors the check so --timezone with --kind webhook errors
  client-side. (Review nits.)
- Add --yes (-y) flag and an interactive y/N confirmation prompt to
  `multica autopilot trigger-rotate-url` so the destructive rotate
  matches the UI's AlertDialog safety. (Review item Recommended #6.)

* fix(views): fetch webhook payload on-demand and truncate at 4 KiB

- Add useAutopilotRun query hook + getAutopilotRun API client method
  paired with the new server endpoint. The run-detail dialog now mounts
  a WebhookPayloadSlot that fetches the full run (incl. trigger_payload)
  lazily — list responses no longer carry up to 256 KiB × N runs of
  envelope data.
- WebhookPayloadPreview truncates its in-DOM <pre> at 4 KiB with a
  localized marker so jank-y machines aren't asked to render a 256 KiB
  JSON blob. The Copy button still yields the full string.
- Adds the truncated_marker i18n string to en + zh-Hans.

Review items Recommended #5 (frontend) and a nit on the preview's
unbounded <pre>.

* test(autopilot-webhook): close coverage gaps flagged in PR review

- request_logger: redactWebhookPath unit tests + integration test
  proving the bearer token never lands in slog output, plus the
  webhook_trigger_id context plumbing.
- autopilot_webhook_handler: empty body → 400, archived autopilot →
  200 ignored, per-IP rate limiter trips before DB lookup, kind=api
  and webhook+timezone are rejected at 400, slim list + full detail
  endpoint round-trip.
- webhook_rate_limiter: Lua script structure guard (catches reordering
  even without a live Redis), plus live-Redis tests for both per-token
  and per-IP limiters (REDIS_TEST_URL gated, matching the existing
  Redis test pattern in the package).
- WebhookPayloadPreview: envelope rendering, fallback shape, and the
  >4 KiB truncation path with full-payload-on-Copy guarantee.

Two branches are documented as code-review-protected rather than
covered by tests: the 500-on-DB-error path requires injecting a stub
Queries (no interface here), and the cross-workspace defense-in-depth
check is unreachable from valid SQL state.

* fix(middleware): SetWebhookTriggerID must mutate request in place

The round-1 helper returned a fresh *http.Request from WithContext, and
the webhook handler did `r = SetWebhookTriggerID(r, ...)`. That swaps
the handler's local pointer but doesn't propagate the new context back
to RequestLogger, which is still holding the original *http.Request —
so the audit line never actually included webhook_trigger_id in
production. The round-1 test happened to pass because it pre-stashed
the value on the request before calling ServeHTTP, bypassing the bug
it was meant to verify.

Switch to in-place mutation via `*r = *r.WithContext(...)` so the
wrapping middleware sees the new context after next.ServeHTTP returns,
and update the test to exercise the real call pattern (set the context
from inside the handler, assert the surrounding logger reads it).

Verified live: an accepted webhook now logs
  path=/api/webhooks/autopilots/[redacted] webhook_trigger_id=<uuid>

* fix(autopilot-webhook): symmetric ErrNoRows split + trusted-proxy gate

Round-2 review (Bohan-J, PR #2348 follow-up):

- Must-fix #1: the second lookup at autopilot_webhook.go:258
  (GetAutopilot after the token resolves) was folding every error into
  404. A transient DB blip would tell a webhook sender "not found" and
  it would never retry. Apply the same errors.Is(err, pgx.ErrNoRows)
  → 404 / else → 500 split as the first lookup got in round 1.

- Must-fix #2: clientIPForRateLimit was honoring X-Forwarded-For /
  X-Real-IP from any caller. An attacker spraying random tokens could
  just rotate the XFF header and the per-IP bucket became per-request,
  so the limiter that's specifically supposed to gate spraying before
  it hits the DB unique index was bypassed.

  New shape — matches Bohan's suggestion exactly:
  * Default: r.RemoteAddr only, headers ignored.
  * Operator opt-in via MULTICA_TRUSTED_PROXIES (comma-separated
    CIDRs). XFF/X-Real-IP are honored only when r.RemoteAddr is
    inside one of the listed prefixes; otherwise they're dropped.

  Wired through .env.example and docker-compose.selfhost.yml so
  self-host operators can configure their reverse-proxy's CIDR.
  Invalid CIDRs in the env var are dropped with a single slog.Warn at
  startup rather than crashing the server. Uses net/netip (stdlib,
  value-typed) for parsing and containment checks.

Verified live on the rebuilt self-host backend: a 35-request spray
from one source with rotating XFF gets the expected 30× 404 + 5× 429,
proving the per-IP bucket is keyed on the real connection IP.

* fix(autopilot): reject cron/timezone PATCH on non-schedule triggers

Round-2 review should-fix. CreateAutopilotTrigger already 400s on
kind=webhook + timezone/cron_expression, but UpdateAutopilotTrigger
silently wrote those fields regardless of prev.Kind. The values then
sat in the DB visible to nobody and read by nothing — a back door that
left the API contract fuzzy across create vs update.

Mirror the create-path discipline: after loading prev, if prev.Kind
!= "schedule" and the PATCH body sets cron_expression or timezone,
return 400 with a clear message. enabled and label remain accepted on
every kind.

The existing prev.Kind == "schedule" guard on next_run_at recompute
stays as belt-and-braces, but with this gate in place the recompute
branch is now reachable only for the kind it was meant for.

* test(autopilot-webhook): close round-2 coverage gaps

- IPRateLimitNotBypassedByXFFSpoof: drives the must-fix #2 invariant
  by rotating XFF across three calls from the same RemoteAddr and
  asserting the third gets 429. Pre-round-2 this test would have
  passed for the wrong reason (limiter trusted XFF, so per-bucket
  collision was incidental); now it pins the bypass-closed property.
- IPRateLimitReturns429BeforeDBLookup: updated to set RemoteAddr
  explicitly and drop the XFF header it was leaning on. With
  TrustedProxies empty (test default) the limiter keys on the real
  connection IP, which is what the test wants to assert anyway.
- UpdateAutopilotTrigger_RejectsCronExpressionOnWebhookKind +
  UpdateAutopilotTrigger_RejectsTimezoneOnWebhookKind: drive the
  round-2 should-fix from the handler boundary.
- UpdateAutopilotTrigger_AcceptsEnabledAndLabelOnWebhookKind: counter
  test so a regression to a blanket reject is caught.

* fix(migrations): bump webhook trigger migration 089 → 091

origin/main added 089_squad_no_action_activity_index (and 090_task_is_leader)
since our last rebase, re-colliding with our 089_autopilot_webhook_triggers.
Bump to 091 so the filename ordering is unambiguous again. The SQL is
unchanged — same partial unique index on autopilot_trigger.webhook_token —
only the filename moves.

* fix(views): dedupe skipped icon in autopilot RUN_VISUAL after rebase

The rebase against origin/main merged main's add of `Ban` for the
skipped status next to our round-1 `MinusCircle` entry, leaving the
RUN_VISUAL map with two `skipped` keys (only the last would have been
read at runtime, and MinusCircle had been dropped from the imports
during conflict resolution — so the file would not compile).

Keep main's `Ban` icon (latest design) and a single `skipped` entry.
Carry over the round-1 comment about why the muted styling matters
for failure-ratio readability.

---------

Co-authored-by: Kerim Incedayi <kerim.incedayi@digitalchargingsolutions.com>
2026-05-18 12:17:39 +08:00
Bohan Jiang
3645bdb5b6 feat(issues): add start_date field with progressive disclosure (MUL-2274) (#2696)
* feat(issues): add start_date field with progressive disclosure (MUL-2274)

Mirrors the existing due_date implementation end-to-end so an issue can
express a planned start in addition to a deadline. Surfaces start_date as
an optional sidebar property alongside priority / due_date / labels (added
in MUL-2275), with consistent picker, board/list/sort, activity, and inbox
plumbing.

Backs the Project Gantt work (parent MUL-1881) and keeps the
progressive-disclosure attribute experience consistent.

- DB: migration 091 adds issue.start_date TIMESTAMPTZ.
- sqlc: ListIssues / CreateIssue / UpdateIssue / CreateIssueWithOrigin /
  ListOpenIssues read & write start_date.
- Backend: IssueResponse + create/update/batch-update handlers parse and
  emit start_date with RFC3339 validation; new start_date_changed activity
  event + subscriber notification (with prev_start_date in event payload).
- CLI: --start-date flag on `multica issue create` / `issue update`.
- Frontend: StartDatePicker component, start_date wired into Issue type,
  Zod schema, draft / view stores, sort util, header sort + card-property
  options, list-row / board-card display, create-issue modal, and the
  issue-detail progressive-disclosure "+ Add property" surface (visibility
  rule, picker row, add-property menu icon + label).
- i18n: en + zh-Hans for sort_start_date / card_start_date /
  prop_start_date / activity start_date_set / start_date_removed /
  picker start_date.trigger_label / clear_action / inbox labels.
- Tests: new TestNotification_StartDateChanged; existing Issue / draft /
  modal fixtures extended with start_date.

Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): align start_date with due_date in actions menu and CLI table

- Add Start Date submenu (today / tomorrow / next week / clear) in
  actions menu, mirroring Due Date — parity with the Due Date quick
  setters in list/board context and 3-dot menus.
- Add corresponding en / zh-Hans i18n keys
  (actions.start_date / start_today / start_tomorrow / start_next_week
  / start_clear).
- CLI human table for `multica issue list` and `multica issue get`
  now shows a START DATE column next to DUE DATE; --full-id variant
  too.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-17 15:01:38 +08:00
Jiayuan Zhang
668cab6022 feat(github): mirror PR CI checks and merge conflict status (MUL-2228) (#2632)
* feat(github): mirror PR CI checks and merge conflict status (MUL-2228)

Surface "checks passed/failed" and "conflicts/no conflicts" badges under
each linked PR on the issue page so users can judge readiness without
flipping over to GitHub. CI state is fed by check_suite webhooks
(GitHub Actions + apps using the Checks API; legacy status events are
out of scope for MVP); conflicts are read from pull_request.mergeable_state.

Data model:
  * github_pull_request: add head_sha + mergeable_state
  * github_pull_request_check_suite: per-suite rows keyed by (pr_id, suite_id)
  * Aggregation done at query time, filtering by current head_sha so
    late-arriving suites for a stale head can't contaminate the new head's
    pending view; per-app latest suite chosen first so a single app firing
    multiple suites isn't counted N times.

Webhook hardening:
  * synchronize/opened/reopened/edited(base) explicitly clear mergeable_state
  * single-row ordering protection on the check_suite upsert prevents a
    late-delivered older event from overwriting a newer one
  * check_suite.pull_requests is iterated; unknown PRs are logged and dropped

UI:
  * PR row shows Checks + Conflicts badges; opaque mergeable values
    (blocked/behind/unstable/...) render as no badge, not as conflicts.
  * Terminal PR states (merged/closed) suppress the status row entirely.

Tests: * Pure unit coverage for derivePRMergeableState + aggregateChecksConclusion
  * Webhook integration tests: multi-app aggregation, old-head ignore,
    late-older-event ignore, synchronize clears mergeable_state
  * Vitest coverage for pull-request-list badge rendering across CI/conflict
    combinations and the legacy (null) fallback.
Co-authored-by: multica-agent <github@multica.ai>

* fix(github): scope check_suite PR lookup; preserve mergeable on metadata

Addresses code review on PR #2632.

1. check_suite handler now resolves the PR through the workspace-scoped
   GetGitHubPullRequest query instead of GetGitHubPullRequestByRepoNumber.
   The (workspace_id, repo_owner, repo_name, pr_number) tuple is the real
   uniqueness key, so a bare (owner, repo, number) lookup could return a
   stale row from another workspace and either land the suite on the wrong
   PR or skip the right one when the installation ids drifted. The old
   unscoped query is removed.

2. derivePRMergeableState now returns (value, clear) and the upsert SQL
   distinguishes three cases: state-changing actions clear the column to
   NULL, non-empty payloads write the value, and metadata events with an
   empty payload preserve the existing column. Previously every empty
   payload became NULL, so a labeled/assigned event silently wiped a
   known clean/dirty verdict in violation of the RFC's "metadata empty
   payload preserves" rule.

3. ListPullRequestsByIssue narrows to the issue's PR ids before running
   the per-app check_suite aggregation, avoiding a full-table scan over
   github_pull_request_check_suite when only a handful of rows belong to
   the requested issue.

New helper test covers labeled+empty preserves; new integration test
verifies a metadata event after a known mergeable_state keeps the value.

Co-authored-by: multica-agent <github@multica.ai>

* feat(github): PR card layout v3 increment — stats + segmented progress bar

Replaces the row + badge layout under "Pull requests" on the issue
detail sidebar with a card that mirrors the GitHub PR summary look:
title, author/avatar, +N −M · K files diff stats, segmented progress
bar (failed → pending → passed, failure leftmost), and a one-line
status caption following an explicit priority pass-through.

Backend
- Migration 092: github_pull_request adds additions / deletions /
  changed_files (INT NOT NULL DEFAULT 0). Zero defaults are what the
  new frontend treats as "legacy backend — hide the stats row" so old
  PR rows that pre-date this migration don't render "+0 −0 · 0 files".
- pull_request webhook handler reads stats off the top-level payload.
- ListPullRequestsByIssue now surfaces per-suite counts
  (checks_passed / failed / pending) alongside the existing aggregate
  conclusion, so the segmented bar reuses the already-computed counts
  with no new aggregation.

Frontend (packages)
- core/github/pull-request-status.{ts,test.ts}: pure-function module
  for the status-kind priority table and the segment derivation; 15
  cases covered, includes the "all-zero → hide stats" guard.
- views/issues/components/pull-request-list.tsx: PullRequestCard plus
  a compact-row fallback used when count > 4 (first 3 as cards, the
  remainder collapsed behind a Show more toggle).
- i18n: new `pull_request_card_*` keys in en + zh-Hans.

Tests
- 12 component tests covering each rule of the priority table, the
  legacy-zero stats fallback, and the collapse threshold.
- Reuse of the v3 webhook handler tests confirmed.

Verification
- pnpm typecheck + pnpm test green (60 test files, 536 tests).
- go build ./... + go vet ./... clean.
- 6 demo issues (DEV-2..DEV-7) screenshotted via Playwright; see the
  PR comments for the visual check matrix.

Co-authored-by: multica-agent <github@multica.ai>

* fix(views): collapse PR cards at N>=4, not N>4

The card-vs-collapse threshold used `>` so 4 PRs slipped past it and
all rendered as full cards, contrary to RFC v3 (N >= 4 collapses to
3 cards + compact tail). Switch to `>=` and update the threshold-
boundary test to expect "Show 1 more".

Co-authored-by: multica-agent <github@multica.ai>

* fix(views): align PR sidebar rows with existing list style

Co-authored-by: multica-agent <github@multica.ai>

* fix(views): hide terminal PR status badges

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-16 21:26:30 +02:00
Jiayuan Zhang
380c6b5122 feat(usage): add Time and Tasks to daily-trend toggle (MUL-2283) (#2709)
Extends the workspace /usage page Daily tokens chart toggle from
Tokens | Cost to Tokens | Cost | Time | Tasks, so users see daily
run-time and task-count trends alongside spend without leaving the page.

- New SQL `ListDashboardRunTimeDaily`: per-date totals from
  agent_task_queue (terminal tasks only), scoped to workspace and
  optionally project. Same time anchor as ListDashboardAgentRunTime
  so day boundaries line up.
- New handler GET /api/dashboard/runtime/daily + TanStack Query option.
- New DailyTimeChart (single-series, smart h/m/s unit) and
  DailyTasksChart (completed + failed stacked).
- Empty-state is per-metric so a workspace with tokens but no terminal
  runs (or vice-versa) doesn't get a false "no data".
- i18n: en + zh-Hans daily.metric_time / metric_tasks + titles.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-15 18:51:02 +02:00
iYuan
d8635ad580 fix(issues): prevent duplicate active issue creation (MUL-2225) (#2602)
* fix: prevent duplicate active issue creation

* fix(issues): address duplicate guard review

* fix(autopilot): skip duplicate issue admissions

* fix(issueguard): tighten duplicate lookup edge cases

* test(issues): cover duplicate guard autopilot skips

* feat(autopilots): group skipped runs in history
2026-05-15 18:27:56 +08:00
Naiyuan Qing
5ad1641b72 Revert "Squad archive dialog + role editor + transactional DeleteSquad (#2680)" (#2687)
This reverts commit 2980ead4c7.
2026-05-15 17:44:59 +08:00
Naiyuan Qing
2980ead4c7 Squad archive dialog + role editor + transactional DeleteSquad (#2680)
* docs(squad): address plan-review feedback for archive + role plan

Resolve the 4 items the reviewer raised on MUL-2265:

1. TS schema: declare `active_issue_count` as optional (`number | null | undefined`)
   so list/create/update Squad responses don't lie about their shape; only
   `getSquad` parses through SquadSchema.
2. Archive semantics: restrict TransferSquadAssignees to active issues
   (status NOT IN done, cancelled) so dialog count and SQL operate on one set
   and terminal-state issues keep their historical assignee.
3. Index assumption: corrected — `idx_issue_assignee (assignee_type,
   assignee_id)` exists and is sufficient at realistic squad cardinality;
   no new index needed.
4. Fixed `*int64` test comparison and added `.loose()` to SquadSchema per
   the local schemas.ts convention.

Co-authored-by: multica-agent <github@multica.ai>

* docs(squad): plan v3 — revert to count-all/transfer-all on archive

Reviewer round 2 surfaced two structural problems with plan v2's
active-only carve-out:

1. useActorName resolves squad names via ListSquads, which filters
   archived_at IS NULL. A closed issue with an archived-squad assignee
   would render as "Unknown Squad".

2. The status-only update path in UpdateIssue skips validateAssigneePair,
   so a done/cancelled issue with an archived-squad assignee could be
   reopened to in_progress, violating the "no active issue on an archived
   squad" invariant enforced elsewhere.

Both problems disappear by reverting to count-all + transfer-all: after
ArchiveSquad runs, no issue points at the archived squad, so neither
case can occur. The product trade-off is that closed historical issues
now show the leader agent instead of the archived squad in their
"Assigned to" badge — consistent with existing agent-level reassignment
behavior elsewhere in the product.

Field rename: active_issue_count -> issue_count.
TransferSquadAssignees SQL is unchanged (already transfers all).

Co-authored-by: multica-agent <github@multica.ai>

* docs(squad): add Task 2b — wrap DeleteSquad transfer + archive in one tx

Reviewer round-3 flagged that the v3 invariant ("after archive no
issue points to the squad") was asserted on the happy path only.
DeleteSquad's current best-effort impl breaks it two ways:
- transfer failure → slog.Warn but archive proceeds (Unknown Squad,
  reopen-into-archived-squad bugs reappear)
- archive failure after a committed transfer → 500 with squad still
  active but emptied

Task 2b rewrites DeleteSquad to run TransferSquadAssignees +
ArchiveSquad inside one pgx tx, mirroring the project.go:266-314
pattern. Publish moves below Commit. Adds two regression tests that
lock both partial-write failure modes.

Co-authored-by: multica-agent <github@multica.ai>

* feat(squad): replace native confirm() with AlertDialog and rewrite role editor as combobox

Backend:
- Add CountIssuesForSquad sqlc query (counts every issue assigned to a squad,
  no status filter — matches the existing transfer-all archive semantics).
- Extend SquadResponse with optional `issue_count` (`*int64` + omitempty,
  populated only by GetSquad to avoid an N+1 in the list endpoint).
- Wrap DeleteSquad's transfer + archive in a single pgx transaction so the
  v3 invariant ("after archive, no issue points to the squad") is durable
  rather than best-effort. Promote slog.Warn to slog.Error and check the
  parseUUIDOrBadRequest ok flag (silent zero-UUID was a #1661-class latent
  bug). Publish only after Commit so realtime never sees rolled-back state.
- Tests cover happy path (count, transfer-all including terminal statuses)
  and both rollback directions (transfer fail / archive fail) via a
  fault-injecting tx wrapper.

Frontend:
- Extend Squad TS type with `issue_count?: number | null` (optional —
  list/create/update legitimately omit it). Add SquadSchema with `.loose()`
  and wrap getSquad with parseWithFallback so older servers and count-error
  responses degrade to the dialog's "no count" copy variant.
- Replace `window.confirm()` with shadcn `ArchiveSquadConfirmDialog`
  (destructive variant, leader name + count + closed-issue caveat in the
  copy, Loader2 while pending). i18n keys added under squads.archive_dialog.
- Rewrite RoleEditor as a Popover + Command combobox: Pencil affordance is
  always visible, suggestions aggregate other members' roles, commit only
  on Enter or selecting a suggestion (blur discards), per-member savingId
  drives Loader2 so the spinner only renders on the row being saved.

Co-authored-by: multica-agent <github@multica.ai>

* fix(squad): discard RoleEditor draft on close and no-op blank Enter

Two reviewer findings on e0d754bf:

1. Closing the Popover (outside click, Esc, trigger re-click) left `query`
   in state, so reopening + Enter would commit the stale draft. Clear
   `query` on every non-saving close path.
2. With an existing role, opening the editor and pressing Enter on an
   empty input committed "" — `commit` only no-op'd when trimmed matched
   value. Treat blank Enter as a no-op; clearing a role would need an
   explicit clear action that doesn't exist yet.

Add two regression tests:
- close (via outside click) → reopen surfaces a clean input; Enter does
  not commit the stale draft
- blank Enter on an existing role does not call onSave

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(squad): add explicit Clear button to RoleEditor

Role is optional, but the previous fix turned blank Enter into a no-op
without exposing any other way to clear an existing role — that broke a
valid terminal state. Keep blank Enter as no-op; add a "Clear role"
button at the bottom of the popover that only renders when value is
non-empty and routes through onSave("").

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 17:29:37 +08:00
LinYushen
319b23eb39 Revert "feat(task): add claim lease mechanism (Phase 2, MUL-2246) (#2660)" (#2674)
This reverts commit 3137feecdf.
2026-05-15 16:07:23 +08:00
LinYushen
b7a58c06ac Revert "feat(task): wire claim lease into TaskService and sweeper (MUL-2246) …" (#2673)
This reverts commit bb32be0e50.
2026-05-15 16:06:58 +08:00
LinYushen
bb32be0e50 feat(task): wire claim lease into TaskService and sweeper (MUL-2246) (#2662)
* feat(task): wire claim lease queries into TaskService and sweeper (MUL-2246)

- ClaimTask now uses ClaimAgentTaskWithLease (generates claim_token + lease)
- StartTask accepts optional claim_token for token-verified start
- AgentTaskResponse includes claim_token for daemon to use
- Daemon client sends claim_token in StartTask body
- Sweeper calls RequeueExpiredClaimLeases each tick
- Legacy daemons without claim_token still work (graceful fallback)

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): address PR #2662 review blockers (MUL-2246)

1. ClaimAgentTaskForRuntime: push runtime_id into atomic SQL WHERE clause
   so runtime A cannot claim tasks queued for runtime B under the same agent.

2. Legacy StartAgentTask: add claim_token IS NULL guard so leased rows
   cannot be started without token verification. Handler rejects malformed
   tokens with 400 instead of silently degrading to legacy path.

3. StartAgentTaskWithClaimToken: validate claim_expires_at >= now(),
   preserve claim_token until terminal state (only clear claim_expires_at),
   use CTE + UNION ALL for idempotent retry when daemon resends after a
   lost StartTask response. Return 409 Conflict on token mismatch/expiry.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): StartTask 409 handling, transport retry, claim_token on FailTask (MUL-2246)

- StartTask 409 (claim superseded): release slot, don't call FailTask
- StartTask transport timeout/5xx: retry once with same token, then
  check task status before failing
- FailTask now sends claim_token; server-side FailAgentTask SQL adds
  AND (claim_token IS NULL OR claim_token = @claim_token) guard so
  stale daemons cannot fail tasks that have been re-claimed

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): close FailTask token bypass and RequeueExpiredClaimLeases liveness gap (MUL-2246)

Blocker 1 - FailTask token validation:
- SQL: change (param IS NULL OR claim_token = param) to
  (param IS NULL AND claim_token IS NULL) OR claim_token = param
  so tokenless requests can only fail legacy (tokenless) rows.
- task.go: malformed claim_token now returns ErrInvalidClaimToken (400)
  instead of being silently dropped to NULL.
- Handler: maps ErrInvalidClaimToken→400, ErrClaimTokenInvalid→409.
- Service: when UPDATE returns no rows but task is still active,
  return ErrClaimTokenInvalid (token mismatch) instead of silent success.

Blocker 2 - RequeueExpiredClaimLeases runtime liveness:
- SQL: JOIN agent_runtime, only requeue tasks where runtime is 'online'.
  Dead/offline runtime tasks stay dispatched for FailTasksForOfflineRuntimes.
- FOR UPDATE → FOR UPDATE OF atq (required with JOIN).

Regression tests:
- task_claim_token_test.go: malformed, tokenless-on-tokened, wrong-token
- requeue_lease_test.go: SQL must JOIN agent_runtime with online filter

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): move expired lease requeue to ClaimTaskForRuntime preflight, add heartbeat freshness backstop (MUL-2246)

- Add RequeueExpiredClaimLeasesForRuntime: per-runtime preflight self-requeue
  in ClaimTaskForRuntime. Runtime proves liveness by actively claiming, so no
  heartbeat check needed.
- Update global RequeueExpiredClaimLeases to require ar.last_seen_at freshness
  (stale_threshold_secs param). Prevents requeuing to a dead runtime in the
  90s gap between lease expiry (60s) and offline detection (150s).
- Add regression tests verifying the heartbeat freshness check and that the
  preflight query does not join agent_runtime.

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): use LivenessStore for global requeue, move preflight before empty-cache (MUL-2246)

Blocker 1: Global RequeueExpiredClaimLeases now uses LivenessStore.IsAliveBatch
to verify runtimes are truly alive before requeuing expired leases. When
LivenessStore is unavailable (no Redis), global requeue is skipped entirely —
the preflight self-requeue in ClaimTaskForRuntime handles live runtimes. This
closes the 60-150s gap where a dead runtime still appears online in DB.

Blocker 2: Moved RequeueExpiredClaimLeasesForRuntime BEFORE EmptyClaim.IsEmpty
fast-path in ClaimTaskForRuntime. Expired leases are now requeued (which bumps
the empty cache via notifyTaskAvailable) before the empty check can
short-circuit the claim path.

Also adds ListRuntimesWithExpiredClaimLeases SQL query and LivenessChecker
interface on TaskService.

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): wire EmptyClaimCache into backend taskSvc for backstop requeue (MUL-2246)

The backend taskSvc used by the sweeper only had Liveness wired but not
EmptyClaim. When global backstop requeue called notifyTaskAvailable,
s.EmptyClaim.Bump() was a nil no-op — the handler's empty-cache was never
invalidated, so the daemon's next claim hit a stale empty verdict.

Fix: wire the same Redis-backed EmptyClaimCache into the backend taskSvc
in main.go (same Redis keys as router.go:139 handler instance).

Add regression test verifying backstop requeue invalidates the handler's
empty-cache.

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): global backstop must not requeue — alive runtimes use preflight, dead stay dispatched (MUL-2246)

- RequeueExpiredClaimLeases is now a no-op (returns 0 always)
- Alive runtimes self-requeue via ClaimTaskForRuntime preflight
- Dead runtimes stay dispatched for FailTasksForOfflineRuntimes
- Rewriting to queued on dead runtime creates 2h blackhole (offline
  sweeper only handles dispatched/running)
- Test actually calls RequeueExpiredClaimLeases and asserts 0 in all cases

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): remove duplicate usage reporting block after merge conflict (MUL-2246)

The merge resolution introduced a second ReportTaskUsage call after the
status check, duplicating the usage-before-early-return block that already
runs right after runner.run. Remove the duplicate and add a regression test
asserting /usage is called exactly once on the normal completion path.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-15 15:15:31 +08:00
LinYushen
3137feecdf feat(task): add claim lease mechanism (Phase 2, MUL-2246) (#2660)
Add claim_token + claim_expires_at columns to agent_task_queue and three
new SQL queries for the claim lease protocol:

- ClaimAgentTaskWithLease: generates a UUID token and sets a lease expiry
  when claiming a task, so the daemon must prove it received the response
- StartAgentTaskWithClaimToken: validates the token on StartTask, preventing
  stale daemons from starting requeued tasks
- RequeueExpiredClaimLeases: moves dispatched tasks with expired leases back
  to queued for re-claim

This closes the reliability gap where a claim response lost in transit
leaves a task stuck in dispatched until the 60s dispatch timeout fires.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-15 15:14:05 +08:00
Naiyuan Qing
f29bd93444 feat(squads): rework Create Squad modal (MUL-2233) (#2645)
* feat(squad): accept avatar_url on CreateSquad

Threads avatar_url through the SQL query, sqlc-generated code, and the Go
handler so the create-squad flow can persist an avatar at creation time
instead of forcing a follow-up PATCH.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(squad): add avatar_url to CreateSquadRequest

Extends the TS contract for the new backend field so the frontend can pass
an uploaded avatar URL through api.createSquad.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(squads): rework Create Squad modal to match CreateAgentDialog (MUL-2233)

Replaces the cramped small-dialog flow with the same large-dialog shape used
by Create Agent: identity row (AvatarPicker + name + description with char
counter), grouped Leader picker (My Agents first, then Workspace Agents),
and a new multi-select Additional Members picker covering agents and
workspace members. The members trigger collapses to "+N" once more than
three are selected; promoting an agent to leader auto-drops it from the
additional-members list.

After createSquad, additional members are attached via Promise.allSettled
so a single failure surfaces a warning toast without blocking navigation —
the squad still exists and the user can retry from the Members tab.

Adds packages/views/modals/create-squad.test.tsx covering identity binding,
leader-group ordering, leader/member conflict sanitization, the empty- and
partial-failure success paths, and the create-failure recovery path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(squads): valid trigger HTML + drop conflicted leader from members

Two issues from PR #2645 review:

1. AdditionalMembersPicker's PopoverTrigger was a <button> containing
   MemberChip's remove <button>, which React/HTML flags as nested
   interactive content (hydration + a11y warning). Render the trigger as
   a <div role="combobox"> via Base UI's render prop so the chip's
   remove button is valid.

2. sanitizedMembers only hid the leader from rendered/submitted output,
   so promoting an additional member to leader then switching leader
   away resurrected the hidden pick. Drop it from selectedMembers at
   the moment of promotion via handleLeaderChange; sanitizedMembers is
   no longer needed.

Adds a test that promotes → switches leader and asserts the member is
not resubmitted.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-15 13:11:08 +08:00
Jiayuan Zhang
4d6b5ad06f fix(squad): wake leader when dual-role agent posts as worker (MUL-2218) (#2626)
* fix(squad): wake leader when dual-role agent posts as worker (MUL-2218)

The squad-leader self-trigger guard skipped a comment whenever the
author equalled the squad's leader id, regardless of the role the agent
was acting in. For an agent that holds both leader and worker roles in
the same squad, this meant the leader role never reacted to its own
worker output and the issue stalled.

Tag each enqueued task with is_leader_task and consult the agent's
most recent task on the issue from both self-trigger guards (comment
path + @squad mention path) — skip only when that task was itself a
leader task.

Co-authored-by: multica-agent <github@multica.ai>

* fix(squad): inherit is_leader_task on retry task clone (MUL-2218)

CreateRetryTask cloned a parent task into a fresh queued attempt but
omitted is_leader_task from the column list, so the child silently fell
back to the column default (false). For a leader task that hit auto-retry
through MaybeRetryFailedTask, the retried task posed as a worker task —
the self-trigger guard then no longer recognised the leader's own
comments, re-opening the very loop MUL-2218 closes.

Inherit p.is_leader_task in the clone and add a query-level test that
covers both leader and worker retries.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-14 15:23:36 +02:00
LinYushen
0cb759b446 fix(squad): suppress no-action leader comments (#2583) 2026-05-14 14:07:26 +08:00
LinYushen
add3135a42 feat(cli): add squad create/update/delete and member add/remove (#2574)
* feat(cli): add squad create/update/delete and member add/remove commands

Implement missing squad management commands in the CLI:
- squad create --name --leader [--description]
- squad update <id> [--name] [--description] [--instructions] [--leader] [--avatar-url]
- squad delete <id>
- squad member add <squad-id> --member-id --type [--role]
- squad member remove <squad-id> --member-id --type

Also adds DeleteJSONWithBody to the API client for the member remove
endpoint which uses DELETE with a JSON body.

All commands support --output json for structured output.

Co-authored-by: multica-agent <github@multica.ai>

* fix(squad): add --output json to delete/member remove, return 404 on 0-row delete

- squad delete: add --output json flag, emit {id, deleted} on success
- squad member remove: add --output json flag, emit {squad_id, member_id, removed}
- Backend RemoveSquadMember: change query to :execrows, check RowsAffected
  and return 404 'squad member not found' when 0 rows deleted

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-14 12:51:44 +08:00
LinYushen
29082f7cfe feat: implement Squad feature MVP (#2505)
* feat: implement Squad feature MVP

- Add migration 084_squad: squad, squad_member, squad_activity_log tables
- Extend issue.assignee_type to support 'squad'
- Add sqlc queries for squad CRUD, member management, activity logs
- Add Go handler with full Squad API (CRUD, members, activity log)
- Register routes: /api/squads/*, /api/issues/{id}/squad-activity, /api/squad-activity
- Add Squad trigger logic:
  - Assign Squad immediately triggers leader
  - Every external comment on squad-assigned issue triggers leader
  - Anti-loop: squad members' comments don't trigger leader
  - Dedup: skip if leader already has pending task
- Add squad activity log API (方案 B) for leader no-op recording
- Add frontend TypeScript types (Squad, SquadMember, SquadActivityLog)
- Add protocol events: squad:created, squad:updated, squad:deleted

Co-authored-by: multica-agent <github@multica.ai>

* fix: address PR review blocking issues

1. validateAssigneePair now accepts 'squad' assignee_type
2. All squad endpoints validate workspace ownership via GetSquadInWorkspace
3. CreateSquadActivityLog restricted to squad leader agent only
4. AddSquadMember validates member exists in workspace
5. UpdateSquad auto-adds new leader to squad members
6. DeleteSquad transfers assigned issues to leader before deletion
7. IssueAssigneeType includes 'squad' in frontend types

Co-authored-by: multica-agent <github@multica.ai>

* feat: soft-delete squads via archive instead of hard delete

- Add migration 085: archived_at + archived_by columns on squad table
- ListSquads now excludes archived squads (ListAllSquads for admin)
- DeleteSquad → ArchiveSquad (sets archived_at, preserves all records)
- Transfer squad-assigned issues to leader before archiving
- SquadResponse includes archived_at/archived_by fields
- Frontend Squad type updated with nullable archived fields

Co-authored-by: multica-agent <github@multica.ai>

* feat: re-add Squads frontend entry (sidebar nav + pages)

Re-applies the frontend squad entry that was lost during a merge:
- Sidebar nav: Squads item with Users icon
- Paths: squads() and squadDetail() in workspace paths
- Routes: /squads and /squads/[id] pages
- Views: SquadsPage (list) and SquadDetailPage
- i18n: en 'Squads' / zh '小队'
- Reserved slug: 'squads'

Co-authored-by: multica-agent <github@multica.ai>

* fix: fix SquadsPage rendering - use PageHeader children pattern

PageHeader takes children, not title/actions props. The incorrect
usage caused a React rendering error. Now matches the pattern used
by autopilots and agents pages.

Co-authored-by: multica-agent <github@multica.ai>

* fix(squads): add API client methods and package export for squads pages

* feat: complete Squad frontend - create dialog, member management, API methods

- Add CreateSquadModal with name/description/leader selection
- Register 'create-squad' in modal registry
- Wire 'New Squad' button to open the modal
- Add full API client methods: createSquad, updateSquad, deleteSquad,
  addSquadMember, removeSquadMember
- Rewrite SquadDetailPage with:
  - Member list showing resolved names
  - Add/remove member UI
  - Archive squad button
  - Back navigation to squads list

Co-authored-by: multica-agent <github@multica.ai>

* feat: improve Squad UI - match create agent dialog style

- CreateSquadModal: proper Dialog with Header/Description/Footer,
  agent picker with avatars, textarea for description
- SquadDetailPage: centered max-w-2xl layout, ActorAvatar for members,
  Crown badge for leader, textarea for member description,
  improved spacing and visual hierarchy
- Renamed 'role' field label to 'Description' in add member form
  (describes the member's responsibilities in the squad)

Co-authored-by: multica-agent <github@multica.ai>

* feat(squad): add avatar, instructions; drop unique-name constraint

- 086: add squad.avatar_url
- 087: drop unique constraint on squad.name (squads with the same
  name are legitimate across teams; uniqueness was an accidental
  product constraint)
- 088: add squad.instructions (text, default '')
- UpdateSquad now COALESCEs avatar_url + instructions
- handler exposes Instructions in SquadResponse and accepts it in
  UpdateSquad

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(squad): assignable + mention target; trigger leader on assign

- assignee picker and @mention suggestion list squads alongside
  agents and members; renders squad avatar/icon
- creating or updating an issue with assignee_type=squad enqueues
  a task for the squad's current leader (mirrors agent-assignee
  parking-lot rule: skip backlog only)
- workspace queries/hooks expose squads where needed for the
  pickers
- locales updated for new picker copy

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(squad): agent-style detail page with members + instructions tabs

- restructure squad detail page to mirror the agent detail page:
  320px inspector (creator, leader, created/updated) + tabbed
  pane (Members | Instructions) with dirty-guard AlertDialog
- inline name + avatar editing on the inspector
- inline description editor (modal textarea)
- members tab: leader + member picker with role descriptions,
  swap leader, edit member roles, remove
- instructions tab: ContentEditor + Save (mirrors agent pattern)
- squads list shows the squad avatar/icon
- core types + api.updateSquad accept avatar_url + instructions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(squad): inject leader briefing on claim (protocol + roster + instructions)

When a squad's leader agent claims a task on a squad-assigned issue,
append a system-level briefing to the agent's Instructions composed of:

1. Squad Operating Protocol — hard-coded rules: leader is a
   coordinator, dispatch via @mention, stop after dispatching,
   resume on re-trigger, do not work outside the roster.
2. Squad Roster — leader self-row plus one row per non-archived
   member with a literal mention markdown string ([@Name](mention://
   agent|member/<UUID>)) the leader can paste verbatim. Round-trips
   through util.ParseMentions, enforced by a contract test.
3. Squad Instructions — the user-defined squad.instructions block,
   omitted entirely when empty so we do not leave a dangling heading.

Non-leader members claiming the same issue receive no briefing.

Tests cover: full squad with mixed agent/human members, lone leader,
archived agents skipped, empty user instructions, mention round-trip,
and the leader/non-leader claim-handler gate.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(squad): tell leader not to restate issue context in dispatch comment

After observing leaders padding their delegation comments with full
re-summaries of the issue body and prior discussion, make the
Operating Protocol explicit:

- assignees on Multica already have the full issue (title,
  description, all comments, attachments) and workspace context;
- delegation comments should add only what cannot be inferred
  (who is picked, why, extra constraints), aim for two or three
  sentences;
- restating context is now an explicit hard rule violation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(squad): unify leader evaluation into activity_log, add CLI command

- Squad member comments now trigger leader (only leader self-excluded)
- Replace squad_activity_log with activity_log (action: squad_leader_evaluated)
- Add CLI: multica squad activity <issue-id> <outcome> --reason
- Add API: POST /api/issues/{id}/squad-evaluated
- Update squad operating protocol to require evaluation recording
- Remove squad_activity_log table from schema and generated code

* feat(cli): add squad list, get, member list commands

* fix(squad): address review findings (P1+P2)

P1 fixes:
- Add 'squads' to reserved_slugs.json (source of truth)
- Add 'create-squad' to ModalType union
- Remove unused leaderOpen/selectedLeader in create-squad modal
- Replace literal JSX strings with i18n selectors (en + zh-Hans)

P2 fixes:
- Add 'squad' to mention regex (MentionRe)
- Fix human member lookup in squad briefing (use GetUser directly)
- Add squads routes to desktop app
- Add squad:created/updated/deleted to WSEventType + invalidation
- Reject archived squads as issue assignees

* fix(squad): restore zh-Hans key, publish activity event, invalidate issues on archive

- Restore create_project.title in zh-Hans modals.json (dropped by prior edit)
- Publish activity:created WS event after squad leader evaluation
- Invalidate issue queries on squad:deleted (archive transfers assignees)
- Add creator info to squad list cards

* fix(squad): realtime sync, rerun support, leader validation

- Use workspaceKeys.squads prefix for detail/member queries (realtime invalidation)
- Publish squad:updated after add/remove/role-change member mutations
- Support rerun for squad-assigned issues (targets leader agent)
- Reject assignment to squads whose leader is archived

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-13 18:46:20 +08:00
Naiyuan Qing
623d29f276 feat(agents): one-click create from curated templates (Phase 1) (#2520)
* docs(agents): three-phase agent quick-create plan

Captures the full design for moving agent creation from manual form +
one-by-one skill attachment to a tiered experience:

- Phase 1 (this PR): one-click curated templates, AI-free.
- Phase 2 (next): AI-recommended skills via the existing quick-create
  task mechanism — no new server-side LLM dependency.
- Phase 3 (later): AI creates the whole agent end-to-end, composing
  Phase 2 with a new `multica agent create` CLI driver.

Documents the architectural decisions that keep all three phases on
existing infrastructure (no SSE, no server-side LLM SDK, no new WS
channels), the two soft blockers Phase 1 unlocks for later phases
(createSkillWithFiles TX composability + skill same-name dedupe), and
the scope decisions we explicitly opted out of (Anthropic plugin
marketplace, ClawHub UI affordances).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(skills): harden import against invalid UTF-8 and binary files

PG rejects two byte patterns in a TEXT column. Both crashed real skill
imports we hit while assembling the template catalog:

- Embedded NUL (0x00) -> SQLSTATE 22021. Already stripped by
  sanitizeNullBytes, kept as-is.
- Other invalid UTF-8 (e.g. 0x91 — Windows-1252 smart quote in a skill
  whose author saved prose from Word). sanitizeNullBytes now also runs
  strings.ToValidUTF8 over the content so the second class no longer
  takes the whole import down.

For non-text payloads (images, fonts, archives, compiled binaries),
sanitization isn't the right fix — agents never read those as text,
and the bytes can't survive a TEXT column at all. addFile now skips
them by extension before the per-bundle cap counters tick, logging
the skip so an unexpected drop leaves a breadcrumb.

Function name kept for compatibility with the many call sites; both
behaviours are strict supersets of the original.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(skills): split createSkillWithFiles for tx composition + add workspace find-or-create query

Two soft blockers cleared so create-from-template (next commit) can
fold N skill creates and the agent + binding writes into one outer
transaction:

1. createSkillWithFiles used to Begin/Commit its own tx. Caller
   composition was impossible — N invocations meant N separate
   transactions and no atomicity over the whole materialise step.
   Pull the body into createSkillWithFilesInTx(ctx, qtx, input); the
   original function becomes a thin wrapper that manages its own tx
   for standalone callers. Existing call sites: zero behaviour change.

2. Add GetSkillByWorkspaceAndName sqlc query — workspace skill lookup
   by name, anchored to UNIQUE(workspace_id, name) from migration
   008. Lets the template materialiser implement find-or-create:
   reuse the workspace's existing skill row when a template
   references the same name, rather than crashing on the unique
   constraint or polluting the workspace with `<name>-2` clones.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(agents): agent template catalog + create-from-template endpoint

Server-side foundation for Phase 1 of the quick-create roadmap (see
docs/agent-quick-create-plan.md). Adds:

- server/internal/agenttmpl/ — embed-loaded catalog of curated agent
  templates. Each template ships pre-written instructions plus a list
  of skill URLs that get materialised into the workspace at create
  time. Validation runs at startup (init() panics on a malformed
  template) so a bad JSON ships as a deploy-time defect, not a
  runtime 500. Slug must equal the filename basename so the URL
  router is mirror-symmetric with the file layout.

- 11 starter templates covering Engineering / Writing / Building /
  Testing (code-reviewer, frontend-builder, planner, docs-writer,
  one-pager, html-slides, full-stack-engineer, …).

- Three new endpoints, all behind RequireWorkspaceMember:
    GET  /api/agent-templates           — picker list (no instructions)
    GET  /api/agent-templates/:slug     — detail with instructions
    POST /api/agents/from-template      — materialise + create

  Create flow:
    1. Auth + runtime authorization happen BEFORE the GitHub fan-out
       so a 403 never wastes 20s of upstream fetches.
    2. Pre-flight dedupe by cached_name reuses workspace skills
       without an HTTP fetch — second create-from-the-same-template
       drops from 20s to <100ms.
    3. Parallel fetch (30s per-URL timeout) for the remaining skills.
    4. Single transaction: every skill insert, the agent insert, and
       the agent_skill bindings. On any upstream fetch failure the TX
       rolls back and the API returns 422 with `failed_urls` so the
       UI can name the bad source(s).
    5. extra_skill_ids (user-supplied additions) are verified through
       GetSkillInWorkspace per id before attach, so a malicious client
       can't graft a skill from another workspace via UUID guessing.

- multica agent create --from-template <slug> CLI flag dispatches to
  the new endpoint with a 60s ceiling, matching `multica skill import`.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(agents): one-click create-from-template UI

Frontend half of Phase 1. CreateAgentDialog becomes a state machine
spanning four steps:

  chooser          → Start blank / From template cards
  blank-form       → existing manual form (post-chooser)
  duplicate-form   → existing form pre-filled from a duplicated agent
  template-picker  → grid of templates, click navigates to detail
  template-detail  → instructions + skill list preview + one-click Use

Picking a template never lands on the form: name auto-deduped against
existingAgentNames, runtime = first usable one, visibility = private.
Refinement happens on the agent detail page if needed. Same rationale
the doc spells out — templates exist precisely to skip configuration.

New components, all collapsible-by-default so quick-create stays fast:
  - template-picker.tsx — categorised grid, lucide icons + semantic
    accent tokens resolved through static maps so Tailwind's JIT picks
    up every variant (dynamic class strings would silently miss).
  - template-detail.tsx — instructions preview, skill list with cached
    descriptions, Use CTA. Renders the failedURLs banner when a 422
    fires — the only step that can trigger that response.
  - instructions-editor.tsx — collapsed preview-card / expanded full
    ContentEditor.
  - skill-multi-select.tsx + skill-picker-list.tsx — shared multi-
    select surface, also adopted by the existing skill-add-dialog.
  - avatar-picker.tsx — agent avatar upload, mirrors the inspector's
    visual language.

Schema-defended client (CLAUDE.md → API Response Compatibility): the
three new endpoints are wired through parseWithFallback with lenient
zod schemas. Desktop builds outlive any given server — a future
field rename / wrapping must not white-screen older installs.
listAgentTemplates accepts both the current bare array and a future
{templates: [...]} envelope. Coverage: 7 new schema-test cases in
schema.test.ts (null body, missing skills/instructions, malformed
create response, envelope migration).

Catalog + detail go through TanStack Query with staleTime: Infinity —
workspace-independent static data, no per-mount refetch.

Other:
- skill-add-dialog becomes a true multi-select (Confirm button +
  checkbox list); attached skills are filtered out of the list.
- agents-page hands the freshly-created Agent back to the dialog so a
  follow-up setAgentSkills can attach the form-selected skills.
- agent-overview-pane drops the mx-auto/max-w-2xl frame on config-
  tab content; the wider dialog visual language reads better with
  tabs filling the column.
- Every new UI string lives in both en/agents.json and
  zh-Hans/agents.json under create_dialog.* / tab_body.skills.* —
  locales/parity.test.ts blocks drift in CI.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ci): align skill import test + drop next-only lint suppression

- TestFetchFromSkillsSh_ResolvesRootLevelSkillMd now expects assets/logo.png
  to be skipped; matches the new addFile binary-extension guard
  (6fafd86e). The .png is intentionally dropped so PG TEXT inserts don't
  hit SQLSTATE 22021.
- packages/views shares zero next/* deps, so the @next/next/no-img-element
  eslint plugin isn't loaded there. The eslint-disable directive
  referencing it produced a hard "rule not found" error in CI lint. Raw
  <img> is the right primitive in views; remove the disable comment.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(agents): wrap CreateAgentDialog tests in workspace/navigation providers

The dialog now calls useNavigation() and useWorkspacePaths(), both of
which throw outside their providers. The existing tests rendered the
dialog bare and tripped both new requirements:

- NavigationProvider — supply a stub adapter so push() works for the
  agent-detail redirect.
- WorkspaceSlugProvider — useWorkspacePaths() requires a slug.

The blank-vs-template chooser is now the default first step; the
existing tests target the runtime picker on the manual form, so the
helper auto-clicks "Start blank" when no template is passed
(duplicate-mode tests skip the chooser).

Manual afterEach(cleanup) + document.body wipe. Base UI's Dialog
portal renders into document.body and leaves focus-guard/inert wrapper
divs behind across tests, so the second test in the suite saw two
"All" / "My Runtime" matches and getByText failed. The wipe is local
to this file rather than the shared setup because it isn't a global
issue — only suites that open Base UI dialogs hit it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-13 18:26:04 +08:00
Bohan Jiang
96695a79c5 feat(dashboard): workspace/project token + run-time dashboard MUL-1882 (#2462)
* feat(dashboard): workspace/project token + run-time dashboard

Add a `/{slug}/dashboard` page showing per-agent token spend and execution
time across the whole workspace, with an optional project filter.

Backend:
  - Three new sqlc queries against task_usage + agent_task_queue: daily
    usage, per-agent usage, per-agent total run-time. All optionally
    scoped to a project via sqlc.narg('project_id'), reaching project
    through the issue join.
  - Handlers under /api/dashboard return the same wire shape the runtime
    page already consumes (model preserved for client-side cost math).

Frontend: - Shared DashboardPage in packages/views/dashboard reusing KpiCard,
    DailyCostChart, ActorAvatar, and estimateCost from the runtime page
    so the visual style and pricing math stay in lock-step.
  - Period selector (7/30/90d), project dropdown, four KPI tiles
    (cost, tokens, run time, tasks), daily cost chart, and a combined
    "cost + run time by agent" list.
  - Routed in both web (app/[slug]/(dashboard)/dashboard) and desktop
    (memory router); sidebar nav entry added under Workspace group.
Co-authored-by: multica-agent <github@multica.ai>

* fix(dashboard): drop stale project filter and stop double-counting tasks

Two issues caught in PR #2462 review:

1. Project filter held the previous selection's UUID across workspace
   switches and project deletions: the dropdown gracefully showed
   "All projects" (because the title lookup missed) while the three
   dashboard queries kept forwarding the dead UUID, leaving the UI
   looking like a full-workspace view but populated with empty
   project-scoped data. Validate the picked UUID against the current
   projects list before passing it to the queries.

2. The "by agent" table read its task count from the token rollup,
   which is grouped per (agent, model). A single task that spans two
   models lands twice and the agent's row reads e.g. "2 tasks" when
   the real count is 1. Prefer `ListDashboardAgentRunTime`'s per-agent
   distinct count when available; fall back to the token aggregate
   only for agents with no terminal run yet (in-flight tasks).

Extract the merge into `mergeAgentDashboardRows` so the precedence
rules are unit-tested directly.

Co-authored-by: multica-agent <github@multica.ai>

* test(dashboard): allocate per-workspace issue.number explicitly

TestDashboardEndpoints creates two issues in the shared fixture
workspace. issue.number defaults to 0 (migration 020), and the table
carries UNIQUE (workspace_id, number), so the second insert raced the
first on the same default and failed in CI.

Allocate MAX(number) + 1 per insert so each row gets a fresh number
without stepping on rows other tests left behind in the same workspace.

Co-authored-by: multica-agent <github@multica.ai>

* feat(dashboard): rollup table + cron-driven aggregation for dashboard

Mirror the per-runtime rollup in `task_usage_daily` (migrations 073/077/082)
to remove the per-request raw aggregation the dashboard was doing.

Migration 084 adds:
  - `task_usage_dashboard_daily` keyed on
    (bucket_date, workspace_id, agent_id, project_id, model) — the
    dimensions the dashboard actually queries, with project_id nullable
    via UNIQUE NULLS NOT DISTINCT (PG15+) so "no-project" buckets
    upsert cleanly.
  - `task_usage_dashboard_rollup_state` watermark table.
  - `task_usage_dashboard_dirty` invalidation queue.
  - Triggers on agent_task_queue DELETE, task_usage DELETE, and
    issue.project_id UPDATE — the cases the updated_at watermark can't
    see. The project_id trigger re-attributes existing rollup rows when
    a user moves an issue across projects.
  - `rollup_task_usage_dashboard_daily_window(from, to)` —
    idempotent recompute primitive (same shape as 077).
  - `rollup_task_usage_dashboard_daily()` cron entry — own advisory
    lock (4244) so it serialises independently of the runtime rollup.
  - `task_usage_dashboard_rollup_lag_seconds()` health helper.

Sqlc queries `ListDashboardUsageDailyRollup` /
`ListDashboardUsageByAgentRollup` read from the new table; the handler
dispatches between rollup and raw on a separate
`UseDailyRollupForDashboard` config flag
(`USAGE_DASHBOARD_ROLLUP_ENABLED` env). Same fail-safe default (false →
raw) so operators can roll out independently of the per-runtime flag.

Bucket date is UTC (the dashboard aggregates across runtimes that may
sit in different tzs; there's no single correct local boundary).

Adds `cmd/backfill_task_usage_dashboard_daily` mirroring the existing
per-runtime backfill — operator runs it once before flipping the flag.

Tests: - TestDashboardEndpoints now also exercises the rollup read path
    (raw vs. rollup, same project-scoped totals).
  - TestDashboardRollupReattributesOnProjectChange verifies the
    issue.project_id trigger enqueues both old + new buckets and the
    next rollup tick zeroes the old project + populates the new one.
Co-authored-by: multica-agent <github@multica.ai>

* fix(dashboard-rollup): close two invalidation gaps

Two leak paths missed by migration 084 review:

1. Issue cascade DELETE — the atq BEFORE DELETE trigger runs AFTER the
   issue row is gone, so `LEFT JOIN issue` returns NULL project_id and
   the original-project bucket never gets cleared (issue 077 calls this
   out for the runtime rollup but didn't need to act on it). Adds an
   `issue BEFORE DELETE` trigger that enqueues using OLD.project_id
   while the issue row is still readable.

2. `LinkTaskToIssue` (quick-create task attaching to a real issue post-
   completion) UPDATEs `agent_task_queue.issue_id` from NULL to a real
   id. Migration 084 only watched DELETE on atq, so usage already
   rolled up under the no-project bucket stayed attributed to NULL
   forever. Extends the atq trigger to fire on UPDATE OF issue_id too,
   enqueueing both OLD (NULL project) and NEW (linked issue's project).

Tests: - TestDashboardRollupClearsOnIssueDelete asserts rollup row drops to
    zero after issue delete + rollup tick.
  - TestDashboardRollupReattributesOnLinkTaskToIssue verifies tokens
    move from the NULL bucket to the project bucket after the UPDATE.
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-13 12:51:16 +08:00
Bohan Jiang
a02e58b488 fix(github): only auto-close issue after all linked PRs resolve (#2470)
* fix(github): only auto-close issue when all linked PRs have resolved

Previously, the webhook handler unconditionally moved an issue to `done`
as soon as a single linked PR was merged. If a second PR was also linked
to the same issue and still open / draft, the issue would close before
the work was actually finished.

Add `CountOpenSiblingPullRequestsForIssue` and gate the auto-status
transition on it: a merged PR advances its linked issues only when no
sibling PR linked to the same issue is still in flight. Issues stay put
while siblings are open or draft, and the merge that resolves the last
in-flight PR is the one that closes the issue.

Adds an integration test that opens two PRs against the same issue,
merges the first, asserts the issue stays in_progress, then merges the
second and asserts the issue advances to done.

Co-authored-by: multica-agent <github@multica.ai>

* fix(github): re-evaluate auto-close on closed-without-merge events too

GPT-Boy review on #2470: gating only the `state == "merged"` branch left
one ordering hole. PR-A merges first → issue stays in_progress because
PR-B is open; PR-B later closes WITHOUT merging → no event ever re-runs
the auto-close check, so the issue is stuck in_progress.

Generalise the trigger to every terminal PR event (`merged` or `closed`)
and advance the issue only when:
- the issue is not already terminal (done / cancelled);
- no sibling PR is still in flight (open / draft);
- at least one linked PR — current or sibling — actually merged.

Rule (3) preserves "user closed every PR without merging → leave the
issue alone": if no work was delivered, the user decides what to do.

Replace `CountOpenSiblingPullRequestsForIssue` with
`GetSiblingPullRequestStateCountsForIssue`, which returns both the
in-flight count and the merged count in a single roundtrip.

Adds `TestWebhook_ClosedSiblingAfterMerge` (the regression GPT-Boy
flagged) and `TestWebhook_AllClosedWithoutMerge` (the negative case
guarding rule 3). Refactors the multi-PR webhook helper out of the
existing two-merge test so all three multi-PR scenarios share it.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-12 15:39:55 +08:00
Bohan Jiang
caeb146bac feat(github): GitHub App integration for PR ↔ issue linking (#1817)
* feat(github): GitHub App backend for PR ↔ issue linking

- New tables: github_installation (workspace ↔ App install), github_pull_request (mirrored PR state), issue_pull_request (M:N link).
- Webhook handler verifies HMAC-SHA256, upserts PR rows, parses issue identifiers from PR title/body/branch and auto-links them. Merging a linked PR moves the issue to done.
- Connect/setup endpoints power the zero-config "Connect GitHub" install flow; state token is HMAC-signed so the setup callback can recover the workspace.
- Workspace-scoped admin routes for listing/disconnecting installations, plus a per-issue `pull-requests` list endpoint.

Co-authored-by: multica-agent <github@multica.ai>

* feat(github): UI for connecting GitHub and viewing linked PRs

- Settings → Integrations: new tab with Connect GitHub / installations list / disconnect, gated on the deployment having the App configured.
- Issue detail sidebar: Pull requests section showing linked PR title, repo, state (open/draft/merged/closed), and author, with deep link to GitHub.
- Real-time refresh: github_installation:* and pull_request:* events invalidate the matching TanStack Query caches.

Co-authored-by: multica-agent <github@multica.ai>

* fix(github): address review — null actor, role gating, configured guard, scoped uninstall broadcast

- listeners: use optionalUUID(e.ActorID) so the system actor on the github-driven issue:updated event no longer panics activity / notification listeners; merged-PR → issue done now produces a status_changed activity and inbox entry.
- IntegrationsTab: gate the admin-only installations query on canManage so members no longer hit /github/installations 403; the configured/not-configured copy is also scoped to admins.
- backend: introduce isGitHubConfigured() requiring both GITHUB_APP_SLUG and GITHUB_WEBHOOK_SECRET, and surface that single flag from list-installations + connect endpoints so the frontend Connect button stays disabled until both are set.
- DeleteGitHubInstallationByInstallationID now RETURNs workspace_id; webhook handler publishes github_installation:deleted scoped to the right workspace so already-open Settings tabs invalidate in real time. ErrNoRows on a re-fired delete short-circuits cleanly.
- tests: focused webhook integration coverage (auto-link + merge → done, cancelled preservation, uninstall returns workspace).

Co-authored-by: multica-agent <github@multica.ai>

* fix(github): i18n the new GitHub UI strings to satisfy lint

CI flagged every literal string in the Integrations tab, the Pull requests
sidebar section, and the per-PR row label. Move them through useT() and
add the matching `integrations.*` block to settings.json (en / zh-Hans)
plus `detail.section_pull_requests` / `detail.pull_request_state_*` /
loading + empty copy under `issues.json`.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-12 13:49:03 +08:00
Naiyuan Qing
86aa5199fc feat(chat): support attachments & images in chat input (#2445)
* docs(plans): chat attachment & image support implementation plan

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(db): add chat_session_id/chat_message_id to attachment

Co-authored-by: multica-agent <github@multica.ai>

* feat(db): sqlc — chat_session_id on CreateAttachment + LinkAttachmentsToChatMessage

Co-authored-by: multica-agent <github@multica.ai>

* feat(file): upload-file accepts chat_session_id form field

Co-authored-by: multica-agent <github@multica.ai>

* feat(chat): SendChatMessage links uploaded attachments to the new message

Co-authored-by: multica-agent <github@multica.ai>

* feat(api): uploadFile accepts chatSessionId; sendChatMessage accepts attachmentIds

Co-authored-by: multica-agent <github@multica.ai>

* feat(core): useFileUpload supports chatSessionId context

Co-authored-by: multica-agent <github@multica.ai>

* feat(chat): support paste/drag/upload attachments in chat input

Co-authored-by: multica-agent <github@multica.ai>

* test(e2e): chat input attachment upload + send round-trip

Co-authored-by: multica-agent <github@multica.ai>

* chore(chat): keep lazy-created session title empty so untitled fallback localizes

Co-authored-by: multica-agent <github@multica.ai>

* fix(chat): address review — dedupe ensureSession + parse upload response

- chat-window: cache in-flight createSession promise in a ref so a file drop
  followed by a quick send no longer spawns two sessions (and orphans the
  attachment on the losing one).
- Attachment type + EMPTY_ATTACHMENT + AttachmentResponseSchema: include the
  new chat_session_id / chat_message_id fields the server now returns.
- uploadFile: route the response through parseWithFallback so a malformed
  body returns EMPTY_ATTACHMENT instead of an undefined-keyed Attachment,
  matching the API boundary rule.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(chat): address PR #2445 review — test ctx, send gating, attachment surface

1. Backend test was 400ing because the handler reads workspace from
   middleware-injected ctx, and `newRequest` only sets the header. Helper
   `withChatTestWorkspaceCtx` mirrors the agent-access-test pattern and
   loads the member row + SetMemberContext before invoking the handler.

2. Attachment metadata now flows end-to-end:
   - new sqlc `ListAttachmentsByChatMessageIDs` (batch lookup, mirrors the
     comment-side query)
   - `chatMessageToResponse` takes `attachments` and `ChatMessageResponse`
     surfaces them — same shape as CommentResponse
   - `ListChatMessages` loads them via a new `groupChatMessageAttachments`
     helper so the chat bubble can render file cards
   - daemon claim path pulls `ListAttachmentsByChatMessage` for the latest
     user message and ships `ChatMessageAttachments` to the daemon
   - `buildChatPrompt` lists id+filename+content_type and instructs the
     agent to `multica attachment download <id>` — fixes the private-CDN
     expiring-URL problem where the markdown URL would have expired by
     the time the agent acts
   - TS `ChatMessage` gains an optional `attachments` field

3. Chat composer now blocks send while uploads are in flight:
   - `pendingUploads` counter increments in handleUpload, SubmitButton
     uses it to disable
   - handleSend also gates on `editorRef.current.hasActiveUploads()` to
     catch the Mod+Enter path that bypasses the button
   - new vitest covers the "drop large file → immediate send" scenario
     where attachment id would otherwise be silently dropped

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* chore: drop implementation plan doc

Process artefact, not something the repo needs to keep.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-12 10:57:54 +08:00
Bohan Jiang
63d215e1c3 feat(runtime): visibility (public/private) gate on CreateAgent / UpdateAgent (#2419)
* feat(runtime): visibility (public/private) gate on CreateAgent / UpdateAgent

Closes the hole where a plain workspace member could pick another member's
runtime in the Create Agent dialog and bind an agent to it — the backend
wasn't checking runtime ownership, so the agent ran on someone else's
hardware / tokens. Reported on GH #1804.

Schema
- Migration 083 adds agent_runtime.visibility ('private' default, 'public')
  with a CHECK constraint. Existing rows default to private — same
  ownership semantics as before, no behavior change for legacy data.

Backend
- canUseRuntimeForAgent predicate: allow when caller is workspace
  owner/admin, the runtime owner, or the runtime is public.
- CreateAgent and UpdateAgent both gate on it: UpdateAgent matters because
  a plain member could otherwise create on their own runtime, then re-bind
  to a private one.
- PATCH /api/runtimes/:id accepts { visibility } — owner/admin only,
  validated against the same private/public allow-list.

Frontend
- Create-agent dialog renders other-owned private runtimes disabled with a
  Lock badge + tooltip explaining who to ask.
- Inspector runtime-picker disables the same set so re-binding fails
  the same way at the UI layer.
- Runtime detail diagnostics gains a Visibility editor (owner/admin) or
  read-only chip (everyone else).
- Runtime list shows a private/public chip next to the name.

Tests
- Go: canUseRuntimeForAgent truth table; CreateAgent / UpdateAgent
  end-to-end gate tests (admin / runtime owner / plain member);
  PATCH visibility owner / admin / member / invalid-value coverage.
- Vitest: create-agent dialog disabled state on private/public runtimes,
  default-runtime selection skips locked rows; runtime detail visibility
  editor → mutation, read-only fallback.

Migrating runtimes: existing rows default to private to preserve the
"owner only" status quo. Owners switch to public via the detail page
diagnostics card.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtime): apply timezone+visibility atomically; don't seed locked template runtime

Two issues surfaced in review of MUL-2062:

1. PATCH /api/runtimes/:id ran the timezone branch first, which:
   - returned early on a tz no-op, silently dropping a concurrent
     `visibility` patch in the same body;
   - committed the timezone mutation (+ usage rollup rebuild) before
     validating visibility, so an invalid visibility left the row
     half-updated.

   Validate every field first, then run the mutations in order. The
   no-op short-circuit now only triggers when nothing else is requested.

2. The Create Agent dialog in duplicate mode unconditionally seeded
   `template.runtime_id` as the selected runtime, even when that runtime
   is now private and owned by someone else — the user saw a selected
   row they couldn't submit (Create → backend 403). Fall back to the
   first usable runtime when the template's runtime is locked, and gate
   the Create button on `selectedRuntimeLocked` as defense in depth.

Tests:
- Go: TestUpdateAgentRuntime_CombinedPatchAppliesBoth (tz no-op +
  visibility flip), TestUpdateAgentRuntime_InvalidVisibilityDoesNotMutateTimezone
  (atomic-fail invariant).
- Vitest: duplicate template pointing at a locked runtime now seeds
  the first usable one; Create button stays disabled when no usable
  alternative exists.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-11 22:53:07 +08:00
Bohan Jiang
f5c2994aed feat(workspace): revoke a member's runtimes when they leave or are removed (#2401)
* feat(workspace): revoke a member's runtimes when they leave or are removed

Previously, leaving or being removed from a workspace only deleted the
member row — every runtime the departed user owned in that workspace
remained in the DB, kept its daemon_token valid, and stayed reachable to
the workspace's other members. The departed user lost access but their
machine kept doing work.

This change converges the runtime state in the same transaction as the
member-row deletion: agents pinned to those runtimes are archived,
in-flight tasks are cancelled (so the daemon's per-task status poller
interrupts the running agent gracefully), the runtimes are forced
offline, and the daemon_token rows are deleted. After commit the
DaemonTokenCache is invalidated and agent:archived / daemon:register
events fire so connected clients reconcile immediately.

Server-side state convergence is the production safety net; the
daemon_token revoke takes effect once the mdt_ flow is live (today most
daemons fall back to PAT/JWT, and the member-row deletion is what stops
those requests via requireWorkspaceMember).

Daemon-side handling (recognising the resulting 401/404 and tearing down
the local pairing for that workspace) lands in a follow-up.

Co-authored-by: multica-agent <github@multica.ai>

* fix(workspace): also cancel tasks for archived agents on member revoke

CancelAgentTasksByRuntime only matched tasks whose runtime_id was in the
revoked set, missing a real path: agent.runtime_id can be reassigned via
UpdateAgent, but agent_task_queue.runtime_id keeps the value from when
the task was queued. So an agent currently bound to the leaving member's
runtime gets archived correctly, but its older tasks still pinned to a
prior runtime stay 'queued' — and ClaimAgentTask does not gate on
agent.archived_at, so those orphaned tasks remain claimable by the
prior runtime.

Replace CancelAgentTasksByRuntime with CancelAgentTasksByRuntimeOrAgent,
which OR-matches runtime_ids and the archived agent IDs in one UPDATE.
Pass the archived agent IDs through from revokeAndRemoveMember.

Adds TestDeleteMember_CancelsTasksFromAgentReassignment as a regression
guard: same agent, two runtimes, the older task on the surviving runtime
must end up cancelled while the surviving runtime stays online.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-11 15:06:50 +08:00
Multica Eve
d6349c16ec feat(runtime): per-runtime timezone for token-usage aggregation (MUL-1950) (#2394)
* feat: per-runtime timezone for token usage aggregation

The runtime token-usage charts (daily and hourly tabs on the
runtime-detail page) bucketed every event by the Postgres session
timezone, which is UTC in production. For an operator in UTC+8 that
meant a Tuesday afternoon's tasks landed in Tuesday early-morning's
bar — the chart was always one off.

Fix: store an IANA timezone on agent_runtime and aggregate under it.

* migrations 081 / 082 add agent_runtime.timezone (TEXT NOT NULL
  DEFAULT 'UTC') and rebuild the rollup pipeline (window function
  and both trigger functions) to compute bucket_date with
  AT TIME ZONE rt.timezone instead of bare DATE().
* No historical backfill — task_usage_daily rows already on disk
  keep their UTC bucket_date; only future writes / re-touches
  recompute under the new tz. (Product call from MUL-1950: 'guarantee
  future correctness'.)
* runtime_usage.sql gains a @tz parameter on ListRuntimeUsage and
  GetRuntimeUsageByHour and threads tz through GetRuntimeTaskHourly  Activity. ListRuntimeUsageDaily reads bucket_date as-is since the
  rollup already wrote it in tz.
* parseSinceParamInTZ replaces the raw N×24h cutoff with start-of-
  day-N in the runtime's tz so 'last 7 days' lines up with bucket
  boundaries.
* Daemon registration sends the host's IANA tz (TZ env, then
  time.Local), and UpsertAgentRuntime preserves any user override
  via a CASE-on-existing-value pattern so a daemon reconnect can't
  silently revert the operator's setting.
* New PATCH /api/runtimes/:id endpoint (UpdateAgentRuntime) lets
  the runtime detail page edit the tz; the editor seeds with the
  browser tz on first interaction.

Refs: MUL-1950

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix: harden runtime timezone rollups

Co-authored-by: multica-agent <github@multica.ai>

* fix: address runtime timezone review nits

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Eve <eve@multica-ai.local>
2026-05-11 14:39:35 +08:00
Bohan Jiang
3f20999597 refactor(timeline): drop server-side comment + timeline pagination (#2322)
* refactor(timeline): drop server-side comment + timeline pagination (MUL-1929)

The cursor-paginated /timeline and /comments endpoints were sized for a
problem the data shape doesn't have: prod p99 is ~30 comments per issue
and the all-time max is ~1.1k. Time-based pagination also splits reply
threads across page boundaries (orphan replies), which the frontend was
papering over with an "orphan rescue" that promoted disconnected replies
to top-level — confusing UX with no real benefit.

Replace both endpoints with a single full-issue fetch, capped server-side
at 2000 rows as a defensive safety net (never hit in practice).

Server
- /api/issues/:id/timeline now returns a flat ASC TimelineEntry[]
  (matches the legacy desktop contract — older Multica.app builds keep
  working because the wrapped TimelineResponse + cursors are gone, and
  the raw array shape was always what they consumed).
- /api/issues/:id/comments drops limit/offset; only ?since is honoured
  for the CLI agent-polling flow.
- Drop ListCommentsBefore/After/Latest, ListActivitiesBefore/After/Latest
  and the timelineCursor encoding.
- Replace with ListCommentsForIssue / ListCommentsSinceForIssue /
  ListActivitiesForIssue (capped by argument).

CLI
- multica issue comment list drops --limit / --offset and the X-Total-Count
  reporting; --since is preserved for incremental polling.

Frontend
- Replace useInfiniteQuery with useQuery in useIssueTimeline; drop
  fetchOlder/Newer, jumpToLatest, isAtLatest, newEntriesBelowCount.
- Remove timeline-cache helpers (mapAllEntries / filterAllEntries /
  prependToLatestPage) and the TimelinePage / TimelinePageParam types.
- WS event handlers update the single flat-array cache directly.
- Drop the orphan-reply rescue in issue-detail — every reply's parent
  is now guaranteed to be in the same array.
- Strip the "show older / show newer / jump to latest" buttons and their
  i18n strings.

Co-authored-by: multica-agent <github@multica.ai>

* fix(timeline): address review feedback on pagination removal

Three issues caught in PR #2322 review:

1. /timeline broke for stale clients between #2128 and this PR. They send
   ?limit/?before/?after/?around and parse with the wrapped TimelinePageSchema;
   the new flat-array response was failing schema validation and falling back
   to an empty timeline. Restore the wrapped shape on those query params
   (DESC entries, null cursors, has_more_*=false), keeping the flat ASC array
   for bare requests. Around-mode now also fills target_index from the merged
   slice so legacy clients can still scroll-to-anchor without a follow-up.

2. The agent prompts in runtime_config.go and prompt.go still told agents
   that `multica issue comment list` accepts --limit/--offset and to use
   `--limit 30` on truncated output. With those flags removed in this PR,
   new agent runs would hit "unknown flag" or skip context. Update the
   prompt copy to "returns all comments, capped at 2000; --since for
   incremental polling".

3. useCreateComment's onSuccess was a bare append to the timeline cache
   with no id-dedupe, so a fast comment:created WS event firing before
   onSuccess produced a transient duplicate. Restore the id guard the old
   prependToLatestPage helper used to provide.

Adds two new boundary tests:
- TestListTimeline_LegacyWrappedShape_OnPaginationParams
- TestListTimeline_LegacyWrappedShape_AroundFillsTargetIndex

Co-authored-by: multica-agent <github@multica.ai>

* test(handler): fix timeline test assertions for handler-package isolation

The TestListTimeline_* assertions assumed CreateIssue would seed an
"issue_created" activity_log row, but the activity listener that publishes
those rows is registered in cmd/server/main.go — handler-package tests
don't wire it up. CI saw 5 entries (3 comments + 2 activities) where the
test expected ≥6.

Drop the auto-activity assumption: assert exactly 5 entries in
TestListTimeline_MergesCommentsAndActivities, and tighten
TestListTimeline_EmptyIssue to assert a fully-empty timeline.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 16:11:58 +08:00
Bohan Jiang
9ded462ecc feat(inbox): auto-archive stale task_failed rows on terminal status (#2319)
When an issue progresses to in_review / done / cancelled, archive any
pre-existing task_failed inbox rows for that issue across all member
recipients and emit inbox:batch-archived per recipient so connected
clients self-heal. Reuses the existing archived column rather than
introducing a parallel dismissed flag; the activity log preserves the
full failure history for audit independently of the inbox surface.

Closes #2291.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 15:53:25 +08:00
Multica Eve
a2dd80d4f6 feat(autopilot): skip dispatch when assignee runtime is offline (MUL-1899) (#2311)
* feat(autopilot): skip dispatch when assignee runtime is offline (MUL-1899)

Prevents scheduled autopilots from accumulating doomed tasks against
offline / archived / unbound agents. Before this change, a paused laptop
or crashed daemon would let a 5-minute-cron autopilot pile up thousands
of queued agent_task_queue rows that no runtime would ever drain — this
is the dominant source of the 89k stuck-task backlog flagged in MUL-1899.

DispatchAutopilot now performs a pre-flight admission check on the
assignee agent's runtime status. If the runtime is not 'online' (or the
agent is archived / has no runtime bound / has no assignee), the run is
recorded as 'skipped' with a failure_reason and no task is enqueued.
Skipped runs still emit autopilot:run.done so the UI / activity feed
reflect that the trigger fired and was evaluated.

Skipped runs are deliberately NOT counted toward the failure-ratio
auto-pause: a user who closes their laptop overnight should not have
their autopilot paused. Sustained server-side failures keep their
existing pause path via the failure monitor.

Tests: added an integration test that creates an offline runtime and
asserts DispatchAutopilot records a skipped run with no task enqueued.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(scheduler): expire stale queued tasks via TTL sweeper (MUL-1899)

Companion to the dispatch-time admission gate added in this PR. The
admission gate prevents *new* tasks from being enqueued against an
offline runtime, but it does not drain the historical backlog
(~89k stuck queued rows observed at MUL-1899 baseline) and does not
help when a runtime goes offline *after* a task has already been
queued. This adds a passive TTL sweeper:

- New SQL query `ExpireStaleQueuedTasks` transitions queued tasks
  older than the TTL to status='failed' with
  failure_reason='queued_expired' and a clear error message.
- Sweep is capped per tick (`queuedExpireBatchSize`, default 500) via
  a CTE+LIMIT so that draining a large backlog cannot monopolise the
  DB on a single tick. At 30s ticks the worst case is 60k rows/hour.
- Wired into the existing 30s `runRuntimeSweeper` loop alongside
  `sweepStaleTasks` and reuses `taskSvc.HandleFailedTasks` so the
  expired tasks broadcast `task:failed` events, reconcile agent
  status, and roll back any in-progress issues — same lifecycle as
  any other failed task.
- Default TTL = 2h. Conservatively above any reasonable
  "queued behind a long-running task" window (default agent timeout
  is 2h, sweeper runs every 30s) so legitimate work isn't expired.
- Integration tests cover the happy path (stale → expired, fresh →
  left alone, correct status/reason/error) and the per-tick batch cap.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): address review blockers from PR #2311 (MUL-1899)

GPT-Boy review of the offline-runtime + queued-TTL PR flagged four
blockers; this commit addresses them all.

1. Restore the 'skipped' autopilot_run status in the DB constraint.
   Migration 043 had removed 'skipped' along with the now-defunct
   concurrency_policy feature, so the new admission gate's INSERT of
   status='skipped' violated `autopilot_run_status_check` and broke
   `TestAutopilotDispatchSkipsWhenRuntimeOffline` in CI. New
   migration 079 re-adds 'skipped' to the CHECK list. The down
   migration migrates skipped → failed before re-tightening, mirror-
   ing what 043 did for the original removal.

2. Make `ExpireStaleQueuedTasks` race-safe.
   The CTE-then-UPDATE pattern could clobber a task that the daemon
   claimed between victim selection and the outer update. Two
   guards added:
     - `FOR UPDATE SKIP LOCKED` in the CTE so we never wait on a
       row that's currently being claimed (and never block the
       claim path either).
     - The outer UPDATE now re-checks `t.status = 'queued'` AND the
       TTL predicate so even if a row's lock is released after a
       successful claim, we cannot transition a now-dispatched/
       running task to 'failed'.

3. Add a partial index for the queued-TTL sweeper.
   `idx_agent_task_queue_queued_created_at` on `created_at WHERE
   status = 'queued'` — keeps the 30s sweep query (status=queued
   AND created_at < ... ORDER BY created_at LIMIT 500) cheap even
   when historical terminal rows accumulate (~89k+ at MUL-1899
   baseline). The partial predicate keeps the index tiny because
   only in-flight rows live in 'queued'.

4. Fix the failure-monitor denominator.
   `SelectAutopilotsExceedingFailureThreshold` had been counting
   'skipped' toward total runs, which would have diluted the failure
   ratio: a 100%-failing autopilot could mask itself behind a wall
   of admission skips. With 'skipped' restored as a real status,
   the auto-pause monitor must explicitly exclude it from BOTH
   numerator and denominator — admission skips are neither a
   success nor a failure.

Verified: `go test ./cmd/server/... ./internal/service/...` passes
(including TestAutopilotDispatchSkipsWhenRuntimeOffline,
TestExpireStaleQueuedTasks, TestExpireStaleQueuedTasksRespectsBatch
Limit). `go build ./... && go vet ./...` clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(migrations): split queued-task TTL index into concurrent migration

Per PR #2311 review: agent_task_queue is a hot table, so building the
new partial index with plain CREATE INDEX inside migration 079 would
hold ACCESS EXCLUSIVE on the queue and block dispatch during deploy.

The migration runner does not allow CONCURRENTLY to share a file with
other statements (documented in 068), so split the index into its own
single-statement file 080 — matching the existing pattern in 035 /
067 / 074 / 075 / 078. Migration 079 keeps the autopilot_run
constraint change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 15:07:57 +08:00
Bohan Jiang
6d9ebb0fdd fix(daemon): unblock issues stuck on a poisoned-image agent session (#2314)
* fix(daemon): treat upstream API 400 invalid_request_error as poisoned session

A markdown-linked image in an issue description that the agent downloads as
a tiny CDN auth-error file and Read's as a PNG poisons the conversation:
the LLM API rejects the bad image with 400 invalid_request_error, the
session_id is pinned mid-flight, and every follow-up task on the issue
(comment-trigger, auto-retry) resumes the same poisoned conversation and
hits the same 400 — the issue can no longer be executed even after the
description is cleaned up.

Mirror the existing fallback-output classifier on the error side: detect
"API Error: ... 400 ... invalid_request_error" in the agent error string,
persist failure_reason='api_invalid_request', and add it to the
GetLastTaskSession exclusion list so the next task starts a fresh
session that re-reads the (now-clean) description.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): unblock issues already poisoned by API 400 invalid_request_error

The forward-only classifier from the previous commit only tags new failures.
Issues like MUL-1918 already have multiple failed-task rows whose
failure_reason is the pre-fix default 'agent_error', and GetLastTaskSession
falls back to those legacy rows on the next claim — so deploying the
classifier alone leaves existing poisoned issues stuck (GPT-Boy review
on PR #2314).

Two complementary changes:

- Migration 079 backfills failure_reason='api_invalid_request' on every
  pre-existing 'agent_error' row whose error text matches the canonical
  Anthropic 400 invalid_request_error shape. Keeps observability
  consistent (multica issue runs / UI now report the right reason).

- GetLastTaskSession adds a defensive ILIKE clause on error text. Closes
  the deploy-window gap where the old binary could write a new
  'agent_error' row between the migration running and the new code
  taking over, and protects against future error-format variants the
  daemon classifier might miss.

Plus regression tests covering the legacy + new coexistence case GPT-Boy
flagged, and a guard rail asserting benign 'agent_error' failures
(timeouts, tool errors) still resume their session.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 14:39:10 +08:00
Jiayuan Zhang
0cd50e14eb feat(agent-live-card): show queued tasks in issue live banner (MUL-1897) (#2307)
The issue-detail "agent live" banner only showed dispatched/running tasks.
A task that was queued — runtime offline, busy on a prior task, or held
behind a coalesced sibling — left the issue silent until claim, which
reads as "the trigger never landed".

Include 'queued' in `ListActiveTasksByIssue`, then branch the renderer:
queued banners use a non-spinning Clock, "{name} 排队中 / is queued"
copy, "queued for Ns" elapsed anchored on `created_at`, and hide the
transcript button (no execution log yet). Cancel still works because
`CancelAgentTask` already accepts queued.

Client-side re-sort by lifecycle (running → dispatched → queued) so the
sticky slot stays on the most-active task even when a queued sibling
was created more recently.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 07:33:12 +02:00
Multica Eve
ce00e05169 Add canonical PostHog core metrics events (#2302)
* Add canonical PostHog core metrics events

Co-authored-by: multica-agent <github@multica.ai>

* Address analytics review feedback

Co-authored-by: multica-agent <github@multica.ai>

* Tighten analytics review follow-ups

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 13:12:00 +08:00
Bohan Jiang
b17f975a17 docs(cli): clarify issue rerun semantics (current assignee, fresh session) (#2304)
* docs(cli): clarify `issue rerun` semantics

The CLI table described `multica issue rerun <id>` as "Rerun the most
recent agent task", which led users to expect it would re-run whichever
agent ran last. The actual behavior is to enqueue a fresh task for the
issue's **current** agent assignee, regardless of who ran most
recently — see `TaskService.RerunIssue` in
`server/internal/service/task.go`.

Also fix a stale claim in `tasks.mdx`: the "Manual rerun" section
described session inheritance as "Yes", but commit b1345685 made manual
rerun pass `force_fresh_session=true` precisely to avoid replaying a
poisoned session. Only **automatic retry** still inherits the session.

Updates EN + ZH mirrors of `cli.mdx` and `tasks.mdx`.

Co-authored-by: multica-agent <github@multica.ai>

* docs(tasks): tighten rerun trigger surface; clean stale Go comments

Apply review feedback on PR #2304:

- `tasks.mdx` / `tasks.zh.mdx`: rerun is triggered via CLI or the
  `/api/issues/{id}/rerun` endpoint, not "UI or CLI" — there's no rerun
  affordance in web/desktop today.
- `tasks.mdx` / `tasks.zh.mdx`: comparison table — manual rerun applies
  to "Issues with an agent assignee", not "All sources". The handler
  rejects with `issue is not assigned to an agent` for anything else,
  and there's no rerun path for chat or autopilot tasks.
- `task_lifecycle.go`: `RerunIssue` doc comment claimed the new task
  "carries the most recent session_id/work_dir so the agent can resume".
  That has been false since b1345685 — rewrite to reflect the actual
  `force_fresh_session=true` contract.
- `agent.sql` (regenerated `agent.sql.go`): `GetLastTaskSession` doc
  said it serves "auto-retry / manual rerun"; manual rerun is now
  routed around it via `force_fresh_session=true`. Note both the
  auto-retry path it does serve and the rerun escape hatch.

No logic change.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 12:46:37 +08:00
Jiayuan Zhang
3b3be9d7bd feat(comments): resolve threads with collapsible bar (MUL-1895) (#2300)
* feat(comments): resolve threads with collapsible bar (MUL-1895)

Adds a Linear-style resolve action on comment thread roots. Resolved
threads collapse to a single "N resolved comments from X" bar in the
activity feed; clicking expands the thread inline (per-session, not
persisted). Replying inside a resolved thread auto-unresolves it.

Backend
- migration 069: resolved_at, resolved_by_type, resolved_by_id on comment
- sqlc ResolveComment / UnresolveComment queries (idempotent via COALESCE)
- POST/DELETE /api/comments/{id}/resolve handlers, root-only validation
- CreateComment auto-clears resolved_at when a reply lands in a resolved
  thread, publishing comment:unresolved
- comment:resolved / comment:unresolved events; CommentResponse and
  TimelineEntry both surface the new fields

Frontend
- Comment + TimelineEntry types extended; payloads typed; WS sync wired
- useResolveComment optimistic mutation with rollback
- ResolvedThreadBar component for the collapsed view
- Resolve / Unresolve menu items on root comments; Collapse strip on the
  expanded resolved card
- en + zh-Hans locale strings

Co-authored-by: multica-agent <github@multica.ai>

* fix(comments): cover agent reply path, expand-state hygiene, nested counts (MUL-1895)

Addresses three review issues from Emacs on PR #2300:

1. TaskService.createAgentComment bypasses Handler.CreateComment, so the
   auto-unresolve wired into the handler did not fire when an agent replied
   in a resolved thread (task / mention / on_comment paths). Extracted the
   logic to TaskService.AutoUnresolveThreadOnReply so both reply paths share
   it; rewired Handler.CreateComment to call the new method.

2. Resolving an already-expanded thread no longer collapses it back to the
   bar because expandedResolved still contained the id. Added
   clearResolvedExpand + handleResolveToggle wrapper so resolve / unresolve
   always wipe the session expand entry.

3. ResolvedThreadBar received only direct children, while CommentCard's
   expanded view recurses through descendants. Extracted the recursive
   walk into thread-utils.collectThreadReplies and called from both —
   counts and author lists now match.

Co-authored-by: multica-agent <github@multica.ai>

* test(comments): mock useResolveComment + add zh-Hans plural key

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 05:49:33 +02:00
Multica Eve
eb067ff077 fix(server): aggregate task_usage into daily rollup table to cut DB load (#2256)
* fix(server): aggregate task_usage into daily rollup table to cut DB load

ListRuntimeUsage previously did a SUM(...) GROUP BY DATE(created_at), provider,
model over the raw task_usage stream once per runtime row on the runtimes
list and once per detail page load, scaling O(events) per call. This is the
hot read path responsible for sustained load on Postgres.

Switch the read path to a materialized daily rollup table maintained by a
pg_cron job:

- 072_task_usage_daily_rollup: schema for task_usage_daily +
  task_usage_rollup_state, plus rollup_task_usage_daily_window(p_from, p_to)
  (window primitive used by both cron and offline backfill, idempotent via
  ON CONFLICT DO UPDATE adding deltas) and rollup_task_usage_daily() (cron
  entry point — pg_try_advisory_lock(4242) for serialization, watermark
  advancement, 5-minute safety lag for late-visible inserts). Also adds
  idx_task_usage_created_at to help the two lazy endpoints
  (ListRuntimeUsageByAgent / GetRuntimeUsageByHour) that still hit the
  raw table.

- 073_task_usage_daily_pgcron: CREATE EXTENSION IF NOT EXISTS pg_cron in a
  DO/EXCEPTION block (mirrors the migration 032 pg_bigm pattern so envs
  without shared_preload_libraries=pg_cron skip gracefully) and schedules
  rollup_task_usage_daily() every 5 minutes when the extension is present.

- queries/runtime_usage.sql ListRuntimeUsage rewritten to read from
  task_usage_daily; sqlc regenerated. Other usage queries unchanged.

- cmd/backfill_task_usage_daily: one-shot Go command that walks
  task_usage in monthly slices through rollup_task_usage_daily_window,
  then stamps the watermark to now()-5m so the cron resumes cleanly.
  Run once after migrations have applied, before relying on the rollup.

- runtime_test.go: TestGetRuntimeUsage_BucketsByUsageTime now invokes
  rollup_task_usage_daily_window after fixture inserts so the handler
  sees the rolled-up rows. Synthetic daily rows cleaned up after each
  test.

- runtime_rollup_test.go: new tests covering aggregation correctness,
  idempotency contract of ON CONFLICT DO UPDATE, and the watermark
  advancing exactly to now()-5m via the cron entry point.

Deployment order: apply migrations → run backfill_task_usage_daily once
→ pg_cron picks up subsequent windows automatically. Today bucket may be
up to ~10 minutes stale (5 min cron + 5 min lag) by design.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(server): make task_usage_daily rollup safe to overlap, replay, and correct

Addresses 4 review blockers on the original PR:

1. Cron/backfill double-count race: the rollup function is now idempotent.
   Window calls find DIRTY KEYS via task_usage.updated_at, then RECOMPUTE
   each bucket from ground truth and REPLACE the daily row (no more
   additive ON CONFLICT). Cron and backfill can now overlap safely.

2. Silent pg_cron absence: the read path is gated behind a new
   USAGE_DAILY_ROLLUP_ENABLED feature flag (default off). The raw
   task_usage scan is preserved as the fallback. Operators flip the
   flag per-environment after backfill + cron are confirmed healthy
   (task_usage_rollup_lag_seconds() helper added for monitoring).

3. UpsertTaskUsage corrections invisible to rollup: added
   task_usage.updated_at column (default now(), backfilled from
   created_at), and bumped it on conflict. Corrections now mark the
   bucket dirty and the next window call recomputes it correctly.

4. CREATE INDEX blocking writes on hot table: split into separate
   single-statement migrations using CREATE INDEX CONCURRENTLY
   (074, 075), matching the 035/067 pattern.

Also: cron.schedule() removed from migrations entirely. Migration 076
only enables the extension (gracefully on unsupported envs); the actual
schedule is a documented operator runbook step that runs AFTER backfill.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(server): trigger-driven invalidation + online-safe migration for task_usage_daily

Round-2 review feedback on PR #2256:

1. Add explicit dirty-bucket queue (task_usage_daily_dirty) populated by
   triggers on agent_task_queue (UPDATE OF runtime_id, DELETE) and
   task_usage (DELETE). The rollup window function drains both this queue
   and the updated_at-based discovery, so runtime reassignment and
   issue-cascade deletes no longer leave the rollup divergent from the
   raw query.

   Triggers join via agent (not issue) to look up workspace_id, because
   when the cascade comes from issue, the issue row is already gone by
   the time atq's BEFORE DELETE fires; agent stays alive.

2. Make migration 072 online-safe: only ADD COLUMN updated_at TIMESTAMPTZ
   (nullable, no default → metadata-only ALTER, no row rewrite) and a
   separate ALTER for SET DEFAULT now() (also metadata-only). No bulk
   UPDATE on the hot task_usage table. The rollup window function's
   dirty_keys CTE handles legacy NULL rows via an OR branch, supported
   by partial index idx_task_usage_created_at_legacy.

3. Refresh stale documentation in cmd/backfill_task_usage_daily/main.go
   header to describe the current recompute/replace semantics, idempotent
   re-runnability, and the actual migration numbering (072..077).

Tests:
- TestRollupTaskUsageDaily_InvalidationOnReassign: verifies usage moves
  between runtime buckets after ReassignTasksToRuntime-style update.
- TestRollupTaskUsageDaily_InvalidationOnIssueDelete: verifies daily
  bucket is cleared after issue delete cascades through atq → task_usage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(server): close dirty-queue race + move legacy partial index to its own concurrent migration

Round-3 review feedback on PR #2256:

1. Blocker: dirty-queue invalidations could be silently lost under
   concurrency. ON CONFLICT DO NOTHING let a late trigger see the row
   already enqueued, no-op, and then the rollup drain (WHERE
   enqueued_at < p_to) would delete the original row — losing the
   late invalidation. Switched all three trigger enqueue paths to
   ON CONFLICT DO UPDATE SET enqueued_at = GREATEST(existing,
   EXCLUDED.enqueued_at), so any invalidation arriving during a
   rollup tick keeps enqueued_at > p_to (p_to = now() - 5min) and
   survives the post-tick drain.

2. High: idx_task_usage_created_at_legacy (partial index on hot
   task_usage table) was being created in the regular 077 migration
   without CONCURRENTLY. Moved to new migration 078 with
   CREATE INDEX CONCURRENTLY, matching the pattern of 074/075.
   077's down migration leaves the index alone (it is owned by 078).

3. Minor: gofmt -w on runtime_rollup_test.go and
   backfill_task_usage_daily/main.go (tabs were lost in the original
   heredoc append). PR description rewritten to describe the current
   recompute/replace + dirty queue + feature flag design and the
   072..078 migration ordering.

Tests still green: TestRollupTaskUsageDaily_* (including both new
invalidation regressions), TestGetRuntimeUsage_*, TestWorkspaceUsage_*.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(server): unify workspace_id source via agent in rollup window function

Round-4 review feedback (J) on PR #2256:

M1 (must-fix): The dirty queue triggers resolved workspace_id via
`agent.workspace_id`, but the window function's `dirty_from_updates`
discovery and `recomputed` recompute join used `issue.workspace_id`.
There is no schema-level FK guaranteeing
`agent.workspace_id == issue.workspace_id`. Any divergence (future
cross-workspace task scenarios, data repairs, migration bugs) would
cause:

  - dirty queue rows with workspace_id from agent
  - recompute join filtering by workspace_id from issue
  - 0 matches in recompute → bucket erroneously hits the
    deleted_empty branch and the daily row is silently dropped
  - dirty_from_updates path attributing usage to the wrong workspace

Replaced both CTEs to JOIN agent (not issue) so trigger / discovery /
recompute share one workspace_id source. Comment in 077 explains the
constraint.

N1: Refreshed two stale references in
cmd/backfill_task_usage_daily/main.go (header now says "072..078";
stampWatermark warning now mentions migration 073, where the rollup
state table is actually introduced).

Test: New TestRollupTaskUsageDaily_WorkspaceMismatch constructs an
atq with agent.workspace_id != issue.workspace_id, asserts the bucket
lands under agent's workspace (not issue's), and re-asserts after a
runtime reassign in the foreign workspace. Acts as a canary if the
schema invariant changes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
2026-05-08 15:35:21 +08:00
LinYushen
cc527c34be perf(heartbeat): batch runtime last_seen_at writes (#2213)
Batches runtime heartbeat last_seen_at updates while preserving the 60s flush / 150s sweeper stale-window invariant. Also drains pending heartbeat writes during graceful shutdown.
2026-05-07 15:50:27 +08:00
LinYushen
250ada1fb3 chore(db): drop unused agent_task_queue.last_heartbeat_at (#2212)
Drops the unused agent_task_queue.last_heartbeat_at column and removes the hot-path task heartbeat write.
2026-05-07 15:45:29 +08:00
Bohan Jiang
d0ac67dea2 fix(skills): drop SKILL.md content from list endpoints (#2180)
* fix(skills): drop SKILL.md content from list endpoints (#2174)

`GET /api/skills` and `GET /api/agents/{id}/skills` were SELECT *'ing the
skill row and shipping the full SKILL.md `content` blob to every caller.
SKILL.md bodies routinely run 50–200KB each, so a workspace with 30–40
skills returned multi-megabyte JSON arrays — past the CLI's 15s timeout
on high-latency links and locking out non-US users entirely.

Add `ListSkillSummariesByWorkspace` / `ListAgentSkillSummaries` sqlc
queries that omit `content`, plus a dedicated `SkillSummaryResponse`
wire shape so the contract is explicit (versus stuffing
`Content: ""` back into the existing struct). Detail endpoints
(`GET /api/skills/{id}`, agent CRUD return values) keep returning the
full body.

`AgentResponse.skills` and the matching TS `Agent.skills` now use
`SkillSummary[]` — frontend list/columns code already only read
id/name/description/config.origin, so the type narrowing matches actual
usage and prevents new code from accidentally depending on a content
field that won't be there.

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): narrow embedded skills to AgentSkillSummary; gofmt agent.go

GPT-Boy review of #2180: the previous commit typed AgentResponse.Skills as
[]SkillSummaryResponse, but the agent list batch query
(ListAgentSkillsByWorkspace) only joins agent_id/id/name/description, so
the wider type left workspace_id/config/created_at/updated_at as zero
values. Define a dedicated AgentSkillSummary {id,name,description} that
matches what the batch query actually returns and what the frontend
actually reads (`agent.skills.map(s => s.name|s.id)`); the standalone
GET /api/agents/{id}/skills endpoint keeps SkillSummaryResponse for
callers that need the source/origin info.

Switch GetAgent's per-agent skills load from ListAgentSkills (full Skill
rows including content) back to ListAgentSkillSummaries to avoid reading
SKILL.md bodies just to discard them.

Re-run gofmt on agent.go to fix the field-tag alignment that drifted when
Skills changed type.

Co-authored-by: multica-agent <github@multica.ai>

* docs(types): correct SkillSummary JSDoc — Agent.skills is AgentSkillSummary[]

GPT-Boy spotted on review: comment said SkillSummary was "embedded in
Agent.skills", but that field is now AgentSkillSummary[]. Re-point the
reader at the right type to avoid future confusion.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-07 01:36:29 +08:00
Bohan Jiang
38f777d0ba feat(autopilot): auto-pause autopilots with sustained high failure rate (#2136)
* feat(autopilot): auto-pause autopilots with sustained high failure rate

Adds a background monitor that pauses any active autopilot whose recent
runs are dominated by failures (defaults: ≥100 terminal runs in 7d, ≥90%
failed). The monitor leaves a severity=attention inbox notification for
the autopilot's creator (or the agent's owner if the autopilot was
agent-created) so a human learns about the auto-pause and can fix the
root cause before re-enabling.

Motivated by MUL-1336 §6 #2: a single broken cron autopilot
(`Registro de ls cada 5 min`, 1,475/1,476 failed in 7d) was burning
~1.5k tasks/tokens per week with no human in the loop.

Tunable via AUTOPILOT_FAIL_MONITOR_{INTERVAL,LOOKBACK,MIN_RUNS,FAIL_RATIO,STARTUP_DELAY};
INTERVAL=0 disables the monitor entirely.

Co-authored-by: multica-agent <github@multica.ai>

* chore(autopilot): relax failure monitor defaults to daily / 50 runs

Per review feedback in MUL-1339: 30-min scan was overkill — the 50-run
threshold already provides multi-hour lag, and operational simplicity
matters. Lowering MinRuns from 100 → 50 keeps low-frequency autopilots
in scope (~7 runs/day reaches threshold within 7d window).

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-06 17:59:15 +08:00
Naiyuan Qing
ba147708a6 fix(timeline): cursor-paginated timeline to stop long-issue freeze (#1968) (#2128)
* fix(timeline): cursor-paginated timeline to stop long-issue freeze (#1968)

Opening an issue from Inbox with thousands of timeline entries used to
hard-freeze the browser tab on a synchronous render of every comment +
activity. The whole pipeline was unbounded: the API returned every row,
TanStack Query cached the full array, and IssueDetail mounted N
CommentCards (each running a full react-markdown + lowlight pipeline)
in one frame.

This swaps the timeline endpoint to keyset cursor pagination and rewires
the frontend to useInfiniteQuery so a long issue costs the same as a
short one on first paint.

API:
- GET /issues/:id/timeline now accepts ?before / ?after / ?around (mutex)
  + ?limit (default 50, max 100); response wraps entries with next/prev
  cursors and has_more flags. Cursors are opaque base64 (created_at, id).
- ?around=<entry_id> anchors a window on the target so Inbox notifications
  pointing at an old comment never trigger the freeze.
- New composite indexes on (issue_id, created_at DESC, id DESC) replace
  the redundant single-column ones so keyset queries are index-only scans.
- /issues/:id/comments default branch now caps at 50 instead of returning
  every row unbounded; the unbounded ListComments / ListActivities sqlc
  queries are deleted.

Frontend:
- useIssueTimeline switches to useInfiniteQuery, exposes
  fetchOlder/fetchNewer/jumpToLatest + isAtLatest + newEntriesBelowCount.
- WS handlers respect the at-latest invariant: comment/activity:created
  prepends to pages[0] only when the user is reading the live tail;
  otherwise it just bumps a counter so the UI offers a "Jump to latest"
  affordance without yanking scroll.
- Optimistic mutations adapted to the InfiniteData shape via shared
  helpers (mapAllEntries / filterAllEntries / prependToLatestPage in
  core/issues/timeline-cache.ts) and use setQueriesData so all open
  windows of the same issue stay in sync.
- IssueDetail Activity section gets a TimelineSkeleton placeholder
  during the brief load window plus subtle text-link load-more buttons
  matching the existing Subscribe affordance (no Button chrome). Top
  uses a divider for boundary clarity; bottom shows
  "Jump to latest · N new" weighted slightly heavier when there's
  unread state.
- highlightCommentId now flows into the hook's around parameter so
  Inbox jumps fetch the surrounding 50 entries directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(agent): default comment list to 50 + prompt hint about long issues

The CLI's "multica issue comment list" used to default to --limit 0
(meaning "fetch every comment"), which lets an agent on a long issue
fill its context window with thousands of rows. The default is now 50;
agents that need older history can pass --limit or --since explicitly.

The local-coding-agent prompt also gains a single-line note about this
in both the comment-triggered and on-assign flows so the agent knows to
scope its fetches when issue size is unknown. Autopilot run-only mode
is intentionally unchanged — it has no issue context to query.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 16:27:06 +08:00
Naiyuan Qing
3447764b03 feat(i18n): full rollout — 21 namespaces translated (en + zh-Hans) (#1853)
* feat(i18n): rollout phase — translate 9 namespaces (WIP)

Phase 1 complete (基建 + login + Settings language switcher),
phase 2 partial (Wave 4 done, search done). Pending namespaces
documented inline; another developer can pick up from here.

Infrastructure
--------------
- server: add users.language column + extend PATCH /api/me
  (TestUpdateMeAcceptsLanguage / TestUpdateMePreservesLanguage)
- packages/core/i18n: types / pickLocale (intl-localematcher) /
  browser-cookie-adapter / createI18n (initAsync false +
  useSuspense false) / I18nProvider / LocaleAdapterProvider
- Split server-safe vs React entries:
    @multica/core/i18n        — for proxy/RSC/middleware (no React)
    @multica/core/i18n/react  — for client trees (createContext)
  (RSC vendored React lacks createContext; mixed import would crash
  proxy.ts at module load.)
- packages/views/i18n: useT hook + selector API augmentation
  (i18next v26 default; auto-propagates to apps via the side-effect
  import in use-t.ts).
- apps/web: proxy.ts (Next 16 renamed middleware) merges existing
  legacy/root redirects with x-multica-locale header forwarding;
  layout.tsx reads locale via headers() and pre-loads RSC resources.
- apps/desktop: webPreferences.additionalArguments injects
  systemLocale (no sendSync — avoids main-thread blocking IPC);
  renderer adapter reads via process.argv.
- ESLint: i18next/no-literal-string at file-scope for translated
  files via packages/views/eslint.config.mjs TRANSLATED_FILES.
- glossary.md (packages/views/locales/) freezes term policy:
  Issue / Workspace / Agent / Skill / Autopilot / Daemon / Runtime
  stay English; Inbox / Project / Comment / Member translate.

Translated namespaces (9 / 19)
------------------------------
- auth: login page (web wrapper含 desktop-handoff 文案) + Settings
  Appearance language switcher
- editor: 9 .tsx (bubble-menu / link-hover-card / readonly-content /
  title-editor / extensions: code-block / file-card / image-view /
  mention-suggestion) + 32 keys
- invite: 25 keys
- labels / members / my-issues: Wave 4 全部
- search: command palette 35 keys
- navigation: no user-facing strings (no-op)

Pending (10 / 19)
-----------------
issues (46 files / ~210 keys)
agents (29 files / ~155 keys; presence.ts + config.ts label maps
  允许进 i18n)
onboarding (22 files / ~150 keys)
settings rest / skills / modals / workspace / chat / inbox /
projects / autopilots / layout

Workflow for picking up
-----------------------
- Glossary: packages/views/locales/glossary.md (mandatory read)
- Reference impls: auth/login-page.tsx + editor/* (selector API +
  i18n-provider test wrapper pattern)
- Per namespace:
    1. create locales/{en,zh-Hans}/{ns}.json
    2. add to packages/views/i18n/resources-types.ts
    3. useT('{ns}') + t($ => $.foo) in components
    4. add files to TRANSLATED_FILES in eslint.config.mjs
    5. typecheck + test + lint must pass
- Subagents currently CANNOT write files (sandbox deny). Run as
  hybrid: subagent researches + outputs full JSON + tsx diff,
  controller writes.

Other
-----
- scripts/init-worktree-env.sh: default
  MULTICA_DEV_VERIFICATION_CODE=888888 in dev for deterministic
  login (gated by isProductionEnv).

Verified: pnpm typecheck (6 pkgs ok), pnpm test (232 pass),
make test (Go).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(i18n): rewrite glossary aligned with docs zh voice

Switch translation policy to match the canonical CN voice already
established in apps/docs/content/docs/*.zh.mdx (20+ files). The new
rule splits product nouns into two classes:

- Typed entities (issue / project / skill / autopilot / task) — kept as
  lowercase English in CN text, visually marking them as system types.
- Concepts (workspace / agent / daemon / runtime / inbox) — fully
  translated (工作区 / 智能体 / 守护进程 / 运行时 / 收件箱).

Previous glossary kept Workspace / Agent / Daemon / Runtime as English
on "工程惯例" grounds, but docs zh and CN AI ecosystem (Coze / 腾讯元器
/ 百度) consistently translate these. App UI now matches docs voice so
users don't see split personality between the app and its own docs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(i18n): register 6 namespaces and retrofit zh strings to new glossary

Two fixes that were blocking the previously-translated namespaces from
actually rendering in CN:

1. RESOURCES gap — locales/index.ts only loaded common/auth/settings,
   but resources-types.ts declared 12 namespaces and 6 of them had real
   translation content. At runtime i18next would fall back to raw keys
   for editor / invite / labels / members / my-issues / search.
   Register all 9 currently-translated namespaces.

2. Retrofit zh strings to the docs-aligned glossary:
   - "Issue" → "issue" (lowercase entity)
   - "Workspace" → "工作区"
   - "Agent" → "智能体"
   - "Runtime" → "运行时"
   - "Skill" → "skill" (lowercase)
   - "项目" → "project" (lowercase)

Touched: editor.json (sub_issue + mention.group_issues), invite.json
(3 Workspace occurrences), members.json (agents_section / more_agents),
my-issues.json (8 retrofits across page/header/errors), search.json
(13 retrofits across groups/pages/commands/empty).

Verified: pnpm typecheck (6/6) + pnpm test (238/238) all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate inbox namespace

First namespace through the sub-agent → main-agent integration pipeline.

JSON: en/inbox.json + zh-Hans/inbox.json — 60 keys across page / menu /
list / detail / types / labels / errors. Time-formatter labels are kept
compact in EN ("5m" / "3h" / "2d") and use full units in zh ("5 分钟" /
"5 小时" / "5 天") since raw "5 分" reads as "5 marks/points" in CN.

Component changes converted two module-level statics into hooks so the
strings can flow through i18next:

- inbox-list-item.tsx: `timeAgo` (pure fn) → `useTimeAgo` (hook
  returning a fn). The local copy is a duplicate of @multica/core/utils
  `timeAgo` that is only used by inbox-page; other consumers across
  chat/agents/skills/issues stay on the core util for now and will be
  translated when their namespaces land.

- inbox-detail-label.tsx: `typeLabels` (static const Record) →
  `useTypeLabels` (hook returning the same Record shape). Call sites
  keep the existing `typeLabels[type]` access pattern.

inbox-page.tsx now uses both hooks and `useT('inbox')` selector calls
for all hardcoded strings (~24 sites: header / dropdown menu / list
empty state / detail panel / mobile back / quick-create-failed flow /
all error toasts).

Wired up: resources-types.ts, locales/index.ts RESOURCES, ESLint
TRANSLATED_FILES (3 inbox tsx files now lint-protected).

Verified: pnpm typecheck (6/6) + pnpm --filter @multica/views test
(238/238) + ESLint clean on inbox/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate workspace namespace

Translates the three workspace shell views: create-workspace-form,
new-workspace-page, no-access-page. Also fixes the prior-art
no-unescaped-entities lint errors in no-access-page.tsx — the
apostrophes in "doesn't" / "don't" were JSX text literals that move
into JSON values after translation, so the lint rule no longer fires.

Tests wrapped: workspace/create-workspace-form.test.tsx,
workspace/no-access-page.test.tsx, modals/create-workspace.test.tsx
all now wrap render() with <I18nProvider locale="en"> so the en values
in workspace.json drive the rendered text and the existing assertions
continue to match.

Slug constants kept: WORKSPACE_SLUG_FORMAT_ERROR /
WORKSPACE_SLUG_CONFLICT_ERROR exports in workspace/slug.ts are still
imported by onboarding/steps/step-workspace.tsx (out of scope here).
The workspace shell now reads its strings from workspace.json directly.

Multica.ai brand prefix in the slug input affordance is wrapped with
an inline `// eslint-disable-next-line i18next/no-literal-string` per
glossary policy on brand names.

Renamed sign_in_other → sign_in_different to avoid colliding with
i18next's `_other` plural-suffix convention which the selector-API
typings treated as a plural form of `sign_in`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate projects namespace

Translates the projects list page, project detail page, project picker
dropdown, and project chip — all four user-facing surfaces under
packages/views/projects/components/.

New file: projects/components/labels.ts exposes three hooks that
replace the static `.label` field on PROJECT_STATUS_CONFIG /
PROJECT_PRIORITY_CONFIG and the previous module-level
`formatRelativeDate` helper. Core's `.label` stays untouched (it's
still consumed by search and the create-project modal, both
out-of-scope for this namespace) — those will flip when their
respective namespaces translate.

In zh, the "project" entity stays lowercase English per glossary
(`新建 project`, `还没有 project`, `从 project 移除`). Status / priority /
table column labels translate fully.

The cancelled / done / paused etc. status labels duplicate per-
namespace as `projects.status.*` rather than reading from a future
shared status namespace. This matches the auth/inbox/workspace
pattern of self-contained namespaces. If a generic "issue/project
status" pool emerges later, these can collapse.

Verified: pnpm --filter @multica/views typecheck (clean) +
test (238/238) + ESLint clean on projects/ (1 pre-existing warning
about useEffect/sidebarRef dep, unrelated to i18n).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate autopilots namespace

Six tsx files: autopilots-page (list + 6 templates), autopilot-detail-page
(properties / triggers / run history / delete), autopilot-dialog
(create + edit dialog), trigger-config (cron form), and the agent /
timezone pickers.

Hook conversions for module-level helpers that need t():
- summarizeTrigger / describeTrigger → useSummarizeTrigger /
  useDescribeTrigger (no external callers, removed the plain exports)
- formatRelativeDate → useFormatRelativeDate (per-component hook)
- formatCountdown → useFormatCountdown (per-component hook)
- TEMPLATES array now keyed by id; titles + summaries pull from
  templates/{id}/{title,summary} JSON. Prompts stay raw EN since
  they're injected directly into the agent task — translating them
  would translate the agent's instructions, not the user's UI.

Status / execution-mode / run-status enums render via t($ => $.status[k])
with k typed against the core type (no separate hook needed).

Verified: pnpm --filter @multica/views typecheck (clean) +
test (238/238) + ESLint clean on autopilots/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate skills namespace

Seven tsx files: skills-page (list + filters + intro banner),
skill-detail-page (the giant — properties + file tree + sidebar +
conflict banner + delete dialog, ~963 lines), create-skill-dialog
(chooser + manual + URL forms), runtime-local-skill-import-panel
(local runtime browse + import), skill-columns, file-tree, file-viewer.

Notable patterns:
- `createSkillColumns` factory → `useSkillColumns` hook so column
  headers flow through useT. Column identity changes per render is
  fine — DataTable handles it.
- `validateNewFilePath` (pure helper) → `useValidateNewFilePath` hook
  so the 5 validation error messages can be translated.
- skill_files / used_by / description_with_agents use i18next plural
  keys (`_one` / `_other`) — the type system collapses these into a
  single PluralValue access, so call sites use
  `t($ => $.foo, { count })` and i18next picks the form.
- Per glossary, "skill" stays lowercase EN in zh ("新建 skill",
  "已删除 skill", "未找到该 skill").

Test wrapper: runtime-local-skill-import-panel.test.tsx now wraps
render() with <I18nProvider> so the assertion on /Import to Workspace/i
matches the EN translation.

Verified: pnpm --filter @multica/views typecheck (clean) +
test (238/238) + ESLint clean on skills/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate chat namespace

Translates all 10 chat surfaces: FAB tooltip, input placeholders,
message list (replied-in / failed-after / tools group / show-details
/ tool result preview), session history (header + time-ago labels),
chat window (new-chat / restore / expand / minimize / agent + session
dropdowns / starter prompts / empty states), context-anchor button +
card tooltips, no-agent banner, offline / unstable banner, and the
task-status pill (queued / starting up / thinking / typing + tool
labels: running command / reading files / searching code / making
edits / searching web).

Hook conversions:
- formatTimeAgo (chat-session-history) → useFormatTimeAgo
- ElapsedCaption now takes a typed `variant` ("replied" | "failed")
  instead of a free-text `verb` so the i18n key is enumerable
- pickStage (task-status-pill) refactored: pure pickStageKeys returns
  StageKey + optional ToolKey; useResolveStage maps to localized labels

Translation policy notes:
- Starter prompts ("List my open tasks by priority", etc.) are user
  UI when displayed AND the user's input when clicked — translating
  them sends the agent the user's locale-native phrasing, which is
  the right UX for a CN user using a CN agent.
- buildAnchorMarkdown (chat-window) stays in English: it's an
  agent-bound markdown prefix injected into the outgoing message,
  not user-facing UI.

Verified: pnpm --filter @multica/views typecheck (clean) +
test (238/238).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate modals namespace

Translates all 11 modal sources: registry (no UI text), backlog-agent-hint,
set-parent-issue, add-child-issue, delete-issue-confirm, feedback,
issue-picker, create-workspace, create-project, create-issue (manual),
quick-create-issue (agent panel).

Notable patterns:
- create-project re-uses useProjectStatusLabels / useProjectPriorityLabels
  hooks from views/projects/components/labels — same translation source
  as the projects list / detail, no duplication.
- create-issue.tsx: renamed `toast.custom((t) => ...)` callback param to
  `toastId` to avoid shadowing the closure-captured useT() `t` function.
- Test wrapper added to modals/create-issue.test.tsx so the two assertions
  on rendered modal text (success toast + Create another) match the EN
  bundle. modals/create-workspace.test.tsx was already wrapped (workspace
  ns commit).

Verified: pnpm --filter @multica/views typecheck (clean) +
test (238/238).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate settings namespace (rest of tabs)

Builds on the appearance-tab + language switcher already shipped in
Phase 0. Translates the remaining 8 settings surfaces: settings-page
shell (left nav + tab keys), account / profile, notifications-tab
(5 group labels + descriptions), tokens-tab (create / list /
revoke / created dialog), workspace-tab (general fields + danger
zone + leave/delete confirmations), members-tab (invite + role
config + revoke / remove flows), repositories-tab, labs-tab,
delete-workspace-dialog.

Hook conversion: members-tab `roleConfig` static const → `useRoleLabels`
hook returning a Record<MemberRole, {label, description, icon}>. The
icon stays as a typed React component (Crown / Shield / User), so
rendering pattern is unchanged at call sites.

Test wrapper: settings/components/delete-workspace-dialog.test.tsx
now wraps render() with <I18nProvider> (custom render() helper)
because the test asserts on rendered button labels ("Delete workspace",
"Cancel", "Deleting...").

Verified: pnpm --filter @multica/views typecheck (clean) +
test (238/238).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate runtimes namespace (entry surfaces)

Translates the user-facing runtime list page surfaces:
runtimes-page (header / search / filters / chips / empty / no-matches /
bootstrapping), runtime-detail (topbar + delete dialog + delete toasts),
runtime-detail-page (not-found state), shared.tsx (4-state HealthBadge
labels).

Hook conversion: shared `healthLabel(health)` was a pure module-level
function. Added `useHealthLabel` hook for translated call sites; kept
`healthLabel` as an EN-only fallback for non-component callers (column
factory in runtime-columns).

Deferred:
- runtime-list / runtime-columns (data table column headers + cell
  bodies) — large surface, not in the page-load critical path.
- connect-remote-dialog / update-section / usage-section — secondary
  flows, English remains acceptable until a focused pass.
- charts/* — primarily numeric tooltips and axes; minimal user-visible
  text.

Verified: pnpm --filter @multica/views typecheck (clean) +
test (238/238).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate layout namespace (sidebar nav, help, loader)

Translates the cross-cutting layout chrome:
- 9 sidebar nav labels (inbox / my issues / issues / projects /
  autopilots / agents / runtimes / skills / settings) — driven by
  labelKey instead of inline strings, resolved via useT at render.
- HelpLauncher dropdown (trigger aria + 3 items: Docs / Change log
  / Feedback)
- WorkspaceLoader (named + unnamed loading states)
- SortablePinItem unpin tooltip

Pattern shift in app-sidebar.tsx: nav arrays carry `labelKey: NavLabelKey`
(typed against the layout JSON) instead of `label: string`. The string
comparison checks (`item.label === "Inbox"`) became cleaner ID-based
checks (`item.key === "inbox"`).

Deferred: deeper sidebar surfaces — workspace switcher dropdown,
"New Issue" CTA, "Pinned" / "Workspace" / "Configure" group labels —
remain English. The 9 nav labels are the ones that read in every
session.

Verified: pnpm --filter @multica/views typecheck (clean) +
test (238/238).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate onboarding namespace (welcome + step header)

Translates the user-first-impression surfaces of the onboarding flow:

- step-welcome.tsx (the wordmark, headline, lede paragraphs, all CTAs:
  Download Desktop / Continue on web / Start exploring / I've done
  this before, illustration caption)
- step-header.tsx ("Step N of M" counter + matching aria-label)
- onboarding-flow.tsx (skip-onboarding error toast)

Test wrapper added to onboarding/components/step-header.test.tsx —
custom render() helper wraps with <I18nProvider> so the "Step 2 of 5"
assertions match the EN bundle.

Deferred (acceptable English fallback for now): step-questionnaire,
step-workspace, step-runtime-connect, step-platform-fork, step-agent,
step-first-issue, cli-install-instructions, option-card, runtime
aside panels, starter-content-prompt, cloud-waitlist-expand. These
are deeper steps with significant copy that would benefit from a
focused dedicated pass — voice on each is more nuanced (questionnaire
options, runtime install instructions, agent template recommendations).

Verified: pnpm --filter @multica/views typecheck (clean) +
test (238/238).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(i18n): add EN/zh-Hans key parity guard

Schema-level vitest that walks RESOURCES.en and RESOURCES["zh-Hans"]
namespace by namespace and asserts both bundles cover the same key
set. i18next plural rule is normalized before compare (`_one` /
`_other` collapse to a single logical key) so EN's plural pair
matches zh's `_other`-only form.

Catches retrofit drift where a new EN key lands without zh —
previously this would silently fall back to the English string in
production. Cheap to keep green: 39 tests across 21 namespaces in
under a second.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate issues namespace

Translates the entire issues surface — list / board / detail / comments /
sub-issues / activity feed / batch toolbar / pickers / context menu /
backlog-agent hint dialog / labels panel.

Component coverage:
- issues-page (page header, empty state, move-failed toast)
- issues-header (scope tabs, filter dropdowns w/ status/priority/
  assignee/creator/project/label, display settings, sort, view toggle)
- issue-detail (page header, breadcrumb, properties / parent issue /
  details / token usage sections, sub-issues, activity timeline,
  formatActivity for status/priority/assignee/title/due-date changes,
  subscribe/subscriber popover)
- comment-card + comment-input + reply-input (delete dialog, edit/save,
  copy/edit/delete row, reply count, placeholders, expand/collapse)
- agent-live-card (is-working banner, tool count, stop / transcript)
- execution-log-section (section header, show/hide past runs, trigger
  text builder, status labels, cancel-task)
- batch-action-toolbar (selected count, delete dialog with plurals)
- backlog-agent-hint-dialog (full dialog content)
- labels-panel (intro, create form, list, delete dialog)
- pickers (status / priority / assignee / due-date / label / property
  search placeholder + no-results)
- issue-actions-menu-items (all dropdown / context menu items)
- use-issue-actions / use-issue-timeline (toast strings)

STATUS_CONFIG / PRIORITY_CONFIG label rendering routed through
$.status[enum] / $.priority[enum] at every call site; the core config
keeps its English fallback for non-i18n consumers but UI never reads
.label directly anymore.

Tests retrofitted: issues-page, issue-detail, and issue-actions-menu
RTL specs now wrap renders in <I18nProvider> with the EN bundle, so
their string assertions match the bundle (not hardcoded literals).

ESLint i18next allow-list extended to 24 issues files. Verified:
pnpm --filter @multica/views typecheck + test (277/277) all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate agents namespace

Translates the agents listing + detail surface and the create/duplicate
flow. Covers the high-frequency surfaces; deeper sub-tab editors
(activity / instructions / skills / env / custom-args bodies, and the
hooks-buggy runtime/model/concurrency pickers) are deferred — they
have their own pre-existing react-hooks rule violations and benefit
from a focused dedicated pass.

Component coverage:
- agents-page (page header w/ tagline + new button, scope segment,
  search, sort dropdown, availability chips, archived toolbar, empty
  state, no-matches messaging w/ search interpolation, list-load
  error)
- agent-detail-page (back link, archived banner, archive dialog,
  not-found state, all 4 toast strings)
- agent-detail-inspector (avatar editor, name + description popover,
  description dialog, every PropRow label, validation message,
  presence badge label sourced from $.availability[enum])
- agent-overview-pane (tab labels, discard-unsaved-changes dialog)
- create-agent-dialog (title / description / labels / placeholders /
  duplicate-suffix / runtime filter buttons / runtime status copy)
- agent-row-actions (full dropdown items + cancel-tasks dialog with
  pluralized "N running + M queued" summary + archive dialog + 6 toasts)
- agent-columns (every header cell, You / Archived chips, runtime
  fallback labels, availability + workload labels via $.availability /
  $.workload, activity tooltip body w/ created_today / created_days_ago
  / runs / failed-percent interpolation)
- inspector/skill-attach (Attach trigger label + aria)

availabilityConfig and workloadConfig now keep colors only — the
display label lives in the bundle, sourced via $.availability[enum]
and $.workload[enum] at every call site. Same pattern as
STATUS_CONFIG/PRIORITY_CONFIG in the issues namespace.

ESLint i18next allow-list extended to 8 agents files.
Verified: pnpm --filter @multica/views typecheck + test (277/277)
all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(i18n): clear 30 stray EN strings in translated files

Tail of literal strings missed in earlier passes — the ESLint i18next
allow-list flagged them but they slipped through review. Files touched:

- layout/app-sidebar.tsx (10 keys: Workspaces / Pending invitations /
  Create workspace / Join / Decline / Log out / New Issue + shortcut /
  Pinned / Workspace / Configure)
- runtimes/components/runtime-detail.tsx (Serving header + serving_count
  pluralization, no_agents copy, running/queued chips with count
  interpolation, Diagnostics header, CLI label, Delete runtime button,
  Technical details toggle, last seen interpolation)
- onboarding/steps/step-welcome.tsx (entire WelcomeIllustration mock —
  5 cards × actor names + body copy + 3 mention chips + 2 timestamps;
  zh translation reads naturally instead of leaving the demo English)
- settings/components/labs-tab.tsx (`Co-authored-by: ...` git trailer
  wrapped in {} so linter sees a JS string, not JSX text — magic
  identifier git relies on, must not translate)
- settings/components/members-tab.tsx (✓ glyph wrapped in {})
- modals/feedback.tsx (⌘↵ shortcut wrapped in {})

ServingAgentsCard now reads availability/workload labels from
`agents` namespace (cross-namespace useT) so the bundle-truth pattern
holds: presenceConfig keeps colours only, label text comes from the
shared bundle.

Verified: typecheck + 277/277 tests + lint (only the pre-existing
react-hooks rule-of-hooks errors remain, which task #6 addresses).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agents): rules-of-hooks + translate 4 model/runtime pickers

Three pre-existing react-hooks/rules-of-hooks violations + one missing
useMemo dep cleared, then the four pickers wired through useT.

Hook order fixes:
- concurrency-picker: useEffect now runs before the !canEdit early
  return. Stale-draft reset still works the same way.
- runtime-picker: useMemo for the filtered list moved above the
  !canEdit branch.
- model-dropdown: `models = data?.models ?? []` was minting a fresh
  array each render and tripping the deps lint of the downstream
  useMemo. Wrap in useMemo so the reference is stable.

Translation coverage:
- concurrency-picker: tooltip ("Concurrency · N max..."), range
  helper text, Save button.
- runtime-picker: trigger label fallback ("No runtime"), tooltip
  text composed from {{name}} + status, Mine/All filter buttons,
  empty-list copy, "owned by {{name}}" + status fragments in row
  tooltip, Cloud badge, online/offline aria.
- model-picker: trigger label, tooltip, "Managed by runtime"
  fallback, search placeholder, "Discovering models…", default
  badge, "No models available", "Use \"X\"" custom-id flow, Clear
  button + its title.
- model-dropdown: every label string including the "Select a runtime
  first" / "Default (provider)" / "Runtime offline — enter manually"
  trigger fallbacks, the supported=false explanation block, discovery
  failed badge, all popover items.

ESLint allow-list extended to 4 picker files. Verified: typecheck +
277/277 tests + lint (0 errors, only pre-existing react-hooks warnings
in unrelated files).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate runtimes list + connect dialog + CLI updater

Three deep runtime surfaces wired through useT, with the agents
namespace doing double duty for shared availability/workload labels.

runtime-columns:
- 7 column headers via t-augmented createRuntimeColumns({ t }).
- HealthCell now reads from useHealthLabel() (already translation-aware)
  instead of the EN-only healthLabel() helper.
- WorkloadCell sources the label from $.workload[enum] (cross-namespace
  to agents) — colour stays via workloadConfig.
- CostCell delta "flat" copy + CLI cell "Desktop" badge + update-
  available aria/tooltip + RowMenu's full delete dialog (title /
  description with {{name}} interpolation / cancel / confirm /
  deleting state) plus its admin-permission hint.

connect-remote-dialog:
- Three steps fully translated: instructions (header + 4 numbered
  steps + security warning + troubleshooting list with mono code
  snippets escaped as JS strings), waiting (loader + hint), success
  (CTA pair).
- Mono CLI commands wrapped in {} so linter sees JS strings — those
  are literal commands that must stay untranslated for the user to
  paste into a terminal.

update-section:
- statusConfig collapsed to icon+colour only; labels move to
  $.update.status[enum] for proper translation per-state.
- "CLI Version:" / "Latest" / "available" / "Update" / "Retry"
  copy + the "Managed by Desktop" tooltip and disabled hint.

Layout helpers tagged: runtime-list passes `t` through to the column
factory the same way agent-columns does.

ESLint allow-list extended with the 4 wired files. Verified:
typecheck + 277/277 tests + 0 i18n lint errors. usage-section.tsx
(KPI cards / WhenChart / TopUsageBreakdown / receipt table) is the
remaining runtimes surface — chart-heavy and benefits from a focused
pass next.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate 5 agent detail tabs + skill-add dialog

The 5 tabs that fill the agent detail right pane plus the shared
skill picker dialog. Agents bundle gains a `tab_body` block with
sub-namespaces per tab + a `common` slot for save/add/unsaved.

Tab coverage:
- instructions-tab: intro paragraph, multi-line example placeholder
  (full 18-line zh translation), Save / Unsaved.
- env-tab: read-only intro / empty state, editable intro with two
  inline `<code>` env-var examples kept English (mono terminal
  payloads), KEY / value placeholders, Show/Hide value aria, Add /
  Remove aria, all 3 toasts (duplicate keys / saved / save failed).
- custom-args-tab: intro about whitespace splitting, launch-mode
  prefix line + `<your args>` placeholder, --flag value placeholder,
  Add, Remove aria, both toasts.
- skills-tab: intro, Add skill button, import-hint callout, empty
  state title + hint + add-CTA, remove-failed toast.
- activity-tab: 3 section titles (Now / Last 30 days / Recent work),
  active-task pluralization, performance subtitle, all 3 empty
  states, runs/success%/avg-duration/failed pluralization with
  interpolation, source labels (Issue / Chat / Autopilot / Untracked),
  source fallbacks (Quick create / Creating issue / Chat session /
  Autopilot run), issue-short fallback, "Triggered by" tooltip
  header, open-issue / transcript / cancel-task tooltips and ARIAs,
  cancelling state, started/dispatched/queued time prefixes, show
  more.
- skill-add-dialog: dialog title + description, empty list copy,
  Cancel button, add-failed toast.

skills-tab.test.tsx wrapped in <I18nProvider> with the EN bundle so
its `Local runtime skills are always available` assertion still
matches the resolved translation instead of the raw key path.

ESLint allow-list extended with the 6 wired files. Verified:
typecheck + 277/277 tests + 0 i18n lint errors. Only the per-test
mock for skills-tab needed wrapping; the other 4 tabs ship without
test files of their own and inherit the I18nProvider chain via
agent-overview-pane / agent-detail-page test renders (when those
exist later).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate onboarding step-questionnaire + option-card

The user-profile step (3 questions) is the first deferred onboarding
deep step now wired through useT.

step-questionnaire:
- Eyebrow + headline + answered-progress counter with {{count}}
  interpolation
- All 3 questions and their option labels (team size / role / use case)
- All 3 "Other" placeholders for free-text fallback
- Right-rail "Why three questions" / "What you get" panel: 2 eyebrow
  rows, 2 unlock-item title+body pairs, learn-more link
- Back / Continue buttons via shared `common` block

option-card: shared "Other" radio label and aria.

Test wrapped in <I18nProvider>. EN value of `other_label` kept as
"Other" so the existing /^other$/i regex in step-questionnaire.test
keeps matching after the rendering pipeline switched from a hardcoded
literal to a bundle lookup.

ESLint allow-list extended with these 2 files. The remaining 4 deep
steps (workspace / runtime-connect / platform-fork / agent), the
2 ancillary surfaces (cli-install-instructions / starter-content-
prompt), and the 3 side panels (runtime-aside-panel / cloud-waitlist-
expand / compact-runtime-row) will be surfaced + swept by the global
ESLint switch (next commit).

Verified: typecheck + 277/277 tests + 0 i18n lint errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): flip ESLint to glob + drain remaining hardcoded EN

ESLint i18next/no-literal-string now applies to **/*.tsx by default
instead of an explicit allow-list. Files that genuinely still need
hardcoded EN are listed in STILL_HARDCODED — concrete, finite, and
the goal is to drain that list to zero.

Tail strings translated in this commit (surfaced by the global flip):

- common/task-transcript/agent-transcript-dialog.tsx — full dialog:
  status badge (Running / Completed / Failed), sr-only DialogTitle,
  Filter dropdown trigger + Clear filters, Copy all / Copy filtered /
  Copied, tool-calls + events metadata chips with pluralization,
  events-filtered "{{shown}} of {{total}}" interpolation, "Waiting
  for events..." live state, "No execution data recorded." past
  state. New `transcript` block in agents namespace.
- runtimes/components/charts/activity-heatmap.tsx — Less / More
  legend labels around the contribution-style heat squares.
- search/search-trigger.tsx — sidebar Search... button label.
  ⌘ glyph wrapped in {} to satisfy the linter (mono shortcut symbol,
  not translatable).

Holdouts (STILL_HARDCODED, ~14 files): the deep onboarding steps
(workspace / runtime-connect / platform-fork / agent / first-issue /
cli-install-instructions, plus 4 ancillary panels), the runtimes
usage-section + KPI cards, and 5 minor agent visual primitives
(sparkline / agent-presence-indicator / agent-profile-card /
visibility-badge / char-counter). Each one gets a dedicated future
pass; the global rule prevents new hardcoded strings from landing
elsewhere.

Verified: typecheck + 277/277 tests + 0 i18n lint errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): drain agent visual primitives + onboarding small components

8 files removed from STILL_HARDCODED:

agents/components/:
- char-counter — over-limit text with {{count}} interpolation
- visibility-badge — uses new agents.visibility.{private,workspace}.
  {label,tooltip} block; drops VISIBILITY_LABEL/TOOLTIP imports from
  core in favour of bundle-driven copy
- agent-presence-indicator — availability + workload labels via
  $.availability[enum] / $.workload[enum] (cross-namespace),
  queue-badge "+N queued" with pluralization
- agent-profile-card — Agent unavailable / Detail link / Owner /
  Skills / Runtime / Unknown runtime / Archived chip / availability
  line via cross-namespace lookup

agents.json: new presence + visibility + profile_card + char_counter
blocks.

onboarding/components/:
- compact-runtime-row — online/offline aria via agents.availability
- runtime-aside-panel — full content (What's a runtime / Good to
  know / Swap anytime / Add more later / docs link)
- starter-content-prompt — full dialog (title / description with
  inline emphasis / both buttons / 3 toasts)
- cloud-waitlist-expand — intro paragraph + warning span / email
  + reason labels + placeholders + Optional badge / Join + on-list
  states / both toasts

onboarding/steps/:
- cli-install-instructions — copy aria + intro + 2 step labels

onboarding.json: new runtime_aside / cli_install / starter_content /
cloud_waitlist blocks.

Tests for step-platform-fork + step-runtime-connect wrapped in
<I18nProvider> with EN bundle so /you're on the list/i etc. still
matches the resolved translations.

Verified: typecheck + 277/277 tests + 0 i18n lint errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate onboarding deep steps

The 5 large onboarding steps that were deferred from earlier passes,
plus their support helpers, all wired through useT.

step-first-issue (final beat — flips onboarded_at):
- error_title / Retry / retry_failed toast / finishing / opening
  states.

step-agent (creates the user's first agent):
- Templates moved from a module-level const to a useT-driven
  useAgentTemplates() hook. Names + emoji stay constant (visual
  identity), labels + blurbs + instructions resolve from the
  bundle. coding / planning / writing / assistant — all four
  templates ship a full zh translation that reads naturally.
- Recommended badge, eyebrow + headline + lede, footer hint,
  Create {{name}} CTA, create_failed toast.
- Right-rail "About agents" panel (4 way-items + headline +
  add-more hint + docs link).

step-workspace (create or pick existing):
- 5 footer states (open / creating / creating-pending / name-first
  / pick), all hint + CTA strings via interpolation.
- Name + URL + slug placeholders, issue-prefix preview spans,
  Create-new card title + subtitle.
- 8-row WorkspacePreviewCard sidebar (Inbox / Issues / Agents /
  Projects / Autopilot / Runtimes / Skills / And more) — every
  label + meta strapped to bundle keys.
- 4 perks (assign / chat / invite / switch) + 3 next-steps
  (runtime / agent / starter), 2 toasts (slug-conflict / failed).
- `multica.ai/${slug}` mono URL escaped via template-literal
  expression so the linter sees a JS string.

step-runtime-connect (desktop scan flow):
- 3 phase headlines + ledes (scanning / found / empty), trust-strip
  status (all online / N online / none online) with pluralization,
  online/offline labels, Skip / Continue / Selected hint.
- Empty-view 2 cards (skip + waitlist) and the cloud waitlist
  dialog wrapper.

step-platform-fork (web fan-out):
- Eyebrow + headline + lede, footer hint with 3 phase variants.
- Primary download card (before/after click) + 2 alt cards (CLI /
  cloud) + CLI dialog with 4 elapsed-time stages (normal / midway /
  slow / stalled), live-listening header, runtime-connected
  pluralization, cloud waitlist dialog.

ESLint: STILL_HARDCODED list shrunk from 14 entries to 1 — only
runtimes/components/usage-section.tsx (chart-heavy KPI panel)
remains.

Verified: typecheck + 277/277 tests + 0 i18n lint errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate runtimes usage panel + drop STILL_HARDCODED

Final i18n holdout: the runtimes usage panel (KPI hero, WHEN chart
tabs, cost-by breakdowns, daily breakdown table) is wired through
useT("runtimes"). With this drained, the eslint scaffolding for
explicit holdouts is removed — every JSX text node in @multica/views
now flows through i18n.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(i18n): drain rollout gaps + add cross-device sync

Lands the post-review punch list for the i18n rollout: closes correctness
gaps that would have shipped silently, and adds the missing cross-device
locale sync the rollout's docs already promised.

Coverage:
- Register issues + agents namespaces in RESOURCES (90 useT call sites
  were rendering keys-as-text in production)
- Harden parity test to compare RESOURCES keys against on-disk JSON
  files, so a future missing namespace registration fails loudly
- Server-side language whitelist in UpdateMe + reject-unsupported test
- Safe SupportedLocale resolution in appearance-tab (no more `as` cast
  on a region-tagged BCP-47 string)
- HTML lang attribute uses zh-CN (not zh-Hans) for screen reader / CJK
  font-stack compatibility
- Cookie Secure flag on https
- Pulled createBrowserCookieLocaleAdapter out of the server-safe entry
  into a new @multica/core/i18n/browser subpath; document.cookie access
  can no longer leak into Edge middleware imports

Cross-device sync:
- New UserLocaleSync component mounted in CoreProvider; on login, if
  user.language differs from the active i18n.language, persist via the
  adapter and reload. Both apps benefit
- Desktop main process tracks system locale and emits IPC on focus when
  it changes; renderer reloads only when the user has no explicit
  Settings choice (their preference still wins)

Tests:
- pickLocale / matchLocale (11 cases incl. region-tagged BCP-47, malformed
  tags, zh-Hant collapse-to-zh-Hans semantics)
- browser-cookie-adapter (6 cases under jsdom)
- Shared renderWithI18n helper at packages/views/test/i18n.tsx that wraps
  the real RESOURCES map; future tests opt in instead of inlining a
  per-file TEST_RESOURCES slice that goes stale silently

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(conventions): consolidate naming + i18n glossary into docs site

Single source of truth for code naming, i18n translation glossary, and
Chinese voice rules. Previously split between packages/views/locales/glossary.md
and scattered comments — now lives at apps/docs/content/docs/developers/conventions.{mdx,zh.mdx}
with both English and Chinese versions kept in sync.

Three sections per page:
1. Code naming — routes, packages, files, DB, Go, TS, commits
2. i18n translation glossary — entity vs concept rule, what to translate,
   word combination, plurals, interpolation, key naming
3. Chinese voice + style — punctuation, principles, where to look in doubt

Side effects:
- packages/views/locales/glossary.md collapses to a stub redirecting to
  the docs page; do not edit it
- CLAUDE.md gets a new top-level "Conventions reference" section so any
  Claude session sees the pointer before any other rule
- apps/docs/content/docs/developers/ gets a stub English meta.json so the
  conventions page is reachable on the EN side (contributing.zh.mdx /
  architecture.zh.mdx remain ZH-only — separate work)
- Both root sidebars get a new "Developers" group

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(i18n): apply zh voice rules + translate project/autopilot

Two-part cleanup driven by the conventions doc landed last commit:

Voice violations (mechanical sweep across 10 zh-Hans namespaces):
- 「」 (Japanese-style brackets) → \" to match the EN source's straight
  double quotes (~13 sites)
- … (single-char ellipsis) → ... three dots (~43 sites)
- Drop translation-ese pronoun "我们" where it's a pure narrator
  ("我们已发送" → "已发送", "我们替你托管" → "由 Multica 托管"); keep
  "告诉我们" where "we" is the legitimate brand recipient
- "作为父级 / 作为子级" → "设为父级 / 设为子级"
- "任务" mistranslated as the task entity → `task` (lowercase EN entity)
- Dialog title "Autopilot" → "autopilot"

Translate project / autopilot per industry consensus:
- `project` → 「项目」 (~42 value sites). Feishu / Tower / Teambition /
  PingCode / GitHub Projects all translate; no Chinese product keeps
  `project`.
- `autopilot` → 「自动化」 (~34 value sites). Avoids the Tesla-style
  「自动驾驶」 association; matches Notion / Feishu's industry term.
- Issue / skill / task remain lowercase EN per dev-team familiarity.
- Sidebar nav-label entities get Title Case ("Issue" / "Skill" / "我的
  Issue") so the entry-point label reads as a proper UI signal; body
  prose stays lowercase.

Conventions doc (EN + ZH) reflects the decision and adds a "why these
translate but issue/skill/task don't" rationale block.

Verification: parity test 45/45, full monorepo typecheck green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): translate chat session delete + project resources section

Two features main shipped while this branch was idle never went through
the i18n pass:

- Chat session delete confirmation dialog (#2115) and history toggle
  tooltip (#2117): adds session_history.delete_dialog.* and
  session_history.row_delete_*, plus window.history_show_tooltip /
  history_back_tooltip.
- Project resources sidebar (#1926/#2080/#2111): entire component
  including toasts, popover form, attach/remove tooltips. New
  projects.resources subtree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 16:16:12 +08:00
Bohan Jiang
09f04847d3 feat(server): redis-backed runtime liveness with DB fallback (#2121) 2026-05-06 14:31:33 +08:00
Bohan Jiang
a4fac51cf5 fix(projects): add resource_count breadcrumb instead of inlining resources (#2118)
* fix(projects): add resource_count breadcrumb to project responses

Closes #2087

`multica project get` previously returned project metadata with no signal
that resources existed. Agents that fetched a project this way had no way
to discover its attached resources without already knowing about
`/api/projects/{id}/resources` or the on-disk `.multica/project/resources.json`.

Rather than inline the full resource list into the parent payload (which
conflates parent metadata with a child sub-collection and locks the
resource_ref shape into the project endpoint's contract), this adds a
scalar `resource_count` breadcrumb to ProjectResponse. The actual list
stays at the dedicated sub-collection endpoint.

Changes:
- GetProjectResourceCounts :many — new batched sqlc query
- ProjectResponse.ResourceCount populated in GetProject, ListProjects,
  SearchProjects, and the with-resources CreateProject echo
- multica project get prints a stderr hint pointing at
  multica project resource list <id> when count > 0; the JSON on stdout
  stays parseable
- Meta-skill (runtime_config.go) lists multica project get and
  multica project resource list in Available Commands so agents that
  read CLAUDE.md / AGENTS.md know about both paths

Co-authored-by: multica-agent <github@multica.ai>

* fix(projects): wire ResourceCount through Update + Create event payload

Review feedback on #2118.

- UpdateProject now reloads ResourceCount before responding/publishing.
  Previously a title- or status-only PUT served (and broadcast over WS)
  resource_count: 0 even when resources existed.
- The with-resources CreateProject path sets resp.ResourceCount before
  the project:created publish, so the WS event payload matches the HTTP
  echo. The hand-rolled response map collapses to an embedded
  ProjectResponse + resources array — one source of truth for the
  serialized shape.
- packages/core/types/project.ts: Project gains resource_count: number
  to keep the TS contract aligned with the server response.

Tests:
- TestProjectResourceCountBreadcrumb extends to assert UpdateProject
  preserves the breadcrumb.
- TestCreateProjectWithResourcesEchoesCount asserts the create echo
  carries resource_count matching the attached resources.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-06 14:09:35 +08:00