Commit Graph

361 Commits

Author SHA1 Message Date
Jiayuan Zhang
fc8528d64d feat(autopilot): support assigning to a squad (MUL-2429) (#2888)
* feat(autopilot): support assigning autopilot to a squad (MUL-2429)

Path A (Squad-as-Leader) from the RFC: when an autopilot's assignee is a
squad, dispatch resolves to squad.leader_id and executes against the
leader's runtime — semantics match a human manually assigning the issue
to that squad, no fan-out.

Backend scope only; frontend picker change is a follow-up PR.

Changes:
- 096_autopilot_squad_assignee migration: drop agent FK on
  autopilot.assignee_id, add assignee_type column (default 'agent'),
  add autopilot_run.squad_id attribution column.
- service.AgentReadiness: single source of truth for archived /
  runtime-bound / runtime-online checks. Shared by autopilot
  admission gate, run_only dispatch, and isSquadLeaderReady.
- service.resolveAutopilotLeader: translates assignee_type/id to the
  agent that actually runs the work.
- dispatchCreateIssue: stamps issue with assignee_type='squad' for
  squad autopilots and enqueues via EnqueueTaskForSquadLeader.
- dispatchRunOnly: belt-and-braces readiness re-check after resolving
  squad → leader so a leader that went offline between admission and
  dispatch produces a clean failure instead of a doomed task.
- handler.CreateAutopilot / UpdateAutopilot: accept assignee_type with
  squad/agent existence + leader-archived validation. Backward-compatible
  default of "agent" preserves the contract for older clients.
- Analytics: AutopilotRunStarted/Completed/Failed events carry
  assignee_type and squad_id; PostHog can now group autopilot runs by
  squad without joining back to the autopilot row.

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): reject archived squads, route post-admission skips, cleanup dangling-agent autopilots (MUL-2429)

Addresses three review findings on PR #2888:

1. Archived squad handling: validateAutopilotAssignee now rejects squads
   with archived_at set; resolveAutopilotLeader returns errSquadArchived
   so the admission gate fails closed; DeleteSquad now mirrors the issue
   transfer for autopilot rows (TransferSquadAutopilotsToLeader) so
   surviving autopilots flip to assignee_type='agent' (leader) instead
   of dangling at the archived squad.

2. dispatchRunOnly post-admission readiness: introduces errDispatchSkipped
   sentinel, recognised by DispatchAutopilot via handleDispatchSkip so
   the run is recorded as `skipped` (not `failed`). Manual triggers no
   longer 500 when the leader's runtime goes offline between admission
   and task creation. New TestManualTriggerDoesNotErrorOnPostAdmissionSkip
   locks the behaviour in.

3. Dangling agent assignee after migration 096 dropped the FK:
   shouldSkipDispatch now distinguishes pgx.ErrNoRows / errSquadArchived
   (hard skip — retrying won't help) from transient DB errors
   (fail-open). DeleteAgentRuntime pauses autopilots that target agents
   about to be hard-deleted (ListArchivedAgentIDsByRuntime +
   PauseAutopilotsByAgentAssignees) so the breakage surfaces as a paused
   row in the UI instead of a quiet skip-burning loop.

Unit tests cover the sentinel unwrap contract and errSquadArchived
errors.Is behaviour. Integration test
TestAutopilotDispatchSkipsWhenRuntimeOffline re-verified against a fresh
DB with migration 096 applied.

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): bump last_run_at on post-admission skip (MUL-2429)

Match recordSkippedRun (pre-flight skip) and the success path so the
scheduler / "last seen" UI both reflect that this tick evaluated the
trigger, even when the post-admission readiness gate caught a late
regression.

Addresses Emacs review caveat #1 on PR #2888.

Co-authored-by: multica-agent <github@multica.ai>

* feat(autopilot): mixed agent/squad assignee picker in dialog (MUL-2429)

End-to-end UI for assigning an autopilot to a squad. Closes the PR #2888
backend gap: the squad-as-assignee feature was already wired in Go (Path A,
RFC §4) but the desktop dialog never offered the choice.

- core/types/autopilot: add `AutopilotAssigneeType`, surface
  `assignee_type` on `Autopilot` + Create/Update request payloads.
- views/autopilots/pickers/agent-picker: switch to a polymorphic
  AssigneeSelection (`{type, id}`); render agents and squads as two
  grouped sections with shared pinyin search.
- views/autopilots/autopilot-dialog: maintain `assigneeType` state, send
  it on create/update, render the trigger avatar / hover dot with
  `assignee.type`.
- views/autopilots/autopilots-page + autopilot-detail-page: render the
  assignee row using `autopilot.assignee_type` so squad-typed autopilots
  show the squad avatar + name, not a broken agent lookup.
- locales: add `agents_group` / `squads_group` / `select_assignee` keys
  (en + zh-Hans), keep legacy `select_agent` for callers that still
  reference it.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 05:30:13 +02:00
Jiayuan Zhang
e48f6a84d6 feat(github): expose read-only installation list to workspace members (MUL-2413) (#2886)
* feat(github): expose read-only installation list to workspace members (MUL-2413)

Relax `GET /api/workspaces/{id}/github/installations` from owner/admin-only
to any workspace member so the Settings → Integrations tab no longer renders
blank for non-admins (the original symptom of MUL-2413).

The handler now reads the caller's role from the workspace middleware:
- owner / admin keep the full row including the numeric `installation_id`
  (the connect / disconnect handle) and receive `can_manage: true`.
- every other role (member / guest) receives rows with `installation_id`
  omitted and `can_manage: false`, giving them visibility into "is GitHub
  wired up?" without the management handle.

`GET /github/connect` and `DELETE /github/installations/{id}` stay under
the admin/owner middleware group — this PR only relaxes the read path.

Tests: `TestListGitHubInstallations_RoleGating` exercises admin, owner,
member, and guest paths against the real DB-backed handler fixture and
asserts the field stripping + `can_manage` contract.

Refs: MUL-2413
Co-authored-by: multica-agent <github@multica.ai>

* fix(github): redact installation_id from realtime broadcasts (MUL-2413)

GET /github/installations strips the numeric installation_id for non-admin
members, but the github_installation:created / uninstall / suspend WS
events were still publishing it, so the same handle was reachable from
any workspace client subscribed to the workspace scope. Broadcast both
payload variants without it — the frontend uses these events only to
invalidate the installations query, so admins re-query the list endpoint
to recover the management handle.

Also adds a router-level test that mounts the production middleware split
(member-visible list vs. owner/admin connect+delete) so a future routing
change can't silently widen the write surface.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-20 04:17:45 +02:00
Jiayuan Zhang
2ad1cd8ff8 feat(profile): user profile description injected into agent brief (MUL-2406)
## Summary

Adds per-user `profile_description` so coding agents have cheap, durable context about who is asking. v1 per the brief Xeon locked in on [MUL-2406](mention://issue/63a7247c-4f6a-42cf-90d1-7c746e77158a):

- **DB** — `user.profile_description TEXT NOT NULL DEFAULT ''` (migration 096). 2000-rune cap enforced server-side. No nullable / privacy state to manage.
- **API** — `PATCH /api/me` accepts the field; `UserResponse` always emits it. Client wraps `updateMe` in a lenient `UserSchema` + `EMPTY_USER` fallback per CLAUDE.md API Response Compatibility.
- **UI** — Settings → Account gains an "About you" textarea with live `n/2000` counter, `maxLength` guard, and a localized too-long error (EN + zh-Hans).
- **CLI** — `multica user profile get` / `multica user profile update` with `--description / --description-stdin / --description-file / --clear`, mirroring the existing `issue comment add` input-mode menu.
- **Daemon injection** — claim handler resolves the runtime owner and stamps `requesting_user_name` + `requesting_user_profile_description` on the task. `buildMetaSkillContent` emits `## Requesting User` between `## Agent Identity` and `## Available Commands`, blockquoted and framed as background context. The block is omitted entirely when the description is empty (no token cost when unused).

Brief is written **once per task** via `CLAUDE.md` / `AGENTS.md`, not the per-turn prompt — same path the agent already reads for identity, so no extra per-turn cost.

## Test plan

- [x] `go build ./...`, `go vet ./...`, `go test ./internal/cli/ ./internal/daemon/ ./internal/daemon/execenv/ ./cmd/multica/`
- [x] New brief tests: `TestBuildMetaSkillContentEmitsRequestingUser`, `TestBuildMetaSkillContentOmitsRequestingUserWhenEmpty`
- [x] `pnpm typecheck`, `pnpm lint`, `pnpm test` (74 files, 644 tests pass)
- [ ] Handler DB tests (`TestUpdateMe*`) require a migrated test DB — not runnable in this sandbox
- [ ] Manual: open Settings → Account, set a description, confirm the next daemon-run agent's `CLAUDE.md` shows `## Requesting User`
2026-05-19 19:51:28 +02:00
Jiayuan Zhang
591e47842d refactor(onboarding): remove starter-content kit; unify install-runtime issue across mark-onboarded paths (MUL-2438) (#2884)
* refactor(onboarding): remove starter-content kit, unify install-runtime issue across mark-onboarded paths (MUL-2438)

Drops the post-onboarding ImportStarterContent / DismissStarterContent
flow (handler + routes + StarterContentPrompt + templates + locale
strings + analytics event). The bug — web onboarding seeding 6+ starter
issues without a runtime — only existed through that path; with it gone
the source disappears.

The "install a runtime" issue from BootstrapOnboardingNoRuntime is now
the canonical no-runtime onboarding seed. The title/description and a
LockAndFindActiveDuplicate-deduped seeder move to
handler/no_runtime_issue.go, and CompleteOnboarding / CreateWorkspace /
AcceptInvitation seed it whenever the workspace has no runtime yet, so
every mark-onboarded entry point lands the user on a concrete next
step.

starter_content_state column is kept and continues to be claimed as
'imported' in all five entry points so older desktop builds (which
still render the legacy dialog on NULL) don't surface it to accounts
created after this change.

Co-authored-by: multica-agent <github@multica.ai>

* fix(onboarding): backfill starter_content_state for in-window NULL users (MUL-2438)

054 only covered pre-feature users. Anyone onboarded between then and the
starter-content kit removal could still sit at NULL, and old desktop
clients gate the legacy StarterContentPrompt on `starter_content_state
IS NULL`. The import/dismiss routes are gone, so leaving these rows NULL
would surface a dialog whose buttons 404. Mark them 'imported' to match
the new helper's claim semantics.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Lambda <lambda@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-19 18:37:48 +02:00
Bohan Jiang
f120e0ef43 refactor(cli): tidy workspace subtree (MUL-2386) (#2866)
- Drop `workspace current`; `workspace get` (no args) already prints the
  current default workspace, so the two were doing the same thing.
- Rename `workspace members` to `workspace member list` to free up the
  `member` namespace for future `add` / `remove` subcommands and align
  with the rest of the CLI's `<resource> <verb>` shape.
- Add `--full-id` to `workspace list`, matching `project list`,
  `autopilot list`, and friends.

Docs and the daemon prompt are updated to match.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-19 17:54:21 +08:00
Jiayuan Zhang
6f21cb8f3e [codex] Simplify onboarding runtime bootstrap (#2836)
* feat(onboarding): simplify runtime bootstrap

* fix(onboarding): close private-helper reuse hole and guide-issue nav race

- server: when bootstrap looks for an existing Multica Helper, require
  Visibility="workspace" so a private helper owned by another member
  can't be auto-assigned to the onboarding issue (and trigger a task as
  that private agent), which would have bypassed canAccessPrivateAgent.
- web onboarding page: refreshMe() inside bootstrap flips hasOnboarded
  before onComplete fires, letting the guard's router.replace overtake
  onComplete's router.push to the new guide issue. Mark the page as
  "completing" right before navigating so the guard stays silent during
  the in-flight transition.

Co-authored-by: multica-agent <github@multica.ai>

* fix(runtimes): escape daemon command literals to satisfy i18next/no-literal-string

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Lambda <lambda@multica.ai>
2026-05-19 09:52:35 +02:00
Bohan Jiang
b5102eb3d2 feat(cli): add workspace switch + current commands (MUL-2386) (#2838)
`multica workspace switch <id|slug>` is the product-semantic entry point for
changing the default workspace on the current profile. It looks the target up
in the user's accessible workspace list (an access check by construction —
the server only returns workspaces the user is a member of), persists the
chosen UUID via the existing CLI config layer, and prints the resolved name.
`config set workspace_id` stays as the low-level escape hatch.

`multica workspace switch` resolves the workspace before saving, so an
unknown id or slug fails fast and leaves the previous default intact.

`multica workspace current` and a `*` marker in `multica workspace list`
expose which workspace commands without --workspace-id/MULTICA_WORKSPACE_ID
will target. `multica login` reuses the same marker when listing discovered
workspaces and points multi-workspace users at switch.

Docs gain a "Working with multiple workspaces" section spelling out the
resolution priority (--workspace-id flag > env > profile default) and
calling out config set workspace_id as low-level.

Addresses GitHub#2750.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-19 14:43:20 +08:00
Bohan Jiang
6f5fbb7813 feat(comments): thread-aware list with composite cursor (MUL-2340) (#2787)
* feat(comments): thread-aware list with composite cursor (MUL-2340)

Adds three optional query params to GET /api/issues/{id}/comments and the
matching `multica issue comment list` flags:

- `thread=<comment-uuid>` resolves the anchor to the thread root via a
  recursive CTE (defends against any future nested replies) and returns
  root + all descendants chronologically. Anchor can be any comment in
  the thread, root or reply.
- `recent=<N>` returns the newest N comments for the issue, ordered
  chronologically in the response.
- `before=<RFC3339>` + `before-id=<uuid>` form a composite cursor for
  stable pagination of `recent`. Both must be set together; a
  timestamp-only cursor is rejected because ties on `created_at` would
  let the existing `(created_at ASC, id ASC)` total order skip or
  duplicate rows across pages.

Flag combination rules: `thread` is exclusive with `recent` and the
cursor; both may combine with `since`. Server and CLI enforce the same
matrix; the CLI fails fast locally so callers don't pay for a 400
round-trip.

Default behaviour (no params) is unchanged — full chronological dump
capped at commentHardCap — so the desktop UI and existing `--since`
polling are untouched. Agent prompt updates land in a follow-up PR so
the new CLI capabilities ship and bake first.

Co-authored-by: multica-agent <github@multica.ai>

* fix(comments): reject cursor without recent and align CLI/server on invalid --recent (MUL-2340)

Elon's PR #2787 second review flagged two gaps in the flag combination
matrix:

- server: GET /comments?before=...&before_id=... without `recent` was
  silently dropped by fetchCommentsForList (RecentN=0 fell through to
  the default / since path), so callers got the full timeline instead
  of the documented "before X" semantics. Now returns 400.
- CLI: --recent 0 / --recent -3 were collapsed with "flag not passed"
  by `recent > 0`, so an explicit invalid value silently fell back to
  the default list. Switched to Flags().Changed("recent") so explicit
  non-positive values fail loudly. Also enforces that --before /
  --before-id only appear with explicit --recent (mirrors the new
  server-side rule).

Tests:
- server flag matrix gains `before + before_id without recent → 400`.
- CLI gains TestRunIssueCommentListFlagGuards covering `--recent 0`,
  `--recent -3`, cursor-without-recent, and the thread/recent
  exclusivity path under the new Changed()-based check. The mock
  server fatals if a request reaches /comments, proving the guards
  fire before any HTTP round-trip.

Co-authored-by: multica-agent <github@multica.ai>

* feat(comments): make `recent` thread-grouped with a thread cursor (MUL-2340)

Bohan pushed back on the row-based `recent=N` shape: comments form a tree,
not a list, and the newest N rows can come from N unrelated threads, giving
the agent N disjoint conversational tails. Replace the row-based query with
a thread-grouped one before #2787 merges so we never ship the wrong shape:

- `recent=N` now returns the N most recently active threads (root + every
  descendant per thread). A thread's recency is MAX(created_at) across its
  whole subtree, so a stale-but-recently-replied thread outranks an old
  quiet one — exactly the property row-recent loses.
- The cursor is now a *thread* cursor: `before` = a thread's
  last_activity_at, `before_id` = its root comment id. The pair walks
  threads strictly less recent than the page's oldest-active thread. The
  cursor surfaces via `X-Multica-Next-Before` / `X-Multica-Next-Before-Id`
  response headers (empty when there are no older threads); the CLI
  forwards the same pair to stderr after listing.
- Row-based `recent` is gone — there is no internal caller and the prompt
  update has not shipped yet, so there is no compat surface to preserve.
- Response body shape unchanged (flat JSON array, chronological). Default
  and `--since` paths untouched. Desktop UI keeps working.

Tests:
- recent=1 returns the freshest-active thread fully; recent=2 returns both
  with the older-active thread first (oldest-active → freshest tail).
- Stale-but-fresh: a thread whose root is older but has a fresh reply
  outranks a thread whose root is newer but quiet.
- Cursor headers emitted only on full pages; empty on the final page.
- Pagination walks threads root2 → root1 → empty, no skips/duplicates.
- Tie-break: three threads sharing last_activity_at paginate one-at-a-time
  using (last_activity_at, root_id) ordering — verifies the timestamp-only
  cursor failure mode is fixed for the thread case too.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-18 19:28:26 +08:00
Bohan Jiang
eabfb8f3d1 fix(autopilots): reject unknown {{...}} tokens in issue title template (MUL-2370) (#2799)
* fix(autopilots): reject unknown {{...}} tokens in issue title template (MUL-2370)

`--issue-title-template` (and the matching `issue_title_template` API
field) silently kept any placeholder other than `{{date}}` as a literal
string in the rendered issue title — `{{.TriggeredAt}}`, `{{trigger_id}}`,
`${date}`, etc. would all slip through `strings.ReplaceAll` unchanged
because the renderer only knew one token. The flag name and help text
("Template for issue titles (create_issue mode)") and the docs phrasing
("the title supports interpolation like `{{date}}`") both implied a
richer placeholder set existed.

Tightens the contract on three fronts:
- Reject any `{{...}}` token other than `{{date}}` at create/update time
  with `unknown template variable %q; supported: {{date}}` — turns the
  silent-on-trigger surprise into an explicit 400 the moment the user
  sets the template.
- Update CLI flag help on `autopilot create --issue-title-template` and
  `autopilot update --issue-title-template` to spell out that only
  `{{date}}` (UTC, YYYY-MM-DD) is interpolated.
- Update `apps/docs/content/docs/autopilots{,.zh}.mdx` to drop the
  "like `{{date}}`" phrasing for the single supported placeholder.

Adds service-layer tests covering `interpolateTemplate` (substitution,
empty-template fallback, no-placeholder verbatim) and
`ValidateIssueTitleTemplate` (accepts empty / plain / `{{date}}` /
`{{ date }}`; rejects Go-template, Mustache-style, future placeholders
like `{{datetime}}`, and templates that mix one valid and one invalid
token).

Expanding the placeholder set (`{{datetime}}`, `{{trigger_id}}`,
`{{trigger_source}}`) is tracked as a separate enhancement — those
need run/trigger context plumbed into the renderer, which is out of
scope for this bug fix.

Closes #2732

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilots): render {{ date }} whitespace form too (MUL-2370)

Validator permitted {{ date }} but interpolateTemplate only matched the
exact string {{date}}, so a template that passed create/update could
still emit a literal {{ date }} at trigger time — re-introducing the
silent-literal behaviour the validator was meant to remove.

Route rendering through the same regex as validation so every accepted
form is also a substituted form. Cover {{ date }} substitution in
TestInterpolateTemplate.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-18 18:12:14 +08:00
Jiayuan Zhang
46c1e2c889 feat(squads): show member working status on squad detail page (#2768)
* feat(squads): show member working status on squad detail page

Add a new GET /api/squads/{id}/members/status endpoint that returns each
member's derived working/idle/offline/unstable status, the issues each
agent is currently running, and the last observed activity timestamp.
The Squad detail page's Members tab consumes this snapshot to render a
status pill and an active-issue link next to each agent, with live
refresh wired through the existing task/agent/daemon WS events.

Human members are returned with status=null so the UI can keep them in
the same list without implying a presence signal. Archived agents stay
in the response and surface as offline rather than being filtered out.

Co-authored-by: multica-agent <github@multica.ai>

* fix(squads): address review feedback on member status endpoint

- i18n the "blocked" issue-status pill in squad members tab (was a
  bare literal that failed `i18next/no-literal-string` lint).
- Treat any dispatched/running task as working, even when its
  `agent_task_queue.issue_id` is NULL (chat / quick-create tasks).
  The agent slot is occupied regardless of whether we can render an
  issue link.
- Force `offline` for archived agents so they appear in the list
  but never look like they're still on duty, matching the RFC
  decision in MUL-2319.
- Include `workspaceKeys.squads` in the post-reconnect /
  workspace-switch bulk invalidation so members-status recovers
  after a disconnect during which task/runtime events were missed.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-18 10:35:18 +02:00
Bohan Jiang
2323b72710 feat(autopilots): webhook delivery layer + idempotency/signature/replay (MUL-2334) [PR1] (#2774)
* feat(autopilots): webhook delivery layer + idempotency / signature / replay (MUL-2334)

Splits "inbound webhook receipt" from "autopilot run creation" so we can
record duplicate attempts, signature outcomes, and ignored/skipped
deliveries — and replay a delivery on demand. v1 ingress wrote straight
into autopilot_run.trigger_payload, which collapsed the two concerns and
left run_only autopilots vulnerable to provider retry storms.

Backend only (PR1). UI Deliveries tab follows in PR2.

Schema (migration 093):
  - autopilot_trigger.provider: 'generic' | 'github' (default 'generic').
  - autopilot_trigger.signing_secret: nullable plaintext (HMAC needs it
    cleartext; mirrors how webhook_token is stored).
  - webhook_delivery: one row per inbound POST. Carries raw_body,
    selected_headers, dedupe_key/source, signature_status,
    autopilot_run_id, replayed_from_delivery_id, response_status / body.
  - Partial unique index on (trigger_id, dedupe_key) excludes NULL and
    'rejected' rows, so a wrong-secret 401 does NOT permanently block a
    future retry with the same X-GitHub-Delivery once the operator fixes
    the secret.

Ingress flow (autopilot_webhook.go), persist-first + sync dispatch:
  1. IP rate limit -> 2. token lookup -> 3. token rate limit ->
  4. read raw body -> 5. autopilot/workspace cross-check ->
  6. normalize JSON (400 without persistence on parse failure) ->
  7. compute dedupe key + signature status ->
  8. INSERT delivery (status=queued). On (trigger_id, dedupe_key)
     unique-violation: bump attempt_count on existing row and return
     the original delivery_id + autopilot_run_id with 200 ->
  9. invalid/missing signature: UPDATE -> rejected, return 401 with
     delivery_id (no dispatch, not replayable) ->
 10. trigger disabled / autopilot paused/archived: UPDATE -> ignored,
     return 200 ->
 11. DispatchAutopilot synchronously, UPDATE -> dispatched/skipped/failed
     with autopilot_run_id and the response body we returned ->
 12. TouchAutopilotTriggerFiredAt and return 200.

No new long-running worker. A stale 'queued' row only happens if the
process dies between INSERT and UPDATE; that's a follow-up sweeper, not
this PR.

Authenticated API:
  - GET    /api/autopilots/{id}/deliveries (slim list)
  - GET    /api/autopilots/{id}/deliveries/{deliveryId} (with raw_body)
  - POST   /api/autopilots/{id}/deliveries/{deliveryId}/replay -> creates
    a new delivery row (replayed_from_delivery_id set), dispatches a
    new run, never collapses onto the original via dedupe.
  - PUT    /api/autopilots/{id}/triggers/{triggerId}/signing-secret
    Write-only; trigger response surfaces has_signing_secret +
    signing_secret_hint (last 4 chars), never the secret itself.

Signature verification reuses the GitHub-compatible
X-Hub-Signature-256: sha256=<hex(hmac(body, secret))> scheme; the
HMAC helper is constant-time. Invalid/missing signatures still count
against per-IP and per-token rate limits.

autopilot_run.trigger_payload is intentionally preserved — delivery
records the HTTP receipt; run records the normalized envelope handed
to the agent. They are two different views.

Tests (Postgres-backed):
  - delivery persistence on accept
  - dedupe via Idempotency-Key and X-GitHub-Delivery; run_only retry
    storm pin (3 retries -> 1 run)
  - invalid signature: 401 + rejected row + no run linkage
  - missing signature when secret configured: 401 + 'missing' state
  - valid signature dispatches
  - signing secret never echoed in trigger responses; hint shows last 4
  - min-length and clear-by-empty for signing secret PUT
  - replay creates a NEW delivery + new run; rejected deliveries cannot
    be replayed
  - list omits raw_body; detail includes it; cross-autopilot ID returns
    404 (workspace isolation defense in depth)
  - provider validation: unknown -> 400, github -> 201 round-trips
  - bad-signature stream still counts against per-token rate limit

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilots): address PR review on webhook delivery layer (MUL-2334)

- Exclude `failed` from the (trigger_id, dedupe_key) partial unique index
  alongside `rejected`, so a transient ingress failure does not strand the
  provider's stable X-GitHub-Delivery / Idempotency-Key retry. Update the
  dedupe lookup to prefer non-terminal rows under the same predicate.
- Tighten delivery status enum: drop `skipped` from the CHECK constraint
  and from the handler. A run that was admission-skipped (e.g. runtime
  offline) is now recorded as delivery=`dispatched` linked to the
  skipped run, with the response payload carrying status=`skipped`.
  Source of truth for skipped-ness is autopilot_run.status, not the
  delivery row — keeps the Deliveries UI enum unambiguous.
- On dispatch error, link the (possibly non-nil) autopilot_run returned
  by DispatchAutopilot to the failed delivery so Deliveries UI can
  navigate to the run row for debugging.
- Slim list projection: ListWebhookDeliveriesByAutopilot no longer pulls
  raw_body / selected_headers / response_body — a 100-row page × 256 KiB
  would otherwise round-trip ~25 MiB from Postgres per Deliveries reload.
  Detail endpoint continues to return the full row.
- Fix backend CI: TestGetDelivery_ReturnsFullPayload now decodes the
  response and asserts on the parsed raw_body instead of substring-
  matching against an escaped JSON string; raise the test-suite default
  webhook rate limits in TestMain so the shared 192.0.2.1 IP bucket
  doesn't fill across the suite and leak 429s into unrelated tests.
- Add regression coverage for the dedupe-after-failure path.

cd server && go test ./... is green locally.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-18 14:59:40 +08:00
Zohar Babin
15152c6ccd feat(auth): cache workspace membership for daemon heartbeat path (MUL-2247) (#2638)
* feat(auth): cache workspace membership for daemon heartbeat path

Cache workspace membership existence (not role) in Redis to eliminate a
DB round-trip on every PAT-authenticated daemon heartbeat. Follows the
existing PATCache nil-safe pattern.

Key design decisions per reviewer feedback:
- Cache existence only (sentinel "1"), not role string. Authorization
  decisions that depend on role always hit the DB directly. This
  eliminates the cache-aside race where a stale elevated role could
  persist after a downgrade.
- Proactive invalidation on UpdateMember, DeleteMember, LeaveWorkspace,
  and DeleteWorkspace (iterates members before cascade delete).
- 5 min TTL. Combined with PATCache (10 min), worst-case revocation
  delay is max(10m, 5m) = 10 min — consistent with original PATCache
  design decision.

Limitations:
- Non-members still hit DB on every request (negative caching not
  implemented — the scenario is rare for daemon endpoints which require
  valid workspace-scoped tokens).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(auth): drive membership cache invalidation through real handlers

- TestRequireDaemonWorkspaceAccess_CacheHit now uses a ghost user with no
  member row, so the only path to a granted access is the cache short-circuit.
  Without priming the cache the access check must fail; with priming it must
  succeed. A future change that bypasses the cache would fail the second
  assertion.
- Replaces the cache-only InvalidatedOnMemberRemoval test (which only
  re-exercised the auth-package primitive) with four handler-driven tests
  that exercise DeleteMember, UpdateMember, LeaveWorkspace and
  DeleteWorkspace via their real HTTP handlers. Each test prepares a real
  member, primes the cache, calls the handler, and asserts the cache entry
  is gone — so a refactor that drops one of the Invalidate(...) calls in
  workspace.go will fail CI.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Jiang Bohan <bhjiang@outlook.com>
2026-05-18 13:30:35 +08:00
Zohar Babin
e50bfc88da fix(auth): add per-IP rate limiting on public auth endpoints (#2636)
Adds a Redis-backed fixed-window rate limiter middleware on /auth/send-code,
/auth/verify-code, and /auth/google. Prevents brute-force enumeration,
verification_code table flooding, and connection pool exhaustion from
rapid-fire unauthenticated requests.

Key design decisions per reviewer feedback:

- X-Forwarded-For trust model: XFF is NEVER trusted by default. Only
  honored when RemoteAddr is from a CIDR in RATE_LIMIT_TRUSTED_PROXIES.
  Uses rightmost-untrusted algorithm (walks XFF right-to-left, returns
  first non-trusted IP). Matches the project's conservative model in
  health_realtime.go.

- Atomic INCR+EXPIRE via Lua script: prevents a stuck key (permanent
  ban) if EXPIRE fails independently. Follows existing Lua script
  pattern in runtime_local_skills_redis_store.go.

- Fixed-window counter (not sliding-window): simple, adequate for auth
  rate limiting where precision at window boundaries is acceptable.

- Fail-open with startup warning: nil Redis disables rate limiting
  (same as PATCache), but logs a warning at startup so ops can see.

- IPv6 normalization: net.ParseIP().String() produces canonical form.

- Configurable via env vars: RATE_LIMIT_AUTH (default 5/min),
  RATE_LIMIT_AUTH_VERIFY (default 20/min), RATE_LIMIT_TRUSTED_PROXIES.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-18 12:59:28 +08:00
Kerim Incedayi
9418d2a2c1 feat(autopilots): webhook triggers (server + CLI + UI + docs) MUL-2049 (#2348)
* feat(server): add webhook trigger DB migration + sqlc queries

Lays the foundation for webhook autopilot triggers:
- partial unique index on autopilot_trigger.webhook_token (kind=webhook only)
  so the public ingress route can resolve a trigger in O(1)
- GetWebhookTriggerByToken / TouchAutopilotTriggerFiredAt /
  RotateAutopilotTriggerWebhookToken / SetAutopilotTriggerWebhookToken
  queries, regenerated with sqlc

* feat(server): webhook token generator + payload normalizer

Two pure helpers for the webhook autopilot work:
- generateWebhookToken: 32 random bytes -> base64-url, "awt_" prefix.
  256 bits of entropy keeps brute-force off the table; the prefix makes
  leaked tokens recognisable in logs.
- normalizeWebhookPayload: turns arbitrary JSON into the WebhookEnvelope
  shape (event/eventPayload/request) used by trigger_payload. Header- and
  body-based event inference covers GitHub, GitLab, X-Event-Type, and
  caller-provided envelopes; scalar/empty/invalid bodies are rejected so
  the handler can answer 400.

* feat(server): generate webhook tokens and expose rotate endpoint

- New handler.Config.PublicURL fed by MULTICA_PUBLIC_URL env so
  /api/autopilots/.../triggers responses can include an absolute
  webhook_url alongside the always-present webhook_path.
- CreateAutopilotTrigger now mints a webhook_token via crypto/rand
  for kind=webhook and ignores cron/timezone for non-schedule kinds.
  api triggers stay accepted-but-inert per PLAN.md.
- New POST /api/autopilots/{id}/triggers/{triggerId}/rotate-webhook-token
  protected by the existing workspace auth group; old tokens stop
  working immediately because the unique-index lookup keys on the
  current row value.

* feat(server): public webhook ingress route + per-token rate limiter

- New POST /api/webhooks/autopilots/{token} route, mounted outside the
  authenticated group: the path token is the credential. Workspace
  context is derived from the joined autopilot row, never headers.
- Body capped at 256 KiB via http.MaxBytesReader; oversized payloads
  return 413 mid-read instead of being fully buffered.
- Disabled triggers / paused / archived autopilots return
  200 {"status":"ignored"} so providers stop retrying.
- Skipped-runtime dispatches surface 200 {"status":"skipped"} with the
  reason from the autopilot service's pre-flight admission check.
- WebhookRateLimiter interface with sliding-window in-memory + Redis
  Lua-script implementations. Default 60 req/min per token. Test
  coverage on the in-memory path; Redis variant fails open on cache
  errors so a Redis hiccup never blocks ingress.
- Integration tests exercise token generation, dispatch, payload
  envelope persistence, GitHub-header inference, paused/disabled
  short-circuits, oversized rejection, and rotate-then-old-token-404.

* feat(server): include webhook payload in create_issue description

When an autopilot run is triggered by a webhook and execution_mode is
create_issue, the agent only sees the issue body — never the run's
trigger_payload. Append a 'Webhook event:' line and a fenced JSON block
with the normalized eventPayload so the agent has the inbound context
inline. Schedule / manual runs are unchanged.

Tests cover:
  - schedule path keeps existing italic note, no webhook block
  - webhook path emits event line + payload block, italic before block
  - non-envelope JSON falls back to raw body (defensive)
  - non-webhook source with payload still gets no webhook block

* feat(core): types, API client and mutations for webhook triggers

- AutopilotRunStatus gains 'skipped' so the run-list UI handles the
  admission-skipped state explicitly instead of falling through to a
  generic case (the backend already emits it via MUL-1899).
- AutopilotTrigger picks up optional webhook_path / webhook_url. Both
  are optional so older self-hosted servers that pre-date this change
  still parse cleanly.
- buildAutopilotWebhookUrl helper composes a usable absolute URL with
  the priority webhook_url > apiBaseUrl + path > origin + path > path.
  Tested with seven cases covering each branch.
- ApiClient.rotateAutopilotTriggerWebhookToken posts to
  /api/autopilots/{id}/triggers/{triggerId}/rotate-webhook-token; the
  HTTP-contract test pins URL + method.
- useRotateAutopilotTriggerWebhookToken mutation invalidates
  autopilotKeys.detail on settle, mirroring the existing trigger-mutation
  pattern.

* feat(views): webhook trigger UI in Add Trigger dialog and trigger row

Add Trigger dialog gains a Schedule/Webhook segmented toggle:
  - Schedule reuses TriggerConfigSection unchanged.
  - Webhook hides the cron config and shows a help line; the trigger is
    created with kind=webhook and the URL is generated server-side.
  - Toast text differentiates schedule vs webhook on success.

TriggerRow grows a webhook branch:
  - Webhook icon, kind translated via trigger_kind.
  - URL shown in a truncating monospace pill, with copy + rotate
    buttons. Copy uses navigator.clipboard with toast feedback; rotate
    uses an AlertDialog confirm because the old URL stops working
    immediately.
  - api triggers render a Deprecated badge and skip URL/copy/rotate
    affordances.

RunRow gains a 'skipped' RUN_VISUAL entry (muted dash) so admission-
skipped runs don't fall through to a generic case. Source label uses the
new run_source i18n key instead of capitalize.

Locales: en + zh-Hans gain run_status.skipped, run_source.*,
trigger_kind.*, trigger_row.{copy_url,rotate_url,*_confirm_*,toast_*},
add_trigger_dialog.{type_*,webhook_help,toast_added_{schedule,webhook}}.

* feat(cli): support webhook trigger creation and URL rotation

- multica autopilot trigger-add now takes --kind schedule|webhook
  (default schedule for backward compatibility). For webhook it skips
  --cron / --timezone validation and prints the resulting webhook URL,
  preferring the server-provided webhook_url and falling back to
  client.BaseURL + webhook_path.
- New multica autopilot trigger-rotate-url <autopilot-id> <trigger-id>
  command for rotating the bearer URL of a webhook trigger.

* docs(autopilots): add webhook trigger guide (en + zh)

Replaces the 'Webhook and API triggers are not available yet' section
with end-to-end webhook documentation: how the URL is generated, what
payload shapes are accepted, the inferred-event rules, the bearer-secret
warning + rotate flow, status-code semantics for accepted/skipped/
ignored/4xx/5xx outcomes, and the MULTICA_PUBLIC_URL self-host
configuration.

Run history list now mentions skipped status. The 'unavailable
features' section narrows to api-kind triggers, HMAC signing, IP
allowlists, and provider presets.

* feat(views): add Schedule/Webhook toggle to the create autopilot dialog

Closes the gap where a brand-new autopilot could only be created with a
schedule trigger. The right-column config now has a Trigger section
with a segmented Schedule/Webhook control:
  - Schedule keeps the existing cron/timezone UI.
  - Webhook hides the cron UI and shows a help line; on submit, a
    kind=webhook trigger is created right after the autopilot.

In edit mode the toggle is intentionally hidden (PLAN.md treats trigger-
type changes as delete-old + create-new, not in-place updates), but the
panel still picks the right kind based on props.triggers[0].kind so a
webhook autopilot doesn't render an irrelevant cron form.

Locales: section_trigger_kind, trigger_kind_{schedule,webhook},
section_webhook, webhook_help_{create,edit} added in en + zh-Hans.

* feat(views): show webhook URL inline after creating a webhook autopilot

After a successful create with kind=webhook, the dialog stays open and
swaps to a confirmation panel showing the freshly minted URL with a
copy button + 'Treat this URL like a password' warning + Done button.
Avoids the friction of "create the autopilot, then go find it in the
list, click in, scroll to triggers, copy URL."

Locales: dialog.webhook_created_{title,description,warning,done} added
in en + zh-Hans.

Schedule create flow is unchanged (toast + close). The success panel is
gated on the trigger returned from the create mutation, so a partial
failure (autopilot created, trigger creation errored) still falls
through to the toast_create_partial path.

* feat(views): show webhook payload in run detail dialog

The agent transcript dialog now accepts an optional headerSlot that
sits above the event list. The autopilot RunRow drops a
WebhookPayloadPreview into that slot when the run came from a webhook
and trigger_payload is non-empty.

The preview is collapsed by default (the transcript itself is the main
event), shows the inferred event name + receivedAt in the header, and
reveals the eventPayload as pretty-printed JSON with a copy button on
expand. Falls back gracefully if the row's trigger_payload doesn't
match the WebhookEnvelope shape — the whole value is shown instead so
nothing is hidden.

Closes the "agent didn't echo the payload, now I can't see what
triggered the run" gap. PLAN.md tracked this as
"Payload preview in run history" under follow-ups.

Locales: webhook_payload.{label, unknown_event, payload, content_type,
copy, copied, copied_short, copy_failed} added in en + zh-Hans.

* chore(server): wire MULTICA_PUBLIC_URL through self-host compose

Two small follow-ups split out of the webhook trigger PR:

- docker-compose.selfhost.yml passes MULTICA_PUBLIC_URL into the
  backend container so a self-hosted deployment behind a real domain
  gets absolute webhook URLs in the trigger response. Documented in
  .env.example with the rationale for not deriving the public host
  from request headers.
- Drop a duplicated 'invalid json:' prefix in the webhook ingress
  400 error path. normalizeWebhookPayload already prefixes its
  errors, so the handler doesn't need to re-prefix.

* fix(migrations): renumber webhook trigger migration 081 → 089 to avoid collision

The branch's 081_autopilot_webhook_triggers.{up,down}.sql collided
numerically with 081_runtime_timezone.{up,down}.sql that landed on
main, making migration apply order undefined. Renumber to 089 so the
file slots after the latest main migration (088_squad_instructions).

The SQL itself doesn't conflict — it only creates a partial unique
index on autopilot_trigger.webhook_token — but the duplicate prefix
is what the migration runner sees, so the filename must move.

* fix(autopilot-webhook): address PR review blocking issues

- Redact bearer tokens from request logs: paths matching
  /api/webhooks/autopilots/<token> now log "[redacted]" instead of the
  token. The resolved trigger ID is plumbed via context so audit lines
  stay useful for debugging. (Review item Blocking #1.)
- Distinguish pgx.ErrNoRows from transient DB errors in token lookup:
  no-row stays 404 (so providers don't retry on a deleted webhook),
  other errors return 500 (which providers DO retry, avoiding silent
  drops on DB blips). (Review item Blocking #2.)
- Add per-IP sliding-window rate limiter that runs BEFORE the token
  lookup, so spraying random tokens can no longer probe the
  autopilot_trigger index unboundedly. Reuses the existing Lua script
  with a separate Redis key namespace; falls open on Redis errors.
  Default budget 30 req/min/IP. (Review item Blocking #3.)

The webhook handler now applies the gates in the order: per-IP rate
limit → token lookup → per-token rate limit → handler logic.

* fix(autopilot): atomic webhook trigger creation + strict kind/timezone validation

- Mint the webhook bearer token BEFORE the INSERT and pass it via
  CreateAutopilotTriggerParams so the row never exists in a half-written
  kind=webhook + webhook_token=NULL state. On the (vanishingly rare)
  unique-index collision the whole INSERT is retried with a fresh token
  — no UPDATE second step. Removes the now-dead attachFreshWebhookToken
  helper. (Review item Recommended #4.)
- Add new GET /api/autopilots/{id}/runs/{runId} endpoint that returns a
  single run including the full trigger_payload. The list response is
  now slim (omits trigger_payload) so worst-case payload size drops
  from ~5 MB to ~5 KB. (Review item Recommended #5, server side.)
- Reject kind=api with 400 ("kind=api is deprecated; use schedule or
  webhook") and reject kind=webhook with --timezone with 400 — both
  surfaces stragglers loudly instead of silently dropping fields.
  CLI mirrors the check so --timezone with --kind webhook errors
  client-side. (Review nits.)
- Add --yes (-y) flag and an interactive y/N confirmation prompt to
  `multica autopilot trigger-rotate-url` so the destructive rotate
  matches the UI's AlertDialog safety. (Review item Recommended #6.)

* fix(views): fetch webhook payload on-demand and truncate at 4 KiB

- Add useAutopilotRun query hook + getAutopilotRun API client method
  paired with the new server endpoint. The run-detail dialog now mounts
  a WebhookPayloadSlot that fetches the full run (incl. trigger_payload)
  lazily — list responses no longer carry up to 256 KiB × N runs of
  envelope data.
- WebhookPayloadPreview truncates its in-DOM <pre> at 4 KiB with a
  localized marker so jank-y machines aren't asked to render a 256 KiB
  JSON blob. The Copy button still yields the full string.
- Adds the truncated_marker i18n string to en + zh-Hans.

Review items Recommended #5 (frontend) and a nit on the preview's
unbounded <pre>.

* test(autopilot-webhook): close coverage gaps flagged in PR review

- request_logger: redactWebhookPath unit tests + integration test
  proving the bearer token never lands in slog output, plus the
  webhook_trigger_id context plumbing.
- autopilot_webhook_handler: empty body → 400, archived autopilot →
  200 ignored, per-IP rate limiter trips before DB lookup, kind=api
  and webhook+timezone are rejected at 400, slim list + full detail
  endpoint round-trip.
- webhook_rate_limiter: Lua script structure guard (catches reordering
  even without a live Redis), plus live-Redis tests for both per-token
  and per-IP limiters (REDIS_TEST_URL gated, matching the existing
  Redis test pattern in the package).
- WebhookPayloadPreview: envelope rendering, fallback shape, and the
  >4 KiB truncation path with full-payload-on-Copy guarantee.

Two branches are documented as code-review-protected rather than
covered by tests: the 500-on-DB-error path requires injecting a stub
Queries (no interface here), and the cross-workspace defense-in-depth
check is unreachable from valid SQL state.

* fix(middleware): SetWebhookTriggerID must mutate request in place

The round-1 helper returned a fresh *http.Request from WithContext, and
the webhook handler did `r = SetWebhookTriggerID(r, ...)`. That swaps
the handler's local pointer but doesn't propagate the new context back
to RequestLogger, which is still holding the original *http.Request —
so the audit line never actually included webhook_trigger_id in
production. The round-1 test happened to pass because it pre-stashed
the value on the request before calling ServeHTTP, bypassing the bug
it was meant to verify.

Switch to in-place mutation via `*r = *r.WithContext(...)` so the
wrapping middleware sees the new context after next.ServeHTTP returns,
and update the test to exercise the real call pattern (set the context
from inside the handler, assert the surrounding logger reads it).

Verified live: an accepted webhook now logs
  path=/api/webhooks/autopilots/[redacted] webhook_trigger_id=<uuid>

* fix(autopilot-webhook): symmetric ErrNoRows split + trusted-proxy gate

Round-2 review (Bohan-J, PR #2348 follow-up):

- Must-fix #1: the second lookup at autopilot_webhook.go:258
  (GetAutopilot after the token resolves) was folding every error into
  404. A transient DB blip would tell a webhook sender "not found" and
  it would never retry. Apply the same errors.Is(err, pgx.ErrNoRows)
  → 404 / else → 500 split as the first lookup got in round 1.

- Must-fix #2: clientIPForRateLimit was honoring X-Forwarded-For /
  X-Real-IP from any caller. An attacker spraying random tokens could
  just rotate the XFF header and the per-IP bucket became per-request,
  so the limiter that's specifically supposed to gate spraying before
  it hits the DB unique index was bypassed.

  New shape — matches Bohan's suggestion exactly:
  * Default: r.RemoteAddr only, headers ignored.
  * Operator opt-in via MULTICA_TRUSTED_PROXIES (comma-separated
    CIDRs). XFF/X-Real-IP are honored only when r.RemoteAddr is
    inside one of the listed prefixes; otherwise they're dropped.

  Wired through .env.example and docker-compose.selfhost.yml so
  self-host operators can configure their reverse-proxy's CIDR.
  Invalid CIDRs in the env var are dropped with a single slog.Warn at
  startup rather than crashing the server. Uses net/netip (stdlib,
  value-typed) for parsing and containment checks.

Verified live on the rebuilt self-host backend: a 35-request spray
from one source with rotating XFF gets the expected 30× 404 + 5× 429,
proving the per-IP bucket is keyed on the real connection IP.

* fix(autopilot): reject cron/timezone PATCH on non-schedule triggers

Round-2 review should-fix. CreateAutopilotTrigger already 400s on
kind=webhook + timezone/cron_expression, but UpdateAutopilotTrigger
silently wrote those fields regardless of prev.Kind. The values then
sat in the DB visible to nobody and read by nothing — a back door that
left the API contract fuzzy across create vs update.

Mirror the create-path discipline: after loading prev, if prev.Kind
!= "schedule" and the PATCH body sets cron_expression or timezone,
return 400 with a clear message. enabled and label remain accepted on
every kind.

The existing prev.Kind == "schedule" guard on next_run_at recompute
stays as belt-and-braces, but with this gate in place the recompute
branch is now reachable only for the kind it was meant for.

* test(autopilot-webhook): close round-2 coverage gaps

- IPRateLimitNotBypassedByXFFSpoof: drives the must-fix #2 invariant
  by rotating XFF across three calls from the same RemoteAddr and
  asserting the third gets 429. Pre-round-2 this test would have
  passed for the wrong reason (limiter trusted XFF, so per-bucket
  collision was incidental); now it pins the bypass-closed property.
- IPRateLimitReturns429BeforeDBLookup: updated to set RemoteAddr
  explicitly and drop the XFF header it was leaning on. With
  TrustedProxies empty (test default) the limiter keys on the real
  connection IP, which is what the test wants to assert anyway.
- UpdateAutopilotTrigger_RejectsCronExpressionOnWebhookKind +
  UpdateAutopilotTrigger_RejectsTimezoneOnWebhookKind: drive the
  round-2 should-fix from the handler boundary.
- UpdateAutopilotTrigger_AcceptsEnabledAndLabelOnWebhookKind: counter
  test so a regression to a blanket reject is caught.

* fix(migrations): bump webhook trigger migration 089 → 091

origin/main added 089_squad_no_action_activity_index (and 090_task_is_leader)
since our last rebase, re-colliding with our 089_autopilot_webhook_triggers.
Bump to 091 so the filename ordering is unambiguous again. The SQL is
unchanged — same partial unique index on autopilot_trigger.webhook_token —
only the filename moves.

* fix(views): dedupe skipped icon in autopilot RUN_VISUAL after rebase

The rebase against origin/main merged main's add of `Ban` for the
skipped status next to our round-1 `MinusCircle` entry, leaving the
RUN_VISUAL map with two `skipped` keys (only the last would have been
read at runtime, and MinusCircle had been dropped from the imports
during conflict resolution — so the file would not compile).

Keep main's `Ban` icon (latest design) and a single `skipped` entry.
Carry over the round-1 comment about why the muted styling matters
for failure-ratio readability.

---------

Co-authored-by: Kerim Incedayi <kerim.incedayi@digitalchargingsolutions.com>
2026-05-18 12:17:39 +08:00
Bohan Jiang
3645bdb5b6 feat(issues): add start_date field with progressive disclosure (MUL-2274) (#2696)
* feat(issues): add start_date field with progressive disclosure (MUL-2274)

Mirrors the existing due_date implementation end-to-end so an issue can
express a planned start in addition to a deadline. Surfaces start_date as
an optional sidebar property alongside priority / due_date / labels (added
in MUL-2275), with consistent picker, board/list/sort, activity, and inbox
plumbing.

Backs the Project Gantt work (parent MUL-1881) and keeps the
progressive-disclosure attribute experience consistent.

- DB: migration 091 adds issue.start_date TIMESTAMPTZ.
- sqlc: ListIssues / CreateIssue / UpdateIssue / CreateIssueWithOrigin /
  ListOpenIssues read & write start_date.
- Backend: IssueResponse + create/update/batch-update handlers parse and
  emit start_date with RFC3339 validation; new start_date_changed activity
  event + subscriber notification (with prev_start_date in event payload).
- CLI: --start-date flag on `multica issue create` / `issue update`.
- Frontend: StartDatePicker component, start_date wired into Issue type,
  Zod schema, draft / view stores, sort util, header sort + card-property
  options, list-row / board-card display, create-issue modal, and the
  issue-detail progressive-disclosure "+ Add property" surface (visibility
  rule, picker row, add-property menu icon + label).
- i18n: en + zh-Hans for sort_start_date / card_start_date /
  prop_start_date / activity start_date_set / start_date_removed /
  picker start_date.trigger_label / clear_action / inbox labels.
- Tests: new TestNotification_StartDateChanged; existing Issue / draft /
  modal fixtures extended with start_date.

Co-authored-by: multica-agent <github@multica.ai>

* feat(issues): align start_date with due_date in actions menu and CLI table

- Add Start Date submenu (today / tomorrow / next week / clear) in
  actions menu, mirroring Due Date — parity with the Due Date quick
  setters in list/board context and 3-dot menus.
- Add corresponding en / zh-Hans i18n keys
  (actions.start_date / start_today / start_tomorrow / start_next_week
  / start_clear).
- CLI human table for `multica issue list` and `multica issue get`
  now shows a START DATE column next to DUE DATE; --full-id variant
  too.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-17 15:01:38 +08:00
Jiayuan Zhang
380c6b5122 feat(usage): add Time and Tasks to daily-trend toggle (MUL-2283) (#2709)
Extends the workspace /usage page Daily tokens chart toggle from
Tokens | Cost to Tokens | Cost | Time | Tasks, so users see daily
run-time and task-count trends alongside spend without leaving the page.

- New SQL `ListDashboardRunTimeDaily`: per-date totals from
  agent_task_queue (terminal tasks only), scoped to workspace and
  optionally project. Same time anchor as ListDashboardAgentRunTime
  so day boundaries line up.
- New handler GET /api/dashboard/runtime/daily + TanStack Query option.
- New DailyTimeChart (single-series, smart h/m/s unit) and
  DailyTasksChart (completed + failed stacked).
- Empty-state is per-metric so a workspace with tokens but no terminal
  runs (or vice-versa) doesn't get a false "no data".
- i18n: en + zh-Hans daily.metric_time / metric_tasks + titles.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-15 18:51:02 +02:00
Jiayuan Zhang
8e88156356 Add assignee grouping for issue boards (#2693) 2026-05-15 18:44:08 +08:00
iYuan
d8635ad580 fix(issues): prevent duplicate active issue creation (MUL-2225) (#2602)
* fix: prevent duplicate active issue creation

* fix(issues): address duplicate guard review

* fix(autopilot): skip duplicate issue admissions

* fix(issueguard): tighten duplicate lookup edge cases

* test(issues): cover duplicate guard autopilot skips

* feat(autopilots): group skipped runs in history
2026-05-15 18:27:56 +08:00
Bohan Jiang
fcd13aece9 feat(daemon): auto-update CLI when idle (MUL-2100) (#2679)
* feat(daemon): auto-update CLI when idle (MUL-2100)

Add a periodic poller that checks GitHub for a newer multica release
every hour and self-updates when the daemon is idle, reusing the same
brew-or-download upgrade path the Runtimes-page "Update" button already
runs.

- Refactor handleUpdate to call a shared runUpdate(target) helper so
  both server-triggered and auto-triggered upgrades go through the same
  brew detection + atomic replace + restart.
- New autoUpdateLoop gates each tick on: opt-out flag, Desktop launch
  source, dev-build version, an in-flight update, and active tasks. The
  idle gate guarantees we never interrupt a running agent — busy ticks
  silently retry at the next interval.
- Config: MULTICA_DAEMON_AUTO_UPDATE=false to disable (also via
  --no-auto-update), MULTICA_DAEMON_AUTO_UPDATE_INTERVAL to retune the
  poll period.
- IsNewerVersion / IsReleaseVersion helpers in the cli package, with
  tests covering patch/minor/major bumps, dev-describe strings, and
  malformed input.
- Daemon-side tests cover every skip path (updating, active tasks,
  fetch failure, no-newer) plus the success path that fires
  triggerRestart while keeping the updating flag held to the end.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): close idle race + verify checksum in auto-update (MUL-2100)

Two issues raised in PR #2679 review:

1. The first idle check in tryAutoUpdate only ran before the release-metadata
   fetch, so a poller that won the claim race during the fetch could end up
   handing handleTask a task that triggerRestart was about to cancel via root-
   ctx cancellation. Add a strict claim barrier: runRuntimePoller now
   tryEnterClaim()s before ClaimTask, and tryAutoUpdate flips pauseClaims
   under claimMu only after observing claimsInFlight + activeTasks == 0.
   Pollers that were already mid-claim hold claimsInFlight > 0, so the barrier
   refuses to engage and the update defers to the next tick.

2. The direct-download path replaced the running binary with whatever bytes
   GitHub returned, without checking checksums.txt. Pull the manifest first,
   buffer the archive, and reject on SHA-256 mismatch before extraction. The
   GoReleaser config already publishes checksums.txt; we just consume it.

Also tighten parseReleaseVersion so it stops accepting dev-describe shapes
like "v0.1.13-5-gabcdef0" through the patch trim, matching its docstring.
The auto-update loop already guards on IsReleaseVersion, but the lenient
parser was a footgun and the existing test name even said "not newer" while
asserting the opposite.

Tests:
- TestTryAutoUpdate_DefersWhenClaimInFlightAtBarrier (new race coverage)
- TestTryAutoUpdate_HoldsBarrierAcrossRestart / ReleasesBarrierOnUpgradeFailure
- TestTryEnterClaim_RespectsBarrier
- TestFindChecksumManifestAsset / TestParseChecksumManifest / TestVerifyAssetSHA256
- TestIsNewerVersion: dev-describe cases now expect false (matches docstring)

Co-authored-by: multica-agent <github@multica.ai>

* chore(daemon): default auto-update poll interval to 6h (MUL-2100)

1h was overly chatty for a release that lands at most a few times a week.
Operators who want a different cadence can still set
MULTICA_DAEMON_AUTO_UPDATE_INTERVAL or --auto-update-interval.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-15 18:10:22 +08:00
LinYushen
b7a58c06ac Revert "feat(task): wire claim lease into TaskService and sweeper (MUL-2246) …" (#2673)
This reverts commit bb32be0e50.
2026-05-15 16:06:58 +08:00
LinYushen
bb32be0e50 feat(task): wire claim lease into TaskService and sweeper (MUL-2246) (#2662)
* feat(task): wire claim lease queries into TaskService and sweeper (MUL-2246)

- ClaimTask now uses ClaimAgentTaskWithLease (generates claim_token + lease)
- StartTask accepts optional claim_token for token-verified start
- AgentTaskResponse includes claim_token for daemon to use
- Daemon client sends claim_token in StartTask body
- Sweeper calls RequeueExpiredClaimLeases each tick
- Legacy daemons without claim_token still work (graceful fallback)

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): address PR #2662 review blockers (MUL-2246)

1. ClaimAgentTaskForRuntime: push runtime_id into atomic SQL WHERE clause
   so runtime A cannot claim tasks queued for runtime B under the same agent.

2. Legacy StartAgentTask: add claim_token IS NULL guard so leased rows
   cannot be started without token verification. Handler rejects malformed
   tokens with 400 instead of silently degrading to legacy path.

3. StartAgentTaskWithClaimToken: validate claim_expires_at >= now(),
   preserve claim_token until terminal state (only clear claim_expires_at),
   use CTE + UNION ALL for idempotent retry when daemon resends after a
   lost StartTask response. Return 409 Conflict on token mismatch/expiry.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): StartTask 409 handling, transport retry, claim_token on FailTask (MUL-2246)

- StartTask 409 (claim superseded): release slot, don't call FailTask
- StartTask transport timeout/5xx: retry once with same token, then
  check task status before failing
- FailTask now sends claim_token; server-side FailAgentTask SQL adds
  AND (claim_token IS NULL OR claim_token = @claim_token) guard so
  stale daemons cannot fail tasks that have been re-claimed

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): close FailTask token bypass and RequeueExpiredClaimLeases liveness gap (MUL-2246)

Blocker 1 - FailTask token validation:
- SQL: change (param IS NULL OR claim_token = param) to
  (param IS NULL AND claim_token IS NULL) OR claim_token = param
  so tokenless requests can only fail legacy (tokenless) rows.
- task.go: malformed claim_token now returns ErrInvalidClaimToken (400)
  instead of being silently dropped to NULL.
- Handler: maps ErrInvalidClaimToken→400, ErrClaimTokenInvalid→409.
- Service: when UPDATE returns no rows but task is still active,
  return ErrClaimTokenInvalid (token mismatch) instead of silent success.

Blocker 2 - RequeueExpiredClaimLeases runtime liveness:
- SQL: JOIN agent_runtime, only requeue tasks where runtime is 'online'.
  Dead/offline runtime tasks stay dispatched for FailTasksForOfflineRuntimes.
- FOR UPDATE → FOR UPDATE OF atq (required with JOIN).

Regression tests:
- task_claim_token_test.go: malformed, tokenless-on-tokened, wrong-token
- requeue_lease_test.go: SQL must JOIN agent_runtime with online filter

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): move expired lease requeue to ClaimTaskForRuntime preflight, add heartbeat freshness backstop (MUL-2246)

- Add RequeueExpiredClaimLeasesForRuntime: per-runtime preflight self-requeue
  in ClaimTaskForRuntime. Runtime proves liveness by actively claiming, so no
  heartbeat check needed.
- Update global RequeueExpiredClaimLeases to require ar.last_seen_at freshness
  (stale_threshold_secs param). Prevents requeuing to a dead runtime in the
  90s gap between lease expiry (60s) and offline detection (150s).
- Add regression tests verifying the heartbeat freshness check and that the
  preflight query does not join agent_runtime.

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): use LivenessStore for global requeue, move preflight before empty-cache (MUL-2246)

Blocker 1: Global RequeueExpiredClaimLeases now uses LivenessStore.IsAliveBatch
to verify runtimes are truly alive before requeuing expired leases. When
LivenessStore is unavailable (no Redis), global requeue is skipped entirely —
the preflight self-requeue in ClaimTaskForRuntime handles live runtimes. This
closes the 60-150s gap where a dead runtime still appears online in DB.

Blocker 2: Moved RequeueExpiredClaimLeasesForRuntime BEFORE EmptyClaim.IsEmpty
fast-path in ClaimTaskForRuntime. Expired leases are now requeued (which bumps
the empty cache via notifyTaskAvailable) before the empty check can
short-circuit the claim path.

Also adds ListRuntimesWithExpiredClaimLeases SQL query and LivenessChecker
interface on TaskService.

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): wire EmptyClaimCache into backend taskSvc for backstop requeue (MUL-2246)

The backend taskSvc used by the sweeper only had Liveness wired but not
EmptyClaim. When global backstop requeue called notifyTaskAvailable,
s.EmptyClaim.Bump() was a nil no-op — the handler's empty-cache was never
invalidated, so the daemon's next claim hit a stale empty verdict.

Fix: wire the same Redis-backed EmptyClaimCache into the backend taskSvc
in main.go (same Redis keys as router.go:139 handler instance).

Add regression test verifying backstop requeue invalidates the handler's
empty-cache.

Co-authored-by: multica-agent <github@multica.ai>

* fix(task): global backstop must not requeue — alive runtimes use preflight, dead stay dispatched (MUL-2246)

- RequeueExpiredClaimLeases is now a no-op (returns 0 always)
- Alive runtimes self-requeue via ClaimTaskForRuntime preflight
- Dead runtimes stay dispatched for FailTasksForOfflineRuntimes
- Rewriting to queued on dead runtime creates 2h blackhole (offline
  sweeper only handles dispatched/running)
- Test actually calls RequeueExpiredClaimLeases and asserts 0 in all cases

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): remove duplicate usage reporting block after merge conflict (MUL-2246)

The merge resolution introduced a second ReportTaskUsage call after the
status check, duplicating the usage-before-early-return block that already
runs right after runner.run. Remove the duplicate and add a regression test
asserting /usage is called exactly once on the normal completion path.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-15 15:15:31 +08:00
Bohan Jiang
a23856bae3 MUL-1624 docs(email): clarify 888888 is opt-in; document SMTP option (#2666)
* docs(email): clarify 888888 is opt-in via MULTICA_DEV_VERIFICATION_CODE; document SMTP option in self-host docs

The startup log line, .env.example, and SELF_HOSTING_ADVANCED.md still
implied that the dev master code 888888 is auto-active whenever
APP_ENV != "production". That has not been true since the master code
was gated behind MULTICA_DEV_VERIFICATION_CODE — the fixed code is
disabled by default and must be opted in explicitly.

Also extend the docs site with the SMTP relay backend added in #1877:
auth-setup, environment-variables, and self-host-quickstart now cover
both Resend and SMTP options in EN and ZH.

Co-authored-by: multica-agent <github@multica.ai>

* docs(email): treat SMTP as an email backend in self-host docs and startup warning

Address review feedback on #2666:

- server: startup warning now fires only when both RESEND_API_KEY and SMTP_HOST
  are empty, since either one is a valid email backend. Otherwise the log
  mis-tells SMTP-only operators that verification codes go to stdout.
- self-host-quickstart (EN/ZH): tell readers to fetch the verification code
  from whichever backend they configured (Resend or SMTP); fall back to
  stdout only when neither is configured.
- auth-setup (EN/ZH): \"without Resend\" → \"without any email backend
  configured\" so the wording stays correct now that SMTP is a first-class
  option.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-15 14:18:46 +08:00
LinYushen
75dc70686b fix(realtime): include actor_type in WS broadcast messages (#2668)
* fix(realtime): include actor_type in WebSocket broadcast messages

The WS broadcast message format was {type, payload, actor_id} but missing
actor_type. This meant the web UI could not distinguish agent from human
operations in real-time events at the top level.

While payload data for comments (author_type) and activities (entry.actor_type)
already included the type, the top-level message did not — causing the web UI
to display agent CLI operations as human operations when relying on the
broadcast actor identity.

Changes:
- server/cmd/server/listeners.go: add actor_type to all broadcast messages
- packages/core/types/events.ts: add actor_type to WSMessage interface
- packages/core/api/ws-client.ts: pass actor_type to event handlers
- packages/core/realtime/hooks.ts: update EventHandler type signature
- packages/core/realtime/provider.tsx: update EventHandler type signature

Fixes MUL-2260

Co-authored-by: multica-agent <github@multica.ai>

* test: add frame-shape unit test asserting actor_type in WS frames

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-15 14:10:24 +08:00
yujiawei
a8ce0a8998 feat(cli): add 'multica issue cancel-task <task-id>' command (#2560)
Exposes the existing /api/tasks/{id}/cancel backend endpoint as a CLI
command. Combined with upstream #2107 (cancel running agent on
server-side task delete), this gives operators a way to interrupt a
runaway agent push-storm without resorting to admin-bypass on the
downstream PR.

Use cases:
- Titan / DevBot iterating beyond its boundary (e.g. push-skip loops)
- Codex turn that locked in tool-call spam
- Manual recovery when a long-running task needs to stop NOW

Symmetric with 'issue rerun': accepts the short ID prefix shown by
'issue runs', supports --issue scoping, and reuses resolveTaskRunID
for ambiguity handling.

Refs: PR#19 octo-server post-mortem (2026-05-13)

Co-authored-by: yujiawei <yujiawei@mininglamp.com>
2026-05-14 17:02:58 +08:00
Multica Eve
58cc189dcd fix: honor quick-create squad mentions (#2586)
Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-14 14:01:37 +08:00
LinYushen
add3135a42 feat(cli): add squad create/update/delete and member add/remove (#2574)
* feat(cli): add squad create/update/delete and member add/remove commands

Implement missing squad management commands in the CLI:
- squad create --name --leader [--description]
- squad update <id> [--name] [--description] [--instructions] [--leader] [--avatar-url]
- squad delete <id>
- squad member add <squad-id> --member-id --type [--role]
- squad member remove <squad-id> --member-id --type

Also adds DeleteJSONWithBody to the API client for the member remove
endpoint which uses DELETE with a JSON body.

All commands support --output json for structured output.

Co-authored-by: multica-agent <github@multica.ai>

* fix(squad): add --output json to delete/member remove, return 404 on 0-row delete

- squad delete: add --output json flag, emit {id, deleted} on success
- squad member remove: add --output json flag, emit {squad_id, member_id, removed}
- Backend RemoveSquadMember: change query to :execrows, check RowsAffected
  and return 404 'squad member not found' when 0 rows deleted

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-14 12:51:44 +08:00
Bohan Jiang
21b49eb59b fix(cli): resolve squad assignees in issue create/update/assign (MUL-2165) (#2551)
* fix(cli): resolve squad assignees in issue create/update/assign (MUL-2165)

The CLI assignee resolver only searched workspace members and agents, so a
quick-create input like "assign to <SquadName>" silently fell through to
"Unrecognized assignee: <SquadName>" in the issue description — even though
squads are first-class assignees server-side and the prompt's whole point was
to route the work for the user.

Extend resolveAssignee / resolveAssigneeByID to also fetch /api/squads, teach
the actor display lookup to render squad names in table output, update the
quick-create prompt and runtime-config command listing to mention
`multica squad list` alongside members and agents, and lock in the new
behavior with tests.

Co-authored-by: multica-agent <github@multica.ai>

* fix(cli): gate squad assignee resolution behind an allowed-kinds set (MUL-2165)

The earlier MUL-2165 fix taught resolveAssignee / resolveAssigneeByID to also
return (squad, ...), but those helpers are shared. Project lead and issue
subscriber callers were still using them, and their target schemas reject
squads — project.lead_type has a DB CHECK constraint
(server/migrations/034_projects.up.sql:10) and the subscriber handler's
isWorkspaceEntity switch only knows member/agent
(server/internal/handler/handler.go:414). So
`multica project create --lead "<SquadName>"` and
`multica issue subscriber add --user "<SquadName>"` would resolve to
(squad, ...) and surface as a 500/403 server-side instead of a clean
CLI-side resolution error.

Thread an assigneeKinds set through the resolver and the pickAssigneeFromFlags
helper. Issue create/update/assign/list pass `issueAssigneeKinds` (all three);
project lead and subscriber pass `memberOrAgentKinds`. The squads fetch is
skipped entirely when not allowed, and the not-found / no-match error wording
adapts to the allowed kinds so it never mentions a type the caller cannot use.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-13 22:31:50 +08:00
Bohan Jiang
0345285b86 feat(quick-create): searchable actor picker + squad support (#2552)
* feat(quick-create): searchable actor picker + squad support (MUL-2163)

- Replaces the flat agent dropdown in the "Create with agent" modal with a
  searchable PropertyPicker that lists Agents and Squads in separate
  sections, so users can filter by name and pick a squad as the creator.
- Persists the selection as (lastActorType, lastActorId), removing the
  agent-only lastAgentId field on the quick-create store.
- Adds squad_id to the quick-create API request and stamps it onto the
  task's QuickCreateContext. The handler resolves the squad to its leader
  agent (re-using validateAssigneePair) and the daemon claim path injects
  the squad-leader briefing when the task carries a squad hint, matching
  the behavior of issue-bound squad tasks.

Co-authored-by: multica-agent <github@multica.ai>

* fix(create-issue): forward squad picks across manual→agent switch

Manual mode → agent mode previously only carried `agent_id`, so picking
a squad and then flipping to agent silently fell back to the persisted
actor / first visible agent and lost the user's choice. Carry `squad_id`
on the same branch so the agent panel honors the squad pick.

Adds a sibling test alongside the existing project-carry case.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-13 22:31:17 +08:00
LinYushen
29082f7cfe feat: implement Squad feature MVP (#2505)
* feat: implement Squad feature MVP

- Add migration 084_squad: squad, squad_member, squad_activity_log tables
- Extend issue.assignee_type to support 'squad'
- Add sqlc queries for squad CRUD, member management, activity logs
- Add Go handler with full Squad API (CRUD, members, activity log)
- Register routes: /api/squads/*, /api/issues/{id}/squad-activity, /api/squad-activity
- Add Squad trigger logic:
  - Assign Squad immediately triggers leader
  - Every external comment on squad-assigned issue triggers leader
  - Anti-loop: squad members' comments don't trigger leader
  - Dedup: skip if leader already has pending task
- Add squad activity log API (方案 B) for leader no-op recording
- Add frontend TypeScript types (Squad, SquadMember, SquadActivityLog)
- Add protocol events: squad:created, squad:updated, squad:deleted

Co-authored-by: multica-agent <github@multica.ai>

* fix: address PR review blocking issues

1. validateAssigneePair now accepts 'squad' assignee_type
2. All squad endpoints validate workspace ownership via GetSquadInWorkspace
3. CreateSquadActivityLog restricted to squad leader agent only
4. AddSquadMember validates member exists in workspace
5. UpdateSquad auto-adds new leader to squad members
6. DeleteSquad transfers assigned issues to leader before deletion
7. IssueAssigneeType includes 'squad' in frontend types

Co-authored-by: multica-agent <github@multica.ai>

* feat: soft-delete squads via archive instead of hard delete

- Add migration 085: archived_at + archived_by columns on squad table
- ListSquads now excludes archived squads (ListAllSquads for admin)
- DeleteSquad → ArchiveSquad (sets archived_at, preserves all records)
- Transfer squad-assigned issues to leader before archiving
- SquadResponse includes archived_at/archived_by fields
- Frontend Squad type updated with nullable archived fields

Co-authored-by: multica-agent <github@multica.ai>

* feat: re-add Squads frontend entry (sidebar nav + pages)

Re-applies the frontend squad entry that was lost during a merge:
- Sidebar nav: Squads item with Users icon
- Paths: squads() and squadDetail() in workspace paths
- Routes: /squads and /squads/[id] pages
- Views: SquadsPage (list) and SquadDetailPage
- i18n: en 'Squads' / zh '小队'
- Reserved slug: 'squads'

Co-authored-by: multica-agent <github@multica.ai>

* fix: fix SquadsPage rendering - use PageHeader children pattern

PageHeader takes children, not title/actions props. The incorrect
usage caused a React rendering error. Now matches the pattern used
by autopilots and agents pages.

Co-authored-by: multica-agent <github@multica.ai>

* fix(squads): add API client methods and package export for squads pages

* feat: complete Squad frontend - create dialog, member management, API methods

- Add CreateSquadModal with name/description/leader selection
- Register 'create-squad' in modal registry
- Wire 'New Squad' button to open the modal
- Add full API client methods: createSquad, updateSquad, deleteSquad,
  addSquadMember, removeSquadMember
- Rewrite SquadDetailPage with:
  - Member list showing resolved names
  - Add/remove member UI
  - Archive squad button
  - Back navigation to squads list

Co-authored-by: multica-agent <github@multica.ai>

* feat: improve Squad UI - match create agent dialog style

- CreateSquadModal: proper Dialog with Header/Description/Footer,
  agent picker with avatars, textarea for description
- SquadDetailPage: centered max-w-2xl layout, ActorAvatar for members,
  Crown badge for leader, textarea for member description,
  improved spacing and visual hierarchy
- Renamed 'role' field label to 'Description' in add member form
  (describes the member's responsibilities in the squad)

Co-authored-by: multica-agent <github@multica.ai>

* feat(squad): add avatar, instructions; drop unique-name constraint

- 086: add squad.avatar_url
- 087: drop unique constraint on squad.name (squads with the same
  name are legitimate across teams; uniqueness was an accidental
  product constraint)
- 088: add squad.instructions (text, default '')
- UpdateSquad now COALESCEs avatar_url + instructions
- handler exposes Instructions in SquadResponse and accepts it in
  UpdateSquad

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(squad): assignable + mention target; trigger leader on assign

- assignee picker and @mention suggestion list squads alongside
  agents and members; renders squad avatar/icon
- creating or updating an issue with assignee_type=squad enqueues
  a task for the squad's current leader (mirrors agent-assignee
  parking-lot rule: skip backlog only)
- workspace queries/hooks expose squads where needed for the
  pickers
- locales updated for new picker copy

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(squad): agent-style detail page with members + instructions tabs

- restructure squad detail page to mirror the agent detail page:
  320px inspector (creator, leader, created/updated) + tabbed
  pane (Members | Instructions) with dirty-guard AlertDialog
- inline name + avatar editing on the inspector
- inline description editor (modal textarea)
- members tab: leader + member picker with role descriptions,
  swap leader, edit member roles, remove
- instructions tab: ContentEditor + Save (mirrors agent pattern)
- squads list shows the squad avatar/icon
- core types + api.updateSquad accept avatar_url + instructions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(squad): inject leader briefing on claim (protocol + roster + instructions)

When a squad's leader agent claims a task on a squad-assigned issue,
append a system-level briefing to the agent's Instructions composed of:

1. Squad Operating Protocol — hard-coded rules: leader is a
   coordinator, dispatch via @mention, stop after dispatching,
   resume on re-trigger, do not work outside the roster.
2. Squad Roster — leader self-row plus one row per non-archived
   member with a literal mention markdown string ([@Name](mention://
   agent|member/<UUID>)) the leader can paste verbatim. Round-trips
   through util.ParseMentions, enforced by a contract test.
3. Squad Instructions — the user-defined squad.instructions block,
   omitted entirely when empty so we do not leave a dangling heading.

Non-leader members claiming the same issue receive no briefing.

Tests cover: full squad with mixed agent/human members, lone leader,
archived agents skipped, empty user instructions, mention round-trip,
and the leader/non-leader claim-handler gate.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(squad): tell leader not to restate issue context in dispatch comment

After observing leaders padding their delegation comments with full
re-summaries of the issue body and prior discussion, make the
Operating Protocol explicit:

- assignees on Multica already have the full issue (title,
  description, all comments, attachments) and workspace context;
- delegation comments should add only what cannot be inferred
  (who is picked, why, extra constraints), aim for two or three
  sentences;
- restating context is now an explicit hard rule violation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* feat(squad): unify leader evaluation into activity_log, add CLI command

- Squad member comments now trigger leader (only leader self-excluded)
- Replace squad_activity_log with activity_log (action: squad_leader_evaluated)
- Add CLI: multica squad activity <issue-id> <outcome> --reason
- Add API: POST /api/issues/{id}/squad-evaluated
- Update squad operating protocol to require evaluation recording
- Remove squad_activity_log table from schema and generated code

* feat(cli): add squad list, get, member list commands

* fix(squad): address review findings (P1+P2)

P1 fixes:
- Add 'squads' to reserved_slugs.json (source of truth)
- Add 'create-squad' to ModalType union
- Remove unused leaderOpen/selectedLeader in create-squad modal
- Replace literal JSX strings with i18n selectors (en + zh-Hans)

P2 fixes:
- Add 'squad' to mention regex (MentionRe)
- Fix human member lookup in squad briefing (use GetUser directly)
- Add squads routes to desktop app
- Add squad:created/updated/deleted to WSEventType + invalidation
- Reject archived squads as issue assignees

* fix(squad): restore zh-Hans key, publish activity event, invalidate issues on archive

- Restore create_project.title in zh-Hans modals.json (dropped by prior edit)
- Publish activity:created WS event after squad leader evaluation
- Invalidate issue queries on squad:deleted (archive transfers assignees)
- Add creator info to squad list cards

* fix(squad): realtime sync, rerun support, leader validation

- Use workspaceKeys.squads prefix for detail/member queries (realtime invalidation)
- Publish squad:updated after add/remove/role-change member mutations
- Support rerun for squad-assigned issues (targets leader agent)
- Reject assignment to squads whose leader is archived

---------

Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-05-13 18:46:20 +08:00
Naiyuan Qing
623d29f276 feat(agents): one-click create from curated templates (Phase 1) (#2520)
* docs(agents): three-phase agent quick-create plan

Captures the full design for moving agent creation from manual form +
one-by-one skill attachment to a tiered experience:

- Phase 1 (this PR): one-click curated templates, AI-free.
- Phase 2 (next): AI-recommended skills via the existing quick-create
  task mechanism — no new server-side LLM dependency.
- Phase 3 (later): AI creates the whole agent end-to-end, composing
  Phase 2 with a new `multica agent create` CLI driver.

Documents the architectural decisions that keep all three phases on
existing infrastructure (no SSE, no server-side LLM SDK, no new WS
channels), the two soft blockers Phase 1 unlocks for later phases
(createSkillWithFiles TX composability + skill same-name dedupe), and
the scope decisions we explicitly opted out of (Anthropic plugin
marketplace, ClawHub UI affordances).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(skills): harden import against invalid UTF-8 and binary files

PG rejects two byte patterns in a TEXT column. Both crashed real skill
imports we hit while assembling the template catalog:

- Embedded NUL (0x00) -> SQLSTATE 22021. Already stripped by
  sanitizeNullBytes, kept as-is.
- Other invalid UTF-8 (e.g. 0x91 — Windows-1252 smart quote in a skill
  whose author saved prose from Word). sanitizeNullBytes now also runs
  strings.ToValidUTF8 over the content so the second class no longer
  takes the whole import down.

For non-text payloads (images, fonts, archives, compiled binaries),
sanitization isn't the right fix — agents never read those as text,
and the bytes can't survive a TEXT column at all. addFile now skips
them by extension before the per-bundle cap counters tick, logging
the skip so an unexpected drop leaves a breadcrumb.

Function name kept for compatibility with the many call sites; both
behaviours are strict supersets of the original.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(skills): split createSkillWithFiles for tx composition + add workspace find-or-create query

Two soft blockers cleared so create-from-template (next commit) can
fold N skill creates and the agent + binding writes into one outer
transaction:

1. createSkillWithFiles used to Begin/Commit its own tx. Caller
   composition was impossible — N invocations meant N separate
   transactions and no atomicity over the whole materialise step.
   Pull the body into createSkillWithFilesInTx(ctx, qtx, input); the
   original function becomes a thin wrapper that manages its own tx
   for standalone callers. Existing call sites: zero behaviour change.

2. Add GetSkillByWorkspaceAndName sqlc query — workspace skill lookup
   by name, anchored to UNIQUE(workspace_id, name) from migration
   008. Lets the template materialiser implement find-or-create:
   reuse the workspace's existing skill row when a template
   references the same name, rather than crashing on the unique
   constraint or polluting the workspace with `<name>-2` clones.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(agents): agent template catalog + create-from-template endpoint

Server-side foundation for Phase 1 of the quick-create roadmap (see
docs/agent-quick-create-plan.md). Adds:

- server/internal/agenttmpl/ — embed-loaded catalog of curated agent
  templates. Each template ships pre-written instructions plus a list
  of skill URLs that get materialised into the workspace at create
  time. Validation runs at startup (init() panics on a malformed
  template) so a bad JSON ships as a deploy-time defect, not a
  runtime 500. Slug must equal the filename basename so the URL
  router is mirror-symmetric with the file layout.

- 11 starter templates covering Engineering / Writing / Building /
  Testing (code-reviewer, frontend-builder, planner, docs-writer,
  one-pager, html-slides, full-stack-engineer, …).

- Three new endpoints, all behind RequireWorkspaceMember:
    GET  /api/agent-templates           — picker list (no instructions)
    GET  /api/agent-templates/:slug     — detail with instructions
    POST /api/agents/from-template      — materialise + create

  Create flow:
    1. Auth + runtime authorization happen BEFORE the GitHub fan-out
       so a 403 never wastes 20s of upstream fetches.
    2. Pre-flight dedupe by cached_name reuses workspace skills
       without an HTTP fetch — second create-from-the-same-template
       drops from 20s to <100ms.
    3. Parallel fetch (30s per-URL timeout) for the remaining skills.
    4. Single transaction: every skill insert, the agent insert, and
       the agent_skill bindings. On any upstream fetch failure the TX
       rolls back and the API returns 422 with `failed_urls` so the
       UI can name the bad source(s).
    5. extra_skill_ids (user-supplied additions) are verified through
       GetSkillInWorkspace per id before attach, so a malicious client
       can't graft a skill from another workspace via UUID guessing.

- multica agent create --from-template <slug> CLI flag dispatches to
  the new endpoint with a 60s ceiling, matching `multica skill import`.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(agents): one-click create-from-template UI

Frontend half of Phase 1. CreateAgentDialog becomes a state machine
spanning four steps:

  chooser          → Start blank / From template cards
  blank-form       → existing manual form (post-chooser)
  duplicate-form   → existing form pre-filled from a duplicated agent
  template-picker  → grid of templates, click navigates to detail
  template-detail  → instructions + skill list preview + one-click Use

Picking a template never lands on the form: name auto-deduped against
existingAgentNames, runtime = first usable one, visibility = private.
Refinement happens on the agent detail page if needed. Same rationale
the doc spells out — templates exist precisely to skip configuration.

New components, all collapsible-by-default so quick-create stays fast:
  - template-picker.tsx — categorised grid, lucide icons + semantic
    accent tokens resolved through static maps so Tailwind's JIT picks
    up every variant (dynamic class strings would silently miss).
  - template-detail.tsx — instructions preview, skill list with cached
    descriptions, Use CTA. Renders the failedURLs banner when a 422
    fires — the only step that can trigger that response.
  - instructions-editor.tsx — collapsed preview-card / expanded full
    ContentEditor.
  - skill-multi-select.tsx + skill-picker-list.tsx — shared multi-
    select surface, also adopted by the existing skill-add-dialog.
  - avatar-picker.tsx — agent avatar upload, mirrors the inspector's
    visual language.

Schema-defended client (CLAUDE.md → API Response Compatibility): the
three new endpoints are wired through parseWithFallback with lenient
zod schemas. Desktop builds outlive any given server — a future
field rename / wrapping must not white-screen older installs.
listAgentTemplates accepts both the current bare array and a future
{templates: [...]} envelope. Coverage: 7 new schema-test cases in
schema.test.ts (null body, missing skills/instructions, malformed
create response, envelope migration).

Catalog + detail go through TanStack Query with staleTime: Infinity —
workspace-independent static data, no per-mount refetch.

Other:
- skill-add-dialog becomes a true multi-select (Confirm button +
  checkbox list); attached skills are filtered out of the list.
- agents-page hands the freshly-created Agent back to the dialog so a
  follow-up setAgentSkills can attach the form-selected skills.
- agent-overview-pane drops the mx-auto/max-w-2xl frame on config-
  tab content; the wider dialog visual language reads better with
  tabs filling the column.
- Every new UI string lives in both en/agents.json and
  zh-Hans/agents.json under create_dialog.* / tab_body.skills.* —
  locales/parity.test.ts blocks drift in CI.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(ci): align skill import test + drop next-only lint suppression

- TestFetchFromSkillsSh_ResolvesRootLevelSkillMd now expects assets/logo.png
  to be skipped; matches the new addFile binary-extension guard
  (6fafd86e). The .png is intentionally dropped so PG TEXT inserts don't
  hit SQLSTATE 22021.
- packages/views shares zero next/* deps, so the @next/next/no-img-element
  eslint plugin isn't loaded there. The eslint-disable directive
  referencing it produced a hard "rule not found" error in CI lint. Raw
  <img> is the right primitive in views; remove the disable comment.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* test(agents): wrap CreateAgentDialog tests in workspace/navigation providers

The dialog now calls useNavigation() and useWorkspacePaths(), both of
which throw outside their providers. The existing tests rendered the
dialog bare and tripped both new requirements:

- NavigationProvider — supply a stub adapter so push() works for the
  agent-detail redirect.
- WorkspaceSlugProvider — useWorkspacePaths() requires a slug.

The blank-vs-template chooser is now the default first step; the
existing tests target the runtime picker on the manual form, so the
helper auto-clicks "Start blank" when no template is passed
(duplicate-mode tests skip the chooser).

Manual afterEach(cleanup) + document.body wipe. Base UI's Dialog
portal renders into document.body and leaves focus-guard/inert wrapper
divs behind across tests, so the second test in the suite saw two
"All" / "My Runtime" matches and getByText failed. The wipe is local
to this file rather than the shared setup because it isn't a global
issue — only suites that open Base UI dialogs hit it.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-13 18:26:04 +08:00
Naiyuan Qing
454c8e3d1a feat: in-app preview for non-image attachments (#2528)
* feat(storage): add GetReader to Storage interface

Adds a streaming read method to the Storage abstraction so callers can
pull object bytes without forcing a full in-memory load. S3Storage wraps
GetObject; LocalStorage opens the file with path-traversal and sidecar
guards. Tests cover happy path, traversal rejection, sidecar rejection,
and missing key.

Used in the next commit by the attachment-preview proxy endpoint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(server): add attachment preview proxy endpoint

GET /api/attachments/{id}/content streams the raw bytes of a
text-previewable attachment back to the client. Exists to (a) bypass
CloudFront CORS, which is not configured on the CDN, and (b) bypass
Content-Disposition: attachment which Chromium honors for iframe document
loads. Media types (image/video/audio/pdf) intentionally do NOT go through
this endpoint — clients render them directly from the signed CloudFront
download_url, which is already served with Content-Disposition: inline.

Hard cap: 2 MB. Larger files return 413. Anything outside the text
whitelist returns 415. The whitelist (isTextPreviewable) mirrors the
client-side dispatcher; the cross-reference comment in file.go flags
the manual sync until a JSON SSOT generator lands.

Response always uses Content-Type: text/plain; charset=utf-8 so a
hostile HTML payload can't be re-interpreted as a document. The
original MIME ships via X-Original-Content-Type for client dispatch.
Cache-Control: no-store so revoked attachment access takes effect
immediately on the next request.

Tests cover happy path (md), extension fallback when content_type is
generic, 415 (pdf), 413 (>2MB), foreign workspace (404 isolation), and
the isTextPreviewable table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(core/api): add getAttachmentTextContent + preview error types

Adds an ApiClient method that fetches the text body of an attachment via
the new /api/attachments/{id}/content proxy. Two typed errors —
PreviewTooLargeError (413) and PreviewUnsupportedError (415) — let the
preview modal render specific fallbacks instead of a generic failure.

Refactors the private fetch() into a shared fetchRaw() helper so the
new method inherits the standard infra: auth headers, 401 →
handleUnauthorized recovery, X-Request-ID, error logging, and the
ApiError contract. The previous draft bypassed all of these by calling
window.fetch directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(views/editor): add AttachmentPreviewModal + Eye entry points

In-app preview for non-image attachments. An Eye icon now sits next to
the existing Download button on file cards / readonly file cards / the
standalone AttachmentList. Clicking it opens a full-screen modal that
dispatches by content_type:

  pdf:      <iframe src={download_url}>           — Chromium PDFium
  video/*:  <video controls src={download_url}>   — native controls
  audio/*:  <audio controls src={download_url}>   — native controls
  md:       <ReadonlyContent>                     — full markdown pipeline
  html:     <iframe srcdoc sandbox="">            — fully restricted
  text:     <code class="hljs">                   — lowlight highlight

Media types render directly from the signed CloudFront download_url
(server marks them inline-disposition). Text types fetch through the
new /api/attachments/{id}/content proxy via TanStack Query, wrapped
in useAttachmentPreview() so each entry point owns its own modal
state without depending on a global Provider mount.

Modal sizing: max-w-6xl × min(90vh, 100vh - 2rem) — slightly larger
than create-issue's max-w-4xl since PDF / video need room, but capped
to viewport on small screens. Sub-renderers use h-full to follow the
fixed modal height instead of viewport-relative units.

Images are intentionally NOT touched — the existing ImageLightbox
(extensions/image-view.tsx) already handles them correctly. The new
modal would be churn without user-visible benefit.

Adds i18n keys under attachment.* (en + zh-Hans) and registers
Preview/Download/Upload in the conventions glossary so future
translations stay consistent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(desktop): enable Chromium PDF viewer for attachment preview

Adds webPreferences.plugins: true to the main BrowserWindow so the
bundled Chromium PDFium plugin activates inside iframes — required for
the attachment preview modal's PDF dispatch. Default is false in Electron;
without it <iframe src=*.pdf> renders blank.

Security trade-off, accepted intentionally and documented inline:
  1. This window already runs with webSecurity: false + sandbox: false,
     so plugins: true does NOT meaningfully widen the renderer's attack
     surface beyond what is already accepted.
  2. The only PDFs that reach an iframe here are signed CloudFront URLs
     we ourselves issued; user-supplied URLs are routed through
     setWindowOpenHandler → openExternalSafely and cannot land in this
     renderer.
  3. Chromium's PDFium plugin is itself sandboxed and only handles
     application/pdf — no Flash/Java/other historical plugin surfaces.

If we ever tighten webSecurity / sandbox, the follow-up is to host the
PDF viewer in a dedicated BrowserView with plugins scoped to that view,
keeping the main renderer plugin-free.

Old desktop builds ship without the preview modal, so the Eye button
never appears and PDF preview is gated by the same release — zero
regression risk for users on stale clients.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 18:24:15 +08:00
Bohan Jiang
51aa924124 feat(chat): support renaming chat sessions inline (#2522)
Adds a pencil icon next to the trash icon on each session row in the chat
dropdown. Clicking it turns the title into an inline editable input:
Enter / blur saves, Escape cancels.

Server: new PATCH /api/chat/sessions/{id} handler that updates the title
via the existing `UpdateChatSessionTitle` sqlc query, broadcasts a new
`chat:session_updated` WS event so other tabs / devices stay in sync, and
rejects blank titles. Frontend mutation is optimistic with rollback,
matching the existing delete-session pattern.

MUL-2110

Co-authored-by: multica-agent <github@multica.ai>
2026-05-13 16:57:34 +08:00
Bohan Jiang
96695a79c5 feat(dashboard): workspace/project token + run-time dashboard MUL-1882 (#2462)
* feat(dashboard): workspace/project token + run-time dashboard

Add a `/{slug}/dashboard` page showing per-agent token spend and execution
time across the whole workspace, with an optional project filter.

Backend:
  - Three new sqlc queries against task_usage + agent_task_queue: daily
    usage, per-agent usage, per-agent total run-time. All optionally
    scoped to a project via sqlc.narg('project_id'), reaching project
    through the issue join.
  - Handlers under /api/dashboard return the same wire shape the runtime
    page already consumes (model preserved for client-side cost math).

Frontend: - Shared DashboardPage in packages/views/dashboard reusing KpiCard,
    DailyCostChart, ActorAvatar, and estimateCost from the runtime page
    so the visual style and pricing math stay in lock-step.
  - Period selector (7/30/90d), project dropdown, four KPI tiles
    (cost, tokens, run time, tasks), daily cost chart, and a combined
    "cost + run time by agent" list.
  - Routed in both web (app/[slug]/(dashboard)/dashboard) and desktop
    (memory router); sidebar nav entry added under Workspace group.
Co-authored-by: multica-agent <github@multica.ai>

* fix(dashboard): drop stale project filter and stop double-counting tasks

Two issues caught in PR #2462 review:

1. Project filter held the previous selection's UUID across workspace
   switches and project deletions: the dropdown gracefully showed
   "All projects" (because the title lookup missed) while the three
   dashboard queries kept forwarding the dead UUID, leaving the UI
   looking like a full-workspace view but populated with empty
   project-scoped data. Validate the picked UUID against the current
   projects list before passing it to the queries.

2. The "by agent" table read its task count from the token rollup,
   which is grouped per (agent, model). A single task that spans two
   models lands twice and the agent's row reads e.g. "2 tasks" when
   the real count is 1. Prefer `ListDashboardAgentRunTime`'s per-agent
   distinct count when available; fall back to the token aggregate
   only for agents with no terminal run yet (in-flight tasks).

Extract the merge into `mergeAgentDashboardRows` so the precedence
rules are unit-tested directly.

Co-authored-by: multica-agent <github@multica.ai>

* test(dashboard): allocate per-workspace issue.number explicitly

TestDashboardEndpoints creates two issues in the shared fixture
workspace. issue.number defaults to 0 (migration 020), and the table
carries UNIQUE (workspace_id, number), so the second insert raced the
first on the same default and failed in CI.

Allocate MAX(number) + 1 per insert so each row gets a fresh number
without stepping on rows other tests left behind in the same workspace.

Co-authored-by: multica-agent <github@multica.ai>

* feat(dashboard): rollup table + cron-driven aggregation for dashboard

Mirror the per-runtime rollup in `task_usage_daily` (migrations 073/077/082)
to remove the per-request raw aggregation the dashboard was doing.

Migration 084 adds:
  - `task_usage_dashboard_daily` keyed on
    (bucket_date, workspace_id, agent_id, project_id, model) — the
    dimensions the dashboard actually queries, with project_id nullable
    via UNIQUE NULLS NOT DISTINCT (PG15+) so "no-project" buckets
    upsert cleanly.
  - `task_usage_dashboard_rollup_state` watermark table.
  - `task_usage_dashboard_dirty` invalidation queue.
  - Triggers on agent_task_queue DELETE, task_usage DELETE, and
    issue.project_id UPDATE — the cases the updated_at watermark can't
    see. The project_id trigger re-attributes existing rollup rows when
    a user moves an issue across projects.
  - `rollup_task_usage_dashboard_daily_window(from, to)` —
    idempotent recompute primitive (same shape as 077).
  - `rollup_task_usage_dashboard_daily()` cron entry — own advisory
    lock (4244) so it serialises independently of the runtime rollup.
  - `task_usage_dashboard_rollup_lag_seconds()` health helper.

Sqlc queries `ListDashboardUsageDailyRollup` /
`ListDashboardUsageByAgentRollup` read from the new table; the handler
dispatches between rollup and raw on a separate
`UseDailyRollupForDashboard` config flag
(`USAGE_DASHBOARD_ROLLUP_ENABLED` env). Same fail-safe default (false →
raw) so operators can roll out independently of the per-runtime flag.

Bucket date is UTC (the dashboard aggregates across runtimes that may
sit in different tzs; there's no single correct local boundary).

Adds `cmd/backfill_task_usage_dashboard_daily` mirroring the existing
per-runtime backfill — operator runs it once before flipping the flag.

Tests: - TestDashboardEndpoints now also exercises the rollup read path
    (raw vs. rollup, same project-scoped totals).
  - TestDashboardRollupReattributesOnProjectChange verifies the
    issue.project_id trigger enqueues both old + new buckets and the
    next rollup tick zeroes the old project + populates the new one.
Co-authored-by: multica-agent <github@multica.ai>

* fix(dashboard-rollup): close two invalidation gaps

Two leak paths missed by migration 084 review:

1. Issue cascade DELETE — the atq BEFORE DELETE trigger runs AFTER the
   issue row is gone, so `LEFT JOIN issue` returns NULL project_id and
   the original-project bucket never gets cleared (issue 077 calls this
   out for the runtime rollup but didn't need to act on it). Adds an
   `issue BEFORE DELETE` trigger that enqueues using OLD.project_id
   while the issue row is still readable.

2. `LinkTaskToIssue` (quick-create task attaching to a real issue post-
   completion) UPDATEs `agent_task_queue.issue_id` from NULL to a real
   id. Migration 084 only watched DELETE on atq, so usage already
   rolled up under the no-project bucket stayed attributed to NULL
   forever. Extends the atq trigger to fire on UPDATE OF issue_id too,
   enqueueing both OLD (NULL project) and NEW (linked issue's project).

Tests: - TestDashboardRollupClearsOnIssueDelete asserts rollup row drops to
    zero after issue delete + rollup tick.
  - TestDashboardRollupReattributesOnLinkTaskToIssue verifies tokens
    move from the NULL bucket to the project bucket after the UPDATE.
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-13 12:51:16 +08:00
Bohan Jiang
caeb146bac feat(github): GitHub App integration for PR ↔ issue linking (#1817)
* feat(github): GitHub App backend for PR ↔ issue linking

- New tables: github_installation (workspace ↔ App install), github_pull_request (mirrored PR state), issue_pull_request (M:N link).
- Webhook handler verifies HMAC-SHA256, upserts PR rows, parses issue identifiers from PR title/body/branch and auto-links them. Merging a linked PR moves the issue to done.
- Connect/setup endpoints power the zero-config "Connect GitHub" install flow; state token is HMAC-signed so the setup callback can recover the workspace.
- Workspace-scoped admin routes for listing/disconnecting installations, plus a per-issue `pull-requests` list endpoint.

Co-authored-by: multica-agent <github@multica.ai>

* feat(github): UI for connecting GitHub and viewing linked PRs

- Settings → Integrations: new tab with Connect GitHub / installations list / disconnect, gated on the deployment having the App configured.
- Issue detail sidebar: Pull requests section showing linked PR title, repo, state (open/draft/merged/closed), and author, with deep link to GitHub.
- Real-time refresh: github_installation:* and pull_request:* events invalidate the matching TanStack Query caches.

Co-authored-by: multica-agent <github@multica.ai>

* fix(github): address review — null actor, role gating, configured guard, scoped uninstall broadcast

- listeners: use optionalUUID(e.ActorID) so the system actor on the github-driven issue:updated event no longer panics activity / notification listeners; merged-PR → issue done now produces a status_changed activity and inbox entry.
- IntegrationsTab: gate the admin-only installations query on canManage so members no longer hit /github/installations 403; the configured/not-configured copy is also scoped to admins.
- backend: introduce isGitHubConfigured() requiring both GITHUB_APP_SLUG and GITHUB_WEBHOOK_SECRET, and surface that single flag from list-installations + connect endpoints so the frontend Connect button stays disabled until both are set.
- DeleteGitHubInstallationByInstallationID now RETURNs workspace_id; webhook handler publishes github_installation:deleted scoped to the right workspace so already-open Settings tabs invalidate in real time. ErrNoRows on a re-fired delete short-circuits cleanly.
- tests: focused webhook integration coverage (auto-link + merge → done, cancelled preservation, uninstall returns workspace).

Co-authored-by: multica-agent <github@multica.ai>

* fix(github): i18n the new GitHub UI strings to satisfy lint

CI flagged every literal string in the Integrations tab, the Pull requests
sidebar section, and the per-PR row label. Move them through useT() and
add the matching `integrations.*` block to settings.json (en / zh-Hans)
plus `detail.section_pull_requests` / `detail.pull_request_state_*` /
loading + empty copy under `issues.json`.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-12 13:49:03 +08:00
Bohan Jiang
046e4b1efa fix(execenv): switch every provider's Windows reply template to --content-file (#2411)
Three user reports converge on the same Windows-shell encoding bug:

- #2198 / #2236 — Chinese, Codex on Win11. Comments / descriptions
  generated by the agent arrive as `?`.
- #2376 — Cyrillic, non-Codex agent ("Ops Lead") on Win11 Desktop.
  Title preserved (argv → CreateProcessW UTF-16), description / agent
  reply garbled (stdin → shell-codepage re-encoding).

woodcoal's independent diagnosis on #2198 confirms the root cause:
Windows PowerShell 5.1's `$OutputEncoding` defaults to ASCIIEncoding
when piping to a native command, so non-ASCII bytes are silently
replaced with `?` before they reach `multica.exe`. The CLI's stdin
parsing is fine; the bytes are corrupted upstream, in the agent's
shell layer.

This PR ships the fix that supersedes the codex-only attempt in
PR #2265 (which is closed in favour of this one):

## CLI

Add `--content-file <path>` to `multica issue comment add` and
`--description-file <path>` to `multica issue {create,update}`. The
CLI reads bytes off disk via `os.ReadFile` and skips the shell
entirely; UTF-8 survives end-to-end regardless of `$OutputEncoding`
or `chcp`. The three input modes (`--content`, `--content-stdin`,
`--content-file`) are mutually exclusive.

## Runtime config

`buildMetaSkillContent`'s Available Commands section is rewritten as a
neutral three-mode menu. The previous unconditional "MUST pipe via
stdin" / `--description-stdin` mandate (over-spread from #1795 /
#1851's Codex-multi-line fix) is gone for non-Codex providers; the
strong directive now lives only in the Codex-Specific section, which
branches on host:

- Codex / Linux+macOS: `--content-stdin` + HEREDOC (preserves MUL-1467
  fix against codex's literal `\n` habit).
- Codex / Windows: `--content-file` (PowerShell ASCII pipe is the
  exact bug we're patching).

## Per-turn reply template

`BuildCommentReplyInstructions` now takes a provider arg and branches
provider × OS:

- Windows + any provider → `--content-file` (the bug is shell-layer,
  not provider-layer; #2376 shows non-Codex agents on Windows also
  hit it). All providers write a UTF-8 file with their file-write tool
  and post via `--content-file ./reply.md`.
- Linux/macOS + Codex → stdin/HEREDOC (MUL-1467 protection).
- Linux/macOS + non-Codex → lightweight pre-#1795 inline
  `--content "..."`. The CLI server-side decodes `\n`, so escaped
  multi-line works; the agent retains stdin / file as escape hatches
  for richer formatting.

`BuildPrompt` and `buildCommentPrompt` gain a `provider` arg;
`daemon.runTask` already has it in scope.

## Tests

- `TestResolveTextFlag` — file-source verbatim with non-ASCII
  (`标题 / Заголовок / 中文段落`), missing-file error, empty-file
  rejection, three-way mutual exclusion.
- `TestInjectRuntimeConfigAvailableCommandsIsNeutral` — every
  non-Codex provider × {linux, darwin, windows} pins the three-mode
  menu present + over-spread "MUST stdin" substrings absent.
- `TestInjectRuntimeConfigCodexLinuxEmphasizesStdin` +
  `TestInjectRuntimeConfigCodexWindowsUsesContentFile` — Codex
  section's per-OS branch.
- `TestBuildCommentReplyInstructionsCodexLinux` +
  `TestBuildCommentReplyInstructionsNonCodexLinux` +
  `TestBuildCommentReplyInstructionsWindowsUsesContentFile` — the
  reply-template provider × OS matrix.
- `TestInjectRuntimeConfigWindowsCommentTriggerHasNoStdin` — end-to-end
  AGENTS.md / CLAUDE.md on Windows has no prescriptive stdin
  directive, for claude / codex / opencode.

`go test ./...` and `go vet ./...` clean.

Closes #2198, #2236, #2376.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-11 17:05:45 +08:00
Kagura
fb026f2607 fix(daemon): suppress git console windows on Windows (#2358)
* fix(daemon): suppress git console windows on Windows

Apply the same HideConsoleWindow pattern used for agent processes
(PR #1474) to all git commands spawned by the daemon's repo-cache,
execenv, and GC packages. Each exec.Command now calls
util.HideConsoleWindow(cmd) which sets CREATE_NEW_CONSOLE + HideWindow
so grandchildren inherit a hidden console instead of flashing visible
console windows.

Closes #2357

Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>

* refactor: use EnsureHiddenConsole at daemon startup

Replace per-site HideConsoleWindow(cmd) calls with a single
EnsureHiddenConsole() invoked once at daemon startup. The daemon
now owns a hidden console that every child process (git, cmd /c
mklink, etc.) inherits automatically, eliminating the need for
per-call SysProcAttr configuration.

This also covers the previously missed exec.Command in
codex_home_link_windows.go (cmd /c mklink) which never had a
HideConsoleWindow call.

Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>

---------

Signed-off-by: kagura-agent <kagura.agent.ai@gmail.com>
Co-authored-by: Claude Opus 4 (1M context) <noreply@anthropic.com>
2026-05-11 14:41:07 +08:00
Multica Eve
d6349c16ec feat(runtime): per-runtime timezone for token-usage aggregation (MUL-1950) (#2394)
* feat: per-runtime timezone for token usage aggregation

The runtime token-usage charts (daily and hourly tabs on the
runtime-detail page) bucketed every event by the Postgres session
timezone, which is UTC in production. For an operator in UTC+8 that
meant a Tuesday afternoon's tasks landed in Tuesday early-morning's
bar — the chart was always one off.

Fix: store an IANA timezone on agent_runtime and aggregate under it.

* migrations 081 / 082 add agent_runtime.timezone (TEXT NOT NULL
  DEFAULT 'UTC') and rebuild the rollup pipeline (window function
  and both trigger functions) to compute bucket_date with
  AT TIME ZONE rt.timezone instead of bare DATE().
* No historical backfill — task_usage_daily rows already on disk
  keep their UTC bucket_date; only future writes / re-touches
  recompute under the new tz. (Product call from MUL-1950: 'guarantee
  future correctness'.)
* runtime_usage.sql gains a @tz parameter on ListRuntimeUsage and
  GetRuntimeUsageByHour and threads tz through GetRuntimeTaskHourly  Activity. ListRuntimeUsageDaily reads bucket_date as-is since the
  rollup already wrote it in tz.
* parseSinceParamInTZ replaces the raw N×24h cutoff with start-of-
  day-N in the runtime's tz so 'last 7 days' lines up with bucket
  boundaries.
* Daemon registration sends the host's IANA tz (TZ env, then
  time.Local), and UpsertAgentRuntime preserves any user override
  via a CASE-on-existing-value pattern so a daemon reconnect can't
  silently revert the operator's setting.
* New PATCH /api/runtimes/:id endpoint (UpdateAgentRuntime) lets
  the runtime detail page edit the tz; the editor seeds with the
  browser tz on first interaction.

Refs: MUL-1950

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix: harden runtime timezone rollups

Co-authored-by: multica-agent <github@multica.ai>

* fix: address runtime timezone review nits

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Eve <eve@multica-ai.local>
2026-05-11 14:39:35 +08:00
Bohan Jiang
b26f850d4e feat(agents): gate private-agent surfaces with allowed_principals predicate (#2359)
* feat(agents): gate private-agent surfaces with allowed_principals predicate

Tighten chat/@-mention, history, edit, and delete entry points so private
agents are only reachable by their owner or workspace owner/admin. Agent-to-
agent traffic still bypasses the gate so A2A collaboration keeps working.

- New canAccessPrivateAgent predicate in handler/agent_access.go; used by
  comment.enqueueMentionedAgentTasks (replacing the inline check), GetAgent,
  ListAgents (filter), ListAgentTasks, GetWorkspaceAgentRunCounts /
  Activity30d / TaskSnapshot (workspace-wide aggregations no longer leak
  private-agent existence + counts), chat.CreateChatSession,
  chat.SendChatMessage (re-checks on every send so role changes can't leave
  a stale session as a back-door), and autopilot.shouldSkipDispatch
  (caller = autopilot creator).
- allowed_principals is computed inline as {agent.owner_id} ∪ workspace
  owner/admin members. No new table — manual config is intentionally not
  exposed in v1; the predicate is the extension seam.
- Front-end agent detail page distinguishes 403 (private agent the caller
  can't access) from 404 (deleted/missing) and renders a "no access"
  placeholder with a back-to-agents button.
- Go tests cover the pure predicate matrix + the four protected surfaces;
  vitest passes for the affected views.

Co-authored-by: multica-agent <github@multica.ai>

* feat(agents): gate issue assignment with the private-agent predicate

Refactor validateAssigneePair to call the shared canAccessPrivateAgent
helper. This closes the back door where a plain member could assign a
private agent to an issue and let normal task dispatch run it, side-
stepping the chat / @-mention gate. Agent callers (X-Agent-ID) bypass
so A2A delegation onto a private assignee still works.

Add an integration test covering all three callers (workspace owner,
agent owner, plain member).

Co-authored-by: multica-agent <github@multica.ai>

* fix(agents): close three private-agent gate bypasses found in PR review

1. X-Agent-ID forgery (resolveActor): require X-Task-ID alongside
   X-Agent-ID before trusting the agent identity. Without this a plain
   workspace member could set X-Agent-ID to any visible agent UUID and
   short-circuit the gate to "actor=agent, allow". Daemons already
   pair the two headers, so legitimate A2A traffic is unaffected.

2. Chat history read path (chat.go): GetChatSession / ListChatMessages /
   GetPendingChatTask / MarkChatSessionRead now go through a new
   gateChatSessionForUser helper that re-applies canAccessPrivateAgent
   after the ownership check, so a session creator whose role was later
   downgraded loses transcript access. ListChatSessions and
   ListPendingChatTasks filter their result sets by the same predicate.

3. Cross-workspace @mention (comment.enqueueMentionedAgentTasks):
   resolve the mentioned agent via GetAgentInWorkspace scoped to the
   issue's workspace so a UUID belonging to a different workspace's
   private agent can't slip past the gate (the gate was being applied
   against the current workspace's role table, which is the wrong
   one).

Regression tests cover each bypass, plus an update to the resolveActor
unit test to reflect the new "X-Agent-ID without X-Task-ID falls back
to member" contract.

Co-authored-by: multica-agent <github@multica.ai>

* test(handler): seed X-Task-ID alongside X-Agent-ID in existing agent-caller tests

After tightening resolveActor to require both headers (X-Agent-ID +
X-Task-ID) for the "agent" actor identity, three existing tests that
set only X-Agent-ID started failing because their requests now resolve
to "member" instead of "agent". Add createHandlerTestTaskForAgent
helper and seed a task per agent-caller assertion. Also patch
TestAgentExplicitMentionStillTriggers — it still passed only because
the @mention path doesn't care about author type for member callers,
but the test claims to exercise the agent path, so make it faithful.

Co-authored-by: multica-agent <github@multica.ai>

* test(handler): finish X-Task-ID seeding + fix cross-workspace mention test schema

The previous CI run still failed in two places:

1. server/cmd/server integration tests — postCommentAsAgent → authRequestWithAgent
   only set X-Agent-ID, so resolveActor downgraded the request to "member"
   and the on_comment chain produced the wrong task counts. Fix:
   authRequestWithAgent now also sets X-Task-ID, fetched or seeded by a new
   ensureAgentTask(agentID) helper.

2. TestMentionAgent_RejectsCrossWorkspaceAgentUUID's hand-crafted comment
   INSERT was missing comment.workspace_id, which migration 025 made
   NOT NULL. Pass testWorkspaceID into the seed row.

Build + vet clean locally; both packages compile.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-11 12:39:45 +08:00
Bohan Jiang
39e57b870f fix(cli): allow --mode run_only on autopilot create/update (#2360)
* fix(cli): allow --mode run_only on autopilot create/update

The autopilot run_only dispatch path is wired end-to-end (handler accepts
the mode, AutopilotService.dispatchRunOnly enqueues a task with
AutopilotRunID, daemon resolves workspace via autopilot_run -> autopilot
in ClaimTaskByRuntime and TaskService.ResolveTaskWorkspaceID). The CLI
guard was added before those fixes landed and never removed.

Drop the CLI rejection on both create and update so callers can pick the
same modes the API and UI already support, and remove the stale "unstable"
callout from the autopilots docs.

Closes multica-ai/multica#2347

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): advertise autopilot run_only in agent runtime instructions

The runtime config injected into AGENTS.md / CLAUDE.md only listed
`--mode create_issue` for autopilot create and didn't expose `--mode` on
update at all. So even after the CLI guard was lifted, agents reading
their harness instructions would still believe create_issue was the only
choice — undermining the "agents operate the same surface as humans"
intent.

Update both lines to advertise create_issue|run_only on create and on
update, and add an InjectRuntimeConfig assertion so the runtime prompt
can't drift away from the CLI surface again.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-10 14:12:34 +08:00
Bohan Jiang
003dfd9b4b feat(quick-create): add project picker that remembers last pick (#2321)
* feat(quick-create): add project picker that remembers last pick

Quick-create users targeting one project repeatedly had to restate "in
project X" in every prompt. The modal now exposes a project picker beside
the agent picker, persists the selection per-workspace, and pins the
agent's `multica issue create` invocation to that project so the prompt
text doesn't have to.

The picked project also flows to the daemon as ProjectID/ProjectTitle and
its github_repo resources override the workspace repo fallback — same
treatment issue-bound tasks already get.

Co-authored-by: multica-agent <github@multica.ai>

* fix(quick-create): move project picker into property pill row

Reviewer feedback: the picker felt out of place wedged next to the agent
header. Move it into a property toolbar row above the footer, reusing the
shared `ProjectPicker` + `PillButton` so its placement and styling line up
exactly with the manual create panel.

This also drops the bespoke dropdown / aria / label strings that were only
needed while the picker rendered inline beside "Created by".

Co-authored-by: multica-agent <github@multica.ai>

* fix(quick-create): clear stale persisted project + carry across mode switch

Two review-blocking bugs in PR #2321:

1. The stale-id sweep in AgentCreatePanel only fired when projects.length > 0
   and only cleared local state, leaving lastProjectId pointing at a deleted
   project. The next open re-seeded the dead UUID and submit hit the server's
   `project not found` rejection. Gate on the query's `isSuccess` so we can
   tell "loading" apart from "loaded as empty", and clear both local state
   and the persisted preference when the selection isn't in the resolved list.

2. ManualCreatePanel's switchToAgent dropped the picked project from the carry
   payload, so flipping manual → agent silently fell back to the agent panel's
   own lastProjectId — potentially routing the issue to a different project
   than the one shown in manual mode. Forward project_id alongside prompt /
   agent_id, and add a regression test.

Co-authored-by: multica-agent <github@multica.ai>

* test(quick-create): pass new isExpanded props in stale-project tests

Main got an expand button on AgentCreatePanel via #2320 while this branch
was open, adding `isExpanded` / `setIsExpanded` to the panel's required
props. The two new stale-project tests still passed `{ onClose }` only,
which CI's typecheck (run on the main+branch merge) caught while my
local run did not.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 16:12:12 +08:00
Bohan Jiang
3f20999597 refactor(timeline): drop server-side comment + timeline pagination (#2322)
* refactor(timeline): drop server-side comment + timeline pagination (MUL-1929)

The cursor-paginated /timeline and /comments endpoints were sized for a
problem the data shape doesn't have: prod p99 is ~30 comments per issue
and the all-time max is ~1.1k. Time-based pagination also splits reply
threads across page boundaries (orphan replies), which the frontend was
papering over with an "orphan rescue" that promoted disconnected replies
to top-level — confusing UX with no real benefit.

Replace both endpoints with a single full-issue fetch, capped server-side
at 2000 rows as a defensive safety net (never hit in practice).

Server
- /api/issues/:id/timeline now returns a flat ASC TimelineEntry[]
  (matches the legacy desktop contract — older Multica.app builds keep
  working because the wrapped TimelineResponse + cursors are gone, and
  the raw array shape was always what they consumed).
- /api/issues/:id/comments drops limit/offset; only ?since is honoured
  for the CLI agent-polling flow.
- Drop ListCommentsBefore/After/Latest, ListActivitiesBefore/After/Latest
  and the timelineCursor encoding.
- Replace with ListCommentsForIssue / ListCommentsSinceForIssue /
  ListActivitiesForIssue (capped by argument).

CLI
- multica issue comment list drops --limit / --offset and the X-Total-Count
  reporting; --since is preserved for incremental polling.

Frontend
- Replace useInfiniteQuery with useQuery in useIssueTimeline; drop
  fetchOlder/Newer, jumpToLatest, isAtLatest, newEntriesBelowCount.
- Remove timeline-cache helpers (mapAllEntries / filterAllEntries /
  prependToLatestPage) and the TimelinePage / TimelinePageParam types.
- WS event handlers update the single flat-array cache directly.
- Drop the orphan-reply rescue in issue-detail — every reply's parent
  is now guaranteed to be in the same array.
- Strip the "show older / show newer / jump to latest" buttons and their
  i18n strings.

Co-authored-by: multica-agent <github@multica.ai>

* fix(timeline): address review feedback on pagination removal

Three issues caught in PR #2322 review:

1. /timeline broke for stale clients between #2128 and this PR. They send
   ?limit/?before/?after/?around and parse with the wrapped TimelinePageSchema;
   the new flat-array response was failing schema validation and falling back
   to an empty timeline. Restore the wrapped shape on those query params
   (DESC entries, null cursors, has_more_*=false), keeping the flat ASC array
   for bare requests. Around-mode now also fills target_index from the merged
   slice so legacy clients can still scroll-to-anchor without a follow-up.

2. The agent prompts in runtime_config.go and prompt.go still told agents
   that `multica issue comment list` accepts --limit/--offset and to use
   `--limit 30` on truncated output. With those flags removed in this PR,
   new agent runs would hit "unknown flag" or skip context. Update the
   prompt copy to "returns all comments, capped at 2000; --since for
   incremental polling".

3. useCreateComment's onSuccess was a bare append to the timeline cache
   with no id-dedupe, so a fast comment:created WS event firing before
   onSuccess produced a transient duplicate. Restore the id guard the old
   prependToLatestPage helper used to provide.

Adds two new boundary tests:
- TestListTimeline_LegacyWrappedShape_OnPaginationParams
- TestListTimeline_LegacyWrappedShape_AroundFillsTargetIndex

Co-authored-by: multica-agent <github@multica.ai>

* test(handler): fix timeline test assertions for handler-package isolation

The TestListTimeline_* assertions assumed CreateIssue would seed an
"issue_created" activity_log row, but the activity listener that publishes
those rows is registered in cmd/server/main.go — handler-package tests
don't wire it up. CI saw 5 entries (3 comments + 2 activities) where the
test expected ≥6.

Drop the auto-activity assumption: assert exactly 5 entries in
TestListTimeline_MergesCommentsAndActivities, and tighten
TestListTimeline_EmptyIssue to assert a fully-empty timeline.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 16:11:58 +08:00
Bohan Jiang
9ded462ecc feat(inbox): auto-archive stale task_failed rows on terminal status (#2319)
When an issue progresses to in_review / done / cancelled, archive any
pre-existing task_failed inbox rows for that issue across all member
recipients and emit inbox:batch-archived per recipient so connected
clients self-heal. Reuses the existing archived column rather than
introducing a parallel dismissed flag; the activity log preserves the
full failure history for audit independently of the inbox surface.

Closes #2291.

Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 15:53:25 +08:00
Multica Eve
a2dd80d4f6 feat(autopilot): skip dispatch when assignee runtime is offline (MUL-1899) (#2311)
* feat(autopilot): skip dispatch when assignee runtime is offline (MUL-1899)

Prevents scheduled autopilots from accumulating doomed tasks against
offline / archived / unbound agents. Before this change, a paused laptop
or crashed daemon would let a 5-minute-cron autopilot pile up thousands
of queued agent_task_queue rows that no runtime would ever drain — this
is the dominant source of the 89k stuck-task backlog flagged in MUL-1899.

DispatchAutopilot now performs a pre-flight admission check on the
assignee agent's runtime status. If the runtime is not 'online' (or the
agent is archived / has no runtime bound / has no assignee), the run is
recorded as 'skipped' with a failure_reason and no task is enqueued.
Skipped runs still emit autopilot:run.done so the UI / activity feed
reflect that the trigger fired and was evaluated.

Skipped runs are deliberately NOT counted toward the failure-ratio
auto-pause: a user who closes their laptop overnight should not have
their autopilot paused. Sustained server-side failures keep their
existing pause path via the failure monitor.

Tests: added an integration test that creates an offline runtime and
asserts DispatchAutopilot records a skipped run with no task enqueued.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* feat(scheduler): expire stale queued tasks via TTL sweeper (MUL-1899)

Companion to the dispatch-time admission gate added in this PR. The
admission gate prevents *new* tasks from being enqueued against an
offline runtime, but it does not drain the historical backlog
(~89k stuck queued rows observed at MUL-1899 baseline) and does not
help when a runtime goes offline *after* a task has already been
queued. This adds a passive TTL sweeper:

- New SQL query `ExpireStaleQueuedTasks` transitions queued tasks
  older than the TTL to status='failed' with
  failure_reason='queued_expired' and a clear error message.
- Sweep is capped per tick (`queuedExpireBatchSize`, default 500) via
  a CTE+LIMIT so that draining a large backlog cannot monopolise the
  DB on a single tick. At 30s ticks the worst case is 60k rows/hour.
- Wired into the existing 30s `runRuntimeSweeper` loop alongside
  `sweepStaleTasks` and reuses `taskSvc.HandleFailedTasks` so the
  expired tasks broadcast `task:failed` events, reconcile agent
  status, and roll back any in-progress issues — same lifecycle as
  any other failed task.
- Default TTL = 2h. Conservatively above any reasonable
  "queued behind a long-running task" window (default agent timeout
  is 2h, sweeper runs every 30s) so legitimate work isn't expired.
- Integration tests cover the happy path (stale → expired, fresh →
  left alone, correct status/reason/error) and the per-tick batch cap.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): address review blockers from PR #2311 (MUL-1899)

GPT-Boy review of the offline-runtime + queued-TTL PR flagged four
blockers; this commit addresses them all.

1. Restore the 'skipped' autopilot_run status in the DB constraint.
   Migration 043 had removed 'skipped' along with the now-defunct
   concurrency_policy feature, so the new admission gate's INSERT of
   status='skipped' violated `autopilot_run_status_check` and broke
   `TestAutopilotDispatchSkipsWhenRuntimeOffline` in CI. New
   migration 079 re-adds 'skipped' to the CHECK list. The down
   migration migrates skipped → failed before re-tightening, mirror-
   ing what 043 did for the original removal.

2. Make `ExpireStaleQueuedTasks` race-safe.
   The CTE-then-UPDATE pattern could clobber a task that the daemon
   claimed between victim selection and the outer update. Two
   guards added:
     - `FOR UPDATE SKIP LOCKED` in the CTE so we never wait on a
       row that's currently being claimed (and never block the
       claim path either).
     - The outer UPDATE now re-checks `t.status = 'queued'` AND the
       TTL predicate so even if a row's lock is released after a
       successful claim, we cannot transition a now-dispatched/
       running task to 'failed'.

3. Add a partial index for the queued-TTL sweeper.
   `idx_agent_task_queue_queued_created_at` on `created_at WHERE
   status = 'queued'` — keeps the 30s sweep query (status=queued
   AND created_at < ... ORDER BY created_at LIMIT 500) cheap even
   when historical terminal rows accumulate (~89k+ at MUL-1899
   baseline). The partial predicate keeps the index tiny because
   only in-flight rows live in 'queued'.

4. Fix the failure-monitor denominator.
   `SelectAutopilotsExceedingFailureThreshold` had been counting
   'skipped' toward total runs, which would have diluted the failure
   ratio: a 100%-failing autopilot could mask itself behind a wall
   of admission skips. With 'skipped' restored as a real status,
   the auto-pause monitor must explicitly exclude it from BOTH
   numerator and denominator — admission skips are neither a
   success nor a failure.

Verified: `go test ./cmd/server/... ./internal/service/...` passes
(including TestAutopilotDispatchSkipsWhenRuntimeOffline,
TestExpireStaleQueuedTasks, TestExpireStaleQueuedTasksRespectsBatch
Limit). `go build ./... && go vet ./...` clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(migrations): split queued-task TTL index into concurrent migration

Per PR #2311 review: agent_task_queue is a hot table, so building the
new partial index with plain CREATE INDEX inside migration 079 would
hold ACCESS EXCLUSIVE on the queue and block dispatch during deploy.

The migration runner does not allow CONCURRENTLY to share a file with
other statements (documented in 068), so split the index into its own
single-statement file 080 — matching the existing pattern in 035 /
067 / 074 / 075 / 078. Migration 079 keeps the autopilot_run
constraint change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 15:07:57 +08:00
Bohan Jiang
6d9ebb0fdd fix(daemon): unblock issues stuck on a poisoned-image agent session (#2314)
* fix(daemon): treat upstream API 400 invalid_request_error as poisoned session

A markdown-linked image in an issue description that the agent downloads as
a tiny CDN auth-error file and Read's as a PNG poisons the conversation:
the LLM API rejects the bad image with 400 invalid_request_error, the
session_id is pinned mid-flight, and every follow-up task on the issue
(comment-trigger, auto-retry) resumes the same poisoned conversation and
hits the same 400 — the issue can no longer be executed even after the
description is cleaned up.

Mirror the existing fallback-output classifier on the error side: detect
"API Error: ... 400 ... invalid_request_error" in the agent error string,
persist failure_reason='api_invalid_request', and add it to the
GetLastTaskSession exclusion list so the next task starts a fresh
session that re-reads the (now-clean) description.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): unblock issues already poisoned by API 400 invalid_request_error

The forward-only classifier from the previous commit only tags new failures.
Issues like MUL-1918 already have multiple failed-task rows whose
failure_reason is the pre-fix default 'agent_error', and GetLastTaskSession
falls back to those legacy rows on the next claim — so deploying the
classifier alone leaves existing poisoned issues stuck (GPT-Boy review
on PR #2314).

Two complementary changes:

- Migration 079 backfills failure_reason='api_invalid_request' on every
  pre-existing 'agent_error' row whose error text matches the canonical
  Anthropic 400 invalid_request_error shape. Keeps observability
  consistent (multica issue runs / UI now report the right reason).

- GetLastTaskSession adds a defensive ILIKE clause on error text. Closes
  the deploy-window gap where the old binary could write a new
  'agent_error' row between the migration running and the new code
  taking over, and protects against future error-format variants the
  daemon classifier might miss.

Plus regression tests covering the legacy + new coexistence case GPT-Boy
flagged, and a guard rail asserting benign 'agent_error' failures
(timeouts, tool errors) still resume their session.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 14:39:10 +08:00
Multica Eve
ce00e05169 Add canonical PostHog core metrics events (#2302)
* Add canonical PostHog core metrics events

Co-authored-by: multica-agent <github@multica.ai>

* Address analytics review feedback

Co-authored-by: multica-agent <github@multica.ai>

* Tighten analytics review follow-ups

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 13:12:00 +08:00
Jiayuan Zhang
3b3be9d7bd feat(comments): resolve threads with collapsible bar (MUL-1895) (#2300)
* feat(comments): resolve threads with collapsible bar (MUL-1895)

Adds a Linear-style resolve action on comment thread roots. Resolved
threads collapse to a single "N resolved comments from X" bar in the
activity feed; clicking expands the thread inline (per-session, not
persisted). Replying inside a resolved thread auto-unresolves it.

Backend
- migration 069: resolved_at, resolved_by_type, resolved_by_id on comment
- sqlc ResolveComment / UnresolveComment queries (idempotent via COALESCE)
- POST/DELETE /api/comments/{id}/resolve handlers, root-only validation
- CreateComment auto-clears resolved_at when a reply lands in a resolved
  thread, publishing comment:unresolved
- comment:resolved / comment:unresolved events; CommentResponse and
  TimelineEntry both surface the new fields

Frontend
- Comment + TimelineEntry types extended; payloads typed; WS sync wired
- useResolveComment optimistic mutation with rollback
- ResolvedThreadBar component for the collapsed view
- Resolve / Unresolve menu items on root comments; Collapse strip on the
  expanded resolved card
- en + zh-Hans locale strings

Co-authored-by: multica-agent <github@multica.ai>

* fix(comments): cover agent reply path, expand-state hygiene, nested counts (MUL-1895)

Addresses three review issues from Emacs on PR #2300:

1. TaskService.createAgentComment bypasses Handler.CreateComment, so the
   auto-unresolve wired into the handler did not fire when an agent replied
   in a resolved thread (task / mention / on_comment paths). Extracted the
   logic to TaskService.AutoUnresolveThreadOnReply so both reply paths share
   it; rewired Handler.CreateComment to call the new method.

2. Resolving an already-expanded thread no longer collapses it back to the
   bar because expandedResolved still contained the id. Added
   clearResolvedExpand + handleResolveToggle wrapper so resolve / unresolve
   always wipe the session expand entry.

3. ResolvedThreadBar received only direct children, while CommentCard's
   expanded view recurses through descendants. Extracted the recursive
   walk into thread-utils.collectThreadReplies and called from both —
   counts and author lists now match.

Co-authored-by: multica-agent <github@multica.ai>

* test(comments): mock useResolveComment + add zh-Hans plural key

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-09 05:49:33 +02:00
Bohan Jiang
61ce8a8090 feat(daemon): add disk-usage CLI to surface per-task / per-workspace footprint (#2267)
* feat(daemon): add disk-usage CLI to surface per-task / per-workspace footprint

Adds `multica daemon disk-usage [--by-workspace] [--by-task] [--top N]
[--output json]`, walking the workspaces root to report task and workspace
disk consumption without requiring a running daemon. Sizing reuses the GC
artifact patternSet (basename-only) so the reported "artifact" footprint
matches what `cleanTaskArtifacts` would actually reclaim, and the walk
honors the same safety contract: never enters .git, never follows symlinks,
counts only regular files.

Refactors WorkspacesRoot resolution into an exported `ResolveWorkspacesRoot`
so the read-only CLI picks the same root the running daemon would have.

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): distinguish displayed totals from scan totals; add workspace artifact ratio

- Track scan-wide TotalTaskCount / TotalWorkspaceCount on the report so
  `--top N` no longer leaves the table footer claiming the truncated row
  count is the full count. The CLI now prints a "Showing top N of M …
  Displayed: X. Scan total: Y" line whenever truncation happens, and keeps
  the bare "Total: …" footer for the un-truncated case.
- Add ArtifactRatio (0..1) on WorkspaceDiskUsage and TotalArtifactRatio on
  the report. The workspace table renders an `ARTIFACT %` column. ratio()
  guards size=0 so empty workspaces report 0% instead of NaN%.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-08 17:14:52 +08:00
Bohan Jiang
823f124d67 feat(daemon): extend GC to chat / autopilot / quick-create tasks (#2260)
* feat(daemon): extend GC to chat / autopilot / quick-create tasks

Before this change the daemon's GC was strictly issue-centric: only tasks
with a non-empty issue_id ever wrote .gc_meta.json, and shouldCleanTaskDir
called only the issue gc-check endpoint. Chat / autopilot run / quick-create
tasks fell through to the GCOrphanTTL mtime path, which mis-killed active
chat sessions while leaving deleted ones around far longer than necessary.

Schema:
- GCMeta gains a Kind discriminator and per-kind ID fields
  (ChatSessionID / AutopilotRunID / TaskID). WriteGCMeta now takes a
  GCMeta struct so the call site classifies the task explicitly.
- ReadGCMeta defaults empty Kind to GCKindIssue, so legacy on-disk meta
  files keep flowing through the issue path with no migration required.

Server endpoints (siblings of /api/daemon/issues/{id}/gc-check, all behind
requireDaemonWorkspaceAccess for the same anti-enumeration shape):
- GET /api/daemon/chat-sessions/{id}/gc-check  -> {status, updated_at}
- GET /api/daemon/autopilot-runs/{id}/gc-check -> {status, completed_at}
- GET /api/daemon/tasks/{id}/gc-check          -> {status, completed_at}

shouldCleanTaskDir dispatches on Kind:
- chat: active is hard-skipped (no mtime fallback) so idle sessions are
  never reclaimed; archived + GCTTL cleans; 404 falls back to mtime to
  stay safe for cross-workspace tokens.
- autopilot_run: terminal (completed/failed/skipped/issue_created) +
  GCTTL cleans; running/pending skips. Uses run.completed_at as the TTL
  anchor since autopilot_run has no updated_at column.
- quick_create: terminal task status cleans immediately (workdir is not
  reused by the linked issue task, which has its own envRoot); running
  skips.

Also drops the "skipping .gc_meta.json: issue_id is empty" warn — with
the new kind dispatch, chat/autopilot/quick-create tasks now write a
proper meta file instead of triggering this log.

Refs: GC follow-up to PR #2077 (symptom fix) and #2115 (chat hard delete).
Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): chat gc-check 404 cleans immediately, no mtime gate

PR review caught that the chat 404 path was routing through
orphanByMTime, which deferred reclamation to GCOrphanTTL (72h) when
acceptance #3 calls for cleanup within one GC cycle (≤ 1h) after the
user hard-deletes a session.

Every chat_session_id we ever ask about was written by this same daemon
under its current token, so the cross-workspace probe defense the issue
path needs doesn't apply here. Drop the gate and clean on 404 directly.

Test updates:
- TestShouldCleanTaskDir_KindDispatch/chat_404 flips the locked
  expectation from gcActionSkip to gcActionClean.
- Adds TestShouldCleanTaskDir_ChatHardDeletedFreshMtime: GCOrphanTTL
  set to a year so any mtime-based path is unmistakably out, and the
  fresh-mtime workdir still cleans on the chat-404 fast path.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: multica-agent <github@multica.ai>
2026-05-08 16:12:48 +08:00
Multica Eve
eb067ff077 fix(server): aggregate task_usage into daily rollup table to cut DB load (#2256)
* fix(server): aggregate task_usage into daily rollup table to cut DB load

ListRuntimeUsage previously did a SUM(...) GROUP BY DATE(created_at), provider,
model over the raw task_usage stream once per runtime row on the runtimes
list and once per detail page load, scaling O(events) per call. This is the
hot read path responsible for sustained load on Postgres.

Switch the read path to a materialized daily rollup table maintained by a
pg_cron job:

- 072_task_usage_daily_rollup: schema for task_usage_daily +
  task_usage_rollup_state, plus rollup_task_usage_daily_window(p_from, p_to)
  (window primitive used by both cron and offline backfill, idempotent via
  ON CONFLICT DO UPDATE adding deltas) and rollup_task_usage_daily() (cron
  entry point — pg_try_advisory_lock(4242) for serialization, watermark
  advancement, 5-minute safety lag for late-visible inserts). Also adds
  idx_task_usage_created_at to help the two lazy endpoints
  (ListRuntimeUsageByAgent / GetRuntimeUsageByHour) that still hit the
  raw table.

- 073_task_usage_daily_pgcron: CREATE EXTENSION IF NOT EXISTS pg_cron in a
  DO/EXCEPTION block (mirrors the migration 032 pg_bigm pattern so envs
  without shared_preload_libraries=pg_cron skip gracefully) and schedules
  rollup_task_usage_daily() every 5 minutes when the extension is present.

- queries/runtime_usage.sql ListRuntimeUsage rewritten to read from
  task_usage_daily; sqlc regenerated. Other usage queries unchanged.

- cmd/backfill_task_usage_daily: one-shot Go command that walks
  task_usage in monthly slices through rollup_task_usage_daily_window,
  then stamps the watermark to now()-5m so the cron resumes cleanly.
  Run once after migrations have applied, before relying on the rollup.

- runtime_test.go: TestGetRuntimeUsage_BucketsByUsageTime now invokes
  rollup_task_usage_daily_window after fixture inserts so the handler
  sees the rolled-up rows. Synthetic daily rows cleaned up after each
  test.

- runtime_rollup_test.go: new tests covering aggregation correctness,
  idempotency contract of ON CONFLICT DO UPDATE, and the watermark
  advancing exactly to now()-5m via the cron entry point.

Deployment order: apply migrations → run backfill_task_usage_daily once
→ pg_cron picks up subsequent windows automatically. Today bucket may be
up to ~10 minutes stale (5 min cron + 5 min lag) by design.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(server): make task_usage_daily rollup safe to overlap, replay, and correct

Addresses 4 review blockers on the original PR:

1. Cron/backfill double-count race: the rollup function is now idempotent.
   Window calls find DIRTY KEYS via task_usage.updated_at, then RECOMPUTE
   each bucket from ground truth and REPLACE the daily row (no more
   additive ON CONFLICT). Cron and backfill can now overlap safely.

2. Silent pg_cron absence: the read path is gated behind a new
   USAGE_DAILY_ROLLUP_ENABLED feature flag (default off). The raw
   task_usage scan is preserved as the fallback. Operators flip the
   flag per-environment after backfill + cron are confirmed healthy
   (task_usage_rollup_lag_seconds() helper added for monitoring).

3. UpsertTaskUsage corrections invisible to rollup: added
   task_usage.updated_at column (default now(), backfilled from
   created_at), and bumped it on conflict. Corrections now mark the
   bucket dirty and the next window call recomputes it correctly.

4. CREATE INDEX blocking writes on hot table: split into separate
   single-statement migrations using CREATE INDEX CONCURRENTLY
   (074, 075), matching the 035/067 pattern.

Also: cron.schedule() removed from migrations entirely. Migration 076
only enables the extension (gracefully on unsupported envs); the actual
schedule is a documented operator runbook step that runs AFTER backfill.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(server): trigger-driven invalidation + online-safe migration for task_usage_daily

Round-2 review feedback on PR #2256:

1. Add explicit dirty-bucket queue (task_usage_daily_dirty) populated by
   triggers on agent_task_queue (UPDATE OF runtime_id, DELETE) and
   task_usage (DELETE). The rollup window function drains both this queue
   and the updated_at-based discovery, so runtime reassignment and
   issue-cascade deletes no longer leave the rollup divergent from the
   raw query.

   Triggers join via agent (not issue) to look up workspace_id, because
   when the cascade comes from issue, the issue row is already gone by
   the time atq's BEFORE DELETE fires; agent stays alive.

2. Make migration 072 online-safe: only ADD COLUMN updated_at TIMESTAMPTZ
   (nullable, no default → metadata-only ALTER, no row rewrite) and a
   separate ALTER for SET DEFAULT now() (also metadata-only). No bulk
   UPDATE on the hot task_usage table. The rollup window function's
   dirty_keys CTE handles legacy NULL rows via an OR branch, supported
   by partial index idx_task_usage_created_at_legacy.

3. Refresh stale documentation in cmd/backfill_task_usage_daily/main.go
   header to describe the current recompute/replace semantics, idempotent
   re-runnability, and the actual migration numbering (072..077).

Tests:
- TestRollupTaskUsageDaily_InvalidationOnReassign: verifies usage moves
  between runtime buckets after ReassignTasksToRuntime-style update.
- TestRollupTaskUsageDaily_InvalidationOnIssueDelete: verifies daily
  bucket is cleared after issue delete cascades through atq → task_usage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(server): close dirty-queue race + move legacy partial index to its own concurrent migration

Round-3 review feedback on PR #2256:

1. Blocker: dirty-queue invalidations could be silently lost under
   concurrency. ON CONFLICT DO NOTHING let a late trigger see the row
   already enqueued, no-op, and then the rollup drain (WHERE
   enqueued_at < p_to) would delete the original row — losing the
   late invalidation. Switched all three trigger enqueue paths to
   ON CONFLICT DO UPDATE SET enqueued_at = GREATEST(existing,
   EXCLUDED.enqueued_at), so any invalidation arriving during a
   rollup tick keeps enqueued_at > p_to (p_to = now() - 5min) and
   survives the post-tick drain.

2. High: idx_task_usage_created_at_legacy (partial index on hot
   task_usage table) was being created in the regular 077 migration
   without CONCURRENTLY. Moved to new migration 078 with
   CREATE INDEX CONCURRENTLY, matching the pattern of 074/075.
   077's down migration leaves the index alone (it is owned by 078).

3. Minor: gofmt -w on runtime_rollup_test.go and
   backfill_task_usage_daily/main.go (tabs were lost in the original
   heredoc append). PR description rewritten to describe the current
   recompute/replace + dirty queue + feature flag design and the
   072..078 migration ordering.

Tests still green: TestRollupTaskUsageDaily_* (including both new
invalidation regressions), TestGetRuntimeUsage_*, TestWorkspaceUsage_*.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

* fix(server): unify workspace_id source via agent in rollup window function

Round-4 review feedback (J) on PR #2256:

M1 (must-fix): The dirty queue triggers resolved workspace_id via
`agent.workspace_id`, but the window function's `dirty_from_updates`
discovery and `recomputed` recompute join used `issue.workspace_id`.
There is no schema-level FK guaranteeing
`agent.workspace_id == issue.workspace_id`. Any divergence (future
cross-workspace task scenarios, data repairs, migration bugs) would
cause:

  - dirty queue rows with workspace_id from agent
  - recompute join filtering by workspace_id from issue
  - 0 matches in recompute → bucket erroneously hits the
    deleted_empty branch and the daily row is silently dropped
  - dirty_from_updates path attributing usage to the wrong workspace

Replaced both CTEs to JOIN agent (not issue) so trigger / discovery /
recompute share one workspace_id source. Comment in 077 explains the
constraint.

N1: Refreshed two stale references in
cmd/backfill_task_usage_daily/main.go (header now says "072..078";
stampWatermark warning now mentions migration 073, where the rollup
state table is actually introduced).

Test: New TestRollupTaskUsageDaily_WorkspaceMismatch constructs an
atq with agent.workspace_id != issue.workspace_id, asserts the bucket
lands under agent's workspace (not issue's), and re-asserts after a
runtime reassign in the foreign workspace. Acts as a canary if the
schema invariant changes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica.ai>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: multica-agent <github@multica.ai>
Co-authored-by: Devv <devv@Devvs-Mac-mini.local>
2026-05-08 15:35:21 +08:00
Multica Eve
9a3a99cef8 fix: make CLI short IDs routable
Make CLI table IDs routable across issue, autopilot, project, label, and task-run workflows. Adds scoped UUID-prefix resolution, --full-id table options, issue KEY display, safer actor/name output, and updated CLI docs/runtime prompt.
2026-05-08 14:32:03 +08:00