Commit Graph

1156 Commits

Author SHA1 Message Date
Bohan Jiang
24754f091b fix: allow framed attachment redirects (#4635)
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-26 23:35:47 +08:00
Bohan Jiang
a252f47337 fix(scheduler): advance autopilot next_run_at after each scheduled dispatch (MUL-3749) (#4618)
* fix(scheduler): advance autopilot next_run_at after each scheduled dispatch

The display-only autopilot_trigger.next_run_at column was written only on
trigger create/update and never advanced afterward, so for a recurring
schedule it froze at a past slot and the list rendered it as a 'next run'
in the past (e.g. '53m ago'). The intended AdvanceTriggerNextRun query was
dead code with zero callers.

Wire it up at the scheduler's existing post-dispatch seam (replacing the
last_fired_at-only TouchAutopilotTriggerFiredAt bump, which AdvanceTrigger-
NextRun already supersets). The advanced value is computed on the app local
clock via ComputeNextRun — the same path create/update use — so the whole
next_run_at display column is owned by one clock and stays consistent;
scheduling itself is untouched and still runs off DB time via
NextOccurrencesUTC. On a cron/timezone parse failure we fall back to the
last_fired_at-only bump.

Adds a deterministic regression test for the reported scenario (hourly
cron in America/New_York) and documents the local-clock ownership on
ComputeNextRun.

MUL-3749

Co-authored-by: multica-agent <github@multica.ai>

* fix(scheduler): floor next_run_at advance at plan_time to survive clock skew

Addresses review feedback on the next_run_at write-back (MUL-3749):

- The post-dispatch advance computed the value from time.Now() alone. The
  handler is entered only after DB time judged the plan due, so if this app
  instance's clock lags the DB clock at a period boundary, time.Now() could
  recompute the slot that just fired and next_run_at would not advance —
  the original staleness bug, at the boundary. Extract advancedNextRun,
  which anchors at max(now, plan_time) via NextOccurrenceAfterUTC so the
  written value is always strictly after the fired plan_time while still
  tracking the local clock in the normal case.
- Add scheduler-layer tests asserting the written value is strictly after
  plan_time across skew / on-slot / normal cases. The previous service-layer
  test only exercised the helper with an explicit after, not this path.
- Sync the stale ListSchedulableAutopilotTriggers comment: the scheduler
  now writes last_fired_at via AdvanceTriggerNextRun (sqlc regenerated).

MUL-3749

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-26 18:54:51 +08:00
Multica Eve
256a0a9b27 fix(squad): skip leader on reply that inherits parent @mention (MUL-3744)
Refs MUL-3744.
2026-06-26 16:53:22 +08:00
LinYushen
3692b6a862 fix(squad): inject leader briefing by task flag, not issue assignee (MUL-3730) (#4606)
* fix(squad): inject leader briefing by task flag, not issue assignee

Key squad-leader briefing injection off task.IsLeaderTask + task.SquadID
instead of issue.AssigneeType=='squad'. The old gate missed the most common
path — an @squad mention in a comment on an issue assigned to a plain agent
(MUL-3724) — so the leader booted with zero squad context and did the work
itself instead of orchestrating.

- migration 127: add agent_task_queue.squad_id (no FK) + partial index
- sqlc: CreateAgentTask stamps squad_id; CreateRetryTask inherits it
- service: thread squadID through EnqueueTaskForSquadLeader(+WithHandoff),
  enqueueMentionTask, and the rerun path; all 5 call sites pass the squad id
- daemon claim: unified injection keyed on leader-task + squad_id, with a
  defensive leader-identity re-check; quick-create block retained (it serves
  issue-less tasks and sets resp.SquadID/SquadName)
- briefing: strengthen leader Operating Protocol opening
- tests: claim-time injection (comment-mention/non-leader/null-squad),
  squad_id enqueue stamping, retry inheritance; existing fixture updated

Co-authored-by: multica-agent <github@multica.ai>

* test+docs(squad): dangling squad_id regression + clarify quick-create path

Address review nits on #4606:
- Add TestClaim_LeaderTaskWithDanglingSquadID_NoBriefing: squad hard-deleted
  after enqueue leaves task.squad_id dangling (no FK); claim still 200 and
  skips injection via the err!=nil guard. This is the load-bearing contract
  for dropping the FK.
- Rewrite the daemon.go injection comment to state quick-create does NOT use
  the is_leader_task/squad_id columns — it routes squad via the context JSON
  branch (qc.SquadID) and must not be folded into the column-based path.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: 魏和尚 <agent@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-26 16:01:33 +08:00
Multica Eve
8d0ea04fb0 feat(composio): add standalone Go SDK client (MVP) (#4603)
* feat(composio): add standalone Go SDK client (MVP)

Adds server/pkg/composio — a self-contained Go SDK for the Composio v3.1
REST API. Built on go-resty/resty v2; zero coupling to other Multica
packages so it can be vendored or extracted later without surgery.

MVP surface (just the endpoints Stage 2 needs):

- POST /connected_accounts/link        Client.CreateLink
- POST /tool_router/session            Client.CreateSession
- GET  /connected_accounts             Client.ListConnectedAccounts
- POST /connected_accounts/{id}/revoke Client.RevokeConnection
- DELETE /connected_accounts/{id}      Client.DeleteConnectedAccount
                                       (404 -> nil, idempotent)
- GET  /toolkits                       Client.ListToolkits
- GET  /toolkits/{slug}                Client.GetToolkit
- POST /tools/execute/{slug}           Client.ExecuteTool
- Webhook HMAC-SHA256 verification     composio.VerifyWebhook /
                                       VerifyHTTPRequest + ParseEvent

Other notes:

- Auth via x-api-key header (Composio v3.1 contract).
- Typed *APIError envelope with IsNotFound / IsUnauthorized /
  IsRateLimited helpers; falls back to raw body when upstream returns
  non-JSON.
- Webhook signature accepts the official "v1,<sig>" format and any
  comma-separated multi-version list; 300s replay tolerance by default,
  honors an injectable clock for tests; RFC3339 timestamps tolerated.
- README.md documents all public APIs and design choices.

Tests:

- All exercise httptest.NewServer - no real Composio calls.
- 36 tests covering happy paths, validation, 404 idempotence, error
  decoding, signature verify (good / tampered / stale / multi-version /
  bare / RFC3339 / missing headers / empty secret).
- go test ./pkg/composio/... -cover -> 82.2%, exceeds the >=80% bar.

Follow-ups (separate PRs):

- server/internal/integrations/composio - DB schema, REST handlers,
  registration_service (CSRF), dispatch hook (MUL-3720 remainder).
- Pagination iterators, retry middleware, proxy execute, triggers.

Refs: MUL-3720, MUL-3715
Co-authored-by: multica-agent <github@multica.ai>

* fix(composio): align SDK with v3.1 wire contract (PR #4603 review)

Addresses GPT-Boy's review on PR #4603:

Must-fix
- ListConnectedAccountsRequest: switch UserID/AuthConfigID singular fields
  to plural slices (UserIDs, AuthConfigIDs, ToolkitSlugs, ConnectedAccountIDs,
  Statuses). The Composio v3.1 spec ships these as `*_ids` array params;
  our singular form silently dropped the user-filter on the wire. Also
  surfaces order_by / order_direction / account_type from the same spec.
- ExecuteToolRequest: rename ToolkitVersions -> Version with json tag
  `version` (the actual v3.1 body field). Marks AllowTracing as
  deprecated per the spec.

Nits
- ListToolkitsRequest.SortBy comment: `popular | alphabetical` -> the
  real enum `usage | alphabetically`.
- Client constructor: when Options.HTTPClient is provided, use
  resty.NewWithClient(hc) so the caller's Transport, Jar, CheckRedirect,
  and Timeout all carry through — the prior code only forwarded
  Transport + Timeout despite the field comment promising the full
  *http.Client.

Tests
- TestListConnectedAccounts_QueryString now asserts plural query keys
  (user_ids, auth_config_ids, connected_account_ids, statuses) and
  explicitly guards that the legacy singular keys do not leak.
- TestExecuteTool_Success decodes the body as a raw map and asserts the
  wire key is `version` (not `toolkit_versions`).
- New TestExecuteToolRequest_VersionSerialization locks the json tag.
- New TestNewClient_HonorsInjectedHTTPClient drives a request through a
  recordingTransport and asserts the inbound *http.Client actually
  handled it.

Verification
- go test ./pkg/composio/... -cover -> 85.1% (38 tests; was 82.2% / 36).
- go vet, go build, gofmt -l all clean.

Refs: PR #4603 review, MUL-3720
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-26 15:07:47 +08:00
Bohan Jiang
9e807efc62 feat(sidebar): per-workspace switcher dot + count unread per issue (MUL-3695) (#4591)
* feat(sidebar): mark which workspace has unread in the switcher dropdown (MUL-3695)

The aggregate avatar dot only says "some other workspace has unread". When
the user opens the workspace switcher they couldn't tell which one. Add a
per-row brand dot next to each OTHER workspace that has unread inbox items,
in the same right-edge slot as the active-workspace check (the active
workspace is excluded — its unread is the Inbox nav count — so dot and
check never collide on one row).

Reuses the existing cross-workspace summary data; no backend change. New
pure helper unreadWorkspaceIds() + unit tests, and AppSidebar dropdown
tests covering: dot only on the other unread workspace, no dot at count 0,
and never on the active workspace.

Co-authored-by: multica-agent <github@multica.ai>

* fix(inbox): count switcher unread per issue, matching the inbox dedup (MUL-3695)

The unread-summary that drives the workspace-switcher dot counted raw
unread inbox_item rows, but the inbox UI deduplicates notifications per
issue and treats an issue as read when its NEWEST non-archived item is
read. Opening an issue marks only that newest item read (markInboxRead is
per-item; only archive cascades to siblings), so older siblings stay
unread in the DB. Result: a workspace whose inbox the user sees as empty
still lit the dot (reported on bohan-personal showing a dot for Multica AI
with no unread).

Rewrite CountUnreadInboxByWorkspace to pick the newest non-archived item
per (workspace, issue-or-id group) via DISTINCT ON and count only groups
whose newest item is unread — the exact semantics of
deduplicateInboxItems(...).filter(!read) on the client. No schema/handler
change; query-only. Adds TestInboxUnreadSummaryDedupesByIssue covering the
read-newest / unread-older case and its inverse.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-26 13:28:45 +08:00
Bohan Jiang
54145ad72e feat(sidebar): dot the workspace switcher when other workspaces have unread inbox (MUL-3695) (#4577)
Adds a cross-workspace unread summary so the workspace switcher shows the
existing brand dot when a workspace OTHER than the active one has unread
inbox items. The active workspace's own unread stays on the Inbox nav
count to avoid a duplicate signal, and the dot is shared with the pending-
invitation indicator.

Backend: new GET /api/inbox/unread-summary returns per-workspace unread
counts for the user, scoped via a member join so a left workspace can't
light the dot. One account-level query instead of N per-workspace inbox
fetches.

Frontend: schema-guarded api.getInboxUnreadSummary, a single account-level
TanStack Query, and a derived "other workspace has unread" boolean in
AppSidebar (shared by web + desktop). Inbox WS events (new/read/archived/
batch) and reconnect invalidate the summary, so the dot appears and clears
in realtime even for events from a non-active workspace.

Closes multica-ai/multica#3773

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-26 12:09:15 +08:00
Multica Eve
7d0c73d11f MUL-3417: tolerate OpenClaw config file CLI mismatch
Closes MUL-3417
Fixes #4299
2026-06-25 16:58:07 +08:00
Naiyuan Qing
8e7d28bff1 fix(issues): emit project_changed so moved issues leave the old project list (MUL-3669) (#4571)
The per-project issue list rides the filtered myAll cache. Changing an
issue's project is a membership change, but the surgical patch
(patchIssueInBuckets) is filter-blind and never removes a card that no
longer matches the list's project filter — so a moved issue stayed visible
in the old project's list until a manual refetch (#4548 / MUL-3669).

Root cause: project_id was the only membership-affecting field with no
server *_changed flag. The WS handler fell back to diffing project_id
against its own cache, which breaks once onMutate has optimistically
overwritten the cached value on a local move.

- server: stamp project_changed on issue:updated (UpdateIssue + Batch),
  alongside status_changed / assignee_changed.
- events.ts: surface project_changed (optional, additive — old clients ignore).
- ws-updaters: prefer the server flag, fall back to the cache diff only when
  absent (older backend) so a new frontend on an old backend does not regress.
- mutations: onSettled invalidates myAll when project_id changed — a local
  safety net that never depends on the WS echo (update + batch).

Tests: WS flag wins over a matching cache (local-move repro), explicit false
suppresses the legacy diff, the cache-diff fallback still fires, and both
mutations invalidate myAll on a project change.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 16:54:11 +08:00
Bohan Jiang
35e5455953 fix: allow split-origin attachment previews (#4539)
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-25 16:01:09 +08:00
Willow Lopez
33967611b2 fix(cli): add daemon signal check to prevent silent PAT fallback
Closes #4204

Add a daemon signal guard in resolveToken so daemon-managed subprocesses fail closed instead of falling back to the user-global config token when agent/task env markers are missing. Also adds boundary tests for MULTICA_DAEMON_PORT, MULTICA_SERVER_URL, explicit MULTICA_TOKEN priority, and normal config fallback.
2026-06-25 15:25:30 +08:00
苗大
93ed3dc131 fix(lark): use app URL for web links (MUL-3679)
Point Lark binding and issue-created links at the web app URL (MULTICA_APP_URL)
instead of the backend public URL, so users on split-domain / self-host
deployments get working links. appURLFromEnv falls back to FRONTEND_ORIGIN,
matching the backend's existing app-URL resolution.
2026-06-25 15:14:09 +08:00
Multica Eve
aa4478af52 fix(codex): unhang cleanup after stdout scanner overflow (#4520) (#4563)
When codex emits a single stdout line larger than the daemon's 10 MB
bufio.Scanner cap (typical trigger: thread/resume on a long-history
session), the reader goroutine returns scanner.Err()="token too long",
markProcessExited fails the in-flight RPC, and the lifecycle goroutine
enters its failure path. That path calls drainAndWait() — stdin.Close()
+ cmd.Wait() — before sending the failed Result. But cmd.Wait() never
returns: codex is alive and blocked writing the rest of the oversized
line into a stdout pipe nobody is reading, so it never reaches its
stdin read syscall and never sees the EOF. The lifecycle goroutine
therefore never sends Result{failed} to its caller, the outer daemon
blocks on the result channel, and the existing PriorSessionID-with-
empty-SessionID fallback never fires — the task is permanently
stalled and codex (Node wrapper + native Rust app-server) leaks until
the OS reaps them.

The cancel() that would have unblocked things via cmd.WaitDelay's
SIGKILL was registered as a defer AFTER drainAndWait, so LIFO defer
order put cancel last — drainAndWait blocks first, cancel never runs.

Fix:

1. drainAndWait now runs the existing graceful-then-cancel pattern
   itself, in two bounded phases. Phase 1 waits for readerDone (capped
   by codexGracefulShutdownTimeout, so we still give codex its OTEL
   flush window on clean exits); on timeout it cancels the runCtx so
   cmd.Cancel kills the tree and the reader unblocks. Phase 2 bounds
   cmd.Wait() the same way for the scanner-overflow case, where
   readerDone closed early but the process is still alive on a full
   stdout pipe. The success-path cleanup that previously duplicated the
   graceful-cancel pattern around readerDone collapses to a single
   drainAndWait() call.

2. cmd.Cancel is set to send SIGKILL to the whole codex process group
   (Setpgid via configureProcessGroup, signalProcessGroup on cancel)
   instead of just the leader. This addresses YOMXXX's
   orphaned-Codex-child concern: the Node wrapper and the native
   app-server it spawns now both die when cleanup forces the kill,
   rather than the native binary leaking as an orphan reparented to
   init. configureProcessGroup is a no-op on Windows.

3. codexGracefulShutdownTimeoutNanos atomic.Int64 mirrors
   opencodeTerminateGraceNanos so the regression test can shrink the
   grace window from 10 s to 500 ms. Production code is unchanged
   (default 10 s).

Outer daemon (daemon.go) already retries with a fresh session when
result.Status == "failed" && PriorSessionID != "" && result.SessionID
== ""; the failed Result now actually reaches it, so the recovery
fires on its own without any daemon-side change.

Tests:

- New regression TestCodexExecuteCleansUpWhenScannerOverflowsOnResume
  spawns a fake codex that emits an 11 MB single-line thread/resume
  response (trips the scanner cap) and then sleeps without re-reading
  stdin. With the original drainAndWait body it blocks at the 10 s
  executeFakeCodex deadline ("timeout waiting for result") — verified
  by temporarily reverting just the helper body — and with the fix it
  completes in ~1.3 s with Result.Status="failed",
  Result.SessionID="" so the outer fallback can fire.
- Full codex test suite, full agent package, daemon + execenv +
  repocache packages, go build ./..., and go vet on agent/daemon all
  pass.

Out of scope (deferred to follow-up per YOMXXX): bumping the 10 MB
bufio.Scanner cap on codex / claude / copilot / cursor / hermes /
kimi / kiro / codebuddy / antigravity / qoder / openclaw / opencode
(pi already sits at 32 MB), and the shared bounded JSON-RPC line
reader that would eliminate the single-line-overflow risk class
entirely. Buffer size alone is not the fix — recovery behaviour is.

Refs: GH#4520

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-25 14:46:20 +08:00
Bohan Jiang
cb6616f530 feat(slack): Socket Mode channel.Channel adapter (MUL-3516) (#4523)
* feat(slack): Socket Mode channel.Channel adapter (MUL-3516)

First slice of the Slack adapter: implements channel.Channel (Type/Connect/Disconnect/Send/Capabilities) over Slack Socket Mode, normalizes inbound events to channel.InboundMessage (DM, channel @mention, thread reply; bot-loop + edit/delete guards), decodes the per-installation config/secret blob, and registers the Factory under TypeSlack. No engine, core, or channel_* schema change. Unit-tested (translation, capabilities, config decode, chunking, Send via httptest). Resolvers + engine wiring + Block Kit binding replier follow.

Co-authored-by: multica-agent <github@multica.ai>

* fix(slack): address adapter review (MUL-3516)

- Propagate InboundHandler errors through dispatchEventsAPI/handleSocketEvent to Connect so an infra failure tears down the connection for Supervisor reconnect/backoff instead of being silently swallowed (ACK still happens first).
- Capabilities: declare only CapText | CapThreadReply; drop CapRichCard/CapAttachment/CapMessageEdit until those Send paths are wired.
- slackChatType: map mpim (multi-party DM) to group, not p2p, so the 'must address bot' filter applies; only 1:1 im is p2p.
- Document the group-addressing decision: explicit @bot mention required in groups; mention-free thread continuation deferred to the session-aware layer.
- Tests: handler-error propagation, slackChatType table, mpim-requires-mention, capabilities negative assertions.

Co-authored-by: multica-agent <github@multica.ai>

* refactor(channel): shared channel-agnostic ChatSession service (MUL-3516)

Extract the session/append//issue machinery — currently locked inside the Feishu-pinned lark.chatSessionService — into a shared engine.ChatSession parameterized by channel_type + session titles, so every IM adapter reuses it instead of re-implementing it. Logic is verbatim (find-or-create session+binding with unique-violation race re-read; append+touch+reply-target+in-tx dedup Mark; /issue parse with bare-command previous-message fallback) but channel-neutral: command-parse source is supplied by the adapter (enrichment is platform-specific). Backed by a narrow SessionQueries interface so it is unit-tested with an in-memory fake (no DB). /issue parser moved to engine.ParseIssueCommand. Next: migrate Feishu onto it and wire Slack's ResolverSet, removing the lark duplicate.

Co-authored-by: multica-agent <github@multica.ai>

* fix(channel): decouple session binding key from outbound target (MUL-3516)

Addresses Elon's round-2 review. engine.ChatSession.EnsureSession previously keyed the binding on a raw chat id (EnsureSessionInput.ChatID), so a resolver wiring Slack straight through would collapse every @bot thread in one channel into a single chat_session and overwrite last_thread_id. Make the API un-misusable:

- EnsureSessionInput.ChatID -> BindingKey: the explicit session-isolation key (Feishu: chat id; Slack DM: channel id; Slack channel: channel id + thread root), documented so a raw threaded-platform chat id is never passed straight through.
- Add EnsureSessionInput.BindingConfig (opaque) persisted on the binding's config column, so the real outbound channel/thread is preserved when BindingKey is composite — outbound routing stays separate from the isolation key.
- channel.sql CreateChannelChatSessionBinding now writes config (additive, uses the existing NOT NULL column; lark caller passes '{}', no schema change, no Feishu regression).
- Tests: TestEnsureSession_ThreadRootIsolation (two thread roots in one channel -> two sessions; same root reuses) and TestEnsureSession_StoresBindingConfig.

No production wiring change yet (per review, the not-yet-wired shared service is an accepted preparatory state); this makes the API correct before Feishu/Slack are migrated onto it.

Co-authored-by: multica-agent <github@multica.ai>

* feat(slack): Slack ResolverSet with thread-root session isolation (MUL-3516)

Wires Slack into the channel-agnostic engine.Router via a ResolverSet built on the generic channel_* queries (installation route by team_id, identity + workspace-membership recheck, two-phase dedup, audit) plus the shared engine.ChatSession. No new query, no schema change.

slackSessionRouting is the per-message isolation rule (Elon round-2 / Niko round-3): a DM is one session per channel; a channel/group message is isolated by thread root (key = channel:threadRoot, root = inbound thread_ts or the message ts for a top-level @mention), so two @bot threads in one channel are two sessions. The real channel id rides in BindingConfig for outbound; the reply thread is returned separately. Tests cover DM/channel/thread routing, config, and that distinct thread roots isolate while a same-thread follow-up reuses its key.

Not yet wired into router.go (still a preparatory commit, per review); Feishu migration onto the shared service, router/config wiring, and the Slack outbound path follow.

Co-authored-by: multica-agent <github@multica.ai>

* feat(slack): Markdown->mrkdwn outbound formatting (MUL-3516)

Slack renders mrkdwn, not Markdown, so an unconverted agent reply shows literal ** , ## and [text](url). Add formatMrkdwn — a faithful Go port of Hermes Agent's slack format_message (MIT) — and apply it in slackChannel.Send before chunking/posting. Protects fenced+inline code, converted links, and existing Slack entities behind placeholders; converts headers/bold/italic/strike/links; escapes control chars. Unit tests cover each construct plus fenced-code protection and a link nested in bold.

Co-authored-by: multica-agent <github@multica.ai>

* docs(slack): preserve Hermes MIT notice for ported mrkdwn converter (MUL-3516)

Addresses Niko's review. formatMrkdwn is a substantial port of Hermes Agent's slack format_message; MIT requires preserving the copyright + permission notice. Add the full Hermes MIT copyright/permission notice + source URL as a header on mrkdwn.go (no repo-level third-party notice file exists, and the header cannot get separated from the ported code). Also add the suggested Send-layer regression test (TestSend_AppliesMrkdwn) that pins the wiring: slackChannel.Send converts Markdown to mrkdwn before posting.

Co-authored-by: multica-agent <github@multica.ai>

* refactor(lark): migrate Feishu onto shared engine.ChatSession, drop duplicate (MUL-3516)

Completes 'every IM reuses one shared session service' and removes the dual-path the reviewers flagged as temporary. Feishu's ResolverSet now drives the channel-agnostic engine.ChatSession (channel_type=feishu, Lark session titles preserved) instead of the Feishu-specific lark.chatSessionService, which is deleted. Behavior is unchanged: engine.ChatSession is the verbatim port of the old logic and is unit-tested; the new Feishu binder param-mapping (BindingKey=chat id, CommandText=un-enriched CommandBody from Raw) is covered by feishu_resolvers_test.go.

- Delete chat_service.go (chatSessionService + helpers) and issue_command.go/_test.go (parser now engine.ParseIssueCommand). Relocate the shared TxStarter interface to tx.go (still used by binding-token + registration services).
- chat.go keeps only the AuditLogger seam; remove the now-dead ChatSessionService / EnsureChatSessionParams / AppendUserMessageParams / AppendResult / IssueCommand types.
- router.go constructs engine.NewChatSession for Feishu; inbound_enricher_test + doc.go updated.

make-test parity: go build ./..., go vet, gofmt, and go test ./internal/integrations/{lark,channel/...,slack} all pass (full Feishu suite green).

Co-authored-by: multica-agent <github@multica.ai>

* feat(slack): wire Slack adapter + ResolverSet + outbound into router (MUL-3516)

Activates the full Slack pipeline, gated by MULTICA_SLACK_SECRET_KEY (the bot/app-token decryption key). When unset the block is skipped, so existing deployments are unaffected and Feishu is untouched.

- router.go registers slack.RegisterSlack (Socket Mode connect/send Factory) + channelRouter.Register(TypeSlack, NewSlackResolverSet) (inbound pipeline) + slack.NewOutbound(...).Register(bus) (outbound).
- New slack/outbound.go: an EventChatDone subscriber mirroring the Feishu Patcher. It finds the Slack chat binding for the finished session, recovers the real channel from the binding config (the channel_chat_id may be a composite thread-isolation key) + the reply thread from last_thread_id, and posts via slackChannel.Send (reusing formatMrkdwn / chunking / threading). Sessions with no Slack binding are ignored, so it coexists with the Feishu Patcher on the shared bus.
- Tests: posts to the bound channel/thread with the real channel id; ignores non-Slack sessions, empty completions, revoked installations, and non-chat events.

Slack now shares engine.ChatSession, channel_* tables, IssueService and TaskService with Feishu. Remaining: config-driven installation provisioning (an operator currently creates the channel_type='slack' row; the config block shape — which workspace/agent — is a product decision) and a live end-to-end smoke. go build ./..., go vet, gofmt, and go test ./internal/integrations/{slack,channel/...,lark} all pass.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-25 14:29:00 +08:00
YYClaw
f4dba5d6b0 fix(cli): setup self-host respects existing config and shows URL changes (#4537)
Honor an already-configured self-host server_url/app_url when re-running `multica setup self-host` without flags, instead of silently resetting to http://localhost:8080. The existing config is added as a fallback step in the URL resolution chain (flag → env → existing config → localhost), and the overwrite confirmation now shows URL changes as `old -> new`.

Closes #4536

MUL-3660
2026-06-25 13:13:46 +08:00
Multica Eve
b71d9d0ab9 MUL-3674: Preserve Kiro goal completion on close error (#4560)
* fix(daemon): preserve Kiro goal completion on close error

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): require completed Kiro goal marker

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-25 12:43:04 +08:00
Ryan Yu
d9bf4b85c9 test(cli): cover additional subcommands (#4555) 2026-06-25 11:25:00 +08:00
Bohan Jiang
0d3b49f2c7 fix(webhook): use unique ZSET member in Redis rate limiter (#4546)
The sliding-window Lua script used the nanosecond timestamp as both the
ZSET score and member. Two requests landing in the same nanosecond
collided on an identical member, so ZADD updated in place instead of
inserting and the window under-counted — letting requests through past
the limit. This surfaced as a flaky CI failure in
TestRedisWebhookIPRateLimiter_HasSeparateBudgetFromTokenLimiter.

Keep the timestamp as the score (so ZREMRANGEBYSCORE trimming is
unchanged) and use a per-request UUID as the member so each admitted
request is counted exactly once.

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-25 01:29:20 +08:00
Bohan Jiang
a03055b07d fix(agent): terminate opencode process group before closing stdout (#4533) (#4541)
On cancellation/timeout the opencode backend closed the stdout read end
immediately, leaving the child writing into a closed pipe. Every write then
returns EPIPE and, per anomalyco/opencode#33653, can spin an orphaned process
at 100% CPU — surfacing as high idle CPU after a cancelled task or daemon
restart (MUL-3655).

Cleanup now runs opencode in its own process group and, on cancel, drives a
graceful group-wide SIGTERM → grace → SIGKILL, closing the stdout pipe only as
a last-resort unblock once the tree has been signalled (SIGKILL is uncatchable,
so no member can write again — no EPIPE window). The group signal also reaps
tool subprocesses opencode spawned instead of orphaning them. WaitDelay remains
the hard backstop.

Adds unix tests covering the graceful path and the SIGTERM-ignored → SIGKILL
escalation, asserting the whole process group is reaped and the run never
deadlocks on the scanner. Windows behaviour is unchanged (no process groups).

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-25 00:41:25 +08:00
Ryan Yu
bea028784a fix(labels): reject control characters in label names (#4531) 2026-06-25 00:19:13 +08:00
Bohan Jiang
dfa384ffa2 fix(daemon): resolve skill bundles per-skill with size-scaled timeout (#4505) (#4530)
* fix(daemon): resolve skill bundles per-skill with size-scaled timeout (MUL-3650, #4505)

Cold-start skill resolution downloaded the agent's entire bundle in one
atomic request bounded by the shared 30s control-plane http.Client timeout.
On a slow/jittery link a large bundle (15+ skills) could not finish the body
read in 30s, and because the cache was only written after the whole batch
succeeded, nothing was persisted on failure — so every dispatch re-downloaded
the full bundle and timed out again, never converging.

Resolve each missing bundle in its own request and cache it the moment it
arrives:

- daemon: per-skill resolve with a deadline scaled to the bundle's declared
  size (floor 30s, cap 5m, ~50KB/s floor throughput) instead of the fixed
  control-plane timeout; each success is persisted independently, so a
  dispatch that fails on one skill still caches the rest and the next dispatch
  only re-fetches what is missing.
- client: dedicated bundleClient with no fixed Timeout (deadline comes from
  ctx), a singular ResolveSkillBundle, and a short transient-retry schedule.

Tests cover the size-scaled timeout and the cross-dispatch incremental
caching / convergence (a failed skill does not discard its siblings, and
cached skills are not re-fetched).

Co-authored-by: multica-agent <github@multica.ai>

* fix(daemon): accept server-side skill updates in per-skill resolve (MUL-3650)

Address review on #4530: resolveSkillBundle validated the returned bundle
against the claim-time ref, which pinned it to the requested hash. The resolve
endpoint intentionally serves the agent's current bundle and hash when the
requested hash is stale (the skill can be edited between claim and prepare), so
a legitimate updated bundle was rejected as invalid and the task failed.

Confirm only that the server returned the requested skill (source/id), then
validate self-consistency against a ref derived from the returned bundle and
cache it under its own hash — matching the documented endpoint contract. Adds a
regression test covering a stale-hash request answered with an updated bundle.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 19:00:13 +08:00
Bohan Jiang
5e824a94ae chore(channel): remove the one-time MULTICA_LARK_HUB_DISABLED cutover switch (#4527)
The lark_*->channel_* cutover (MUL-3515) is deployed to prod, and the
MULTICA_LARK_HUB_DISABLED park-switch was a one-time scaffold for that
rollout — the end state intentionally does not use it (prod never set the
env). Remove the env-gated branch from cmd/server/main.go so the channel
supervisor always starts when built; its existing nil-guard and shutdown
join are unchanged. Trim migration 124's now-obsolete switch runbook to a
short historical note (comment-only; 124 is already applied, so this does
not re-run).

Refs MUL-3515

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 18:18:06 +08:00
J
34bd115808 test(execenv): fix stale test name reference in comment (#3028)
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 18:06:24 +08:00
jockibeard
3adfaf4285 fix(execenv): support OpenClaw 2026.6.x agents schema (#3028) (#4319)
Adapts OpenClaw execenv prep to the 2026.6.x agents schema (agents.list config path removed; agents live in a sqlite registry). Case-insensitive key-missing guard + registry fallback on read, version-aware emission on write so per-task workspace pinning keeps working.

Closes #3028

MUL-3643
2026-06-24 18:05:38 +08:00
Bohan Jiang
c3d8529ba5 fix(channel): eliminate WaitGroup Add/Wait data race in engine.Supervisor (MUL-3620) (#4517)
The race detector caught a flaky failure on main (passes on retry):
Supervisor.startSupervisor does s.wg.Add(1) under s.mu, while Supervisor.Wait
calls s.wg.Wait() with no lock. Calling WaitGroup.Add concurrently with
WaitGroup.Wait is a data race and undefined per the WaitGroup contract — so it
only trips occasionally (it passed locally and in PR CI).

Wait now blocks on stopChan (closed by Run's defer when Run returns) before
calling wg.Wait(). Run is the sole caller of startSupervisor, so once Run has
returned no further Add can happen and wg.Wait is race-free. WaitWithTimeout
inherits the fix (it calls Wait), and its timer still bounds shutdown.

This latent race existed in the original lark.Hub.Wait too; fixed properly in
the generalized Supervisor.

Verified: go test -race -count=300 on the flagged test and -count=8 on the
whole engine package, all clean; no deadlock from the stopChan gate (every
caller pairs Wait with a started Run + cancelled ctx).

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 17:21:03 +08:00
Bohan Jiang
9db80a0940 fix(daemon): forbid mid-run progress comments in runtime brief (#4516)
A run could post running progress/plan narration as issue comments, and a
review run surfaced its in-progress narration as the result instead of a
conclusion (MUL-3605).

Add one rule to the Output section's issue-task branch, in both the
legacy and slim briefs: post exactly one comment per run — the final
result, before the turn exits — and keep plans/progress in the agent's
own reasoning. The pre-existing "Final results MUST be delivered … a task
that finishes without a result comment is invisible" line already makes
the comment mandatory, and "state the outcome, not the process" already
rules out progress dumps, so no second rule is added.

Chat / quick-create / autopilot keep their own delivery channels. Adds a
regression test across both brief paths.

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 17:20:19 +08:00
Bohan Jiang
3e21e58df0 feat(channel): channel-agnostic engine (Supervisor + Router); Feishu as channel.Channel (MUL-3620) (#4512)
* feat(channel): add channel-agnostic engine Supervisor (MUL-3620)

Stage-1 (MUL-3515) shipped the channel abstraction but nothing drove it.
Add the generic engine that does:

- channel.InboundHandler + Config.Handler: the single shared inbound entry
  the engine injects into every adapter (Hermes set_message_handler model).
- channel.Channel.Connect now blocks for the connection lifetime (doc), so
  the supervisor can tie lease renewal to connection liveness.
- new package channel/engine: Supervisor, generalized out of lark.Hub. It
  enumerates active installations across ALL channel types (no hard-coded
  feishu), fences each behind the WS lease CAS, builds the platform Channel
  via channel.Registry, drives Connect/Disconnect with backoff+jitter, and
  restarts on credential rotation. Knows nothing about any platform.

channel.Channel is now driven by an engine; integrations/channel has an
external consumer. Feishu adapter + boot cutover follow next.

Tests: supervisor_test.go covers lease CAS, reclaim, reap-on-revoke,
rotation restart + token fencing, backoff on build error, lease-loss
teardown, bounded release, shutdown timeout. Race-clean.

Co-authored-by: multica-agent <github@multica.ai>

* feat(lark): drive Feishu through the channel engine; remove lark.Hub (MUL-3620)

Refactor Feishu into the first channel.Channel and cut boot over to the
channel-agnostic engine.Supervisor, removing the Feishu-only Hub.

- feishuChannel implements channel.Channel: Connect runs the existing
  WS long-conn connector for one installation; Send posts a text reply
  via the Lark IM API; Capabilities declares Feishu's feature set.
  RegisterFeishu wires it to channel.TypeFeishu — adding a platform is
  now 'register a Factory', no engine edit.
- FeishuRuntime extracts the former Hub.handleEvent / scheduleReply:
  runs the Dispatcher and drives the detached typing indicator +
  OutcomeReplier off the connector ACK path. main.go drains it on
  shutdown after the supervisor stops delivering events.
- channelInstallationStore (engine.InstallationStore) enumerates active
  installations across ALL channel types via the new de-hardcoded query
  ListAllActiveChannelInstallations; the Supervisor routes each row to
  its registered Factory by channel_type. Generic per-row fingerprint
  replaces the feishu-specific one.
- boot: engine.Supervisor replaces lark.Hub.Run; MULTICA_LARK_HUB_DISABLED
  keeps its name for runbook compatibility.
- delete hub.go / hub_pgx.go / hub_test.go; relocate the connector
  contract (EventConnector/EventEmitter), uuidString, and the reply-path
  tests (-> feishu_runtime_test.go) so coverage is preserved.

No channel_* schema change. Feishu behaviour unchanged; lark + channel +
engine tests green under -race; go build/vet ./... clean.

Remaining (follow-up): lift the Dispatcher pipeline into a channel-
agnostic engine.Router over channel.InboundMessage + resolver interfaces,
so the inbound core stops being Lark-shaped and adding a channel needs
zero core edits (validated by Slack, MUL-3516).

Co-authored-by: multica-agent <github@multica.ai>

* feat(channel): add channel-agnostic engine.Router (inbound pipeline) (MUL-3620)

Generalize lark.Dispatcher's inbound pipeline into engine.Router: the single
shared channel.InboundHandler the Supervisor injects into every Channel. It
routes by ChannelType to a registered ResolverSet and runs the same ordered
pipeline for every platform (install route -> two-phase dedup -> group @bot
filter -> identity+membership -> ensure session -> append+mark -> /issue ->
debounced run), then drives the detached OutboundReplier + typing indicator.

Platform specifics live behind resolver interfaces (InstallationResolver,
IdentityResolver, Deduper, SessionBinder, Auditor, OutboundReplier,
TypingNotifier) + shared services (IssueCreator/TaskEnqueuer/SessionReader).
Adding a platform is 'register a ResolverSet', not 'edit the Router'. Outcome
/ DropReason values match the legacy lark ones 1:1.

Additive: lark.Dispatcher untouched and still wired; the feishu ResolverSet,
the cutover, and the old-path removal land next. channel.InboundMessage gains
ForceFresh (the normalized /fresh affordance). Batcher moved into engine.

router_test.go covers the pipeline invariants (routing, dedup finalize
states, group filter, identity, membership, ensure/append, /issue, debounce,
flush offline, force-fresh, drain) with generic fakes; race-clean.

Co-authored-by: multica-agent <github@multica.ai>

* feat(lark): cut Feishu over to engine.Router; remove lark.Dispatcher; core no longer Lark-shaped (MUL-3620)

Wire the channel-agnostic engine.Router (added in the prior commit) as the
shared inbound handler and refactor Feishu into a ResolverSet, completing the
generic-engine cutover. The inbound core (engine.Router) now contains zero
platform specifics.

- Feishu ResolverSet (feishu_resolvers.go): InstallationResolver,
  IdentityResolver, Deduper, SessionBinder, Auditor, OutboundReplier,
  TypingNotifier — each backed by the existing ChannelStore / ChatSessionService
  / OutcomeReplier / typing indicator, translating at the channel.InboundMessage
  boundary (platform fields read from Raw). origin_type stays 'lark_chat'.
- feishuChannel now produces a normalized channel.InboundMessage and hands it to
  the engine handler via channel.Config.Handler; the old Raw round-trip through
  lark.Dispatcher is gone.
- Remove lark.Dispatcher, FeishuRuntime, and lark's pending_batcher (the engine
  owns the pipeline + batcher now); their behavioural coverage moved to
  engine.Router tests. Surviving native types (InboundMessage / Outcome /
  DispatchResult) relocated to feishu_types.go.

elon review nits addressed:
- The channel engine (Registry + Router + Supervisor) is now built
  UNCONDITIONALLY, outside the MULTICA_LARK_SECRET_KEY gate, so a non-Lark
  deployment runs it; Feishu registers its Factory + ResolverSet only when its
  key is present.
- channel.Config.Raw is now genuinely the platform config JSONB
  (channel_installation.config): the feishu factory builds a credentials-only
  Installation from it, and the workspace/agent identity is resolved per message
  by the Router — no full-db-row marshaling.
- feishuChannel gains direct unit tests: factory config decode, Send text +
  reply-target mapping, Capabilities, inbound normalization + Raw round-trip,
  msg-type + result mapping.

No channel_* schema change. go build/vet ./... clean; channel + engine + lark
green under -race. Feishu behaviour preserved (pipeline logic lifted verbatim,
only generalized).

Co-authored-by: multica-agent <github@multica.ai>

* docs(channel): fix stale comments on the channel engine boot (MUL-3620)

Address Elon's review nit: three comments still described the pre-cutover
behavior.

- handler.go: ChannelSupervisor is built UNCONDITIONALLY now, not nil when
  MULTICA_LARK_SECRET_KEY is unset.
- main.go: same — the supervisor always exists; only MULTICA_LARK_HUB_DISABLED
  parks it.
- router.go: with no platform registered the store still lists active rows;
  Registry.Build returns ErrUnknownType and the supervisor backs off (it does
  not 'find no installations').

Comment-only; no behavior change.

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 17:01:33 +08:00
YzKing
4d7111d396 MUL-3605: Fix serialization of agent task claims by capacity
* fix(server): serialize agent task claims by capacity

* test(server): clean up claim race fixture

---------

Co-authored-by: Yanziqing25 <1519319045@qq.com>
2026-06-24 17:00:03 +08:00
beast
20eecfb093 fix(projects): honor repo resource checkout refs (MUL-3593) (#4470) 2026-06-24 16:25:17 +08:00
Multica Eve
1ac3a03e5d MUL-3618: dispatch daemon feature flag snapshots (#4509)
* MUL-3618: dispatch daemon feature flag snapshots

Co-authored-by: multica-agent <github@multica.ai>

* MUL-3618: narrow daemon flag snapshots to process scope

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 16:19:30 +08:00
Bohan Jiang
79c9158097 fix(issue): order sub-issues by number ASC instead of position (#4511)
ListChildIssues and ListChildrenByParents ordered by
`position ASC, created_at DESC`. position is assigned by
NextTopPosition as MIN(position)-1 scoped to (workspace, status),
not relative to siblings, so a parent's children interleave
unpredictably across creation batches and statuses.

Order by `number` (a per-workspace monotonic counter) instead.
ASC keeps sub-issues in stable creation order (oldest first), so a
parent's plan reads top-to-bottom in the order tasks were added.

Adds ordering tests covering both queries with scrambled positions
and mixed statuses.

Closes #4232
MUL-3362

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 16:06:14 +08:00
Bohan Jiang
cb7cc82ecb fix: allow same-origin attachment previews (#4504)
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 15:37:38 +08:00
Bohan Jiang
76c58a4ee8 MUL-3617: remove Gemini CLI runtime (#4503)
* fix: remove gemini cli runtime

Co-authored-by: multica-agent <github@multica.ai>

* fix: skip unsupported custom runtime profiles

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 15:15:42 +08:00
Willow Lopez
af34b8f83a feat(lark): add proxy support for WebSocket connections (#4166)
Add ProxyURL field to GorillaDialer so deployments behind a corporate
proxy can route Lark WebSocket connections through an HTTP CONNECT proxy.

- GorillaDialer.ProxyURL: optional proxy URL parsed and applied to the
  underlying gorilla/websocket dialer before each DialContext call.
  Empty value preserves the default ProxyFromEnvironment behaviour.
- Router reads MULTICA_LARK_WS_PROXY_URL env var and sets it on the
  production dialer.
- Three new unit tests cover invalid URL, proxy-applied, and empty-URL
  default paths.

Closes #4032

Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 15:12:53 +08:00
Multica Eve
8ad673fdb7 MUL-3560: gate slim runtime brief behind runtime_brief_slim feature flag (#4449)
The MUL-3560 slim runtime brief — kind-driven dispatcher, per-section
gating, prose compression for ~7k chars saved on the typical
comment-triggered task — now ships behind the `runtime_brief_slim`
feature flag wired via the framework-level service from MUL-3615.

Default: OFF in every environment (production stays on the legacy
brief that has shipped for ~2 years). Staging opts in via the YAML
rule set; ops can override per-process with `FF_RUNTIME_BRIEF_SLIM=true`.
Production is held back until staging has burned in long enough that
we are confident the slim brief does not regress agent behaviour.

Architecture (one toggle point, two code paths, both fully tested):

  buildMetaSkillContent (runtime_config.go)
      │
      └─ useSlimBrief() → false (default)
      │     → fall through to the legacy verbose body that ships on
      │       main today — byte-for-byte unchanged, no migration risk
      │
      └─ useSlimBrief() → true
            → buildMetaSkillContentSlim (runtime_config_sections.go)
              → classifyTask → 5-way kind switch → per-section writers

BuildCommentReplyInstructions takes the same gate, so the per-turn
comment prompt and the runtime brief stay in sync on which template
they emit.

What's in this PR:

- runtime_config_flag.go (new): package-scope `runtimeFlags` atomic
  pointer + `SetFeatureFlags` setter + `useSlimBrief` toggle point.
  Nil-safe: a daemon that forgets to wire the service falls back to
  legacy, no panic.
- runtime_config_kind.go (new): `taskKind` enum + `classifyTask` +
  `hasIssueContext` predicate. Used only by the slim path.
- runtime_config_sections.go (new): the slim brief itself —
  `buildMetaSkillContentSlim` + per-section `writeXxx` helpers
  + `writeAvailableCommandsQuickCreate` minimal variant +
  `writeBackgroundTaskSafetySlim` compressed safety section. The
  Section × Kind matrix is documented inline on
  `buildMetaSkillContentSlim` and the test below checks the
  dispatcher does not diverge from the spec.
- reply_instructions.go: `BuildCommentReplyInstructions` gains a
  short slim-or-legacy prelude; new `buildCommentReplyInstructionsSlim`
  is the compressed cookbook (defers the shell-hazard rationale to
  `## Comment Formatting`).
- runtime_config.go: `buildMetaSkillContent` gains a 2-line
  dispatcher at the top; the legacy body is otherwise untouched.
- runtime_config_kind_test.go (new): canaries for both paths.
  - TestClassifyTask: 5 kinds + 3 tiebreak cases.
  - TestTaskKindHasIssueContext: predicate semantics.
  - TestSlimFlagOffUsesLegacy: nil flag service → legacy path
    (renders "Get full issue details.", a legacy-only substring).
  - TestSlimFlagOnUsesSlim: flag on → slim path (renders "full
    issue.", a slim-only one-liner) AND must NOT render legacy
    "Get full issue details.".
  - TestBuildMetaSkillContentSlimKindMatrix: locks the per-kind
    section set; heading match is line-anchored so inline references
    don't trip absence assertions.
  - TestSlimQuickCreateAvailableCommands: locks the minimal-variant
    content for quick-create (issue create present, every other
    Core command absent).
  - TestSlimBriefIsSubstantiallyShorter: ≥ 30% reduction guard so
    a future change can't accidentally re-bloat the slim path back
    to legacy levels.
- cmd/server/main.go: now calls `execenv.SetFeatureFlags(flags)`
  immediately after constructing the feature flag service.

Measured impact (slim vs legacy, claude provider, realistic fixture
with 2 repos + 2 skills + member initiator):

    legacy = 19567 chars
    slim   = 11868 chars    Δ = -7699  (-39.3%)

Verification:

- go vet ./internal/daemon/... ./cmd/server/...                  ok
- go test ./internal/daemon/...                                  ok
- go test ./pkg/featureflag/...                                  ok
- TestSlimBriefIsSubstantiallyShorter logs the 39.3% ratio
- TestSlimFlagOffUsesLegacy + TestSlimFlagOnUsesSlim pass both
  directions, so the dispatcher is locked in code.

The pre-existing `internal/handler` test failures
(TestLeaveWorkspace_RevokesOwnRuntimes,
TestDeleteMember_CancelsTasksFromAgentReassignment,
TestDeleteMember_NoRuntimes_DeletesMember) reproduce on plain
`origin/main` with the same `relation "channel_user_binding" does
not exist` SQL error — they are a missing-migration bug from the
recent channels foundation PR (ce28d0aa0), not anything this PR
touched.

Rollout plan:

1. Merge this PR. Production daemons keep emitting the legacy brief
   (flag default false).
2. Add a YAML rule to staging's
   `MULTICA_FEATURE_FLAGS_FILE`:

       runtime_brief_slim:
         default: true

   Staging daemons start emitting the slim brief on next restart.
3. Watch `agent prompt prepared` logs + agent behaviour for 7 days.
4. If staging is clean, flip the prod YAML to `default: true`.
   Legacy code path stays in the binary as a kill-switch
   (`FF_RUNTIME_BRIEF_SLIM=false` to revert without a deploy).
5. After ~30 days clean in prod, follow up with a PR that deletes
   the legacy body and the flag — same pattern as docs/feature-flags.md
   recommends ("plan the death of the flag at birth").

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 14:23:17 +08:00
Wilson-G
b92e4a53fb DH-106 为飞书接入补上 /new 会话指令 (MUL-3503) (#4396)
Lark/飞书入站消息新增 /new 首行指令,解析为 force_fresh_session,复用既有 daemon 会话续接门控。

Co-authored-by: Wilson-G <Wilson-G@users.noreply.github.com>
2026-06-24 14:16:22 +08:00
Multica Eve
4a8210912a feat(featureflag): framework-level feature flag system (MUL-3615) (#4496)
* feat(featureflag): framework-level feature flag system (MUL-3615)

Introduces a reusable feature flag framework so future features can adopt
flags without writing infrastructure code.

Backend: server/pkg/featureflag (Go)
- Service / Provider / Decision separation per Martin Fowler's Toggle
  Point / Toggle Router / Toggle Configuration pattern.
- Providers: StaticProvider (rules in source control), EnvProvider
  (FF_<KEY> overrides for ops kill switches), ChainProvider
  (first-hit-wins composition).
- EvalContext carried through context.Context with WithEvalContext /
  EvalContextFrom; supports user_id, workspace_id, free-form attributes.
- PercentRollout via deterministic FNV-1a bucketing; same user always
  lands in the same bucket so experiments do not flap between requests.
- Nil-safe Service: a nil *Service or missing flag returns the caller's
  default so business code never panics on a missing flag.
- 100% unit-test coverage with -race; go vet clean.

Frontend: packages/core/feature-flags (TypeScript)
- Same vocabulary as the Go side (Decision, EvalContext, Rule,
  PercentRollout). FNV-1a parity ensures cross-tier bucket agreement.
- FeatureFlagService + StaticProvider + ChainProvider in pure TS.
- React glue: FeatureFlagsProvider, useFlag(key, default),
  useVariant(key, default). Hooks fall back to the default when no
  provider is mounted so Storybook / unit tests stay simple.
- Vitest tests for service, providers, hash, and React hooks.

Docs: docs/feature-flags.md — wiring, EvalContext, toggle points,
backend-protection note, and the standard best-practice checklist.

The framework intentionally has no third-party Go deps and no API
surface beyond what real callers will need. New providers (DB, remote
config, LaunchDarkly) plug in by implementing Provider; no existing
caller has to change.

Co-authored-by: multica-agent <github@multica.ai>

* fix(featureflag): cross-tier hash parity + variant only when enabled (MUL-3615)

Two must-fix issues from the PR review on #4496:

1. TS hash had a trailing zero separator that Go did not emit, so the
   same (key, identifier) bucketed differently on the two tiers. The
   "user lands in the same bucket on server and client" promise was
   broken. For example billing_new_invoice/user-42 was bucket 97 in Go
   and bucket 11 in TS.

   Fix: TS fnv1a now emits the zero separator BETWEEN parts only, never
   after the last one, matching Go's hash.Write byte stream exactly.
   Verified by parallel golden tests on both sides that pin five
   (key, identifier) -> bucket triples; if either side drifts both tests
   fail and one must be brought back in sync.

2. StaticProvider returned `Rule.Variant` regardless of whether the rule
   evaluated to enabled=true. A 0%-rollout user, a deny-listed user, or
   a default-off user would see variant="experiment-v2", so callers
   branching on Variant() would route control users into the experiment
   arm.

   Fix: Rule.Variant is now the ON-variant only. When the rule evaluates
   to enabled=false the Decision's variant is the canonical "off",
   regardless of what Rule.Variant says. Documented as a behavior
   contract in the Rule godoc / JSDoc and covered by regression tests
   on both sides.

Tests: - go test -race ./pkg/featureflag/...  : all green (1.58s).
  - pnpm --filter @multica/core test     : 661/661 (3 new).
  - pnpm --filter @multica/core typecheck: clean.
Co-authored-by: multica-agent <github@multica.ai>

* fix(featureflag): hash UTF-8 bytes on the TS side for cross-tier parity (MUL-3615)

Follow-up review on PR #4496 caught that the previous hash fix was only
correct for ASCII input. The TS side used `charCodeAt`, which returns
UTF-16 code units, while the Go side hashes the UTF-8 byte
representation. Any non-ASCII flag key or identifier — Chinese flag
names, accented user IDs, emoji — would bucket differently on backend
vs frontend, silently breaking the "same user, same bucket" promise the
PR description makes.

Concretely:
  flag/é         Go 53  vs TS-old 68
  flag/🦄        Go 82  vs TS-old 75
  实验/user-1    Go 90  vs TS-old 4
  flag/用户-1    Go 95  vs TS-old 2

Fix: replace per-char charCodeAt with a module-level `TextEncoder`
('utf-8') and hash each encoded byte. After the fix all four cases above
match Go exactly, and the existing ASCII cases continue to match.

The cross-language golden tables on both sides now include the 5 new
non-ASCII cases alongside the 5 ASCII cases, so any future regression
that swaps UTF-8 for charCodeAt (or vice versa) will fail loudly on
both Go and TS simultaneously.

TextEncoder is part of WHATWG Encoding and is available in every
evergreen browser, in Node 11+, and in Hermes (React Native) >= 0.74,
which covers every runtime that imports @multica/core/feature-flags.

Tests: - go test -race ./pkg/featureflag/...   : all green.
  - pnpm --filter @multica/core test      : 661/661.
  - pnpm --filter @multica/core typecheck : clean.
Co-authored-by: multica-agent <github@multica.ai>

* feat(featureflag): wire into main app config — YAML file + env override (MUL-3615)

Follow-up requested by Yushen on PR #4496: make the feature flag
framework configurable through the existing main-program config system
instead of requiring Go code edits. multica's main app is purely env-var
driven (see .env.example) with optional MULTICA_*_FILE knobs for richer
config; feature flags now follow the same pattern.

server/pkg/featureflag/config.go
  - LoadRulesFromYAMLFile(path) parses a YAML rule set into runtime
    Rule structs. Empty files are a valid "no flags yet" state; missing
    or malformed files surface a hard error so operators see misconfig
    the same way DATABASE_URL parse errors do.
  - NewServiceFromEnv composes the standard provider chain:
      1. EnvProvider("FF_")               (runtime kill-switch path)
      2. StaticProvider from YAML file    (declarative rule set)
    When MULTICA_FEATURE_FLAGS_FILE is unset, only the env layer is
    active and every IsEnabled call falls through to the caller's
    default, so the server can boot before any flag is authored.

server/cmd/server/main.go
  - Construct the Service once at startup right after env-var warnings,
    fail loudly on malformed YAML, log the loaded rule count via the
    Service logger. The Service is held in a local `flags` variable
    ready to be threaded into handler.Handler / service constructors
    when the first flag user lands. Threading is deferred to the PR
    that adds the first business consumer so this PR stays a pure
    framework + config layer.

.env.example
  - New "Feature flags" section documents MULTICA_FEATURE_FLAGS_FILE and
    the FF_<KEY> override convention, with a minimal YAML schema example
    inline.

docs/feature-flags.md
  - Replace the "build a provider manually" example with the
    NewServiceFromEnv pattern that now matches what main.go actually
    does. Show the YAML schema in one place. Note the on-variant /
    off semantics from the previous review round.

server/pkg/featureflag/doc.go
  - Update package doc to mention the gopkg.in/yaml.v3 dependency
    (already a server-level dep) instead of the now-inaccurate
    "no third-party dependencies" claim.

Tests: - go test -race -count=1 ./pkg/featureflag/...   all green; new
    config_test.go covers: simple YAML, full-shape YAML, empty file,
    missing file, malformed YAML, no env var, file-only, env-beats-file,
    bad file surfaces error.
  - go test -race -count=1 -run TestHealth ./cmd/server/...   sanity
    check that the main.go boot path with the new wiring still passes.
  - go vet ./...   clean.
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 13:49:59 +08:00
Multica Eve
00b9668cd2 fix(autopilot): cold-start planner honors trigger.last_fired_at (MUL-3551) (#4495)
Post-deploy of the new scheduled-dispatch scheduler (PR #4444), an
autopilot configured for "weekdays 17:10 Asia/Shanghai" fired at
~12:30 Beijing the day after deploy — ~4h 38m before the next
scheduled time the UI showed. Traced to a cold-start regression in
the planner hook:

Old behaviour
-------------
On the first tick after migration the hook found no
`sys_cron_executions` row for the trigger
(`latestPlan(...).Found == false`) and anchored on the trigger's
`created_at`, then applied the 24h replay cap:

  after := cfg.CreatedAt
  if oldest := now.Add(-replayWindow); after.Before(oldest) {
      after = oldest // now - 24h
  }

For a trigger created days/weeks earlier and last fired by the
legacy goroutine at Mon 17:10 Beijing (= Mon 09:10 UTC), this set
`after = Tue 04:13 UTC - 24h ≈ Mon 04:13 UTC`. The half-open
enumeration `(Mon 04:13 UTC, Tue 04:13 UTC]` STILL contained Mon
09:10 UTC — the occurrence the legacy code had already handled —
so the new scheduler dispatched it again the moment it took over.
The result: a SCHEDULED-source autopilot_run with planned_at = Mon
17:10 Beijing but a wall-clock dispatch at Tue ~12:30 Beijing.

Timezone math was correct; the bug was purely the cold-start
anchor not respecting prior-fire history.

Fix

Co-authored-by: multica-agent <github@multica.ai>
---
The `autopilot_trigger.last_fired_at` column is maintained by both
the legacy goroutine and the new scheduler (via
TouchAutopilotTriggerFiredAt), so it is the authoritative
"most-recent successful fire" cursor across the migration boundary.
The planner hook now anchors cold-start enumeration on it:

  case latest.Found:        after = latest.PlanTime
  case lastFiredAt != zero: after = lastFiredAt
  default:                  after = cfg.CreatedAt

For the regressed case, `after = Mon 17:10 Beijing`, the next
enumeration window is `(Mon 17:10, Tue 12:30]`, and Tue 17:10 is
in the future — the hook returns nothing and the trigger waits
quietly for Tue 17:10 as the UI promised. For brand-new triggers
(last_fired_at NULL), the original `created_at` path still
applies. For long-dormant triggers the `replayWindow` cap remains.

Changes
-------
* `ListSchedulableAutopilotTriggers` SQL now returns
  `last_fired_at`.
* `autopilotTriggerConfig.LastFiredAt` is populated by the scope
  provider on every tick.
* `autopilotPlansForScope` cold-start branch uses the new anchor.

Tests
-----
* TestAutopilotScheduleJobColdStartHonorsLastFiredAt — seeds the
  exact dev-environment shape (created 3 days ago, last_fired_at
  5 hours ago, no sys_cron_executions row), runs a tick, asserts
  zero exec rows AND zero autopilot_run rows. Without the fix this
  test produces one of each at a historical plan_time.
* TestAutopilotScheduleJobColdStartBrandNewTriggerStillFires —
  asserts a brand-new trigger (last_fired_at NULL) still fires its
  first due occurrence on cold start.

All existing `TestAutopilotScheduleJob*` tests still pass.

Refs MUL-3551

Co-authored-by: Eve <eve@multica-ai.local>
2026-06-24 13:01:59 +08:00
Bohan Jiang
ce28d0aa0e feat(integrations): add platform-agnostic channel foundation (MUL-3515) (#4412)
* feat(integrations): add platform-agnostic channel foundation

Introduce server/internal/integrations/channel — the contract every
inbound IM integration implements, so the core never learns a platform's
event JSON. Four pieces:

- Channel interface (Type/Connect/Disconnect/Send/Capabilities) + Factory
  + Config (channel_type + opaque JSON blob, maps to channel_installation).
- Normalized InboundMessage/OutboundMessage envelopes + Source/MediaRef/
  ReplyCtx/MsgType/ChatType. Envelope holds only cross-platform-true
  fields; platform specifics live in Raw, read only by the adapter.
- Capability bitmask: declaration only, no degrade logic in core.
- Registry: Type->Factory map, last-writer-wins, concurrency-safe.

Pure package (no DB/network/platform deps). Foundation for MUL-3515; the
lark cutover + lark_*->channel_* generalization land in follow-up PRs.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* feat(channel): generalize lark_* tables into channel_* (DB layer)

Migration 123 creates channel_installation / channel_user_binding /
channel_chat_session_binding / channel_inbound_message_dedup /
channel_inbound_audit / channel_outbound_card_message /
channel_binding_token. Each carries a channel_type discriminator and a
JSONB config for platform-specific identifiers/credentials; cross-platform
columns stay flat. Existing Feishu rows are backfilled (channel_type=
'feishu', app_secret_encrypted via base64). NO foreign keys / cascades
(MUL-3515 §4) — integrity moves to the app layer in the cutover.

queries/channel.sql ports the lark query surface to channel_*, JSONB-aware,
plus DeleteChannelUserBindingsByWorkspaceMember /
DeleteChannelChatSessionBindingBySession for the app-layer cleanup that
replaces the removed cascades.

lark_* tables/queries are left in place here and removed once the Go
cutover lands, so this commit ships green on its own.

Verified: sqlc generate, go build ./..., full migrate chain (1..123) on
Postgres 17, and a real-data backfill spot-check (base64 round-trip,
NULL-strip, functional unique index on (channel_type, app_id)).

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* fix(channel): name app_id query param + multi-IM install key + null-safe binding merge

Addresses review on MUL-3515 (PR #4412):

- GetChannelInstallationByAppID: explicitly name params and cast app_id to
  ::text so sqlc emits AppID string. A bare $2 next to `config ->> 'app_id'`
  was mis-attributed to the JSONB config column, generating Config []byte.

- channel_installation uniqueness -> (workspace_id, agent_id, channel_type),
  with the UpsertChannelInstallation conflict key matched. Lets one agent
  hold one installation per IM (feishu + slack + ...) instead of a later
  install clobbering an earlier one. Behaviorally identical in the current
  feishu-only world; "one agent, at most one IM overall" stays an app-layer
  rule per MUL-3515 §4, not a DB constraint.

- CreateChannelUserBinding merges jsonb_strip_nulls(EXCLUDED.config) so a
  re-bind carrying {"union_id": null} no longer erases an already-captured
  union_id, restoring the old COALESCE(EXCLUDED.union_id, ...) semantics.

Regenerated with sqlc v1.31.1. Verified on PG17: re-install replaces in
place, feishu+slack coexist, null re-bind keeps union_id, real union_id wins.

Co-authored-by: multica-agent <github@multica.ai>

* feat(lark): channel-backed Feishu store + fix base64 backfill wrapping

Cutover step 1 of switching the lark Go code from lark_* onto the channel_*
tables (MUL-3515). Introduces the JSONB config boundary the rest of the
cutover sits on, and fixes a latent backfill bug surfaced while building it.

- migration 123: strip newlines from the app_secret_encrypted base64 backfill.
  PostgreSQL encode(...,'base64') MIME-wraps at 76 chars, and a secretbox-
  sealed ~72-byte secret exceeds that. Go's encoding/json decodes a JSON
  string into []byte with base64.StdEncoding, which rejects embedded newlines,
  so without the strip every migrated installation would fail to decrypt its
  app secret once reads move to channel_installation.config.

- store.go: flat domain types (Installation / UserBinding / ChatSessionBinding)
  with field parity to the retired db.Lark* rows, plus the feishu config codec.
  Row->domain mappers decode the JSONB config; the secret decoder is
  whitespace-tolerant so legacy MIME-wrapped data still round-trips, while the
  encoder emits unwrapped base64. Binding config encodes an absent union_id as
  "{}" so the upsert's jsonb_strip_nulls merge never clobbers a stored union_id.

- store_test.go: 72-byte secret round-trip, MIME-wrapped tolerance, optional
  null-strip, and flat-column preservation. Verified on PG17.

Field parity keeps the upcoming ~190 db.LarkInstallation call sites a
mechanical rename. No call sites switched yet; behavior unchanged.

Co-authored-by: multica-agent <github@multica.ai>

* feat(lark): route inbound integration onto channel_* + explicit membership checks

Cutover step 2 (MUL-3515): switch the Feishu Go code from the lark_* queries to
channel_* via a ChannelStore adapter, and replace the removed member foreign key
with explicit application-layer membership checks. No user-visible behavior change.

- channel_store.go: ChannelStore embeds *db.Queries and SHADOWS the ~24 lark
  query methods with channel_*-backed equivalents, keeping the db.Lark*
  signatures so the dispatcher/hub/services and their ~20k lines of tests stay
  untouched; the feishu JSONB config is (de)coded by store.go. Adds
  IsWorkspaceMember and a tx-aware WithTx. Only production wiring swaps
  *db.Queries for *ChannelStore.

- Membership re-check (§4 removed the lark_user_binding -> member FK, so a
  binding row no longer proves current membership):
  * the dispatcher inbound identity step verifies membership after the binding
    lookup; a former member's stale binding is dropped as non_workspace_member
    + audited and never reaches chat_session (§4.3 safety property).
  * RedeemAndBind and BindInstallerTx replace the now-dead FK (23503) branch
    with an explicit IsWorkspaceMember gate, preserving the existing
    ErrBindingNotWorkspaceMember outcome without burning the token.

- router wires the ChannelStore into the patcher, typing indicator, dispatcher,
  hub, and the union_id/region backfills; constructor-based services wrap
  *db.Queries internally so their signatures and nil-check tests are unchanged.

Verified: go build ./... ; go vet ; gofmt ; go test -race ./internal/integrations/...
(full lark suite green unchanged + new membership drop/error tests). Adapter
field mappings (secret base64, union_id RMW, chat-id/open-id remaps, dedup,
token, card) checked end-to-end against a PG17 channel_* schema.

lark_* tables and queries remain (unused at runtime) until the S3 cleanup-hooks
and S4 drop-tables/rename commits.

Co-authored-by: multica-agent <github@multica.ai>

* fix(channel): renumber generalization migration 123 -> 124

main merged 123_issue_stage after this branch forked, so the branch's 123_channel_generalization now collides on the migration number. The runner keys schema_migrations by full version string and would still apply both, but a duplicate number is a merge hazard and convention violation, so move the channel migration to the next free slot (124).

issue_stage (ALTER issue ADD COLUMN stage) and the channel generalization touch disjoint tables; verified on PG17 that 123_issue_stage applies cleanly on a DB already carrying 124_channel_generalization, so the two are order-independent. sqlc regenerated (v1.31.1): only the migration-number comment changed.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* feat(channel): prune channel bindings on member removal + chat session delete

MUL-3515 §4 dropped every channel_* foreign key, so the old ON DELETE CASCADE that cleared a user's channel_user_binding when they left a workspace, and a chat's channel_chat_session_binding when its chat_session was deleted, no longer fires. Re-establish that integrity in the application layer, inside the existing transactions: revokeAndRemoveMember -> DeleteChannelUserBindingsByWorkspaceMember, DeleteChatSession -> DeleteChannelChatSessionBindingBySession.

Adds real-DB tests for both paths, including a scoping check that a remaining member's binding survives the prune. Verified on PG17: both new tests plus the existing revocation tests and the full handler package pass.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* fix(channel): scope Lark/Feishu store reads to channel_type='feishu'

The S2 cutover routed the Feishu integration onto channel_*, but the Lark-facing ChannelStore wrappers read installation / chat-session-binding / outbound-card rows across ALL channel_type values. Once a second IM exists, that would let the Lark hub supervise a non-Feishu installation, the Lark install list show it, /lark/installations/{id} revoke another channel's row, and the outbound patcher / typing indicator act on a non-Feishu chat binding or card.

Add a channel_type predicate to the six read/list channel queries and pass channelTypeFeishu from every wrapper: GetChannelInstallation, GetChannelInstallationInWorkspace, ListChannelInstallationsByWorkspace, ListActiveChannelInstallations, GetChannelChatSessionBindingBySession, GetChannelOutboundCardByTask.

The S3 cleanup deletes (DeleteChannelUserBindingsByWorkspaceMember / DeleteChannelChatSessionBindingBySession) stay all-channel on purpose: a member leaving or a chat_session being deleted should clear every IM's binding. Adds a real-DB test that seeds a Slack installation/binding/card next to the Feishu ones and asserts the Lark wrappers never return them.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* refactor(channel): replace db.Lark* translation layer with lark domain types

S2 introduced ChannelStore as a translation layer that read/wrote channel_* but kept the retired db.Lark* struct/param shapes so the dispatcher/hub/services and their ~20k lines of tests did not have to change. This collapses that layer: the store now takes and returns the package's flat domain types (Installation, UserBinding, ChatSessionBinding, InboundMessageDedup, BindingTokenRow, OutboundCardMessage) and the *Params types in params.go, with channel-neutral field names (ChannelUserID / ChannelChatID / ...). All call sites, fakes, and tests move to the domain types.

No behavior change: only channel_* is read/written (as before); db.Lark* is now unused, and the lark_* tables + queries/lark.sql are removed in the next commit. Verified on PG17: go build / vet / gofmt clean, go test -race ./internal/integrations/... green (the ~20k-line fake suite), and the lark + handler suites pass.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* refactor(channel): drop lark_* tables and queries (remove old path)

The Go cutover (previous commit) moved the lark package entirely onto channel_* and the domain types, leaving the lark_* tables, queries/lark.sql, and the generated db.Lark* models unused. Remove them per the design (§5: replace, do not keep both): migration 125 drops the seven lark_* tables (data already lives in channel_* since migration 124), and queries/lark.sql is deleted + sqlc regenerated, removing the db.Lark* models and lark query methods.

The 125 down recreates the authoritative pre-drop schema (bot_union_id, region, per-installation dedup PK, thread-reply columns). Verified on PG17: fresh migrate up ends with lark_* gone + channel_* present; isolated 125 down/up round-trips correctly; go build / vet / gofmt clean; go test -race ./internal/integrations/... and the handler suite pass.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* fix(migrations): remove trailing blank line at EOF of 125 down migration

git diff --check flagged a blank line at EOF of 125_drop_lark_tables.down.sql (a pg_dump-generation artifact). Whitespace only; the recreate SQL is unchanged.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* refactor(channel): defer lark_* table drop to a follow-up migration

Preflight deploy review: dropping lark_* in the same release that cuts over (old migration 125) is not rollback/rolling-safe — the v0.3.27 release still reads lark_*, so a rolling deploy or a post-deploy code rollback would hit "relation does not exist". Remove the drop and keep the old tables for one release (standard expand/contract): migration 124 already backfilled lark_* -> channel_*, the new code reads/writes only channel_*, and the physical drop moves to a separate cleanup migration once this ships and is observed.

The lark_* tables remain in the schema, so sqlc regenerates the (now unused) db.Lark* models; queries/lark.sql stays deleted (the new code uses channel_*). No code path reads lark_* — only the destructive drop is deferred, keeping the design's no-compat-layer / no-dual-write rule while being deploy-safe.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* fix(channel): skip orphaned installations in hub-boot active scan

Preflight deploy review: channel_installation dropped the workspace/agent FK (MUL-3515 §4), so unlike lark_installation it does not cascade away when its workspace is deleted or its agent is hard-deleted (e.g. runtime teardown). The hub-boot query then keeps opening a WebSocket for a bot whose owner is gone.

JOIN ListActiveChannelInstallations to live workspace + agent so an orphaned installation is never connected, uniformly for every deletion path. The JOIN matches the old ON DELETE CASCADE semantics (row existence, not agent archival), so an archived-but-present agent's installation is still listed; the orphaned row's encrypted secret is thereby never decrypted/used.

Tests: a real-DB handler test asserts a deleted-workspace/agent installation and a non-Feishu one are both excluded; the lark scope test's active-list assertion moved there since the JOIN now needs real workspace/agent fixtures. (Physically deleting dormant orphaned channel rows on workspace/agent deletion is a separate app-layer-cleanup follow-up.)

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* docs(channel): document non-rolling cutover constraint for the lark->channel migration

Elon deploy review: keeping the lark_* tables (deferred drop) stops old v0.3.27 code from crashing, but is not full expand/contract. Migration 124 is a one-time backfill; afterwards new code runs on channel_* (lease + dedup on channel_*) while pre-cutover code runs on lark_* (lease + dedup on lark_*). If both run concurrently during a rolling deploy, each side claims the same Feishu bot's WS lease on its own table and double-processes inbound events.

This release therefore requires a NON-ROLLING cutover (stop the old hub before applying migration 124 + starting new code; rollback is not lossless once new code writes channel_*). Documented where deployers/reviewers see it: migration 124 header gains a ROLLOUT note; the channel_store.go header is corrected (lark_* tables are retained one release for rollback safety, not "gone"; the store still never touches them). Comment-only — no schema/codegen/behavior change.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* feat(lark): add MULTICA_LARK_HUB_DISABLED switch for the channel cutover

The lark_*->channel_* cutover needs a way to make the Feishu bot briefly unavailable WITHOUT taking down the whole multica-api process — the Lark hub is a goroutine inside it, not a separate Deployment. MULTICA_LARK_HUB_DISABLED=true parks the hub at startup: the API serves HTTP normally but never claims a WS lease or opens a Feishu connection.

Rollout (see migration 124 ROLLOUT note): ship the new release with the flag SET so new pods run API-only while old pods (hub on lark_*) drain during the rolling deploy — the two hubs never overlap. After the old pods are gone and migration 124 has run, flip the flag off; the new hub comes up on channel_*. The old backend does NOT need this switch — its hub stops when k8s terminates the old pods, not via a flag. Nil-ing LarkHub reuses the existing not-configured path so both the startup start and the shutdown join skip it.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* docs(channel): point migration 124 ROLLOUT note at the hub-disable switch

Refine the rollout note to use MULTICA_LARK_HUB_DISABLED for a bot-only cutover (new pods serve API with the hub parked while old pods drain; flip the switch off after the migration), instead of the earlier whole-API recreate. Comment-only.

MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

* docs(channel): fix migration 124 rollout order and document self-host cutover

The previous ROLLOUT note shipped the new (channel_*) build before
running migration 124, so the channel_*-backed HTTP paths (installation
list/install/revoke, chat-session delete, member revoke) would 500 in
the window between new-pod boot and the deferred migration. Restate the
runbook around two explicit invariants — channel_* must exist before the
new build serves those paths, and the old/new hubs must never overlap —
and order the steps so channel_* is created first (park old hub -> snapshot
-> deploy parked new build -> unpark). Document that default self-host
(entrypoint migrate + single-replica Recreate) satisfies both invariants
automatically and needs no manual steps; only prd / multi-replica rolling
self-host needs the switch procedure. Clarify in main.go that the
hub-park switch is generation-agnostic (parks whichever hub the build
carries), which is what enables the preparatory release.

Refs MUL-3515

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 12:46:20 +08:00
Bohan Jiang
3103ed1082 fix(agent): surface Antigravity provider log failures (#4494)
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 12:45:52 +08:00
Multica Eve
131ca80a6c refactor(autopilot): migrate scheduled dispatch to scheduler.Manager (3/3 MUL-3551) (#4444)
* refactor(autopilot): migrate scheduled dispatch to scheduler.Manager

PR 3 of 3 for the scheduled-Autopilot refactor on MUL-3551.

Replaces the legacy cmd/server/autopilot_scheduler.go goroutine
(30 s app-clock polling, app-time cron advancement, weak crash
recovery) with a JobSpec registered on the existing
scheduler.Manager. sys_cron_executions is now the lease + audit
table for scheduled Autopilot occurrences, and the unique key on
(job_name, scope_kind, scope_id, plan_time) is the primary
guarantee that the same planned fire time cannot produce two runs.

What changed

  * server/internal/scheduler/jobs_autopilot.go
    New AutopilotScheduleDispatchJob factory:
      - scope_kind = "autopilot_trigger", scope_id = trigger.id
      - PlansForScope hook (from PR 1) enumerates cron occurrences
        in (lastPlan, dbNow] and collapses missed fires to the most
        recent one (CatchUpLatestOnly — same policy the legacy
        goroutine had, now provable via a one-row-per-tick audit).
      - Handler re-loads trigger + autopilot inside the handler so a
        between-tick state change (paused, disabled, deleted) takes
        effect immediately and is recorded as a no-op SUCCESS row
        with skipped_reason in the result JSON.
      - Calls AutopilotService.DispatchAutopilotForPlan (from PR 2)
        for the actual run creation; that path is itself idempotent
        on (trigger_id, planned_at), so a stale-steal retry reuses
        the run created by the prior attempt instead of duplicating.
      - RunTimeout=2m, StaleTimeout=5m, HeartbeatInterval=30s,
        AllowStaleReentry=true, MaxAttempts=3, RetryBackoff
        [1m, 5m, 15m], MaxPlansPerTick=5 (safety cap).

  * server/internal/scheduler/manager.go
    Manager.runOnce promoted to RunOnce (exported) so external test
    packages can drive deterministic ticks; existing call sites in
    this package + cmd/server tests updated.

  * server/internal/service/cron.go
    NextOccurrenceAfterUTC and NextOccurrencesUTC: cron evaluators
    that take an explicit "now" instant. Callers pass dbNow() so
    schedule decisions stay consistent across app instances with
    clock skew. Legacy ComputeNextRun is preserved (delegating to
    NextOccurrenceAfterUTC with time.Now()) for the display-only
    autopilot_trigger.next_run_at write path — scheduling decisions
    no longer use it.

  * server/pkg/db/queries/autopilot.sql
    ListSchedulableAutopilotTriggers replaces the legacy
    ClaimDueScheduleTriggers (the new path no longer mutates
    autopilot_trigger.next_run_at on claim). RecoverLostTriggers
    removed — sys_cron_executions lease theft now handles crash
    recovery without an in-handler restart sweep.

  * server/cmd/server/main.go
    The "go runAutopilotScheduler(...)" line is gone. The new
    JobSpec is registered alongside TaskUsageHourlyJob on the
    existing schedulerMgr (still using sweepCtx for lifecycle).

  * server/cmd/server/autopilot_scheduler.go DELETED.

Tests

  * server/internal/service/cron_test.go — unit tests for the cron
    helpers: timezone-aware enumeration, half-open (after, until]
    window, plan_time-exclusive "after", invalid inputs surface
    parse errors, and the "ignores wall clock" property the
    scheduler relies on.

  * server/cmd/server/autopilot_schedule_job_test.go — DB-backed
    integration tests:
      - DispatchesOnce: one tick → 1 SUCCESS exec row + 1
        autopilot_run with planned_at set; a second tick does not
        regress the count.
      - MissedSchedulesCollapse: an hour of missed */5 fires
        produce a single autopilot_run, not 12.
      - CrashRecovery: simulated stale RUNNING lease at the same
        plan_time → second tick reclaims it and DOES NOT duplicate
        autopilot_run.
      - TwoRunnersSingleWinner: two concurrent
        scheduler.Manager instances on the same trigger →
        per-plan_time uniqueness holds (sys_cron_executions never
        has two RUNNING rows at the same plan_time, autopilot_run
        count == exec row count).
      - DisabledTriggerSkips: a trigger disabled between
        scope-list and tick produces no exec row.
      - PausedAutopilotSkipsAtHandler: an autopilot paused after
        the first tick does not produce a new exec row.
      - BadCronFailsLoudly: an invalid cron expression never fires
        dispatch (parse error surfaces in the plan hook).
    Existing autopilot listener / squad / dispatch tests still
    pass.

  * server/internal/scheduler/plans_for_scope_test.go from PR 1
    still passes (RunOnce rename only).

Verification

  * go build ./...
  * go vet ./...
  * go test ./internal/scheduler ./internal/service ./cmd/server
    ./internal/handler — all green.

Rollback

  * Reverting this commit re-introduces the legacy goroutine.
    Migration 124 (PR 2) and the scheduler hook (PR 1) stay in
    place. Autopilot data on disk is forward- and backward-
    compatible: planned_at columns are nullable, the legacy
    goroutine never reads planned_at and the new job never reads
    autopilot_trigger.next_run_at.

Refs MUL-3551

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): scheduler hook retries FAILED plans + tighten tests

Review fix for #4444 (MUL-3551).

Blocker: hook planner skipped the FAILED-with-retry plan_time

`autopilotPlansForScope` unconditionally set
`after = latest.PlanTime` when `latest.Found`, then enumerated cron
occurrences in the half-open interval `(after, dbNow]`. That
EXCLUDED the FAILED plan_time itself, so `tryClaim`'s
"FAILED-with-retry" branch — which only fires when the planner
returns the same plan_time — never ran. A claim + crash sequence
left the FAILED row stuck at attempt<max_attempts forever and the
scheduled occurrence was lost (MUL-3551 acceptance ③).

Fix: hook now branches on `latest.RetryEligible(now)` BEFORE
computing `after`. When the most recent stored row is FAILED with
attempts remaining and next_retry_at <= dbNow, the hook returns
`[latest.PlanTime]` unchanged. tryClaim's retry-from-FAILED path
fires, attempt increments, the run is retried, and the audit row
reaches SUCCESS at the same plan_time. Mirrors the cadence
planner's `info.RetryEligible(now)` branch in manager.plansForTick.

Tests

  * TestAutopilotScheduleJobCrashRecovery rewritten to actually
    pin the retry contract instead of just "no duplicate run":
      - assert first attempt completes at attempt=1 with a real
        task_id linkage (the "complete" snapshot the retry must
        reuse);
      - simulate a crash mid-dispatch (status=RUNNING, expired
        stale_after, ghost lease_token);
      - assert tick 2 transitions the SAME exec row (same plan_time)
        to status=SUCCESS at attempt=2 (proving the planner did
        NOT skip past the FAILED bucket);
      - assert autopilot_run stays at exactly one row, reused from
        the first attempt — proving DispatchAutopilotForPlan's
        complete-run reuse path is what closes the loop.

  * TestAutopilotScheduleJobPausedAutopilotSkipsAtHandler rewritten
    to invoke `job.Handler` directly (the previous version drove
    `mgr.RunOnce` which short-circuited at the scope-list SQL
    filter and never reached the handler). The new test pauses the
    autopilot AFTER setup, calls the handler with a fabricated
    HandlerInput, and asserts the handler returns
    skipped_reason=autopilot_inactive without creating an
    autopilot_run.

  * TestAutopilotScheduleJobBadCronFailsLoudly renamed to
    TestAutopilotScheduleJobBadCronStaysSilent and updated to
    match the real implementation: a parse error in the plan hook
    surfaces as a manager-level warning log, NOT a
    sys_cron_executions row (no plan_time was ever claimed). The
    test now asserts zero exec rows AND zero autopilot_run rows,
    documenting that bad cron is a permanent configuration error
    (caught at HTTP create/update time first), not a transient
    failure that belongs in the retry envelope.

Refs MUL-3551

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 12:09:07 +08:00
Multica Eve
eca6102365 refactor(autopilot): autopilot_run.planned_at + DispatchAutopilotForPlan (2/3 MUL-3551) (#4443)
* refactor(scheduler): add PlansForScope hook for non-cadence jobs

The current Manager.plansForTick assumes a uniform Cadence grid:
plan_times are derived via FloorPlan(db_now, Cadence). That works for
rollup_task_usage_hourly but not for the upcoming Autopilot schedule
dispatch job, where each trigger has its own cron expression and the
plan_times do not snap to a single global grid.

This change adds an optional JobSpec.PlansForScope hook. When set:

  * Manager loads the latest stored plan for (job, scope) and passes
    a new LatestPlanInfo to the hook (exported from the previously
    private latestPlanInfo). The hook returns the plan_times to attempt
    this tick.
  * Cadence, CatchUpMode and CatchUpWindow are bypassed; the hook is
    in full control of plan_time selection.
  * MaxPlansPerTick still acts as a safety cap on the hook's output.
  * All other timing fields (RunTimeout / StaleTimeout /
    HeartbeatInterval / MaxAttempts / RetryBackoff / AllowStaleReentry)
    and the lease/heartbeat/terminal-write SQL primitives are reused
    unchanged.

JobSpec.validate now allows Cadence=0 when PlansForScope is set, and
makes the every_plan MaxPlansPerTick > 0 invariant fire only on
Cadence-driven every_plan jobs. Existing rollup_task_usage_hourly
behaviour is unchanged — that JobSpec leaves PlansForScope nil.

Tests:
  * TestJobSpecValidatePlansForScopeRelaxesCadence — validate() rules.
  * TestManagerPlansForScopeHookDrivesPlans — end-to-end hook delegation
    through the manager (DB-backed), proving that hook-returned
    plan_times go through the same tryClaim path, MaxPlansPerTick
    truncates without erroring, and LatestPlanInfo is populated on the
    second tick.
  * TestManagerPlansForScopeHookEmptyIsNoOp — empty hook output is a
    valid no-op.

No behaviour change for callers that don't set PlansForScope.

Refs MUL-3551

Co-authored-by: multica-agent <github@multica.ai>

* refactor(autopilot): add planned_at + DispatchAutopilotForPlan for occurrence idempotency

PR 2 of 3 for the scheduled-Autopilot refactor on MUL-3551.

Adds dispatch-layer idempotency for scheduled triggers. This is the
second line of defence behind the primary uq_sys_cron_execution
guarantee in sys_cron_executions: if a runner crashes between
"create autopilot_run" and "write SUCCESS in sys_cron_executions",
the next stale-steal retry re-enters dispatch with the SAME
(trigger_id, planned_at). Without a row-level guard, that retry
would create a duplicate autopilot_run, issue, and task.

Changes:

  * Migration 124: ALTER TABLE autopilot_run ADD COLUMN planned_at
    TIMESTAMPTZ + partial unique index on (trigger_id, planned_at)
    WHERE both are NOT NULL. Manual / webhook / api dispatch leaves
    planned_at NULL so they keep the existing semantics unchanged.

  * autopilot.sql: CreateAutopilotRun now takes planned_at;
    GetAutopilotRunByTriggerAndPlanned is the fast-path lookup used
    by DispatchAutopilotForPlan to detect a prior attempt's row
    without burning an INSERT.

  * service.DispatchAutopilotForPlan: new entry point for scheduled
    triggers that already know the canonical UTC plan_time of the
    occurrence they are firing. Looks up an existing run for
    (trigger_id, planned_at) and reuses it on a stale-steal retry;
    otherwise dispatches normally with planned_at stamped on the
    new run.

  * service.DispatchAutopilot keeps its current signature for
    manual / webhook / api callers (planned_at stays NULL).

  * recordSkippedRun also threads planned_at so the skip path
    participates in the same partial-unique guarantee.

  * sqlc v1.31.1 regenerated autopilot.sql.go + models.go.
    Unrelated workspace.sql.go drift restored.

Tests (against local Postgres):

  * TestDispatchAutopilotForPlanIsIdempotent — first call creates a
    run; second call with same (trigger, planned_at) reuses it
    (autopilot_run row count stays at 1); third call with a different
    planned_at on the same trigger creates a second run (proves we
    are not collapsing legitimate occurrences).

  * TestDispatchAutopilotForPlanRejectsZeroArgs — invalid trigger_id
    and zero planned_at both fail loudly so callers cannot silently
    disable the idempotency guard.

  * Existing autopilot listener / squad / dispatch tests all still
    pass.

This PR has no scheduler / handler / UI behaviour change on its own:
the new entry point exists but is not yet wired into the schedule
goroutine. PR 3 will register the autopilot_schedule_dispatch
JobSpec that consumes it and remove the legacy
cmd/server/autopilot_scheduler.go path.

Refs MUL-3551

Co-authored-by: multica-agent <github@multica.ai>

* fix(autopilot): DispatchAutopilotForPlan recovers partial-state runs

Review fix for #4443 (MUL-3551).

Before this change, DispatchAutopilotForPlan returned ANY existing
autopilot_run for (trigger_id, planned_at), including the
half-written rows produced when a runner crashed between
"CreateAutopilotRun" and "create downstream issue/task". The
scheduler handler would then write SUCCESS in sys_cron_executions
even though no issue or agent task was ever created, silently
losing the scheduled occurrence.

Fix:

  * New isAutopilotRunComplete helper classifies an existing run:
      - terminal status (completed / failed / skipped) → reuse.
      - issue_created with valid issue_id → reuse (the issue
        listener owns task creation from here).
      - running with valid task_id → reuse (the task is queued).
      - anything else → partial; must NOT short-circuit.

  * New SQL RecoverPartialAutopilotRun marks a partial row FAILED
    with a recovery reason AND clears its planned_at. The cleared
    planned_at releases the partial-unique slot in
    uq_autopilot_run_trigger_planned, letting the fresh dispatch
    INSERT a new row at the same (trigger_id, planned_at) without
    conflict.

  * DispatchAutopilotForPlan now branches on the lookup:
    complete run → return; partial run → recover-then-fresh-
    dispatch; not-found → fresh dispatch. The fresh dispatch path
    still goes through dispatchAutopilot, so the new row carries
    the real issue_id / task_id by the time the handler returns.

  * Tests: TestDispatchAutopilotForPlanRecoversPartialRun seeds a
    partial run (status='running', task_id=NULL for run_only;
    status='issue_created', issue_id=NULL for create_issue) and
    asserts the retry:
      - returns a DIFFERENT run row (no false reuse);
      - leaves the partial row in status='failed', planned_at=NULL,
        with a non-empty failure_reason for ops;
      - produces a fresh row with planned_at preserved AND the
        appropriate downstream linkage (task_id for run_only,
        issue_id for create_issue);
      - exactly one live row at (trigger_id, planned_at) after
        recovery, so the partial-unique constraint is honoured.

Existing TestDispatchAutopilotForPlanIsIdempotent and
TestDispatchAutopilotForPlanRejectsZeroArgs still pass — the
complete-reuse path is unchanged for the realistic SUCCESS-state
case.

Refs MUL-3551

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 12:01:10 +08:00
Multica Eve
f9ed62f075 refactor(scheduler): add PlansForScope hook for non-cadence jobs (#4442)
The current Manager.plansForTick assumes a uniform Cadence grid:
plan_times are derived via FloorPlan(db_now, Cadence). That works for
rollup_task_usage_hourly but not for the upcoming Autopilot schedule
dispatch job, where each trigger has its own cron expression and the
plan_times do not snap to a single global grid.

This change adds an optional JobSpec.PlansForScope hook. When set:

  * Manager loads the latest stored plan for (job, scope) and passes
    a new LatestPlanInfo to the hook (exported from the previously
    private latestPlanInfo). The hook returns the plan_times to attempt
    this tick.
  * Cadence, CatchUpMode and CatchUpWindow are bypassed; the hook is
    in full control of plan_time selection.
  * MaxPlansPerTick still acts as a safety cap on the hook's output.
  * All other timing fields (RunTimeout / StaleTimeout /
    HeartbeatInterval / MaxAttempts / RetryBackoff / AllowStaleReentry)
    and the lease/heartbeat/terminal-write SQL primitives are reused
    unchanged.

JobSpec.validate now allows Cadence=0 when PlansForScope is set, and
makes the every_plan MaxPlansPerTick > 0 invariant fire only on
Cadence-driven every_plan jobs. Existing rollup_task_usage_hourly
behaviour is unchanged — that JobSpec leaves PlansForScope nil.

Tests:
  * TestJobSpecValidatePlansForScopeRelaxesCadence — validate() rules.
  * TestManagerPlansForScopeHookDrivesPlans — end-to-end hook delegation
    through the manager (DB-backed), proving that hook-returned
    plan_times go through the same tryClaim path, MaxPlansPerTick
    truncates without erroring, and LatestPlanInfo is populated on the
    second tick.
  * TestManagerPlansForScopeHookEmptyIsNoOp — empty hook output is a
    valid no-op.

No behaviour change for callers that don't set PlansForScope.

Refs MUL-3551

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 12:00:34 +08:00
Naiyuan Qing
b79777caec feat(comments): resolve-aware fold for agent comment reads (MUL-3555) (#4463)
* feat(comments): resolve-aware fold for agent comment reads (MUL-3555)

Agents reading a long issue paid tokens for settled discussion. The human
timeline already folds resolved threads, but the agent read path
(`comment list`) ignored resolved_at entirely — humans saw the conclusion,
agents got the full raw discussion.

Add an opt-in `fold=true` projection to ListComments that collapses each
resolved thread to root + conclusion (reply-resolved) or root only
(root-resolved), reusing the human timeline's deriveThreadResolution
semantics. The resolved thread's root carries `thread_resolved` +
`folded_count`; `--full` brings the dropped comments back. Fold is rejected
on partial-thread reads (since/tail) and roots_only, where a resolution
comment could be unfetched and silently dropped.

CLI `comment list` folds by default on the complete-thread reads (default,
--recent, untailed --thread) with a `--full` escape hatch; the agent
prompts and runtime brief document the fold + escape. No new endpoint, no
human UI change, no SQL/migration change — in-memory projection, same
precedent as summary/roots_only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

* refactor(daemon): dedupe fold prompt restatements per review (MUL-3555)

Howard's PR review flagged DRY redundancy: the resolve-fold rule was
restated in full in the task prompt (prompt.go:41/:182) and the brief
workflow steps (runtime_config.go:673/:692, reply_instructions cold
hint) even though the canonical command catalog (runtime_config.go:477)
— always present in the brief — already documents it in full, and the
task prompt explicitly defers to it ("follow the rule in your runtime
workflow file").

Keep the catalog entry full (the canonical reference); shrink the five
inline restatements to a short "resolved threads come back folded —
`--full` to expand" pointer. No loss of signal (the agent always has the
full catalog in context), ~80-120 tokens/run saved on the worst-case
assignment / cold paths.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-24 09:52:18 +08:00
Bohan Jiang
ac84b8c70c fix(agent): stop Antigravity turns dying at agy's hidden 5m print-timeout (#4462)
agy's --print-timeout defaults to 5m when the flag is omitted, but the
daemon treated "omit the flag" as "no cap". In the default no-cap config
every Antigravity turn was therefore silently capped at 5 minutes: any
run whose build/tests outlived the budget had agy abort mid-turn, print
"Error: timed out waiting for response", and exit 0 — which the backend
recorded as a successful "completed" with truncated output (the reported
"Antigravity disconnects", MUL-3570 / #4453).

- Always pass --print-timeout: the configured cap when positive, else a
  large value (24h) that defers to the daemon's idle/tool watchdogs.
- Detect agy's print-mode timeout marker in the run log and surface the
  result as a timeout instead of a truncated success.

Verified by reproducing against agy 1.0.8 and with new unit + end-to-end
backend tests.

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 18:57:50 +08:00
Bohan Jiang
294953ba37 fix: delete custom runtime profiles from runtime rows (#4456)
Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 16:52:46 +08:00
Bohan Jiang
5038c983c0 MUL-3281: Add daemon skill bundle refs (#4445)
* feat: add daemon skill bundle refs

Co-authored-by: multica-agent <github@multica.ai>

* fix: tighten skill bundle resolve safeguards

Co-authored-by: multica-agent <github@multica.ai>

* feat: add task prepare lease

Co-authored-by: multica-agent <github@multica.ai>

* fix: isolate prepare lease concurrent index migration

Co-authored-by: multica-agent <github@multica.ai>

* fix: keep prepare lease active through start

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: J <j@multica.ai>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 16:19:16 +08:00
Naiyuan Qing
3ce97453b3 fix(issues): pre-trigger preview + run-confirm + handoff UX polish (MUL-3375) (#4454)
* fix(issues): stop issue-trigger preview flicker

The pre-trigger preview re-rendered/refetched on every workspace task
event: WS task lifecycle invalidated issueTriggerPreviewAll (staleTime 0),
forcing a background refetch whose isFetching was surfaced as isLoading,
collapsing and reopening CreateRunHint's reveal band.

The assign source (create / assignee change) cancels existing tasks before
enqueuing, so its verdict can't shift from a task event at all; the status
source's pending dedup could, but the preview is advisory and the write
path re-evaluates authoritatively, so a rare stale label is harmless. Drop
the WS invalidation so the preview refetches only on input (signature)
change. Keep the comment-trigger invalidation — its verdict genuinely
changes mid-compose and its chips drive an immediate, unconfirmed send.

Align the hook's data handling with the comment-trigger preview:
keepPreviousData so an input switch swaps in place instead of collapsing,
and treat only the first load (no prior data) as loading.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(issues): skip run-confirm modal for backlog assign

Assigning a Backlog issue to an agent/squad never starts a run (the
parking lot — server/internal/service/issue_trigger.go), so the
pre-trigger confirm modal only rendered an empty "won't start" box with
a single Apply button. Apply directly instead: the single path checks
issue.status, the batch path skips only when every selected issue is
Backlog (mixed selections still confirm — the non-backlog ones trigger).
Mirrors the existing backlog short-circuit in handleBatchStatus.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix(modals): run-confirm loading state + submit spinner

The dialog grew in height after open: it rendered the short "won't
start" variant while POST /api/issues/preview-trigger was in flight, then
the note box appeared when the predicate landed. Keep the note box
mounted (disabled) during loading so assign mode opens at its resolved
height, and show a Spinner + 'checking' headline while loading.

Submit had no feedback — buttons only disabled, which read as frozen for
note assigns (the request starts an agent server-side). Track which
footer action is in flight and show a Spinner on the clicked button.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* feat(issues): show handoff note in execution-log trigger text

An assignment-triggered run that carried a handoff note showed the
generic "Initial run" label. Surface the note inline (truncated, like
comment triggers show their text) so the row reads as the handoff.

taskToResponse now populates handoff_note for all callers (dropping the
now-redundant explicit set in ClaimTaskByRuntime); the field is added to
the AgentTask type + zod schema (optional, additive — old clients ignore
it via the loose schema, new clients fall back to "Initial run").

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 16:15:44 +08:00
Multica Eve
12ea1f6a8c MUL-3495: support custom runtime args and registration errors (#4408)
* feat: support custom runtime args

Co-authored-by: multica-agent <github@multica.ai>

* fix: address custom runtime review nits

Co-authored-by: multica-agent <github@multica.ai>

---------

Co-authored-by: Eve <eve@multica-ai.local>
Co-authored-by: multica-agent <github@multica.ai>
2026-06-23 14:20:18 +08:00
Reinhard Schneidewind
7008f4276d MUL-3558 feat: add 'issue usage' CLI command for aggregated issue token usage
`GET /api/issues/:id/usage` endpoint (handler `GetIssueUsage`). Returns the
aggregated token usage for an issue, summed across all of its task runs.

The usage is already captured server-side and shown in the issue detail view,
but was not reachable from the CLI, so it couldn't feed billing/cost scripts.
This closes that gap with no backend changes.

$ multica issue usage MUL-139
INPUT_TOKENS  OUTPUT_TOKENS  CACHE_READ  CACHE_WRITE  RUNS
5625          83880          9190806     154078       1

$ multica issue usage MUL-139 --output json
{ "total_input_tokens": 5625, ... "task_count": 1 }

Accepts an issue key (`MUL-139`) or UUID, mirroring `issue runs`.

- Table cells use the existing `formatMetadataValue` helper so large cache-token
  counts render as plain integers (not scientific notation).
- Scope is the per-issue aggregate the endpoint already returns (`task_count` =
  run count); a per-run token breakdown is out of scope.

- `go test ./cmd/multica/ -run TestRunIssueUsage` (added) 
- `go vet ./cmd/multica/` 
- Verified against a live self-hosted server; numbers match the issue UI.

- `server/cmd/multica/cmd_issue.go` — command + handler
- `server/cmd/multica/cmd_issue_test.go` — unit test
- `CLI_AND_DAEMON.md` — docs
2026-06-23 14:13:57 +08:00